SlideShare a Scribd company logo
1 of 57
Download to read offline
Virtual Machine Construction
for Dummies
Jim Huang <jserv@0xlab.org>
Rifur Ni <rifurdoma@gmail.com>
Xatier Lee <xatierlike@gmail.com>
Jan 29, 2013 ! TOSSUG / March 7, 2013 ! 新竹碼農
Rights to copy
Attribution – ShareAlike 3.0
You are free
to copy, distribute, display, and perform the work
to make derivative works
to make commercial use of the work
Under the following conditions
Attribution. You must give the original author credit.
Share Alike. If you alter, transform, or build upon this work, you may distribute the
resulting work only under a license identical to this one.
For any reuse or distribution, you must make clear to others the license terms of this work.
Any of these conditions can be waived if you get permission from the copyright holder.
Your fair use and other rights are in no way affected by the above.
License text: http://creativecommons.org/licenses/by-sa/3.0/legalcode
© Copyright 2015 0xlab
contact@0xlab.org
Corrections, suggestions, contributions and translations
are welcome!
Latest update: Feb 21, 2015
Goals of this Presentation
• Build a full-functioned virtual machine from scratch
– The full source code is inside the slides.
• Basic concepts about interpreter, optimizations
techniques, language specialization, and platform
specific tweaks.
• Brainfuck is selected as the primary programming
language because
– it's a very simple turing-complete programming
language.
– it's easier to write its compiler than its interpreter.
– it's easier to write its interpreter than its real
programs.
Brainfuck Programming Language
• created in 1993 by Urban Müller
• Only 8 instructions
– Müller's Amiga compiler was 240 bytes in size
– x86/Linux by Brian Raiter had 171 Bytes!
+++[>+++++[>+++++>+++++>++++>++
++>+<<<<<-]>++++>+++>+++>+<<<<<
-]>>.>.>+.>.>-----.
Learn such a stupid language! Why?
• Understand how basic a Turing-complete
programming language can be.
– A common argument when programmers compare
languages is “well they’re all Turing-complete”,
meaning that anything you can do in one language
you can do in another.
• Once you’ve learnt brainfuck, you’ll understand just
how difficult it can be to use a Turing-complete
language, and how that argument holds no water.
Source: http://skilldrick.co.uk/2011/02/why-you-should-learn-brainfuck-or-learn-you-a-brainfuck-
for-great-good/
Brainfuck: Turing Complete
Source:
http://jonfwilkins.blogspot.tw/2011/06/happy-99th-birthday-alan-turing.html
http://bugrammer.g.hatena.ne.jp/nisemono_san
/20111114/1321218802
Brainfuck Instructions
(mapped to C language)
Increment the byte value at the data pointer.
Decrement the data pointer to point to the previous cell.
Increment the data pointer to point to the next cell.
Decrement the byte value at the data pointer.
Output the byte value at the data pointer.
Input one byte and store its value at the data pointer.
If the byte value at the data pointer is zero, jump to the
instruction following the matching ] bracket. Otherwise,
continue execution.
Unconditionally jump back to the
matching [ bracket.
Writing a Brainfuck compiler is Easy!
#!/usr/bin/awk -f
BEGIN {
print "int main() {";
print " int c = 0;"; print " static int b[30000];n";
}
{
gsub(/]/, " }n");
gsub(/[/, " while(b[c] != 0) {n");
gsub(/+/, " ++b[c];n");
gsub(/-/, " --b[c];n");
gsub(/>/, " ++c;n");
gsub(/</, " --c;n");
gsub(/./, " putchar(b[c]);n");
gsub(/,/, " b[c] = getchar();n");
print $0
}
END {
print "n return 0;";
print "}";
}
Brainfuck interpreter in portable C (1/3)
#include <stdio.h>
#include <stdlib.h>
int p, r, q;
char a[5000], f[5000], b, o, *s = f;
void interpret(char *c)
{
char *d;
r++;
while (*c) {
switch (o = 1, *c++) {
case '<': p--; break;
case '>': p++; break;
case '+': a[p]++; break;
case '-': a[p]--; break;
Brainfuck interpreter in portable C (2/3)
case '.':
putchar(a[p]);
fflush(stdout); break;
case ',':
a[p] = getchar();
fflush(stdout); break;
case '[':
for (b = 1, d = c; b && *c; c++)
b += *c == '[', b -= *c == ']';
if (!b) {
c[-1] = 0;
while (a[p]) interpret(d);
c[-1] = ']'; break;
}
case ']':
puts("Unblanced brackets"),exit(0);
Brainfuck interpreter in portable C (3/3)
default: o = 0;
}
if (p < 0 || p > 100)
puts("Range error"), exit(0);
}
r--;
}
int main(int argc, char *argv[])
{
FILE *z; q = argc;
if ((z = fopen(argv[1], "r"))) {
while ((b = getc(z)) > 0) *s++ = b;
*s = 0; interpret(f);
}
return 0;
}
Self-Interpreter can be short!
Writen by Oleg Mazonka & Daniel B. Cristofani
21 November 2003
>>>+[[-]>>[-]++>+>+++++++[<++++>>++<-]++>>+>+>+++++[>
++>++++++<<-]+>>>,<++[[>[->>]<[>>]<<-]<[<]<+>>[>]>[<+
>-[[<+>-]>]<[[[-]<]++<-[<+++++++++>[<->-]>>]>>]]<<]<]
<
[[<]>[[>]>>[>>]+[<<]<[<]<+>>-]>[>]+[->>]<<<<[[<<]<[<]
+<<[+>+<<-[>-->+<<-[>+<[>>+<<-]]]>[<+>-]<]++>>-->[>]>
>[>>]]<<[>>+<[[<]<]>[[<<]<[<]+[-<+>>-[<<+>++>-[<->[<<
+>>-]]]<[>+<-]>]>[>]>]>[>>]>>]<<[>>+>>+>>]<<[->>>>>>>
>]<<[>.>>>>>>>]<<[>->>>>>]<<[>,>>>]<<[>+>]<<[+<<]<]
Turing Complete (again)
• In fact, Brianfuck has 6 opcode + Input/Output
commands
• gray area for I/O (implementation dependent)
– EOF
– tape length
– cell type
– newlines
• That is enough to program!
• Extension: self-modifying Brainfuck
https://soulsphere.org/hacks/smbf/
Statement: while
• Implementing a while statement is easy, because the
Brainfuck [ .. ] statement is a while loop.
• Thus, while (x) { <foobar> } becomes
(move pointer to a)
[
(foobar)
(move pointer to a)
]
Statement: x=y
• Implementing assignment (copy) instructions is
complex.
• Straightforward way of doing that resets y to zero:
(move pointer to y) [ ­
(move pointer to x) +
(move pointer to y) ]
• A temporary variable t is needed:
(move pointer to y) [ ­
(move pointer to t) +
(move pointer to y) ]
(move pointer to t) [ ­
(move pointer to x) +
(move pointer to y) +
(move pointer to t) ]
Statement: if
• The if statement is like a while-loop, but it should run
its block only once. Again, a temporary variable is
needed to implement if (x) { <foobar> }:
(move pointer to x) [ ­
(move pointer to t) +
(move pointer to x) ]
(move pointer to t) [
    [ ­
    (move pointer to x) +
    (move pointer to t) ]
    (foobar)
(move pointer to t) ]
Example: clean
[-]
while(cell[0]) {
--cell[0];
}
Example: cat
+ [ , . ]
cell[0] ← 1
while(cell[0]) {
Read-in a character
print it
}
Example: if-endif
$f +
$A + [
$B + /* $B = $B + 1 */
$f [-] /* end if */
]
$A = 1;
if($A) {
$B = $B + 1;
}
Example: if-else-endif
$f +
$A + [
$B + /* $B = $B + 1 */
$f [-] /* end if */
] $f [
$B - /* $B = $B - 1 */
$f [-] /* end if */
]
$A = 1;
if($A) { ++$B; } else { --$B; }
Example: Multiply (6x7)
+++ +++
[ > +++ +++ +
< - ]
cell[0] ← 6
while(cell[0]) {
cell[1] += 7;
--cell[0];
}
Example: Division (12/4)
++++++++++++ // Loop 12 times
[
>+ // Increment second cell
<---- // Subtract 4 from first cell
]
>. // Display second cell's value
Example: Hello World!
++++++++[>+++++++++<-]>. // 8 x 9 = 72 (H)
<+++++[>++++++<-]>-. // 72 + (6 x 5) - 1 = 101 (e)
+++++++.. // 101 + 7 = 108 (l)
+++. // 108 + 3 = 111 (o)
<++++++++[>>++++<<-]>>. // 8 x 4 = 32 (SPACE)
<<++++[>------<-]>. // 111 - 24 = 87 (W)
<++++[>++++++<-]>. // 87 + 24 = 111 (o)
+++. // 111 + 3 = 114 (r)
------. // 114 - 6 = 108 (l)
--------. // 108 - 8 = 100 (d)
>+. // 32 + 1 = 33 (!)
Example: Hello World!
0 0
72 0
p
...a: 0 0 9
72 0 H
101 0 e
108 0 ll
111 0 o
0 0
a[0] a[1]
111 0
87 0
114 0
108 0
100 0
33 0
10 0
32 0
W
o
r
l
d
n
!
>+++++++++[<++++++++>-]<.>+++++++[<++++>-]<+.
+++++++..+++.[-]>++++++++[<++++>-]<.
>+++++++++++[<+++++>-]<.>++++++++[<+++>-]<.
+++.------.--------.[-]>++++++++[<++++>-]<+.
[-]++++++++++.
Example: Bubble Sort
>>>>>,.[>>>,.]
<<<
[<<<
[>>>
[-<<<-<+>[>]>>]
<<<[<]>>
[>>>+<<<-]<
[>+>>>+<<<<-]
<<]
>>>[.[-]]
>>>[>>>]<<<]
Example: Bubble Sort
>>>>>,.[>>>,.]
<<<
[<<<
[>>>
[-<<<-<+>[>]>>]
<<<[<]>>
[>>>+<<<-]<
[>+>>>+<<<<-]
<<]
>>>[.[-]]
>>>[>>>]<<<]
Idea: if(b>a): swap(a, b)
Operation: decrement $a and
$b. Then, store the smaller one
into $t
Example: Bubble Sort
>>>>>,.[>>>,.]
<<<
[<<<
[>>>
[-<<<-<+>[>]>>]
<<<[<]>>
[>>>+<<<-]<
[>+>>>+<<<<-]
<<]
>>>[.[-]]
>>>[>>>]<<<]
Idea: if(a>b): swap(a, b)
Operation: when b > a, assign
the value of $b to $a
Example: Bubble Sort
>>>>>,.[>>>,.]
<<<
[<<<
[>>>
[-<<<-<+>[>]>>]
<<<[<]>>
[>>>+<<<-]<
[>+>>>+<<<<-]
<<]
>>>[.[-]]
>>>[>>>]<<<]
Idea: if(b>a): swap(a, b)
Example: Bubble Sort
>>>>>,.[>>>,.]
<<<
[<<<
[>>>
[-<<<-<+>[>]>>]
<<<[<]>>
[>>>+<<<-]<
[>+>>>+<<<<-]
<<]
>>>[.[-]]
>>>[>>>]<<<]
Example: Bubble Sort
>>>>>,.[>>>,.]
<<<
[<<<
[>>>
[-<<<-<+>[>]>>]
<<<[<]>>
[>>>+<<<-]<
[>+>>>+<<<<-]
<<]
>>>[.[-]]
>>>[>>>]<<<]
Online interpreter: http://work.damow.net/random/bf-turing/
Brainfuck Toolchain
:: interpreter, translator, virtual machine, nested runtime ::
Nested Interpreting
• Translation:
– BF extensions → Brainfuck
– Other languages → Brainfuck
• Interpreter written in Brainfuck runs on BF VM
DEF
def Copy(s, d, t) {
s [ d+ t+ s- ]
t [ s+ t- ]
}
END
MAIN
$100=10 $200=0 $300=0
Copy($100, $200, $300)
#define Copy(argc, s, d, t) 
s[d+t+s-] t[s+d+t-]
$100 =10 $200 =0 $300 =0 Copy(3,
$100, $200, $300)
$100 =10 $200 =0 $300 =0
$100 [ $200+ $300+ $100- ]
$300 [ $100+ $200+ $300- ]
Brainfuck translator
(use BF as the backend)
• Translate C-like language to Brainfuck
– bfc [C] → bfa [Assembly] → bf [Machine/CPU]
http://www.clifford.at/bfcpu/bfcomp.html
• Another C to Brainfuck
http://esolangs.org/wiki/C2BF
• BASIC to Brainfuck
http://esolangs.org/wiki/BFBASIC
Compile Brainfuck into ELF
Using Artificial Intelligence to Write Self-
Modifying/Improving Programs
• AI program works, as follows:
– A genome consists of an array of doubles.
– Each gene corresponds to an instruction in the brainf-ck
programming language.
– Start with a population of random genomes.
– Decode each genome into a resulting program by converting
each double into its corresponding instruction and execute the
program.
– Get each program's fitness score, based upon the output it
writes to the console (if any), and rank them.
– Mate the best genomes together using roulette selection,
crossover, and mutation to produce a new generation.
– Repeat the process with the new generation until the target
fitness score is achieved.
Source: http://www.primaryobjects.com/CMS/Article149
Runtime Optimizations
Mandelbrot
Incremental optimizing interpreter
https://github.com/xatier/brainfuck-tools
https://github.com/xatier/brainfuck-bench
Interpreter vs. Static Compiler
Implementation (user-space)
Execution Time
( in second)
simple bf 91.50
slight
optimizations
8.03
bff 5.04
vff 3.10
vm +
optimizations
3.02
BF-JIT 17.78
BF-JIT +
optimizations
1.37
simple xbyak JIT 3.25
xbyak JIT +
optimizations
0.93
custom JIT +
aggressive
optimizations
0.77
Simple lightning
JIT
1.27
Implementation (user-space)
Execution Time
( in second)
simple BF to C 1.27
awib to C 1.05
esotope-bfc 0.72
bftran to C 0.66
bftran to ELF32c 3.58
The executable generated by
static compiler (2 pass: BF → C
→ x86_64) is likely slower than
optimized interpreters.
The fastest interpreter record
appears on Lenovo X230
[Intel(R) Core(TM) i5-3320M CPU
@ 2.60GHz].
Plain JIT compilation without effective
optimizations is slower than portable interpreters!
11x speedup!
Walk through typical Design Patterns
• Classify the executions of Brainfuck programs
• Eliminate the redundant
– CSE: common sub-expression elimination
• Quick instructions
• Hotspot
– Replace with faster implementation
• Enable Just-In-Time compilation
Pattern: +++
++++++++++
*ptr += 10
Pattern: >>>
>>>>>>>>>>
ptr += 10
Pattern: +++
++++++++++
*ptr += 10
Fetch
Instruction
Eval
Increment
Fetch
Instruction
Eval
Increment
…
10 Instructions
Fetch
Instruction
Eval Calc
+10
1 instruction
Pattern: [-]
[-]
*ptr = 0
Pattern: [>+<-]
[>+<-]
*(ptr+1) += *ptr
*ptr = 0
Pattern: [-]
[-]
Interpret:
if(!*ptr) goto ];--*ptr;goto [;
Fetch
Instruction
Jump If
Zero
Fetch
Instruction
Eval
Decrement
Fetch
Instruction
Jump
Contains Branch
Fetch
Instruction
Eval Reset
Zero
1 Instruction,
No Branch
Optimization Techniques
• To evaluate the impact different optimization
techniques can have on performance, we need a
set of Brainfuck programs that are sufficiently
non-trivial for optimization to make sense.
– awib-0.4 (Brainfuck compiler)
– factor.b
– mandelbrot.b
– hanoi.b
– dbfi.b (self-interpreter)
– long.b
• source: https://github.com/matslina/bfoptimization
Benchmark Results
(without optimizations)
Benchmark Results
(w/ and w/o optimizations)
Benchmark Results
(Contraction)
IR C
add(x) mem[p] += x;
sub(x) mem[p] -= x;
right(x) p += x;
left(x) p -= x;
output putchar(mem[p]);
input mem[p] = getchar();
open_loop while(mem[p]) {
close_loop }
+++++[­>>>++<<<]>>>.
mem[p]++;
mem[p]++;
mem[p]++;
mem[p]++;
mem[p]++;
while (mem[p]) {
    mem[p]­­;
    p++;
    p++;
    p++;
    mem[p]++;
    mem[p]++;
    p­­;
    p­­;
    p­­;
}
p++;
p++;
p++;
putchar(mem[p]);
mem[p] += 5;
while (mem[p]) {
    mem[p] ­= 1;
    p += 3;
    mem[p] += 2;
    p ­= 3;
}
p += 3;
putchar(mem[p]);
Benchmark Results
(Clear loops)
IR C
add mem[p]++;
sub mem[p]--;
right p++
left p--
output putchar(mem[p]);
input mem[p] = getchar();
open_loop while(mem[p]) {
close_loop }
clear mem[p] = 0;
[­]
Eval Reset Zero
Benchmark Results
(Copy loops)
[­>+>+<<])
mem[p+1] += mem[p];
mem[p+2] += mem[p];
mem[p] = 0;
IR C
add(x) mem[p] += x;
sub(x) mem[p] -= x;
right(x) p += x;
left(x) p -= x;
output putchar(mem[p]);
input mem[p] = getchar();
open_loop while(mem[p]) {
close_loop }
clear mem[p] = 0;
copy(x) mem[p+x] += mem[p];
Benchmark Results
(Multiplication loops)
IR C
add(x) mem[p] += x;
sub(x) mem[p] -= x;
right(x) p += x;
left(x) p -= x;
output putchar(mem[p]);
input mem[p] = getchar();
open_loop while(mem[p]) {
close_loop }
clear mem[p] = 0;
mul(x,y) mem[p+x] += mem[p] * y;
[­>+++>+++++++<<]
mem[p+1] += mem[p] * 3;
mem[p+2] += mem[p] * 7;
mem[p] = 0
Benchmark Results
(Operation offsets)
IR C
add(x,off) mem[p+off] += x;
sub(x,off) mem[p+off] -= x;
right(x) p++
left(x) p--
output putchar(mem[p+off]);
input mem[p+off] = getchar();
open_loop while(mem[p]) {
close_loop }
clear mem[p+off] = 0;
mul(x,y) mem[p+x+off] +=
mem[p+off] * y;
Both the copy loop and multiplication loop
optimizations share an interesting trait: they
perform an arithmetic operation at an offset from
the current cell. In brainfuck we often find long
sequences of non-loop operations and these
sequences typically contain a fair number of <
and >. Why waste time moving the pointer
around?
Benchmark Results
(Scan loops)
IR C
add(x) mem[p] += x;
sub(x) mem[p] -= x;
right(x) p += x;
left(x) p -= x;
output putchar(mem[p]);
input mem[p] = getchar();
open_loop while(mem[p]) {
close_loop }
clear mem[p] = 0;
mul(x,y) mem[p+x] += mem[p] * y;
ScanLeft p -= (long)((void *)(mem
+ p) - memrchr(mem, 0, p
+ 1));
ScanRight p += (long)(memchr(mem
+ p, 0, sizeof(mem)) -
(void *)(mem + p));
 +<[>­]>[>]<
The problem of efficiently searching a memory
area for occurrences of a particular byte is mostly
solved by the C standard library’s memchr()
function, which operates by loading full memory
words (typically 32 or 64 bits) into a CPU register
and checking the individual 8-bit components in
parallel. This proves to be much more efficient
than loading and inspecting bytes one at a time.
Benchmark Results
(apply all techniques)
2.4x speedup
130x speedup
Reference
1. Principles of Compiler Design: The Brainf*ck Compiler
http://www.clifford.at/papers/2004/compiler/
2. brainfuck optimization strategies
http://calmerthanyouare.org/2015/01/07/optimizing-brainfuck.html
3. Brainf*ck Compiler Project
http://www.clifford.at/bfcpu/bfcomp.html
4. Brainfuck code generation
http://esolangs.org/wiki/Brainfuck_code_generation

More Related Content

What's hot

The Microkernel Mach Under NeXTSTEP
The Microkernel Mach Under NeXTSTEPThe Microkernel Mach Under NeXTSTEP
The Microkernel Mach Under NeXTSTEP
Gregor Schmidt
 

What's hot (20)

Embedded Virtualization applied in Mobile Devices
Embedded Virtualization applied in Mobile DevicesEmbedded Virtualization applied in Mobile Devices
Embedded Virtualization applied in Mobile Devices
 
The Internals of "Hello World" Program
The Internals of "Hello World" ProgramThe Internals of "Hello World" Program
The Internals of "Hello World" Program
 
GDB Rocks!
GDB Rocks!GDB Rocks!
GDB Rocks!
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
 
The Microkernel Mach Under NeXTSTEP
The Microkernel Mach Under NeXTSTEPThe Microkernel Mach Under NeXTSTEP
The Microkernel Mach Under NeXTSTEP
 
Qemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System EmulationQemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System Emulation
 
Learn C Programming Language by Using GDB
Learn C Programming Language by Using GDBLearn C Programming Language by Using GDB
Learn C Programming Language by Using GDB
 
Q2.12: Debugging with GDB
Q2.12: Debugging with GDBQ2.12: Debugging with GDB
Q2.12: Debugging with GDB
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
Understand more about C
Understand more about CUnderstand more about C
Understand more about C
 
Java Web 程式之效能技巧與安全防護
Java Web 程式之效能技巧與安全防護Java Web 程式之效能技巧與安全防護
Java Web 程式之效能技巧與安全防護
 
Pwning in c++ (basic)
Pwning in c++ (basic)Pwning in c++ (basic)
Pwning in c++ (basic)
 
QEMU - Binary Translation
QEMU - Binary Translation QEMU - Binary Translation
QEMU - Binary Translation
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
 
ROP 輕鬆談
ROP 輕鬆談ROP 輕鬆談
ROP 輕鬆談
 
ZynqMPのブートとパワーマネージメント : (ZynqMP Boot and Power Management)
ZynqMPのブートとパワーマネージメント : (ZynqMP Boot and Power Management)ZynqMPのブートとパワーマネージメント : (ZynqMP Boot and Power Management)
ZynqMPのブートとパワーマネージメント : (ZynqMP Boot and Power Management)
 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
 
YoctoをつかったDistroの作り方とハマり方
YoctoをつかったDistroの作り方とハマり方YoctoをつかったDistroの作り方とハマり方
YoctoをつかったDistroの作り方とハマり方
 

Viewers also liked

Viewers also liked (12)

進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明
進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明
進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明
 
中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學
中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學
中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學
 
Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
 
Xvisor: embedded and lightweight hypervisor
Xvisor: embedded and lightweight hypervisorXvisor: embedded and lightweight hypervisor
Xvisor: embedded and lightweight hypervisor
 
Develop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM BoardsDevelop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM Boards
 
從線上售票看作業系統設計議題
從線上售票看作業系統設計議題從線上售票看作業系統設計議題
從線上售票看作業系統設計議題
 
Explore Android Internals
Explore Android InternalsExplore Android Internals
Explore Android Internals
 
給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明
給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明
給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明
 
Implement Runtime Environments for HSA using LLVM
Implement Runtime Environments for HSA using LLVMImplement Runtime Environments for HSA using LLVM
Implement Runtime Environments for HSA using LLVM
 
Lecture notice about Embedded Operating System Design and Implementation
Lecture notice about Embedded Operating System Design and ImplementationLecture notice about Embedded Operating System Design and Implementation
Lecture notice about Embedded Operating System Design and Implementation
 
PyPy's approach to construct domain-specific language runtime
PyPy's approach to construct domain-specific language runtimePyPy's approach to construct domain-specific language runtime
PyPy's approach to construct domain-specific language runtime
 
Priority Inversion on Mars
Priority Inversion on MarsPriority Inversion on Mars
Priority Inversion on Mars
 

Similar to Virtual Machine Constructions for Dummies

Python for High School Programmers
Python for High School ProgrammersPython for High School Programmers
Python for High School Programmers
Siva Arunachalam
 
Nosql hands on handout 04
Nosql hands on handout 04Nosql hands on handout 04
Nosql hands on handout 04
Krishna Sankar
 
Pycon 2011 talk (may not be final, note)
Pycon 2011 talk (may not be final, note)Pycon 2011 talk (may not be final, note)
Pycon 2011 talk (may not be final, note)
c.titus.brown
 

Similar to Virtual Machine Constructions for Dummies (20)

Τα Πολύ Βασικά για την Python
Τα Πολύ Βασικά για την PythonΤα Πολύ Βασικά για την Python
Τα Πολύ Βασικά για την Python
 
Python 1
Python 1Python 1
Python 1
 
Tree Top
Tree TopTree Top
Tree Top
 
Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)
 
Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010
 
03 tk2123 - pemrograman shell-2
03   tk2123 - pemrograman shell-203   tk2123 - pemrograman shell-2
03 tk2123 - pemrograman shell-2
 
Building a DSL with GraalVM (CodeOne)
Building a DSL with GraalVM (CodeOne)Building a DSL with GraalVM (CodeOne)
Building a DSL with GraalVM (CodeOne)
 
PesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShell
PesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShellPesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShell
PesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShell
 
A Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsA Few of My Favorite (Python) Things
A Few of My Favorite (Python) Things
 
Python for High School Programmers
Python for High School ProgrammersPython for High School Programmers
Python for High School Programmers
 
Malcon2017
Malcon2017Malcon2017
Malcon2017
 
901131 examples
901131 examples901131 examples
901131 examples
 
Python quickstart for programmers: Python Kung Fu
Python quickstart for programmers: Python Kung FuPython quickstart for programmers: Python Kung Fu
Python quickstart for programmers: Python Kung Fu
 
Pdxpugday2010 pg90
Pdxpugday2010 pg90Pdxpugday2010 pg90
Pdxpugday2010 pg90
 
Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython
Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPythonByterun, a Python bytecode interpreter - Allison Kaptur at NYCPython
Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython
 
Nosql hands on handout 04
Nosql hands on handout 04Nosql hands on handout 04
Nosql hands on handout 04
 
Groovy
GroovyGroovy
Groovy
 
忙しい人のためのSphinx 入門 demo
忙しい人のためのSphinx 入門 demo忙しい人のためのSphinx 入門 demo
忙しい人のためのSphinx 入門 demo
 
Pycon 2011 talk (may not be final, note)
Pycon 2011 talk (may not be final, note)Pycon 2011 talk (may not be final, note)
Pycon 2011 talk (may not be final, note)
 
JSDC 2014 - functional java script, why or why not
JSDC 2014 - functional java script, why or why notJSDC 2014 - functional java script, why or why not
JSDC 2014 - functional java script, why or why not
 

More from National Cheng Kung University

More from National Cheng Kung University (14)

Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
 
2016 年春季嵌入式作業系統課程說明
2016 年春季嵌入式作業系統課程說明2016 年春季嵌入式作業系統課程說明
2016 年春季嵌入式作業系統課程說明
 
進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明
進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明
進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明
 
Construct an Efficient and Secure Microkernel for IoT
Construct an Efficient and Secure Microkernel for IoTConstruct an Efficient and Secure Microkernel for IoT
Construct an Efficient and Secure Microkernel for IoT
 
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsF9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
 
Open Source from Legend, Business, to Ecosystem
Open Source from Legend, Business, to EcosystemOpen Source from Legend, Business, to Ecosystem
Open Source from Legend, Business, to Ecosystem
 
Summer Project: Microkernel (2013)
Summer Project: Microkernel (2013)Summer Project: Microkernel (2013)
Summer Project: Microkernel (2013)
 
進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明
進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明
進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明
 
Faults inside System Software
Faults inside System SoftwareFaults inside System Software
Faults inside System Software
 
Hints for L4 Microkernel
Hints for L4 MicrokernelHints for L4 Microkernel
Hints for L4 Microkernel
 
Shorten Device Boot Time for Automotive IVI and Navigation Systems
Shorten Device Boot Time for Automotive IVI and Navigation SystemsShorten Device Boot Time for Automotive IVI and Navigation Systems
Shorten Device Boot Time for Automotive IVI and Navigation Systems
 
Microkernel Evolution
Microkernel EvolutionMicrokernel Evolution
Microkernel Evolution
 
Develop Your Own Operating System
Develop Your Own Operating SystemDevelop Your Own Operating System
Develop Your Own Operating System
 
olibc: Another C Library optimized for Embedded Linux
olibc: Another C Library optimized for Embedded Linuxolibc: Another C Library optimized for Embedded Linux
olibc: Another C Library optimized for Embedded Linux
 

Recently uploaded

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 

Recently uploaded (20)

WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 

Virtual Machine Constructions for Dummies

  • 1. Virtual Machine Construction for Dummies Jim Huang <jserv@0xlab.org> Rifur Ni <rifurdoma@gmail.com> Xatier Lee <xatierlike@gmail.com> Jan 29, 2013 ! TOSSUG / March 7, 2013 ! 新竹碼農
  • 2. Rights to copy Attribution – ShareAlike 3.0 You are free to copy, distribute, display, and perform the work to make derivative works to make commercial use of the work Under the following conditions Attribution. You must give the original author credit. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. For any reuse or distribution, you must make clear to others the license terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above. License text: http://creativecommons.org/licenses/by-sa/3.0/legalcode © Copyright 2015 0xlab contact@0xlab.org Corrections, suggestions, contributions and translations are welcome! Latest update: Feb 21, 2015
  • 3. Goals of this Presentation • Build a full-functioned virtual machine from scratch – The full source code is inside the slides. • Basic concepts about interpreter, optimizations techniques, language specialization, and platform specific tweaks. • Brainfuck is selected as the primary programming language because – it's a very simple turing-complete programming language. – it's easier to write its compiler than its interpreter. – it's easier to write its interpreter than its real programs.
  • 4. Brainfuck Programming Language • created in 1993 by Urban Müller • Only 8 instructions – Müller's Amiga compiler was 240 bytes in size – x86/Linux by Brian Raiter had 171 Bytes! +++[>+++++[>+++++>+++++>++++>++ ++>+<<<<<-]>++++>+++>+++>+<<<<< -]>>.>.>+.>.>-----.
  • 5. Learn such a stupid language! Why? • Understand how basic a Turing-complete programming language can be. – A common argument when programmers compare languages is “well they’re all Turing-complete”, meaning that anything you can do in one language you can do in another. • Once you’ve learnt brainfuck, you’ll understand just how difficult it can be to use a Turing-complete language, and how that argument holds no water. Source: http://skilldrick.co.uk/2011/02/why-you-should-learn-brainfuck-or-learn-you-a-brainfuck- for-great-good/
  • 8. Brainfuck Instructions (mapped to C language) Increment the byte value at the data pointer. Decrement the data pointer to point to the previous cell. Increment the data pointer to point to the next cell. Decrement the byte value at the data pointer. Output the byte value at the data pointer. Input one byte and store its value at the data pointer. If the byte value at the data pointer is zero, jump to the instruction following the matching ] bracket. Otherwise, continue execution. Unconditionally jump back to the matching [ bracket.
  • 9. Writing a Brainfuck compiler is Easy! #!/usr/bin/awk -f BEGIN { print "int main() {"; print " int c = 0;"; print " static int b[30000];n"; } { gsub(/]/, " }n"); gsub(/[/, " while(b[c] != 0) {n"); gsub(/+/, " ++b[c];n"); gsub(/-/, " --b[c];n"); gsub(/>/, " ++c;n"); gsub(/</, " --c;n"); gsub(/./, " putchar(b[c]);n"); gsub(/,/, " b[c] = getchar();n"); print $0 } END { print "n return 0;"; print "}"; }
  • 10. Brainfuck interpreter in portable C (1/3) #include <stdio.h> #include <stdlib.h> int p, r, q; char a[5000], f[5000], b, o, *s = f; void interpret(char *c) { char *d; r++; while (*c) { switch (o = 1, *c++) { case '<': p--; break; case '>': p++; break; case '+': a[p]++; break; case '-': a[p]--; break;
  • 11. Brainfuck interpreter in portable C (2/3) case '.': putchar(a[p]); fflush(stdout); break; case ',': a[p] = getchar(); fflush(stdout); break; case '[': for (b = 1, d = c; b && *c; c++) b += *c == '[', b -= *c == ']'; if (!b) { c[-1] = 0; while (a[p]) interpret(d); c[-1] = ']'; break; } case ']': puts("Unblanced brackets"),exit(0);
  • 12. Brainfuck interpreter in portable C (3/3) default: o = 0; } if (p < 0 || p > 100) puts("Range error"), exit(0); } r--; } int main(int argc, char *argv[]) { FILE *z; q = argc; if ((z = fopen(argv[1], "r"))) { while ((b = getc(z)) > 0) *s++ = b; *s = 0; interpret(f); } return 0; }
  • 13. Self-Interpreter can be short! Writen by Oleg Mazonka & Daniel B. Cristofani 21 November 2003 >>>+[[-]>>[-]++>+>+++++++[<++++>>++<-]++>>+>+>+++++[> ++>++++++<<-]+>>>,<++[[>[->>]<[>>]<<-]<[<]<+>>[>]>[<+ >-[[<+>-]>]<[[[-]<]++<-[<+++++++++>[<->-]>>]>>]]<<]<] < [[<]>[[>]>>[>>]+[<<]<[<]<+>>-]>[>]+[->>]<<<<[[<<]<[<] +<<[+>+<<-[>-->+<<-[>+<[>>+<<-]]]>[<+>-]<]++>>-->[>]> >[>>]]<<[>>+<[[<]<]>[[<<]<[<]+[-<+>>-[<<+>++>-[<->[<< +>>-]]]<[>+<-]>]>[>]>]>[>>]>>]<<[>>+>>+>>]<<[->>>>>>> >]<<[>.>>>>>>>]<<[>->>>>>]<<[>,>>>]<<[>+>]<<[+<<]<]
  • 14. Turing Complete (again) • In fact, Brianfuck has 6 opcode + Input/Output commands • gray area for I/O (implementation dependent) – EOF – tape length – cell type – newlines • That is enough to program! • Extension: self-modifying Brainfuck https://soulsphere.org/hacks/smbf/
  • 15. Statement: while • Implementing a while statement is easy, because the Brainfuck [ .. ] statement is a while loop. • Thus, while (x) { <foobar> } becomes (move pointer to a) [ (foobar) (move pointer to a) ]
  • 16. Statement: x=y • Implementing assignment (copy) instructions is complex. • Straightforward way of doing that resets y to zero: (move pointer to y) [ ­ (move pointer to x) + (move pointer to y) ] • A temporary variable t is needed: (move pointer to y) [ ­ (move pointer to t) + (move pointer to y) ] (move pointer to t) [ ­ (move pointer to x) + (move pointer to y) + (move pointer to t) ]
  • 17. Statement: if • The if statement is like a while-loop, but it should run its block only once. Again, a temporary variable is needed to implement if (x) { <foobar> }: (move pointer to x) [ ­ (move pointer to t) + (move pointer to x) ] (move pointer to t) [     [ ­     (move pointer to x) +     (move pointer to t) ]     (foobar) (move pointer to t) ]
  • 19. Example: cat + [ , . ] cell[0] ← 1 while(cell[0]) { Read-in a character print it }
  • 20. Example: if-endif $f + $A + [ $B + /* $B = $B + 1 */ $f [-] /* end if */ ] $A = 1; if($A) { $B = $B + 1; }
  • 21. Example: if-else-endif $f + $A + [ $B + /* $B = $B + 1 */ $f [-] /* end if */ ] $f [ $B - /* $B = $B - 1 */ $f [-] /* end if */ ] $A = 1; if($A) { ++$B; } else { --$B; }
  • 22. Example: Multiply (6x7) +++ +++ [ > +++ +++ + < - ] cell[0] ← 6 while(cell[0]) { cell[1] += 7; --cell[0]; }
  • 23. Example: Division (12/4) ++++++++++++ // Loop 12 times [ >+ // Increment second cell <---- // Subtract 4 from first cell ] >. // Display second cell's value
  • 24. Example: Hello World! ++++++++[>+++++++++<-]>. // 8 x 9 = 72 (H) <+++++[>++++++<-]>-. // 72 + (6 x 5) - 1 = 101 (e) +++++++.. // 101 + 7 = 108 (l) +++. // 108 + 3 = 111 (o) <++++++++[>>++++<<-]>>. // 8 x 4 = 32 (SPACE) <<++++[>------<-]>. // 111 - 24 = 87 (W) <++++[>++++++<-]>. // 87 + 24 = 111 (o) +++. // 111 + 3 = 114 (r) ------. // 114 - 6 = 108 (l) --------. // 108 - 8 = 100 (d) >+. // 32 + 1 = 33 (!)
  • 25. Example: Hello World! 0 0 72 0 p ...a: 0 0 9 72 0 H 101 0 e 108 0 ll 111 0 o 0 0 a[0] a[1] 111 0 87 0 114 0 108 0 100 0 33 0 10 0 32 0 W o r l d n ! >+++++++++[<++++++++>-]<.>+++++++[<++++>-]<+. +++++++..+++.[-]>++++++++[<++++>-]<. >+++++++++++[<+++++>-]<.>++++++++[<+++>-]<. +++.------.--------.[-]>++++++++[<++++>-]<+. [-]++++++++++.
  • 27. Example: Bubble Sort >>>>>,.[>>>,.] <<< [<<< [>>> [-<<<-<+>[>]>>] <<<[<]>> [>>>+<<<-]< [>+>>>+<<<<-] <<] >>>[.[-]] >>>[>>>]<<<] Idea: if(b>a): swap(a, b) Operation: decrement $a and $b. Then, store the smaller one into $t
  • 32. Brainfuck Toolchain :: interpreter, translator, virtual machine, nested runtime ::
  • 33. Nested Interpreting • Translation: – BF extensions → Brainfuck – Other languages → Brainfuck • Interpreter written in Brainfuck runs on BF VM
  • 34. DEF def Copy(s, d, t) { s [ d+ t+ s- ] t [ s+ t- ] } END MAIN $100=10 $200=0 $300=0 Copy($100, $200, $300) #define Copy(argc, s, d, t) s[d+t+s-] t[s+d+t-] $100 =10 $200 =0 $300 =0 Copy(3, $100, $200, $300) $100 =10 $200 =0 $300 =0 $100 [ $200+ $300+ $100- ] $300 [ $100+ $200+ $300- ]
  • 35. Brainfuck translator (use BF as the backend) • Translate C-like language to Brainfuck – bfc [C] → bfa [Assembly] → bf [Machine/CPU] http://www.clifford.at/bfcpu/bfcomp.html • Another C to Brainfuck http://esolangs.org/wiki/C2BF • BASIC to Brainfuck http://esolangs.org/wiki/BFBASIC
  • 37. Using Artificial Intelligence to Write Self- Modifying/Improving Programs • AI program works, as follows: – A genome consists of an array of doubles. – Each gene corresponds to an instruction in the brainf-ck programming language. – Start with a population of random genomes. – Decode each genome into a resulting program by converting each double into its corresponding instruction and execute the program. – Get each program's fitness score, based upon the output it writes to the console (if any), and rank them. – Mate the best genomes together using roulette selection, crossover, and mutation to produce a new generation. – Repeat the process with the new generation until the target fitness score is achieved. Source: http://www.primaryobjects.com/CMS/Article149
  • 40. Interpreter vs. Static Compiler Implementation (user-space) Execution Time ( in second) simple bf 91.50 slight optimizations 8.03 bff 5.04 vff 3.10 vm + optimizations 3.02 BF-JIT 17.78 BF-JIT + optimizations 1.37 simple xbyak JIT 3.25 xbyak JIT + optimizations 0.93 custom JIT + aggressive optimizations 0.77 Simple lightning JIT 1.27 Implementation (user-space) Execution Time ( in second) simple BF to C 1.27 awib to C 1.05 esotope-bfc 0.72 bftran to C 0.66 bftran to ELF32c 3.58 The executable generated by static compiler (2 pass: BF → C → x86_64) is likely slower than optimized interpreters. The fastest interpreter record appears on Lenovo X230 [Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz]. Plain JIT compilation without effective optimizations is slower than portable interpreters! 11x speedup!
  • 41.
  • 42. Walk through typical Design Patterns • Classify the executions of Brainfuck programs • Eliminate the redundant – CSE: common sub-expression elimination • Quick instructions • Hotspot – Replace with faster implementation • Enable Just-In-Time compilation
  • 43. Pattern: +++ ++++++++++ *ptr += 10 Pattern: >>> >>>>>>>>>> ptr += 10
  • 44. Pattern: +++ ++++++++++ *ptr += 10 Fetch Instruction Eval Increment Fetch Instruction Eval Increment … 10 Instructions Fetch Instruction Eval Calc +10 1 instruction
  • 45. Pattern: [-] [-] *ptr = 0 Pattern: [>+<-] [>+<-] *(ptr+1) += *ptr *ptr = 0
  • 46. Pattern: [-] [-] Interpret: if(!*ptr) goto ];--*ptr;goto [; Fetch Instruction Jump If Zero Fetch Instruction Eval Decrement Fetch Instruction Jump Contains Branch Fetch Instruction Eval Reset Zero 1 Instruction, No Branch
  • 47. Optimization Techniques • To evaluate the impact different optimization techniques can have on performance, we need a set of Brainfuck programs that are sufficiently non-trivial for optimization to make sense. – awib-0.4 (Brainfuck compiler) – factor.b – mandelbrot.b – hanoi.b – dbfi.b (self-interpreter) – long.b • source: https://github.com/matslina/bfoptimization
  • 49. Benchmark Results (w/ and w/o optimizations)
  • 50. Benchmark Results (Contraction) IR C add(x) mem[p] += x; sub(x) mem[p] -= x; right(x) p += x; left(x) p -= x; output putchar(mem[p]); input mem[p] = getchar(); open_loop while(mem[p]) { close_loop } +++++[­>>>++<<<]>>>. mem[p]++; mem[p]++; mem[p]++; mem[p]++; mem[p]++; while (mem[p]) {     mem[p]­­;     p++;     p++;     p++;     mem[p]++;     mem[p]++;     p­­;     p­­;     p­­; } p++; p++; p++; putchar(mem[p]); mem[p] += 5; while (mem[p]) {     mem[p] ­= 1;     p += 3;     mem[p] += 2;     p ­= 3; } p += 3; putchar(mem[p]);
  • 51. Benchmark Results (Clear loops) IR C add mem[p]++; sub mem[p]--; right p++ left p-- output putchar(mem[p]); input mem[p] = getchar(); open_loop while(mem[p]) { close_loop } clear mem[p] = 0; [­] Eval Reset Zero
  • 52. Benchmark Results (Copy loops) [­>+>+<<]) mem[p+1] += mem[p]; mem[p+2] += mem[p]; mem[p] = 0; IR C add(x) mem[p] += x; sub(x) mem[p] -= x; right(x) p += x; left(x) p -= x; output putchar(mem[p]); input mem[p] = getchar(); open_loop while(mem[p]) { close_loop } clear mem[p] = 0; copy(x) mem[p+x] += mem[p];
  • 53. Benchmark Results (Multiplication loops) IR C add(x) mem[p] += x; sub(x) mem[p] -= x; right(x) p += x; left(x) p -= x; output putchar(mem[p]); input mem[p] = getchar(); open_loop while(mem[p]) { close_loop } clear mem[p] = 0; mul(x,y) mem[p+x] += mem[p] * y; [­>+++>+++++++<<] mem[p+1] += mem[p] * 3; mem[p+2] += mem[p] * 7; mem[p] = 0
  • 54. Benchmark Results (Operation offsets) IR C add(x,off) mem[p+off] += x; sub(x,off) mem[p+off] -= x; right(x) p++ left(x) p-- output putchar(mem[p+off]); input mem[p+off] = getchar(); open_loop while(mem[p]) { close_loop } clear mem[p+off] = 0; mul(x,y) mem[p+x+off] += mem[p+off] * y; Both the copy loop and multiplication loop optimizations share an interesting trait: they perform an arithmetic operation at an offset from the current cell. In brainfuck we often find long sequences of non-loop operations and these sequences typically contain a fair number of < and >. Why waste time moving the pointer around?
  • 55. Benchmark Results (Scan loops) IR C add(x) mem[p] += x; sub(x) mem[p] -= x; right(x) p += x; left(x) p -= x; output putchar(mem[p]); input mem[p] = getchar(); open_loop while(mem[p]) { close_loop } clear mem[p] = 0; mul(x,y) mem[p+x] += mem[p] * y; ScanLeft p -= (long)((void *)(mem + p) - memrchr(mem, 0, p + 1)); ScanRight p += (long)(memchr(mem + p, 0, sizeof(mem)) - (void *)(mem + p));  +<[>­]>[>]< The problem of efficiently searching a memory area for occurrences of a particular byte is mostly solved by the C standard library’s memchr() function, which operates by loading full memory words (typically 32 or 64 bits) into a CPU register and checking the individual 8-bit components in parallel. This proves to be much more efficient than loading and inspecting bytes one at a time.
  • 56. Benchmark Results (apply all techniques) 2.4x speedup 130x speedup
  • 57. Reference 1. Principles of Compiler Design: The Brainf*ck Compiler http://www.clifford.at/papers/2004/compiler/ 2. brainfuck optimization strategies http://calmerthanyouare.org/2015/01/07/optimizing-brainfuck.html 3. Brainf*ck Compiler Project http://www.clifford.at/bfcpu/bfcomp.html 4. Brainfuck code generation http://esolangs.org/wiki/Brainfuck_code_generation