SlideShare a Scribd company logo
1 of 181
Download to read offline
Tag: virtual machine, compiler, performance
PyPy’s Approach to Construct Domain-specific
Language Runtime
Tag: virtual machine, compiler, performance
Construct Domain-specific Language Runtime
using
Speed
7.4 times faster than CPython
http://speed.pypy.org
antocuni (PyCon Otto) PyPy Status Update April 07 2017 4 / 19
Why is Python slow?
Interpretation overhead
Boxed arithmetic and automatic overflow handling
Dynamic dispatch of operations
Dynamic lookup of methods and attributes
Everything can change on runtime
Extreme introspective and reflective capabilities
Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 8 / 51
Why is Python slow?
Boxed arithmetic and automatic overflow handling
i = 0
while i < 10000000:
i = i +1
Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 9 / 51
Why is Python slow?
Dynamic dispatch of operations
# while i < 1000000
9 LOAD_FAST 0 (i)
12 LOAD_CONST 2 (10000000)
15 COMPARE_OP 0 (<)
18 POP_JUMP_IF_FALSE 34
# i = i + 1
21 LOAD_FAST 0 (i)
24 LOAD_CONST 3 (1)
27 BINARY_ADD
28 STORE_FAST 0 (i)
31 JUMP_ABSOLUTE 9
Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 10 / 51
Why is Python slow?
Dynamic lookup of methods and attributes
class MyExample(object ):
pass
def foo(target , flag ):
if flag:
target.x = 42
obj = MyExample ()
foo(obj , True)
print obj.x #=> 42
print getattr(obj , "x") #=> 42
Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 11 / 51
Why is Python slow?
Everything can change on runtime
def fn():
return 42
def hello ():
return ’Hi! PyConEs!’
def change_the_world ():
global fn
fn = hello
print fn() #=> 42
change_the_world ()
print fn() => ’Hi! PyConEs!’
Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 12 / 51
Why is Python slow?
Everything can change on runtime
class Dog(object ):
def __init__(self ):
self.name = ’Jandemor ’
def talk(self ):
print "%s: guau!" % self.name
class Cat(object ):
def __init__(self ):
self.name = ’CatInstance ’
def talk(self ):
print "%s: miau!" % self.name
Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 13 / 51
Why is Python slow?
Everything can change on runtime
my_pet = Dog()
my_pet.talk () #=> ’Jandemor: guau!’
my_pet.__class__ = Cat
my_pet.talk () #=> ’Jandemor: miau!’
Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 14 / 51
Why is Python slow?
Extreme introspective and reflective capabilities
def fill_list(name ):
frame = sys._getframe (). f_back
lst = frame.f_locals[name]
lst.append (42)
def foo ():
things = []
fill_list(’things ’)
print things #=> 42
Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 15 / 51
Why is Python slow?
Everything can change on runtime
def fn():
return 42
def hello ():
return ’Hi! PyConEs!’
def change_the_world ():
global fn
fn = hello
print fn() #=> 42
change_the_world ()
print fn() => ’Hi! PyConEs!’
Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 12 / 51
PyPy Translation Toolchain
• Capable of compiling (R)Python!
• Garbage collection!
• Tracing just-in-time compiler generator!
• Software transactional memory?
PyPy Architecture
PyPy based interpreters
• Topaz (Ruby)!
• HippyVM (PHP)!
• Pyrolog (Prolog)!
• pycket (Racket)!
• Various other interpreters for (Scheme, Javascript,
io, Gameboy)
Compiler / Interpreter
Source: Compiler Construction, Prof. O. NierstraszSource: Compiler Construction, Prof. O. Nierstrasz
• intermediate representation (IR)
• front end maps legal code into IR
• back end maps IR onto target machine
• simplify retargeting
• allows multiple front ends
• multiple passes better code→
Traditional 2 pass compiler
• analyzes and changes IR
• goal is to reduce runtime
• must preserve values
Traditional 3 pass compiler
• constant propagation and folding
• code motion
• reduction of operator strength
• common sub-expression elimination
• redundant store elimination
• dead code elimination
Optimizer: middle end
Modern optimizers are usually built as a set of passes
• Preserve language semantics
• Reflection, Introspection, Eval
• External APIs
• Interpreter consists of short sequences of code
• Prevent global optimizations
• Typically implemented as a stack machine
• Dynamic, imprecise type information
• Variables can change type
• Duck Typing: method works with any object that provides
accessed interfaces
• Monkey Patching: add members to “class” after initialization
• Memory management and concurrency
• Function calls through packing of operands in fat object
Optimization Challenges
PyPy Functional Architecture
RPython
• Python subset!
• Statically typed!
• Garbage collected!
• Standard library almost entirely unavailable!
• Some missing builtins (print, open(), …)!
• rpython.rlib!
• exceptions are (sometimes) ignored!
• Not a really a language, rather a "state"
22
PyPy Interpreter
def f(x):
return x + 1
>>> dis.dis(f)
2 0 LOAD_FAST 0 (x)
3 LOAD_CONST 1 (1)
6 BINARY_ADD
7 RETURN_VALUE
• written in Rpython
• Stack-based bytecode interpreter (like JVM)
• bytecode compiler generates bytecode→
• bytecode evaluator interprets bytecode →
• object space handles operations on objects→
23
PyPy Bytecode Interpreter
31
CFG (Call Flow Graph)
• Consists of Blocks and
Links
• Starting from entry_point
• “Single Static Information”
form
def f(n):
return 3 * n + 2
Block(v1): # input argument
v2 = mul(Constant(3), v1)
v3 = add(v2, Constant(2))
33
CFG: Static Single Information
33
def test(a):
if a > 0:
if a > 5:
return 10
return 4
if a < - 10:
return 3
return 10
• SSI: “PHIs” for all used variables
• Blocks as “functions without branches”
• High Level Language Implementation
• to implement new features: lazily computed objects
and functions, plug-able  garbage-collection, runtime
replacement of live-objects, stackless concurrency 
• JIT Generation
• Object space
• Stackless
• infinite Recursion
• Microthreads: Coroutines, Tasklets and Channels,
Greenlets
PyPy Advantages
PERCEPTION
http://abstrusegoose.com/secretarchives/under-the-hood - CC BY-NC 3.0 US
Assumptions
Pareto Principle (80-20 rule)
I the 20% of the program accounts for the 80% of the
runtime
I hot-spots
Fast Path principle
I optimize only what is necessary
I fall back for uncommon cases
Most of runtime spent in loops
Always the same code paths (likely)
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 9 / 32
Tracing JIT phases
Interpretation
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
Tracing JIT phases
Interpretation
Tracing
hot loop detected
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
Tracing JIT phases
Interpretation
Tracing
hot loop detected
Compilation
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
Tracing JIT phases
Interpretation
Tracing
hot loop detected
Compilation
Running
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
Tracing JIT phases
Interpretation
Tracing
hot loop detected
Compilation
Running
cold guard failed
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
Tracing JIT phases
Interpretation
Tracing
hot loop detected
Compilation
Running
cold guard failed
entering compiled loop
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
Tracing JIT phases
Interpretation
Tracing
hot loop detected
Compilation
Running
cold guard failed
entering compiled loop
guard failure → hot
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
Tracing JIT phases
Interpretation
Tracing
hot loop detected
Compilation
Running
cold guard failed
entering compiled loop
guard failure → hot
hot guard failed
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
Trace trees (1)
tracetree.py
def foo():
a = 0
i = 0
N = 100
while i < N:
if i%2 == 0:
a += 1
else:
a *= 2;
i += 1
return a
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 12 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
BLACKHOLE
COLD FAIL
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
BLACKHOLE
COLD FAIL
INTERPRETER
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
BLACKHOLE
COLD FAIL
INTERPRETER
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
a1 = int_mul(a0, 2)
i1 = int_add(i0, 1)
jump(start, i1, a1)
HOT FAIL
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
a1 = int_mul(a0, 2)
i1 = int_add(i0, 1)
jump(start, i1, a1)
HOT FAIL
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Trace trees (2)
label(start, i0, a0)
v0 = int_lt(i0, 2000)
guard_true(v0)
v1 = int_mod(i0, 2)
v2 = int_eq(v1, 0)
guard_true(v1)
a1 = int_add(a0, 10)
i1 = int_add(i0, 1)
jump(start, i1, a1)
a1 = int_mul(a0, 2)
i1 = int_add(i0, 1)
jump(start, i1, a1)
HOT FAIL
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
Part 3
The PyPy JIT
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 14 / 32
Terminology (1)
translation time: when you run "rpython
targetpypy.py" to get the pypy binary
runtime: everything which happens after you start
pypy
interpretation, tracing, compiling
assembler/machine code: the output of the JIT
compiler
execution time: when your Python program is being
executed
I by the interpreter
I by the machine code
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 15 / 32
Terminology (2)
interp-level: things written in RPython
[PyPy] interpreter: the RPython program which
executes the final Python programs
bytecode: "the output of dis.dis". It is executed by the
PyPy interpreter.
app-level: things written in Python, and executed by
the PyPy Interpreter
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 16 / 32
Terminology (3)
(the following is not 100% accurate but it’s enough to
understand the general principle)
low level op or ResOperation
I low-level instructions like "add two integers", "read a field
out of a struct", "call this function"
I (more or less) the same level of C ("portable assembler")
I knows about GC objects (e.g. you have getfield_gc
vs getfield_raw)
jitcodes: low-level representation of RPython
functions
I sequence of low level ops
I generated at translation time
I 1 RPython function --> 1 C function --> 1 jitcode
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 17 / 32
Terminology (4)
JIT traces or loops
I a very specific sequence of llops as actually executed by
your Python program
I generated at runtime (more specifically, during tracing)
JIT optimizer: takes JIT traces and emits JIT traces
JIT backend: takes JIT traces and emits machine
code
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 18 / 32
General architecture
def LOAD_GLOBAL(self):
...
def STORE_FAST(self):
...
def BINARY_ADD(self):
...
RPYTHON
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
General architecture
def LOAD_GLOBAL(self):
...
def STORE_FAST(self):
...
def BINARY_ADD(self):
...
RPYTHON
CODEWRITER
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
General architecture
def LOAD_GLOBAL(self):
...
def STORE_FAST(self):
...
def BINARY_ADD(self):
...
RPYTHON
CODEWRITER
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
....
...
p0 = getfield_gc(p0, 'locals_w')
setarrayitem_gc(p0, i0, p1)
....
...
promote_class(p0)
i0 = getfield_gc(p0, 'intval')
promote_class(p1)
i1 = getfield_gc(p1, 'intval')
i2 = int_add(i0, i1)
if (overflowed) goto ...
p2 = new_with_vtable('W_IntObject')
setfield_gc(p2, i2, 'intval')
....
JITCODE
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
General architecture
def LOAD_GLOBAL(self):
...
def STORE_FAST(self):
...
def BINARY_ADD(self):
...
RPYTHON
CODEWRITER
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
....
...
p0 = getfield_gc(p0, 'locals_w')
setarrayitem_gc(p0, i0, p1)
....
...
promote_class(p0)
i0 = getfield_gc(p0, 'intval')
promote_class(p1)
i1 = getfield_gc(p1, 'intval')
i2 = int_add(i0, i1)
if (overflowed) goto ...
p2 = new_with_vtable('W_IntObject')
setfield_gc(p2, i2, 'intval')
....
JITCODE
compile-time
runtime
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
General architecture
def LOAD_GLOBAL(self):
...
def STORE_FAST(self):
...
def BINARY_ADD(self):
...
RPYTHON
CODEWRITER
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
....
...
p0 = getfield_gc(p0, 'locals_w')
setarrayitem_gc(p0, i0, p1)
....
...
promote_class(p0)
i0 = getfield_gc(p0, 'intval')
promote_class(p1)
i1 = getfield_gc(p1, 'intval')
i2 = int_add(i0, i1)
if (overflowed) goto ...
p2 = new_with_vtable('W_IntObject')
setfield_gc(p2, i2, 'intval')
....
JITCODE
compile-time
runtime
META-TRACER
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
General architecture
def LOAD_GLOBAL(self):
...
def STORE_FAST(self):
...
def BINARY_ADD(self):
...
RPYTHON
CODEWRITER
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
....
...
p0 = getfield_gc(p0, 'locals_w')
setarrayitem_gc(p0, i0, p1)
....
...
promote_class(p0)
i0 = getfield_gc(p0, 'intval')
promote_class(p1)
i1 = getfield_gc(p1, 'intval')
i2 = int_add(i0, i1)
if (overflowed) goto ...
p2 = new_with_vtable('W_IntObject')
setfield_gc(p2, i2, 'intval')
....
JITCODE
compile-time
runtime
META-TRACEROPTIMIZER
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
General architecture
def LOAD_GLOBAL(self):
...
def STORE_FAST(self):
...
def BINARY_ADD(self):
...
RPYTHON
CODEWRITER
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
....
...
p0 = getfield_gc(p0, 'locals_w')
setarrayitem_gc(p0, i0, p1)
....
...
promote_class(p0)
i0 = getfield_gc(p0, 'intval')
promote_class(p1)
i1 = getfield_gc(p1, 'intval')
i2 = int_add(i0, i1)
if (overflowed) goto ...
p2 = new_with_vtable('W_IntObject')
setfield_gc(p2, i2, 'intval')
....
JITCODE
compile-time
runtime
META-TRACEROPTIMIZERBACKEND
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
General architecture
def LOAD_GLOBAL(self):
...
def STORE_FAST(self):
...
def BINARY_ADD(self):
...
RPYTHON
CODEWRITER
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
....
...
p0 = getfield_gc(p0, 'locals_w')
setarrayitem_gc(p0, i0, p1)
....
...
promote_class(p0)
i0 = getfield_gc(p0, 'intval')
promote_class(p1)
i1 = getfield_gc(p1, 'intval')
i2 = int_add(i0, i1)
if (overflowed) goto ...
p2 = new_with_vtable('W_IntObject')
setfield_gc(p2, i2, 'intval')
....
JITCODE
compile-time
runtime
META-TRACEROPTIMIZERBACKENDASSEMBLER
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
PyPy trace example
def fn():
c = a+b
...
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
PyPy trace example
def fn():
c = a+b
...
LOAD_GLOBAL A
LOAD_GLOBAL B
BINARY_ADD
STORE_FAST C
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
PyPy trace example
def fn():
c = a+b
...
LOAD_GLOBAL A
LOAD_GLOBAL B
BINARY_ADD
STORE_FAST C
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
...
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
PyPy trace example
def fn():
c = a+b
...
LOAD_GLOBAL A
LOAD_GLOBAL B
BINARY_ADD
STORE_FAST C
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
...
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
...
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
PyPy trace example
def fn():
c = a+b
...
LOAD_GLOBAL A
LOAD_GLOBAL B
BINARY_ADD
STORE_FAST C
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
...
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
...
...
guard_class(p0, W_IntObject)
i0 = getfield_gc(p0, 'intval')
guard_class(p1, W_IntObject)
i1 = getfield_gc(p1, 'intval')
i2 = int_add(00, i1)
guard_not_overflow()
p2 = new_with_vtable('W_IntObject')
setfield_gc(p2, i2, 'intval')
...
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
PyPy trace example
def fn():
c = a+b
...
LOAD_GLOBAL A
LOAD_GLOBAL B
BINARY_ADD
STORE_FAST C
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
...
...
p0 = getfield_gc(p0, 'func_globals')
p2 = getfield_gc(p1, 'strval')
call(dict_lookup, p0, p2)
...
...
guard_class(p0, W_IntObject)
i0 = getfield_gc(p0, 'intval')
guard_class(p1, W_IntObject)
i1 = getfield_gc(p1, 'intval')
i2 = int_add(00, i1)
guard_not_overflow()
p2 = new_with_vtable('W_IntObject')
setfield_gc(p2, i2, 'intval')
...
...
p0 = getfield_gc(p0, 'locals_w')
setarrayitem_gc(p0, i0, p1)
....
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
PyPy optimizer
intbounds
constant folding / pure operations
virtuals
string optimizations
heap (multiple get/setfield, etc)
unroll
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 21 / 32
Intbound optimization (1)
intbound.py
def fn():
i = 0
while i < 5000:
i += 2
return i
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 22 / 32
Intbound optimization (2)
unoptimized
...
i17 = int_lt(i15, 5000)
guard_true(i17)
i19 = int_add_ovf(i15, 2)
guard_no_overflow()
...
optimized
...
i17 = int_lt(i15, 5000)
guard_true(i17)
i19 = int_add(i15, 2)
...
It works often
array bound checking
intbound info propagates all over the trace
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 23 / 32
Intbound optimization (2)
unoptimized
...
i17 = int_lt(i15, 5000)
guard_true(i17)
i19 = int_add_ovf(i15, 2)
guard_no_overflow()
...
optimized
...
i17 = int_lt(i15, 5000)
guard_true(i17)
i19 = int_add(i15, 2)
...
It works often
array bound checking
intbound info propagates all over the trace
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 23 / 32
Intbound optimization (2)
unoptimized
...
i17 = int_lt(i15, 5000)
guard_true(i17)
i19 = int_add_ovf(i15, 2)
guard_no_overflow()
...
optimized
...
i17 = int_lt(i15, 5000)
guard_true(i17)
i19 = int_add(i15, 2)
...
It works often
array bound checking
intbound info propagates all over the trace
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 23 / 32
Virtuals (1)
virtuals.py
def fn():
i = 0
while i < 5000:
i += 2
return i
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 24 / 32
Virtuals (2)
unoptimized
...
guard_class(p0, W_IntObject)
i1 = getfield_pure(p0, ’intval’)
i2 = int_add(i1, 2)
p3 = new(W_IntObject)
setfield_gc(p3, i2, ’intval’)
...
optimized
...
i2 = int_add(i1, 2)
...
The most important optimization (TM)
It works both inside the trace and across the loop
It works for tons of cases
I e.g. function frames
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 25 / 32
Virtuals (2)
unoptimized
...
guard_class(p0, W_IntObject)
i1 = getfield_pure(p0, ’intval’)
i2 = int_add(i1, 2)
p3 = new(W_IntObject)
setfield_gc(p3, i2, ’intval’)
...
optimized
...
i2 = int_add(i1, 2)
...
The most important optimization (TM)
It works both inside the trace and across the loop
It works for tons of cases
I e.g. function frames
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 25 / 32
Virtuals (2)
unoptimized
...
guard_class(p0, W_IntObject)
i1 = getfield_pure(p0, ’intval’)
i2 = int_add(i1, 2)
p3 = new(W_IntObject)
setfield_gc(p3, i2, ’intval’)
...
optimized
...
i2 = int_add(i1, 2)
...
The most important optimization (TM)
It works both inside the trace and across the loop
It works for tons of cases
I e.g. function frames
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 25 / 32
Constant folding (1)
constfold.py
def fn():
i = 0
while i < 5000:
i += 2
return i
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 26 / 32
Constant folding (2)
unoptimized
...
i1 = getfield_pure(p0, ’intval’)
i2 = getfield_pure(<W_Int(2)>,
’intval’)
i3 = int_add(i1, i2)
...
optimized
...
i1 = getfield_pure(p0, ’intval’)
i3 = int_add(i1, 2)
...
It "finishes the job"
Works well together with other optimizations (e.g.
virtuals)
It also does "normal, boring, static" constant-folding
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 27 / 32
Constant folding (2)
unoptimized
...
i1 = getfield_pure(p0, ’intval’)
i2 = getfield_pure(<W_Int(2)>,
’intval’)
i3 = int_add(i1, i2)
...
optimized
...
i1 = getfield_pure(p0, ’intval’)
i3 = int_add(i1, 2)
...
It "finishes the job"
Works well together with other optimizations (e.g.
virtuals)
It also does "normal, boring, static" constant-folding
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 27 / 32
Constant folding (2)
unoptimized
...
i1 = getfield_pure(p0, ’intval’)
i2 = getfield_pure(<W_Int(2)>,
’intval’)
i3 = int_add(i1, i2)
...
optimized
...
i1 = getfield_pure(p0, ’intval’)
i3 = int_add(i1, 2)
...
It "finishes the job"
Works well together with other optimizations (e.g.
virtuals)
It also does "normal, boring, static" constant-folding
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 27 / 32
Out of line guards (1)
outoflineguards.py
N = 2
def fn():
i = 0
while i < 5000:
i += N
return i
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 28 / 32
Out of line guards (2)
unoptimized
...
quasiimmut_field(<Cell>, ’val’)
guard_not_invalidated()
p0 = getfield_gc(<Cell>, ’val’)
...
i2 = getfield_pure(p0, ’intval’)
i3 = int_add(i1, i2)
optimized
...
guard_not_invalidated()
...
i3 = int_add(i1, 2)
...
Python is too dynamic, but we don’t care :-)
No overhead in assembler code
Used a bit "everywhere"
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 29 / 32
Out of line guards (2)
unoptimized
...
quasiimmut_field(<Cell>, ’val’)
guard_not_invalidated()
p0 = getfield_gc(<Cell>, ’val’)
...
i2 = getfield_pure(p0, ’intval’)
i3 = int_add(i1, i2)
optimized
...
guard_not_invalidated()
...
i3 = int_add(i1, 2)
...
Python is too dynamic, but we don’t care :-)
No overhead in assembler code
Used a bit "everywhere"
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 29 / 32
Out of line guards (2)
unoptimized
...
quasiimmut_field(<Cell>, ’val’)
guard_not_invalidated()
p0 = getfield_gc(<Cell>, ’val’)
...
i2 = getfield_pure(p0, ’intval’)
i3 = int_add(i1, i2)
optimized
...
guard_not_invalidated()
...
i3 = int_add(i1, 2)
...
Python is too dynamic, but we don’t care :-)
No overhead in assembler code
Used a bit "everywhere"
antocuni (Intel@Bucharest) PyPy Intro April 4 2016 29 / 32
Hello RPython
# hello_rpython.py	
import os	
!
def entry_point(argv):	
os.write(2, “Hello, World!n”)	
return 0	
!
def target(driver, argv):	
return entry_point, None
$ rpython hello_rpython.py	
…	
$ ./hello_python-c	
Hello, RPython!
Goal
• BASIC interpreter capable of running Hamurabi!
• Bytecode based!
• Garbage Collection!
• Just-In-Time Compilation
Live play session
Architecture
Parser
Compiler
Virtual Machine
AST
Bytecode
Source
10 PRINT TAB(32);"HAMURABI"	
20 PRINT TAB(15);"CREATIVE COMPUTING MORRISTOWN, NEW JERSEY"	
30 PRINT:PRINT:PRINT	
80 PRINT "TRY YOUR HAND AT GOVERNING ANCIENT SUMERIA"	
90 PRINT "FOR A TEN-YEAR TERM OF OFFICE.":PRINT	
95 D1=0: P1=0	
100 Z=0: P=95:S=2800: H=3000: E=H-S	
110 Y=3: A=H/Y: I=5: Q=1	
210 D=0	
215 PRINT:PRINT:PRINT "HAMURABI: I BEG TO REPORT TO YOU,": Z=Z+1	
217 PRINT "IN YEAR";Z;",";D;"PEOPLE STARVED,";I;"CAME TO THE CITY,"	
218 P=P+I	
227 IF Q>0 THEN 230	
228 P=INT(P/2)	
229 PRINT "A HORRIBLE PLAGUE STRUCK! HALF THE PEOPLE DIED."	
230 PRINT "POPULATION IS NOW";P	
232 PRINT "THE CITY NOW OWNS ";A;"ACRES."	
235 PRINT "YOU HARVESTED";Y;"BUSHELS PER ACRE."	
250 PRINT "THE RATS ATE";E;"BUSHELS."	
260 PRINT "YOU NOW HAVE ";S;"BUSHELS IN STORE.": PRINT	
270 REM *** MORE CODE THAT DID NOT FIT INTO THE SLIDE FOLLOWS
Parser
Parser
Abstract Syntax Tree (AST)
Source
Parser
Parser
AST
Source
Lexer
Tokens
Source
Parser
AST
RPLY
• Based on PLY, which is based on Lex and Yacc!
• Lexer generator!
• LALR parser generator
Lexer
from rply import LexerGenerator	
!
lg = LexerGenerator()	
!
lg.add(“NUMBER”, “[0-9]+”)	
# …	
lg.ignore(“ +”) # whitespace	
!
lexer = lg.build().lex
lg.add('NUMBER', r'[0-9]*.[0-9]+')	
lg.add('PRINT', r'PRINT')	
lg.add('IF', r'IF')	
lg.add('THEN', r'THEN')	
lg.add('GOSUB', r'GOSUB')	
lg.add('GOTO', r'GOTO')	
lg.add('INPUT', r'INPUT')	
lg.add('REM', r'REM')	
lg.add('RETURN', r'RETURN')	
lg.add('END', r'END')	
lg.add('FOR', r'FOR')	
lg.add('TO', r'TO')	
lg.add('NEXT', r'NEXT')	
lg.add('NAME', r'[A-Z][A-Z0-9$]*')	
lg.add('(', r'(')	
lg.add(')', r')')	
lg.add(';', r';')	
lg.add('STRING', r'"[^"]*"')	
lg.add(':', r'r?n')	
lg.add(':', r':')	
lg.add('=', r'=')	
lg.add('<>', r'<>')	
lg.add('-', r'-')	
lg.add('/', r'/')	
lg.add('+', r'+')	
lg.add('>=', r'>=')	
lg.add('>', r'>')	
lg.add('***', r'***.*')	
lg.add('*', r'*')	
lg.add('<=', r'<=')	
lg.add('<', r'<')
>>> from basic.lexer import lex	
>>> source = open("hello.bas").read()	
>>> for token in lex(source):	
... print token	
Token("NUMBER", "10")	
Token("PRINT", "PRINT")	
Token("STRING",'"HELLO BASIC!"')	
Token(":", "n")
Grammar
• A set of formal rules that defines the syntax!
• terminals = tokens!
• nonterminals = rules defining a sequence of one or
more (non)terminals
10 PRINT TAB(32);"HAMURABI"	
20 PRINT TAB(15);"CREATIVE COMPUTING MORRISTOWN, NEW JERSEY"	
30 PRINT:PRINT:PRINT	
80 PRINT "TRY YOUR HAND AT GOVERNING ANCIENT SUMERIA"	
90 PRINT "FOR A TEN-YEAR TERM OF OFFICE.":PRINT	
95 D1=0: P1=0	
100 Z=0: P=95:S=2800: H=3000: E=H-S	
110 Y=3: A=H/Y: I=5: Q=1	
210 D=0	
215 PRINT:PRINT:PRINT "HAMURABI: I BEG TO REPORT TO YOU,": Z=Z+1	
217 PRINT "IN YEAR";Z;",";D;"PEOPLE STARVED,";I;"CAME TO THE CITY,"	
218 P=P+I	
227 IF Q>0 THEN 230	
228 P=INT(P/2)	
229 PRINT "A HORRIBLE PLAGUE STRUCK! HALF THE PEOPLE DIED."	
230 PRINT "POPULATION IS NOW";P	
232 PRINT "THE CITY NOW OWNS ";A;"ACRES."	
235 PRINT "YOU HARVESTED";Y;"BUSHELS PER ACRE."	
250 PRINT "THE RATS ATE";E;"BUSHELS."	
260 PRINT "YOU NOW HAVE ";S;"BUSHELS IN STORE.": PRINT	
270 REM *** MORE CODE THAT DID NOT FIT INTO THE SLIDE FOLLOWS
program :	
program : line	
program : line program
line : NUMBER statements
statements : statement	
statements : statement statements
statement : PRINT :	
statement : PRINT expressions :	
expressions : expression	
expressions : expression ;	
expressions : expression ; expressions
statement : NAME = expression :
statement : IF expression THEN number :
statement : INPUT name :
statement : GOTO NUMBER :	
statement : GOSUB NUMBER :	
statement : RETURN :
statement : REM *** :
statement : FOR NAME = NUMBER TO NUMBER :	
statement : NEXT NAME :
statement : END :
expression : NUMBER	
expression : NAME	
expression : STRING	
expression : operation	
expression : ( expression )	
expression : NAME ( expression )
operation : expression + expression	
operation : expression - expression	
operation : expression * expression	
operation : expression / expression	
operation : expression <= expression	
operation : expression < expression	
operation : expression = expression	
operation : expression <> expression	
operation : expression > expression	
operation : expression >= expression
from rply.token import BaseBox	
!
class Program(BaseBox):	
def __init__(self, lines):

self.lines = lines
AST
class Line(BaseBox):	
def __init__(self, lineno, statements):	
self.lineno = lineno	
self.statements = statements
class Statements(BaseBox):	
def __init__(self, statements):	
self.statements = statements
class Print(BaseBox):	
def __init__(self, expressions, newline=True):	
self.expressions = expressions	
self.newline = newline
…
from rply import ParserGenerator	
!
pg = ParserGenerator(["NUMBER", "PRINT", …])
Parser
@pg.production("program : ")	
@pg.production("program : line")	
@pg.production("program : line program")	
def program(p):	
if len(p) == 2:	
return Program([p[0]] + p[1].get_lines())	
return Program(p)
@pg.production("line : number statements")	
def line(p):	
return Line(p[0], p[1].get_statements())
@pg.production("op : expression + expression")	
@pg.production("op : expression * expression")	
def op(p):	
if p[1].gettokentype() == "+":	
return Add(p[0], p[2])	
elif p[1].gettokentype() == "*":	
return Mul(p[0], p[2])
pg = ParserGenerator([…], precedence=[	
("left", ["+", "-"]),	
("left", ["*", "/"])	
])
parse = pg.build().parse
Compiler/Virtual Machine
Compiler
Virtual Machine
AST
Bytecode
class VM(object):	
def __init__(self, program):	
self.program = program
class VM(object):	
def __init__(self, program):	
self.program = program	
self.pc = 0
class VM(object):	
def __init__(self, program):	
self.program = program	
self.pc = 0	
self.frames = []
class VM(object):	
def __init__(self, program):	
self.program = program	
self.pc = 0	
self.frames = []	
self.iterators = []
class VM(object):	
def __init__(self, program):	
self.program = program	
self.pc = 0	
self.frames = []	
self.iterators = []	
self.stack = []
class VM(object):	
def __init__(self, program):	
self.program = program	
self.pc = 0	
self.frames = []	
self.iterators = {}	
self.stack = []	
self.variables = {}
class VM(object):	
…	
def execute(self):	
while self.pc < len(self.program.instructions):	
self.execute_bytecode(self.program.instructions[self.pc])
class VM(object):	
…	
def execute_bytecode(self, code):	
raise NotImplementedError(code)
class VM(object):	
...	
def execute_bytecode(self):	
if isinstance(code, TYPE):	
self.execute_TYPE(code)	
...	
else:	
raise NotImplementedError(code)
class Program(object):	
def __init__(self):	
self.instructions = []
Bytecode
class Instruction(object):	
pass
class Number(Instruction):	
def __init__(self, value):	
self.value = value	
!
class String(Instructions):	
def __init__(self, value):	
self.value = value
class Print(Instruction):	
def __init__(self, expressions, newline):	
self.expressions = expressions	
self.newline = newline
class Call(Instruction):	
def __init__(self, function_name):	
self.function_name = function_name
class Let(Instruction):	
def __init__(self, name):	
self.name = name
class Lookup(Instruction):	
def __init__(self, name):	
self.name = name
class Add(Instruction):	
pass	
!
class Sub(Instruction):	
pass	
!
class Mul(Instruction):	
pass	
!
class Equal(Instruction):	
pass	
!
...
class GotoIfTrue(Instruction):	
def __init__(self, target):	
self.target = target	
!
class Goto(Instruction):	
def __init__(self, target, with_frame=False):	
self.target = target	
self.with_frame = with_frame	
!
class Return(Instruction):	
pass
class Input(object):	
def __init__(self, name):	
self.name = name
class For(Instruction):	
def __init__(self, variable):	
self.variable = variable	
!
class Next(Instruction):	
def __init__(self, variable):	
self.variable = variable
class Program(object):	
def __init__(self):	
self.instructions = []	
self.lineno2instruction = {}	
!
def __enter__(self):	
return self	
!
def __exit__(self, exc_type, exc_value, tb):	
if exc_type is None:	
for i, instruction in enumerate(self.instructions):	
instruction.finalize(self, i)
def finalize(self, program, index):	
self.target = program.lineno2instruction[self.target]
class Program(BaseBox):	
…	
def compile(self):	
with bytecode.Program() as program:	
for line in self.lines:	
line.compile(program)	
return program
class Line(BaseBox):	
...	
def compile(self, program):	
program.lineno2instruction[self.lineno] = len(program.instructions)	
for statement in self.statements:	
statement.compile(program)
class Line(BaseBox):	
...	
def compile(self, program):	
program.lineno2instruction[self.lineno] = len(program.instructions)	
for statement in self.statements:	
statement.compile(program)
class Print(Statement):	
def compile(self, program):	
for expression in self.expressions:	
expression.compile(program)	
program.instructions.append(	
bytecode.Print(	
len(self.expressions),	
self.newline	
)	
)
class Print(Statement):	
...	
def compile(self, program):	
for expression in self.expressions:	
expression.compile(program)	
program.instructions.append(	
bytecode.Print(	
len(self.expressions),	
self.newline	
)	
)
class Let(Statement):	
...	
def compile(self, program):	
self.value.compile(program)	
program.instructions.append(	
bytecode.Let(self.name)	
)
class Input(Statement):	
...	
def compile(self, program):	
program.instructions.append(	
bytecode.Input(self.variable)	
)
class Goto(Statement):	
...	
def compile(self, program):	
program.instructions.append(	
bytecode.Goto(self.target)	
)	
!
class Gosub(Statement):	
...	
def compile(self, program):	
program.instructions.append(	
bytecode.Goto(	
self.target,	
with_frame=True	
)	
)	
!
class Return(Statement):	
...	
def compile(self, program):	
program.instructions.append(	
bytecode.Return()	
)
class For(Statement):	
...	
def compile(self, program):	
self.start.compile(program)	
program.instructions.append(	
bytecode.Let(self.variable)	
)	
self.end.compile(program)	
program.instructions.append(	
bytecode.For(self.variable)	
)
class WrappedObject(object):	
pass	
!
class WrappedString(WrappedObject):	
def __init__(self, value):	
self.value = value	
!
class WrappedFloat(WrappedObject):	
def __init__(self, value):	
self.value = value
class VM(object):	
…	
def execute_number(self, code):	
self.stack.append(WrappedFloat(code.value))	
self.pc += 1	
!
def execute_string(self, code):	
self.stack.append(WrappedString(code.value))	
self.pc += 1
class VM(object):	
…	
def execute_call(self, code):	
argument = self.stack.pop()	
if code.function_name == "TAB":	
self.stack.append(WrappedString(" " * int(argument)))	
elif code.function_name == "RND":	
self.stack.append(WrappedFloat(random.random()))	
...	
self.pc += 1
class VM(object):	
…	
def execute_let(self, code):	
value = self.stack.pop()	
self.variables[code.name] = value	
self.pc += 1	
!
def execute_lookup(self, code):	
value = self.variables[code.name]	
self.stack.append(value)	
self.pc += 1
class VM(object):	
…	
def execute_add(self, code):	
right = self.stack.pop()	
left = self.stack.pop()	
self.stack.append(WrappedFloat(left + right))	
self.pc += 1
class VM(object):	
…	
def execute_goto_if_true(self, code):	
condition = self.stack.pop()	
if condition:	
self.pc = code.target	
else:	
self.pc += 1
class VM(object):	
…	
def execute_goto(self, code):	
if code.with_frame:	
self.frames.append(self.pc + 1)	
self.pc = code.target
class VM(object):	
…	
def execute_return(self, code):	
self.pc = self.frames.pop()
class VM(object):	
…	
def execute_input(self, code):	
value = WrappedFloat(float(raw_input() or “0.0”))	
self.variables[code.name] = value	
self.pc += 1
class VM(object):	
…	
def execute_for(code):	
self.pc += 1	
self.iterators[code.variable] = (	
self.pc,	
self.stack.pop()	
)
class VM(object):	
…	
def execute_next(self, code):	
loop_begin, end = self.iterators[code.variable]	
current_value = self.variables[code.variable].value	
next_value = current_value + 1.0	
if next_value <= end:	
self.variables[code.variable] = 	
WrappedFloat(next_value)	
self.pc = loop_begin	
else:	
del self.iterators[code.variable]	
self.pc += 1
def entry_point(argv):	
try:	
filename = argv[1]	
except IndexError:	
print(“You must supply a filename”)	
return 1	
content = read_file(filename)	
tokens = lex(content)	
ast = parse(tokens)	
program = ast.compile()	
vm = VM(program)	
vm.execute()	
return 0
Entry Point
JIT (in PyPy)
1. Identify “hot" loops!
2. Create trace inserting guards based on observed
values!
3. Optimize trace!
4. Compile trace!
5. Execute machine code instead of interpreter
from rpython.rlib.jit import JitDriver	
!
jitdriver = JitDriver(	
greens=[“pc”, “vm”, “program”, “frames”, “iterators”],	
reds=[“stack”, “variables"]	
)
class VM(object):	
…	
def execute(self):	
while self.pc < len(self.program.instructions):	
jitdriver.merge_point(	
vm=self,	
pc=self.pc,	
…	
)
Benchmark
10 N = 1	
20 IF N <= 10000 THEN 40	
30 END	
40 GOSUB 100	
50 IF R = 0 THEN 70	
60 PRINT "PRIME"; N	
70 N = N + 1: GOTO 20	
100 REM *** ISPRIME N -> R	
110 IF N <= 2 THEN 170	
120 FOR I = 2 TO (N - 1)	
130 A = N: B = I: GOSUB 200	
140 IF R <> 0 THEN 160	
150 R = 0: RETURN	
160 NEXT I	
170 R = 1: RETURN	
200 REM *** MOD A -> B -> R	
210 R = A - (B * INT(A / B))	
220 RETURN
cbmbasic 58.22s
basic-c 5.06s
basic-c-jit 2.34s
Python implementation (CPython) 2.83s
Python implementation (PyPy) 0.11s
C implementation 0.03s
Project milestones
2008 Django support
2010 First JIT-compiler
2011 Compatibility with CPython 2.7
2014 Basic ARM support
CPython 3 support
Improve compatibility with C extensions
NumPyPy
Multi-threading support
PyPy STM
PyPy STM
http://dabeaz.com/GIL/gilvis/
GIL locking
PyPy STM
10 loops, best of 3: 1.2 sec per loop10 loops, best of 3: 822 msec per loop
from threading import Thread
def count(n):
while n > 0:
n -= 1
def run():
t1 = Thread(target=count, args=(10000000,))
t1.start()
t2 = Thread(target=count, args=(10000000,))
t2.start()
t1.join(); t2.join()
def count(n):
while n > 0:
n -= 1
def run():
count(10000000)
count(10000000)
Inside the Python GIL - David Beazley
PyPy in the real world (1)
High frequency trading platform for sports bets
I low latency is a must
PyPy used in production since 2012
~100 PyPy processes running 24/7
up to 10x speedups
I after careful tuning and optimizing for PyPy
antocuni (PyCon Otto) PyPy Status Update April 07 2017 6 / 19
PyPy in the real world (2)
Real-time online advertising auctions
I tight latency requirement (<100ms)
I high throughput (hundreds of thousands of requests per
second)
30% speedup
We run PyPy basically everywhere
Julian Berman
antocuni (PyCon Otto) PyPy Status Update April 07 2017 7 / 19
PyPy in the real world (3)
IoT on the cloud
5-10x faster
We do not even run benchmarks on CPython
because we just know that PyPy is way faster
Tobias Oberstein
antocuni (PyCon Otto) PyPy Status Update April 07 2017 8 / 19

More Related Content

What's hot

Intro To Spring Python
Intro To Spring PythonIntro To Spring Python
Intro To Spring Pythongturnquist
 
WAD : A Module for Converting Fatal Extension Errors into Python Exceptions
WAD : A Module for Converting Fatal Extension Errors into Python ExceptionsWAD : A Module for Converting Fatal Extension Errors into Python Exceptions
WAD : A Module for Converting Fatal Extension Errors into Python ExceptionsDavid Beazley (Dabeaz LLC)
 
An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...
An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...
An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...David Beazley (Dabeaz LLC)
 
Programming with Python - Adv.
Programming with Python - Adv.Programming with Python - Adv.
Programming with Python - Adv.Mosky Liu
 
Python Developer Certification
Python Developer CertificationPython Developer Certification
Python Developer CertificationVskills
 
Using SWIG to Control, Prototype, and Debug C Programs with Python
Using SWIG to Control, Prototype, and Debug C Programs with PythonUsing SWIG to Control, Prototype, and Debug C Programs with Python
Using SWIG to Control, Prototype, and Debug C Programs with PythonDavid Beazley (Dabeaz LLC)
 
Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475Max Kleiner
 
Metrics ekon 14_2_kleiner
Metrics ekon 14_2_kleinerMetrics ekon 14_2_kleiner
Metrics ekon 14_2_kleinerMax Kleiner
 
Pascal script maxbox_ekon_14_2
Pascal script maxbox_ekon_14_2Pascal script maxbox_ekon_14_2
Pascal script maxbox_ekon_14_2Max Kleiner
 
Take advantage of C++ from Python
Take advantage of C++ from PythonTake advantage of C++ from Python
Take advantage of C++ from PythonYung-Yu Chen
 
Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Yung-Yu Chen
 
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2 「エッジAIモダン計測制御の世界」オ...
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2  「エッジAIモダン計測制御の世界」オ...Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2  「エッジAIモダン計測制御の世界」オ...
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2 「エッジAIモダン計測制御の世界」オ...Mr. Vengineer
 
EKON 25 Python4Delphi_mX4
EKON 25 Python4Delphi_mX4EKON 25 Python4Delphi_mX4
EKON 25 Python4Delphi_mX4Max Kleiner
 
LeFlowを調べてみました
LeFlowを調べてみましたLeFlowを調べてみました
LeFlowを調べてみましたMr. Vengineer
 
Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.Carlos Miguel Ferreira
 

What's hot (20)

Intro To Spring Python
Intro To Spring PythonIntro To Spring Python
Intro To Spring Python
 
Interfacing C/C++ and Python with SWIG
Interfacing C/C++ and Python with SWIGInterfacing C/C++ and Python with SWIG
Interfacing C/C++ and Python with SWIG
 
WAD : A Module for Converting Fatal Extension Errors into Python Exceptions
WAD : A Module for Converting Fatal Extension Errors into Python ExceptionsWAD : A Module for Converting Fatal Extension Errors into Python Exceptions
WAD : A Module for Converting Fatal Extension Errors into Python Exceptions
 
An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...
An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...
An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...
 
effective_r27
effective_r27effective_r27
effective_r27
 
Programming with Python - Adv.
Programming with Python - Adv.Programming with Python - Adv.
Programming with Python - Adv.
 
Python Developer Certification
Python Developer CertificationPython Developer Certification
Python Developer Certification
 
Python Workshop
Python WorkshopPython Workshop
Python Workshop
 
Easy native wrappers with SWIG
Easy native wrappers with SWIGEasy native wrappers with SWIG
Easy native wrappers with SWIG
 
Using SWIG to Control, Prototype, and Debug C Programs with Python
Using SWIG to Control, Prototype, and Debug C Programs with PythonUsing SWIG to Control, Prototype, and Debug C Programs with Python
Using SWIG to Control, Prototype, and Debug C Programs with Python
 
Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475
 
Metrics ekon 14_2_kleiner
Metrics ekon 14_2_kleinerMetrics ekon 14_2_kleiner
Metrics ekon 14_2_kleiner
 
Pascal script maxbox_ekon_14_2
Pascal script maxbox_ekon_14_2Pascal script maxbox_ekon_14_2
Pascal script maxbox_ekon_14_2
 
TensorFlow XLA RPC
TensorFlow XLA RPCTensorFlow XLA RPC
TensorFlow XLA RPC
 
Take advantage of C++ from Python
Take advantage of C++ from PythonTake advantage of C++ from Python
Take advantage of C++ from Python
 
Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020
 
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2 「エッジAIモダン計測制御の世界」オ...
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2  「エッジAIモダン計測制御の世界」オ...Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2  「エッジAIモダン計測制御の世界」オ...
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2 「エッジAIモダン計測制御の世界」オ...
 
EKON 25 Python4Delphi_mX4
EKON 25 Python4Delphi_mX4EKON 25 Python4Delphi_mX4
EKON 25 Python4Delphi_mX4
 
LeFlowを調べてみました
LeFlowを調べてみましたLeFlowを調べてみました
LeFlowを調べてみました
 
Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.
 

Viewers also liked

Lecture notice about Embedded Operating System Design and Implementation
Lecture notice about Embedded Operating System Design and ImplementationLecture notice about Embedded Operating System Design and Implementation
Lecture notice about Embedded Operating System Design and ImplementationNational Cheng Kung University
 
給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明
給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明
給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明National Cheng Kung University
 
中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學
中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學
中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學National Cheng Kung University
 
進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明
進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明
進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明National Cheng Kung University
 
Develop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM BoardsDevelop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM BoardsNational Cheng Kung University
 

Viewers also liked (15)

Improve Android System Component Performance
Improve Android System Component PerformanceImprove Android System Component Performance
Improve Android System Component Performance
 
Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
 
Lecture notice about Embedded Operating System Design and Implementation
Lecture notice about Embedded Operating System Design and ImplementationLecture notice about Embedded Operating System Design and Implementation
Lecture notice about Embedded Operating System Design and Implementation
 
Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
 
Implement Runtime Environments for HSA using LLVM
Implement Runtime Environments for HSA using LLVMImplement Runtime Environments for HSA using LLVM
Implement Runtime Environments for HSA using LLVM
 
給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明
給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明
給自己更好未來的 3 個練習:嵌入式作業系統設計、實做,與移植 (2015 年春季 ) 課程說明
 
中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學
中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學
中輟生談教育: 完全用開放原始碼軟體進行 嵌入式系統教學
 
進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明
進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明
進階嵌入式系統開發與實做 (2014 年秋季 ) 課程說明
 
How A Compiler Works: GNU Toolchain
How A Compiler Works: GNU ToolchainHow A Compiler Works: GNU Toolchain
How A Compiler Works: GNU Toolchain
 
Explore Android Internals
Explore Android InternalsExplore Android Internals
Explore Android Internals
 
Xvisor: embedded and lightweight hypervisor
Xvisor: embedded and lightweight hypervisorXvisor: embedded and lightweight hypervisor
Xvisor: embedded and lightweight hypervisor
 
Develop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM BoardsDevelop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM Boards
 
從線上售票看作業系統設計議題
從線上售票看作業系統設計議題從線上售票看作業系統設計議題
從線上售票看作業系統設計議題
 
Virtual Machine Constructions for Dummies
Virtual Machine Constructions for DummiesVirtual Machine Constructions for Dummies
Virtual Machine Constructions for Dummies
 
Priority Inversion on Mars
Priority Inversion on MarsPriority Inversion on Mars
Priority Inversion on Mars
 

Similar to PyPy's approach to construct domain-specific language runtime

Understanding PyPy - PyConEs 14
Understanding PyPy - PyConEs 14Understanding PyPy - PyConEs 14
Understanding PyPy - PyConEs 14fcofdezc
 
AI Machine Learning Complete Course: for PHP & Python Devs
AI Machine Learning Complete Course: for PHP & Python DevsAI Machine Learning Complete Course: for PHP & Python Devs
AI Machine Learning Complete Course: for PHP & Python DevsAmr Shawqy
 
Py4 inf 01-intro
Py4 inf 01-introPy4 inf 01-intro
Py4 inf 01-introIshaq Ali
 
PyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and MorePyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and MoreMatt Harrison
 
Python Interview Questions For Experienced
Python Interview Questions For ExperiencedPython Interview Questions For Experienced
Python Interview Questions For Experiencedzynofustechnology
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
 
First Steps in Python Programming
First Steps in Python ProgrammingFirst Steps in Python Programming
First Steps in Python ProgrammingDozie Agbo
 
Why is Python slow? Python Nordeste 2013
Why is Python slow? Python Nordeste 2013Why is Python slow? Python Nordeste 2013
Why is Python slow? Python Nordeste 2013Daker Fernandes
 
What is Python? (Silicon Valley CodeCamp 2015)
What is Python? (Silicon Valley CodeCamp 2015)What is Python? (Silicon Valley CodeCamp 2015)
What is Python? (Silicon Valley CodeCamp 2015)wesley chun
 
Python intro
Python introPython intro
Python introrik0
 
PyWPS Development restart
PyWPS Development restartPyWPS Development restart
PyWPS Development restartJachym Cepicky
 
Python and Pytorch tutorial and walkthrough
Python and Pytorch tutorial and walkthroughPython and Pytorch tutorial and walkthrough
Python and Pytorch tutorial and walkthroughgabriellekuruvilla
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientistsaeberspaecher
 
Monitoraggio del Traffico di Rete Usando Python ed ntop
Monitoraggio del Traffico di Rete Usando Python ed ntopMonitoraggio del Traffico di Rete Usando Python ed ntop
Monitoraggio del Traffico di Rete Usando Python ed ntopPyCon Italia
 

Similar to PyPy's approach to construct domain-specific language runtime (20)

Understanding PyPy - PyConEs 14
Understanding PyPy - PyConEs 14Understanding PyPy - PyConEs 14
Understanding PyPy - PyConEs 14
 
Pyhton-1a-Basics.pdf
Pyhton-1a-Basics.pdfPyhton-1a-Basics.pdf
Pyhton-1a-Basics.pdf
 
Python ppt
Python pptPython ppt
Python ppt
 
AI Machine Learning Complete Course: for PHP & Python Devs
AI Machine Learning Complete Course: for PHP & Python DevsAI Machine Learning Complete Course: for PHP & Python Devs
AI Machine Learning Complete Course: for PHP & Python Devs
 
Py4 inf 01-intro
Py4 inf 01-introPy4 inf 01-intro
Py4 inf 01-intro
 
py4inf-01-intro.ppt
py4inf-01-intro.pptpy4inf-01-intro.ppt
py4inf-01-intro.ppt
 
PyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and MorePyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and More
 
Python Interview Questions For Experienced
Python Interview Questions For ExperiencedPython Interview Questions For Experienced
Python Interview Questions For Experienced
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
Learn python
Learn pythonLearn python
Learn python
 
First Steps in Python Programming
First Steps in Python ProgrammingFirst Steps in Python Programming
First Steps in Python Programming
 
Why is Python slow? Python Nordeste 2013
Why is Python slow? Python Nordeste 2013Why is Python slow? Python Nordeste 2013
Why is Python slow? Python Nordeste 2013
 
Pl/Python
Pl/PythonPl/Python
Pl/Python
 
What is Python? (Silicon Valley CodeCamp 2015)
What is Python? (Silicon Valley CodeCamp 2015)What is Python? (Silicon Valley CodeCamp 2015)
What is Python? (Silicon Valley CodeCamp 2015)
 
Python intro
Python introPython intro
Python intro
 
PyWPS Development restart
PyWPS Development restartPyWPS Development restart
PyWPS Development restart
 
Python Course
Python CoursePython Course
Python Course
 
Python and Pytorch tutorial and walkthrough
Python and Pytorch tutorial and walkthroughPython and Pytorch tutorial and walkthrough
Python and Pytorch tutorial and walkthrough
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientists
 
Monitoraggio del Traffico di Rete Usando Python ed ntop
Monitoraggio del Traffico di Rete Usando Python ed ntopMonitoraggio del Traffico di Rete Usando Python ed ntop
Monitoraggio del Traffico di Rete Usando Python ed ntop
 

More from National Cheng Kung University

進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明
進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明
進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明National Cheng Kung University
 
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsF9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsNational Cheng Kung University
 
進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明
進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明
進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明National Cheng Kung University
 
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例National Cheng Kung University
 
Shorten Device Boot Time for Automotive IVI and Navigation Systems
Shorten Device Boot Time for Automotive IVI and Navigation SystemsShorten Device Boot Time for Automotive IVI and Navigation Systems
Shorten Device Boot Time for Automotive IVI and Navigation SystemsNational Cheng Kung University
 

More from National Cheng Kung University (16)

2016 年春季嵌入式作業系統課程說明
2016 年春季嵌入式作業系統課程說明2016 年春季嵌入式作業系統課程說明
2016 年春季嵌入式作業系統課程說明
 
Interpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratchInterpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratch
 
進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明
進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明
進階嵌入式作業系統設計與實做 (2015 年秋季 ) 課程說明
 
Construct an Efficient and Secure Microkernel for IoT
Construct an Efficient and Secure Microkernel for IoTConstruct an Efficient and Secure Microkernel for IoT
Construct an Efficient and Secure Microkernel for IoT
 
The Internals of "Hello World" Program
The Internals of "Hello World" ProgramThe Internals of "Hello World" Program
The Internals of "Hello World" Program
 
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsF9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
 
Open Source from Legend, Business, to Ecosystem
Open Source from Legend, Business, to EcosystemOpen Source from Legend, Business, to Ecosystem
Open Source from Legend, Business, to Ecosystem
 
Summer Project: Microkernel (2013)
Summer Project: Microkernel (2013)Summer Project: Microkernel (2013)
Summer Project: Microkernel (2013)
 
進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明
進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明
進階嵌入式系統開發與實作 (2013 秋季班 ) 課程說明
 
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例
 
Faults inside System Software
Faults inside System SoftwareFaults inside System Software
Faults inside System Software
 
Hints for L4 Microkernel
Hints for L4 MicrokernelHints for L4 Microkernel
Hints for L4 Microkernel
 
Shorten Device Boot Time for Automotive IVI and Navigation Systems
Shorten Device Boot Time for Automotive IVI and Navigation SystemsShorten Device Boot Time for Automotive IVI and Navigation Systems
Shorten Device Boot Time for Automotive IVI and Navigation Systems
 
Microkernel Evolution
Microkernel EvolutionMicrokernel Evolution
Microkernel Evolution
 
Develop Your Own Operating System
Develop Your Own Operating SystemDevelop Your Own Operating System
Develop Your Own Operating System
 
olibc: Another C Library optimized for Embedded Linux
olibc: Another C Library optimized for Embedded Linuxolibc: Another C Library optimized for Embedded Linux
olibc: Another C Library optimized for Embedded Linux
 

Recently uploaded

Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...Nitya salvi
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyAnusha Are
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 

Recently uploaded (20)

Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 

PyPy's approach to construct domain-specific language runtime

  • 1. Tag: virtual machine, compiler, performance PyPy’s Approach to Construct Domain-specific Language Runtime
  • 2. Tag: virtual machine, compiler, performance Construct Domain-specific Language Runtime using
  • 3. Speed 7.4 times faster than CPython http://speed.pypy.org antocuni (PyCon Otto) PyPy Status Update April 07 2017 4 / 19
  • 4. Why is Python slow? Interpretation overhead Boxed arithmetic and automatic overflow handling Dynamic dispatch of operations Dynamic lookup of methods and attributes Everything can change on runtime Extreme introspective and reflective capabilities Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 8 / 51
  • 5. Why is Python slow? Boxed arithmetic and automatic overflow handling i = 0 while i < 10000000: i = i +1 Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 9 / 51
  • 6. Why is Python slow? Dynamic dispatch of operations # while i < 1000000 9 LOAD_FAST 0 (i) 12 LOAD_CONST 2 (10000000) 15 COMPARE_OP 0 (<) 18 POP_JUMP_IF_FALSE 34 # i = i + 1 21 LOAD_FAST 0 (i) 24 LOAD_CONST 3 (1) 27 BINARY_ADD 28 STORE_FAST 0 (i) 31 JUMP_ABSOLUTE 9 Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 10 / 51
  • 7. Why is Python slow? Dynamic lookup of methods and attributes class MyExample(object ): pass def foo(target , flag ): if flag: target.x = 42 obj = MyExample () foo(obj , True) print obj.x #=> 42 print getattr(obj , "x") #=> 42 Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 11 / 51
  • 8. Why is Python slow? Everything can change on runtime def fn(): return 42 def hello (): return ’Hi! PyConEs!’ def change_the_world (): global fn fn = hello print fn() #=> 42 change_the_world () print fn() => ’Hi! PyConEs!’ Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 12 / 51
  • 9. Why is Python slow? Everything can change on runtime class Dog(object ): def __init__(self ): self.name = ’Jandemor ’ def talk(self ): print "%s: guau!" % self.name class Cat(object ): def __init__(self ): self.name = ’CatInstance ’ def talk(self ): print "%s: miau!" % self.name Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 13 / 51
  • 10. Why is Python slow? Everything can change on runtime my_pet = Dog() my_pet.talk () #=> ’Jandemor: guau!’ my_pet.__class__ = Cat my_pet.talk () #=> ’Jandemor: miau!’ Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 14 / 51
  • 11. Why is Python slow? Extreme introspective and reflective capabilities def fill_list(name ): frame = sys._getframe (). f_back lst = frame.f_locals[name] lst.append (42) def foo (): things = [] fill_list(’things ’) print things #=> 42 Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 15 / 51
  • 12. Why is Python slow? Everything can change on runtime def fn(): return 42 def hello (): return ’Hi! PyConEs!’ def change_the_world (): global fn fn = hello print fn() #=> 42 change_the_world () print fn() => ’Hi! PyConEs!’ Francisco Fernandez Castano (@fcofdezc) PyPy November 8, 2014 12 / 51
  • 13. PyPy Translation Toolchain • Capable of compiling (R)Python! • Garbage collection! • Tracing just-in-time compiler generator! • Software transactional memory?
  • 15. PyPy based interpreters • Topaz (Ruby)! • HippyVM (PHP)! • Pyrolog (Prolog)! • pycket (Racket)! • Various other interpreters for (Scheme, Javascript, io, Gameboy)
  • 16. Compiler / Interpreter Source: Compiler Construction, Prof. O. NierstraszSource: Compiler Construction, Prof. O. Nierstrasz
  • 17. • intermediate representation (IR) • front end maps legal code into IR • back end maps IR onto target machine • simplify retargeting • allows multiple front ends • multiple passes better code→ Traditional 2 pass compiler
  • 18. • analyzes and changes IR • goal is to reduce runtime • must preserve values Traditional 3 pass compiler
  • 19. • constant propagation and folding • code motion • reduction of operator strength • common sub-expression elimination • redundant store elimination • dead code elimination Optimizer: middle end Modern optimizers are usually built as a set of passes
  • 20. • Preserve language semantics • Reflection, Introspection, Eval • External APIs • Interpreter consists of short sequences of code • Prevent global optimizations • Typically implemented as a stack machine • Dynamic, imprecise type information • Variables can change type • Duck Typing: method works with any object that provides accessed interfaces • Monkey Patching: add members to “class” after initialization • Memory management and concurrency • Function calls through packing of operands in fat object Optimization Challenges
  • 22. RPython • Python subset! • Statically typed! • Garbage collected! • Standard library almost entirely unavailable! • Some missing builtins (print, open(), …)! • rpython.rlib! • exceptions are (sometimes) ignored! • Not a really a language, rather a "state"
  • 23. 22 PyPy Interpreter def f(x): return x + 1 >>> dis.dis(f) 2 0 LOAD_FAST 0 (x) 3 LOAD_CONST 1 (1) 6 BINARY_ADD 7 RETURN_VALUE • written in Rpython • Stack-based bytecode interpreter (like JVM) • bytecode compiler generates bytecode→ • bytecode evaluator interprets bytecode → • object space handles operations on objects→
  • 25. 31
  • 26. CFG (Call Flow Graph) • Consists of Blocks and Links • Starting from entry_point • “Single Static Information” form def f(n): return 3 * n + 2 Block(v1): # input argument v2 = mul(Constant(3), v1) v3 = add(v2, Constant(2))
  • 27. 33 CFG: Static Single Information 33 def test(a): if a > 0: if a > 5: return 10 return 4 if a < - 10: return 3 return 10 • SSI: “PHIs” for all used variables • Blocks as “functions without branches”
  • 28. • High Level Language Implementation • to implement new features: lazily computed objects and functions, plug-able  garbage-collection, runtime replacement of live-objects, stackless concurrency  • JIT Generation • Object space • Stackless • infinite Recursion • Microthreads: Coroutines, Tasklets and Channels, Greenlets PyPy Advantages
  • 30. Assumptions Pareto Principle (80-20 rule) I the 20% of the program accounts for the 80% of the runtime I hot-spots Fast Path principle I optimize only what is necessary I fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni (Intel@Bucharest) PyPy Intro April 4 2016 9 / 32
  • 31. Tracing JIT phases Interpretation antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
  • 32. Tracing JIT phases Interpretation Tracing hot loop detected antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
  • 33. Tracing JIT phases Interpretation Tracing hot loop detected Compilation antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
  • 34. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
  • 35. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running cold guard failed antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
  • 36. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running cold guard failed entering compiled loop antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
  • 37. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running cold guard failed entering compiled loop guard failure → hot antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
  • 38. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running cold guard failed entering compiled loop guard failure → hot hot guard failed antocuni (Intel@Bucharest) PyPy Intro April 4 2016 11 / 32
  • 39. Trace trees (1) tracetree.py def foo(): a = 0 i = 0 N = 100 while i < N: if i%2 == 0: a += 1 else: a *= 2; i += 1 return a antocuni (Intel@Bucharest) PyPy Intro April 4 2016 12 / 32
  • 40. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 41. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 42. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 43. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) BLACKHOLE COLD FAIL antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 44. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) BLACKHOLE COLD FAIL INTERPRETER antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 45. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) BLACKHOLE COLD FAIL INTERPRETER antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 46. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 47. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 48. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) a1 = int_mul(a0, 2) i1 = int_add(i0, 1) jump(start, i1, a1) HOT FAIL antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 49. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) a1 = int_mul(a0, 2) i1 = int_add(i0, 1) jump(start, i1, a1) HOT FAIL antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 50. Trace trees (2) label(start, i0, a0) v0 = int_lt(i0, 2000) guard_true(v0) v1 = int_mod(i0, 2) v2 = int_eq(v1, 0) guard_true(v1) a1 = int_add(a0, 10) i1 = int_add(i0, 1) jump(start, i1, a1) a1 = int_mul(a0, 2) i1 = int_add(i0, 1) jump(start, i1, a1) HOT FAIL antocuni (Intel@Bucharest) PyPy Intro April 4 2016 13 / 32
  • 51. Part 3 The PyPy JIT antocuni (Intel@Bucharest) PyPy Intro April 4 2016 14 / 32
  • 52. Terminology (1) translation time: when you run "rpython targetpypy.py" to get the pypy binary runtime: everything which happens after you start pypy interpretation, tracing, compiling assembler/machine code: the output of the JIT compiler execution time: when your Python program is being executed I by the interpreter I by the machine code antocuni (Intel@Bucharest) PyPy Intro April 4 2016 15 / 32
  • 53. Terminology (2) interp-level: things written in RPython [PyPy] interpreter: the RPython program which executes the final Python programs bytecode: "the output of dis.dis". It is executed by the PyPy interpreter. app-level: things written in Python, and executed by the PyPy Interpreter antocuni (Intel@Bucharest) PyPy Intro April 4 2016 16 / 32
  • 54. Terminology (3) (the following is not 100% accurate but it’s enough to understand the general principle) low level op or ResOperation I low-level instructions like "add two integers", "read a field out of a struct", "call this function" I (more or less) the same level of C ("portable assembler") I knows about GC objects (e.g. you have getfield_gc vs getfield_raw) jitcodes: low-level representation of RPython functions I sequence of low level ops I generated at translation time I 1 RPython function --> 1 C function --> 1 jitcode antocuni (Intel@Bucharest) PyPy Intro April 4 2016 17 / 32
  • 55. Terminology (4) JIT traces or loops I a very specific sequence of llops as actually executed by your Python program I generated at runtime (more specifically, during tracing) JIT optimizer: takes JIT traces and emits JIT traces JIT backend: takes JIT traces and emits machine code antocuni (Intel@Bucharest) PyPy Intro April 4 2016 18 / 32
  • 56. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
  • 57. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
  • 58. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
  • 59. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
  • 60. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
  • 61. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACEROPTIMIZER antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
  • 62. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACEROPTIMIZERBACKEND antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
  • 63. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACEROPTIMIZERBACKENDASSEMBLER antocuni (Intel@Bucharest) PyPy Intro April 4 2016 19 / 32
  • 64. PyPy trace example def fn(): c = a+b ... antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
  • 65. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL A LOAD_GLOBAL B BINARY_ADD STORE_FAST C antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
  • 66. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
  • 67. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
  • 68. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... guard_class(p0, W_IntObject) i0 = getfield_gc(p0, 'intval') guard_class(p1, W_IntObject) i1 = getfield_gc(p1, 'intval') i2 = int_add(00, i1) guard_not_overflow() p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') ... antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
  • 69. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... guard_class(p0, W_IntObject) i0 = getfield_gc(p0, 'intval') guard_class(p1, W_IntObject) i1 = getfield_gc(p1, 'intval') i2 = int_add(00, i1) guard_not_overflow() p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') ... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... antocuni (Intel@Bucharest) PyPy Intro April 4 2016 20 / 32
  • 70. PyPy optimizer intbounds constant folding / pure operations virtuals string optimizations heap (multiple get/setfield, etc) unroll antocuni (Intel@Bucharest) PyPy Intro April 4 2016 21 / 32
  • 71. Intbound optimization (1) intbound.py def fn(): i = 0 while i < 5000: i += 2 return i antocuni (Intel@Bucharest) PyPy Intro April 4 2016 22 / 32
  • 72. Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni (Intel@Bucharest) PyPy Intro April 4 2016 23 / 32
  • 73. Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni (Intel@Bucharest) PyPy Intro April 4 2016 23 / 32
  • 74. Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni (Intel@Bucharest) PyPy Intro April 4 2016 23 / 32
  • 75. Virtuals (1) virtuals.py def fn(): i = 0 while i < 5000: i += 2 return i antocuni (Intel@Bucharest) PyPy Intro April 4 2016 24 / 32
  • 76. Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’) i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases I e.g. function frames antocuni (Intel@Bucharest) PyPy Intro April 4 2016 25 / 32
  • 77. Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’) i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases I e.g. function frames antocuni (Intel@Bucharest) PyPy Intro April 4 2016 25 / 32
  • 78. Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’) i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases I e.g. function frames antocuni (Intel@Bucharest) PyPy Intro April 4 2016 25 / 32
  • 79. Constant folding (1) constfold.py def fn(): i = 0 while i < 5000: i += 2 return i antocuni (Intel@Bucharest) PyPy Intro April 4 2016 26 / 32
  • 80. Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2 = getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It "finishes the job" Works well together with other optimizations (e.g. virtuals) It also does "normal, boring, static" constant-folding antocuni (Intel@Bucharest) PyPy Intro April 4 2016 27 / 32
  • 81. Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2 = getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It "finishes the job" Works well together with other optimizations (e.g. virtuals) It also does "normal, boring, static" constant-folding antocuni (Intel@Bucharest) PyPy Intro April 4 2016 27 / 32
  • 82. Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2 = getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It "finishes the job" Works well together with other optimizations (e.g. virtuals) It also does "normal, boring, static" constant-folding antocuni (Intel@Bucharest) PyPy Intro April 4 2016 27 / 32
  • 83. Out of line guards (1) outoflineguards.py N = 2 def fn(): i = 0 while i < 5000: i += N return i antocuni (Intel@Bucharest) PyPy Intro April 4 2016 28 / 32
  • 84. Out of line guards (2) unoptimized ... quasiimmut_field(<Cell>, ’val’) guard_not_invalidated() p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit "everywhere" antocuni (Intel@Bucharest) PyPy Intro April 4 2016 29 / 32
  • 85. Out of line guards (2) unoptimized ... quasiimmut_field(<Cell>, ’val’) guard_not_invalidated() p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit "everywhere" antocuni (Intel@Bucharest) PyPy Intro April 4 2016 29 / 32
  • 86. Out of line guards (2) unoptimized ... quasiimmut_field(<Cell>, ’val’) guard_not_invalidated() p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit "everywhere" antocuni (Intel@Bucharest) PyPy Intro April 4 2016 29 / 32
  • 87. Hello RPython # hello_rpython.py import os ! def entry_point(argv): os.write(2, “Hello, World!n”) return 0 ! def target(driver, argv): return entry_point, None
  • 88. $ rpython hello_rpython.py … $ ./hello_python-c Hello, RPython!
  • 89. Goal • BASIC interpreter capable of running Hamurabi! • Bytecode based! • Garbage Collection! • Just-In-Time Compilation
  • 90.
  • 93. 10 PRINT TAB(32);"HAMURABI" 20 PRINT TAB(15);"CREATIVE COMPUTING MORRISTOWN, NEW JERSEY" 30 PRINT:PRINT:PRINT 80 PRINT "TRY YOUR HAND AT GOVERNING ANCIENT SUMERIA" 90 PRINT "FOR A TEN-YEAR TERM OF OFFICE.":PRINT 95 D1=0: P1=0 100 Z=0: P=95:S=2800: H=3000: E=H-S 110 Y=3: A=H/Y: I=5: Q=1 210 D=0 215 PRINT:PRINT:PRINT "HAMURABI: I BEG TO REPORT TO YOU,": Z=Z+1 217 PRINT "IN YEAR";Z;",";D;"PEOPLE STARVED,";I;"CAME TO THE CITY," 218 P=P+I 227 IF Q>0 THEN 230 228 P=INT(P/2) 229 PRINT "A HORRIBLE PLAGUE STRUCK! HALF THE PEOPLE DIED." 230 PRINT "POPULATION IS NOW";P 232 PRINT "THE CITY NOW OWNS ";A;"ACRES." 235 PRINT "YOU HARVESTED";Y;"BUSHELS PER ACRE." 250 PRINT "THE RATS ATE";E;"BUSHELS." 260 PRINT "YOU NOW HAVE ";S;"BUSHELS IN STORE.": PRINT 270 REM *** MORE CODE THAT DID NOT FIT INTO THE SLIDE FOLLOWS
  • 96. RPLY • Based on PLY, which is based on Lex and Yacc! • Lexer generator! • LALR parser generator
  • 97. Lexer from rply import LexerGenerator ! lg = LexerGenerator() ! lg.add(“NUMBER”, “[0-9]+”) # … lg.ignore(“ +”) # whitespace ! lexer = lg.build().lex
  • 98. lg.add('NUMBER', r'[0-9]*.[0-9]+') lg.add('PRINT', r'PRINT') lg.add('IF', r'IF') lg.add('THEN', r'THEN') lg.add('GOSUB', r'GOSUB') lg.add('GOTO', r'GOTO') lg.add('INPUT', r'INPUT') lg.add('REM', r'REM') lg.add('RETURN', r'RETURN') lg.add('END', r'END') lg.add('FOR', r'FOR') lg.add('TO', r'TO') lg.add('NEXT', r'NEXT') lg.add('NAME', r'[A-Z][A-Z0-9$]*') lg.add('(', r'(') lg.add(')', r')') lg.add(';', r';') lg.add('STRING', r'"[^"]*"') lg.add(':', r'r?n') lg.add(':', r':') lg.add('=', r'=') lg.add('<>', r'<>') lg.add('-', r'-') lg.add('/', r'/') lg.add('+', r'+') lg.add('>=', r'>=') lg.add('>', r'>') lg.add('***', r'***.*') lg.add('*', r'*') lg.add('<=', r'<=') lg.add('<', r'<')
  • 99. >>> from basic.lexer import lex >>> source = open("hello.bas").read() >>> for token in lex(source): ... print token Token("NUMBER", "10") Token("PRINT", "PRINT") Token("STRING",'"HELLO BASIC!"') Token(":", "n")
  • 100. Grammar • A set of formal rules that defines the syntax! • terminals = tokens! • nonterminals = rules defining a sequence of one or more (non)terminals
  • 101. 10 PRINT TAB(32);"HAMURABI" 20 PRINT TAB(15);"CREATIVE COMPUTING MORRISTOWN, NEW JERSEY" 30 PRINT:PRINT:PRINT 80 PRINT "TRY YOUR HAND AT GOVERNING ANCIENT SUMERIA" 90 PRINT "FOR A TEN-YEAR TERM OF OFFICE.":PRINT 95 D1=0: P1=0 100 Z=0: P=95:S=2800: H=3000: E=H-S 110 Y=3: A=H/Y: I=5: Q=1 210 D=0 215 PRINT:PRINT:PRINT "HAMURABI: I BEG TO REPORT TO YOU,": Z=Z+1 217 PRINT "IN YEAR";Z;",";D;"PEOPLE STARVED,";I;"CAME TO THE CITY," 218 P=P+I 227 IF Q>0 THEN 230 228 P=INT(P/2) 229 PRINT "A HORRIBLE PLAGUE STRUCK! HALF THE PEOPLE DIED." 230 PRINT "POPULATION IS NOW";P 232 PRINT "THE CITY NOW OWNS ";A;"ACRES." 235 PRINT "YOU HARVESTED";Y;"BUSHELS PER ACRE." 250 PRINT "THE RATS ATE";E;"BUSHELS." 260 PRINT "YOU NOW HAVE ";S;"BUSHELS IN STORE.": PRINT 270 REM *** MORE CODE THAT DID NOT FIT INTO THE SLIDE FOLLOWS
  • 102. program : program : line program : line program
  • 103. line : NUMBER statements
  • 104. statements : statement statements : statement statements
  • 105. statement : PRINT : statement : PRINT expressions : expressions : expression expressions : expression ; expressions : expression ; expressions
  • 106. statement : NAME = expression :
  • 107. statement : IF expression THEN number :
  • 108. statement : INPUT name :
  • 109. statement : GOTO NUMBER : statement : GOSUB NUMBER : statement : RETURN :
  • 110. statement : REM *** :
  • 111. statement : FOR NAME = NUMBER TO NUMBER : statement : NEXT NAME :
  • 113. expression : NUMBER expression : NAME expression : STRING expression : operation expression : ( expression ) expression : NAME ( expression )
  • 114. operation : expression + expression operation : expression - expression operation : expression * expression operation : expression / expression operation : expression <= expression operation : expression < expression operation : expression = expression operation : expression <> expression operation : expression > expression operation : expression >= expression
  • 115. from rply.token import BaseBox ! class Program(BaseBox): def __init__(self, lines):
 self.lines = lines AST
  • 116. class Line(BaseBox): def __init__(self, lineno, statements): self.lineno = lineno self.statements = statements
  • 117. class Statements(BaseBox): def __init__(self, statements): self.statements = statements
  • 118. class Print(BaseBox): def __init__(self, expressions, newline=True): self.expressions = expressions self.newline = newline
  • 119.
  • 120. from rply import ParserGenerator ! pg = ParserGenerator(["NUMBER", "PRINT", …]) Parser
  • 121. @pg.production("program : ") @pg.production("program : line") @pg.production("program : line program") def program(p): if len(p) == 2: return Program([p[0]] + p[1].get_lines()) return Program(p)
  • 122. @pg.production("line : number statements") def line(p): return Line(p[0], p[1].get_statements())
  • 123. @pg.production("op : expression + expression") @pg.production("op : expression * expression") def op(p): if p[1].gettokentype() == "+": return Add(p[0], p[2]) elif p[1].gettokentype() == "*": return Mul(p[0], p[2])
  • 124. pg = ParserGenerator([…], precedence=[ ("left", ["+", "-"]), ("left", ["*", "/"]) ])
  • 127. class VM(object): def __init__(self, program): self.program = program
  • 128. class VM(object): def __init__(self, program): self.program = program self.pc = 0
  • 129. class VM(object): def __init__(self, program): self.program = program self.pc = 0 self.frames = []
  • 130. class VM(object): def __init__(self, program): self.program = program self.pc = 0 self.frames = [] self.iterators = []
  • 131. class VM(object): def __init__(self, program): self.program = program self.pc = 0 self.frames = [] self.iterators = [] self.stack = []
  • 132. class VM(object): def __init__(self, program): self.program = program self.pc = 0 self.frames = [] self.iterators = {} self.stack = [] self.variables = {}
  • 133. class VM(object): … def execute(self): while self.pc < len(self.program.instructions): self.execute_bytecode(self.program.instructions[self.pc])
  • 134. class VM(object): … def execute_bytecode(self, code): raise NotImplementedError(code)
  • 135. class VM(object): ... def execute_bytecode(self): if isinstance(code, TYPE): self.execute_TYPE(code) ... else: raise NotImplementedError(code)
  • 138. class Number(Instruction): def __init__(self, value): self.value = value ! class String(Instructions): def __init__(self, value): self.value = value
  • 139. class Print(Instruction): def __init__(self, expressions, newline): self.expressions = expressions self.newline = newline
  • 140. class Call(Instruction): def __init__(self, function_name): self.function_name = function_name
  • 141. class Let(Instruction): def __init__(self, name): self.name = name
  • 143. class Add(Instruction): pass ! class Sub(Instruction): pass ! class Mul(Instruction): pass ! class Equal(Instruction): pass ! ...
  • 144. class GotoIfTrue(Instruction): def __init__(self, target): self.target = target ! class Goto(Instruction): def __init__(self, target, with_frame=False): self.target = target self.with_frame = with_frame ! class Return(Instruction): pass
  • 145. class Input(object): def __init__(self, name): self.name = name
  • 146. class For(Instruction): def __init__(self, variable): self.variable = variable ! class Next(Instruction): def __init__(self, variable): self.variable = variable
  • 147. class Program(object): def __init__(self): self.instructions = [] self.lineno2instruction = {} ! def __enter__(self): return self ! def __exit__(self, exc_type, exc_value, tb): if exc_type is None: for i, instruction in enumerate(self.instructions): instruction.finalize(self, i)
  • 148. def finalize(self, program, index): self.target = program.lineno2instruction[self.target]
  • 149. class Program(BaseBox): … def compile(self): with bytecode.Program() as program: for line in self.lines: line.compile(program) return program
  • 150. class Line(BaseBox): ... def compile(self, program): program.lineno2instruction[self.lineno] = len(program.instructions) for statement in self.statements: statement.compile(program)
  • 151. class Line(BaseBox): ... def compile(self, program): program.lineno2instruction[self.lineno] = len(program.instructions) for statement in self.statements: statement.compile(program)
  • 152. class Print(Statement): def compile(self, program): for expression in self.expressions: expression.compile(program) program.instructions.append( bytecode.Print( len(self.expressions), self.newline ) )
  • 153. class Print(Statement): ... def compile(self, program): for expression in self.expressions: expression.compile(program) program.instructions.append( bytecode.Print( len(self.expressions), self.newline ) )
  • 154. class Let(Statement): ... def compile(self, program): self.value.compile(program) program.instructions.append( bytecode.Let(self.name) )
  • 155. class Input(Statement): ... def compile(self, program): program.instructions.append( bytecode.Input(self.variable) )
  • 156. class Goto(Statement): ... def compile(self, program): program.instructions.append( bytecode.Goto(self.target) ) ! class Gosub(Statement): ... def compile(self, program): program.instructions.append( bytecode.Goto( self.target, with_frame=True ) ) ! class Return(Statement): ... def compile(self, program): program.instructions.append( bytecode.Return() )
  • 157. class For(Statement): ... def compile(self, program): self.start.compile(program) program.instructions.append( bytecode.Let(self.variable) ) self.end.compile(program) program.instructions.append( bytecode.For(self.variable) )
  • 158. class WrappedObject(object): pass ! class WrappedString(WrappedObject): def __init__(self, value): self.value = value ! class WrappedFloat(WrappedObject): def __init__(self, value): self.value = value
  • 159. class VM(object): … def execute_number(self, code): self.stack.append(WrappedFloat(code.value)) self.pc += 1 ! def execute_string(self, code): self.stack.append(WrappedString(code.value)) self.pc += 1
  • 160. class VM(object): … def execute_call(self, code): argument = self.stack.pop() if code.function_name == "TAB": self.stack.append(WrappedString(" " * int(argument))) elif code.function_name == "RND": self.stack.append(WrappedFloat(random.random())) ... self.pc += 1
  • 161. class VM(object): … def execute_let(self, code): value = self.stack.pop() self.variables[code.name] = value self.pc += 1 ! def execute_lookup(self, code): value = self.variables[code.name] self.stack.append(value) self.pc += 1
  • 162. class VM(object): … def execute_add(self, code): right = self.stack.pop() left = self.stack.pop() self.stack.append(WrappedFloat(left + right)) self.pc += 1
  • 163. class VM(object): … def execute_goto_if_true(self, code): condition = self.stack.pop() if condition: self.pc = code.target else: self.pc += 1
  • 164. class VM(object): … def execute_goto(self, code): if code.with_frame: self.frames.append(self.pc + 1) self.pc = code.target
  • 165. class VM(object): … def execute_return(self, code): self.pc = self.frames.pop()
  • 166. class VM(object): … def execute_input(self, code): value = WrappedFloat(float(raw_input() or “0.0”)) self.variables[code.name] = value self.pc += 1
  • 167. class VM(object): … def execute_for(code): self.pc += 1 self.iterators[code.variable] = ( self.pc, self.stack.pop() )
  • 168. class VM(object): … def execute_next(self, code): loop_begin, end = self.iterators[code.variable] current_value = self.variables[code.variable].value next_value = current_value + 1.0 if next_value <= end: self.variables[code.variable] = WrappedFloat(next_value) self.pc = loop_begin else: del self.iterators[code.variable] self.pc += 1
  • 169. def entry_point(argv): try: filename = argv[1] except IndexError: print(“You must supply a filename”) return 1 content = read_file(filename) tokens = lex(content) ast = parse(tokens) program = ast.compile() vm = VM(program) vm.execute() return 0 Entry Point
  • 170. JIT (in PyPy) 1. Identify “hot" loops! 2. Create trace inserting guards based on observed values! 3. Optimize trace! 4. Compile trace! 5. Execute machine code instead of interpreter
  • 171. from rpython.rlib.jit import JitDriver ! jitdriver = JitDriver( greens=[“pc”, “vm”, “program”, “frames”, “iterators”], reds=[“stack”, “variables"] )
  • 172. class VM(object): … def execute(self): while self.pc < len(self.program.instructions): jitdriver.merge_point( vm=self, pc=self.pc, … )
  • 173. Benchmark 10 N = 1 20 IF N <= 10000 THEN 40 30 END 40 GOSUB 100 50 IF R = 0 THEN 70 60 PRINT "PRIME"; N 70 N = N + 1: GOTO 20 100 REM *** ISPRIME N -> R 110 IF N <= 2 THEN 170 120 FOR I = 2 TO (N - 1) 130 A = N: B = I: GOSUB 200 140 IF R <> 0 THEN 160 150 R = 0: RETURN 160 NEXT I 170 R = 1: RETURN 200 REM *** MOD A -> B -> R 210 R = A - (B * INT(A / B)) 220 RETURN
  • 174. cbmbasic 58.22s basic-c 5.06s basic-c-jit 2.34s Python implementation (CPython) 2.83s Python implementation (PyPy) 0.11s C implementation 0.03s
  • 175. Project milestones 2008 Django support 2010 First JIT-compiler 2011 Compatibility with CPython 2.7 2014 Basic ARM support CPython 3 support Improve compatibility with C extensions NumPyPy Multi-threading support
  • 178. PyPy STM 10 loops, best of 3: 1.2 sec per loop10 loops, best of 3: 822 msec per loop from threading import Thread def count(n): while n > 0: n -= 1 def run(): t1 = Thread(target=count, args=(10000000,)) t1.start() t2 = Thread(target=count, args=(10000000,)) t2.start() t1.join(); t2.join() def count(n): while n > 0: n -= 1 def run(): count(10000000) count(10000000) Inside the Python GIL - David Beazley
  • 179. PyPy in the real world (1) High frequency trading platform for sports bets I low latency is a must PyPy used in production since 2012 ~100 PyPy processes running 24/7 up to 10x speedups I after careful tuning and optimizing for PyPy antocuni (PyCon Otto) PyPy Status Update April 07 2017 6 / 19
  • 180. PyPy in the real world (2) Real-time online advertising auctions I tight latency requirement (<100ms) I high throughput (hundreds of thousands of requests per second) 30% speedup We run PyPy basically everywhere Julian Berman antocuni (PyCon Otto) PyPy Status Update April 07 2017 7 / 19
  • 181. PyPy in the real world (3) IoT on the cloud 5-10x faster We do not even run benchmarks on CPython because we just know that PyPy is way faster Tobias Oberstein antocuni (PyCon Otto) PyPy Status Update April 07 2017 8 / 19