Pony ORM is an Object-Relational Mapper implemented in Python. It uses an unusual approach for writing database queries using Python generators. Pony analyzes the abstract syntax tree of a generator and translates it to its SQL equivalent. The translation process consists of several non-trivial stages.
This talk was given at EuroPython 2014 and reveals the internal details of the translation process.
13. session.query(Product).filter(
(Product.name.startswith('A') & (Product.image == None))
| (extract('year', Product.added) < 2014))
Query syntax comparison
Product.objects.filter(
Q(name__startswith='A', image__isnull=True)
| Q(added__year__lt=2014))
select(p for p in Product
if p.name.startswith('A') and p.image is None
or p.added.year < 2014)
Pony
Django
SQLAlchemy
14. Query translation
select(p for p in Product
if p.name.startswith('A') and p.image is None
or p.added.year < 2014)
• Translation from the bytecode is fast
• The bytecode translation result is cached
• The Python generator object is used as a
cache key
Python generator object
15. Building a query step by step
q = select(o for o in Order if o.customer.id == some_id)
q = q.filter(lambda o: o.state != 'DELIVERED')
q = q.filter(lambda o: len(o.items) > 2)
q = q.order_by(Order.date_created)
q = q[10:20]
SELECT "o"."id"
FROM "Order" "o"
LEFT JOIN "OrderItem" "orderitem-1"
ON "o"."id" = "orderitem-1"."order"
WHERE "o"."customer" = ?
AND "o"."state" <> 'DELIVERED'
GROUP BY "o"."id"
HAVING COUNT("orderitem-1"."ROWID") > 2
ORDER BY "o"."date_created"
LIMIT 10 OFFSET 10
17. Python generator to SQL translation
1. Decompile bytecode and restore AST
2. Translate AST to ‘abstract SQL’
3. Translate ‘abstract SQL’ to a specific SQL
dialect
18. Python generator to SQL translation
1. Decompile bytecode and restore AST
2. Translate AST to ‘abstract SQL’
3. Translate ‘abstract SQL’ to a concrete
SQL dialect
19. Bytecode decompilation
• Using the Visitor pattern
• Methods of the Visitor object correspond
the byte code commands
• Pony keeps fragments of AST at the stack
• Each method either adds a new part of AST
or combines existing parts
21. (a + b.c) in x.y
LOAD_GLOBAL a
LOAD_FAST b
LOAD_ATTR c
BINARY_ADD
LOAD_FAST x
LOAD_ATTR y
COMPARE_OP in
Bytecode decompilation
22. Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL a
LOAD_FAST b
LOAD_ATTR c
BINARY_ADD
LOAD_FAST x
LOAD_ATTR y
COMPARE_OP in
Stack
23. Bytecode decompilation
(a + b.c) in x.y
> LOAD_GLOBAL a
LOAD_FAST b
LOAD_ATTR c
BINARY_ADD
LOAD_FAST x
LOAD_ATTR y
COMPARE_OP in
Stack
Name('a')
24. Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL a
> LOAD_FAST b
LOAD_ATTR c
BINARY_ADD
LOAD_FAST x
LOAD_ATTR y
COMPARE_OP in
Stack
Name('b')
Name('a')
25. (a + b.c) in x.y
LOAD_GLOBAL a
LOAD_FAST b
> LOAD_ATTR c
BINARY_ADD
LOAD_FAST x
LOAD_ATTR y
COMPARE_OP in
Stack
Getattr(Name('b'), 'c')
Name('a')
Bytecode decompilation
26. (a + b.c) in x.y
LOAD_GLOBAL a
LOAD_FAST b
LOAD_ATTR c
> BINARY_ADD
LOAD_FAST x
LOAD_ATTR y
COMPARE_OP in
Stack
Add(Name('a'),
Getattr(Name('b'), 'c'))
Bytecode decompilation
27. Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL a
LOAD_FAST b
LOAD_ATTR c
BINARY_ADD
> LOAD_FAST x
LOAD_ATTR y
COMPARE_OP in
Stack
Name('x')
Add(Name('a'),
Getattr(Name('b'), 'c'))
28. Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL a
LOAD_FAST b
LOAD_ATTR c
BINARY_ADD
LOAD_FAST x
> LOAD_ATTR y
COMPARE_OP in
Stack
Getattr(Name('x'), 'y')
Add(Name('a'),
Getattr(Name('b'), 'c'))
29. Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL a
LOAD_FAST b
LOAD_ATTR c
BINARY_ADD
LOAD_FAST x
LOAD_ATTR y
> COMPARE_OP in
Stack
Compare('in',
Add(…), Getattr(…))
31. Python generator to SQL translation
1. Decompile bytecode and restore AST
2. Translate AST to ‘abstract SQL’
3. Translate ‘abstract SQL’ to a concrete
SQL dialect
32. What SQL it should be translated to?
a
in
+
.c
b
.y
x
(a + b.c) in x.y
33. It depends on variables types!
What SQL it should be translated to?
(a + b.c) in x.y
34. • If a and c are numbers, y is a collection
(? + "b"."c") IN (SELECT …)
• If a and c are strings, y is a collection
CONCAT(?, "b"."c") IN (SELECT …)
• If a, c and y are strings
“x"."y" LIKE CONCAT('%', ?, "b"."c", '%')
What SQL it should be translated to?
(a + b.c) in x.y
35. • The result of translation depends on types
• If the translator analyzes node types by
itself, the logic becomes too complex
• Pony uses Monads to keep it simple
(a + b.c) in x.y
AST to SQL Translation
36. • Encapsulates the node translation logic
• Generates the result of translation - ‘the
abstract SQL’
• Can combine itself with other monads
The translator delegates the logic of translation
to monads
A Monad
37. • StringAttrMonad
• StringParamMonad
• StringExprMonad
• StringConstMonad
• DatetimeAttrMonad
• DatetimeParamMonad
• ObjectAttrMonad
• CmpMonad
• etc…
Each monad defines a set of allowed operations and can
translate itself into a part of resulting SQL query
Monad types
38. AST Translation
• Using the Visitor pattern
• Walk the tree in depth-first order
• Create monads when leaving each node
39. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
40. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
41. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
42. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
43. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
44. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
45. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
46. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
StringAttr
Monad
47. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
StringAttr
Monad
48. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
StringAttr
Monad
StringExpr
Monad
49. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
StringAttr
Monad
StringExpr
Monad
50. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
StringAttr
Monad
StringExpr
Monad
51. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
StringAttr
Monad
StringExpr
Monad
ObjectIter
Monad
52. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
StringAttr
Monad
StringExpr
Monad
ObjectIter
Monad
53. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
StringAttr
Monad
StringExpr
Monad
ObjectIter
Monad
StringAttr
Monad
54. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
StringAttr
Monad
StringExpr
Monad
ObjectIter
Monad
StringAttr
Monad
55. (a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParam
Monad
ObjectIter
Monad
StringAttr
Monad
StringExpr
Monad
ObjectIter
Monad
StringAttr
Monad
Cmp
Monad
56. Abstract SQL
(a + b.c) in x.y
['LIKE', ['COLUMN', 't1', 'y'],
['CONCAT',
['VALUE', '%'], ['PARAM', 'p1'],
['COLUMN', 't2', 'c'], ['VALUE', '%']
]
]
Allows to put aside the SQL dialect differences
57. Python generator to SQL translation
1. Decompile bytecode and restore AST
2. Translate AST to ‘abstract SQL’
3. Translate ‘abstract SQL’ to a specific SQL
dialect
59. Other Pony ORM features
• Identity Map
• Automatic query optimization
• N+1 Query Problem solution
• Optimistic transactions
• Online ER Diagram Editor
60. Django ORM
s1 = Student.objects.get(pk=123)
print s1.name, s1.group.id
s2 = Student.objects.get(pk=456)
print s2.name, s2.group.id
• How many SQL queries will be executed?
• How many objects will be created?
71. Solution for the N+1 Query Problem
orders = select(o for o in Order if o.total_price > 1000)
.order_by(desc(Order.id)).page(1, pagesize=5)
for o in orders:
print o.total_price, o.customer.name
1
SELECT o.id, o.total_price, o.customer_id,...
FROM "Order" o
WHERE o.total_price > 1000
ORDER BY o.id DESC
LIMIT 5
72. Order 1
Order 3
Order 4
Order 7
Order 9
Customer 1
Customer 4
Customer 7
Solution for the N+1 Query Problem
73. Order 1
Order 3
Order 4
Order 7
Order 9
Customer 1
Customer 4
Customer 7
Solution for the N+1 Query Problem
One SQL query
74. Solution for the N+1 Query Problem
1
1
SELECT c.id, c.name, …
FROM “Customer” c
WHERE c.id IN (?, ?, ?)
orders = select(o for o in Order if o.total_price > 1000)
.order_by(desc(Order.id)).page(1, pagesize=5)
for o in orders:
print o.total_price, o.customer.name
SELECT o.id, o.total_price, o.customer_id,...
FROM "Order" o
WHERE o.total_price > 1000
ORDER BY o.id DESC
LIMIT 5
75. Automatic query optimization
select(c for c in Customer
if sum(c.orders.total_price) > 1000)
SELECT "c"."id", "c"."email", "c"."password", "c"."name",
"c"."country", "c"."address"
FROM "Customer" "c"
WHERE (
SELECT coalesce(SUM("order-1"."total_price"), 0)
FROM "Order" "order-1"
WHERE "c"."id" = "order-1"."customer"
) > 1000
SELECT "c"."id"
FROM "Customer" "c"
LEFT JOIN "Order" "order-1"
ON "c"."id" = "order-1"."customer"
GROUP BY "c"."id"
HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000
80. db_session
• Pony tracks which objects where changed
• No need to call save()
• Pony saves all updated objects in a single
transaction automatically on leaving the
db_session scope
Transactions
82. Optimistic Locking
• Pony tracks attributes which were read and
updated
• If object wasn’t locked using the for_update
method, Pony uses the optimistic locking
automatically