SlideShare a Scribd company logo
1 of 48
Download to read offline
Recursive Query Throwdown
in MySQL 8
BILL KARWIN
PERCONA LIVE OPEN SOURCE DATABASE CONFERENCE 2017
Bill Karwin
Software developer, consultant, trainer
Using MySQL since 2000
Senior Database Architect at SchoolMessenger
Author of SQL Antipatterns: Avoiding the Pitfalls of
Database Programming
Oracle ACE Director
How to Query a Tree?
Hierarchical data
§ Organization charts
§ Categories and sub-categories
§ Parts explosion
§ Threaded discussions
https://commons.wikimedia.org/wiki/File:Staff_Organisation_Diagram,_1896.jpg
Example: Threaded Comments
Adjacency List Example Data
comment_id parent_id author comment
1 NULL Fran What’s the cause of this bug?
2 1 Ollie I think it’s a null pointer.
3 2 Fran No, I checked for that.
4 1 Kukla We need to check valid input.
5 4 Ollie Yes, that’s a bug.
6 4 Fran Yes, please add a check
7 6 Kukla That fixed it.
Can’t Easily Query Deep Trees
SELECT * FROM Comments c1
LEFT JOIN Comments c2 ON (c2.parent_id = c1.comment_id)
LEFT JOIN Comments c3 ON (c3.parent_id = c2.comment_id)
LEFT JOIN Comments c4 ON (c4.parent_id = c3.comment_id)
LEFT JOIN Comments c5 ON (c5.parent_id = c4.comment_id)
LEFT JOIN Comments c6 ON (c6.parent_id = c5.comment_id)
LEFT JOIN Comments c7 ON (c7.parent_id = c6.comment_id)
LEFT JOIN Comments c8 ON (c8.parent_id = c7.comment_id)
LEFT JOIN Comments c9 ON (c9.parent_id = c8.comment_id)
LEFT JOIN Comments c10 ON (c10.parent_id = c9.comment_id)
...
MySQL Workarounds
MySQL Workarounds
MySQL lacked support for recursive queries, so workarounds were needed
These are all denormalized designs, most don’t have referential integrity
§Path enumeration
§Nested sets
§Closure table
Path Enumeration Example Data
comment_id path author comment
1 1/ Fran What’s the cause of this bug?
2 1/2/ Ollie I think it’s a null pointer.
3 1/2/3/ Fran No, I checked for that.
4 1/4/ Kukla We need to check valid input.
5 1/4/5/ Ollie Yes, that’s a bug.
6 1/4/6/ Fran Yes, please add a check
7 1/4/6/7/ Kukla That fixed it.
Path Enumeration Example Queries
Query ancestors of comment #7:
SELECT * FROM Comments
WHERE '1/4/6/7/' LIKE CONCAT(path, '%');
Query descendants of comment #4:
SELECT * FROM Comments
WHERE path LIKE '1/4/%';
Path Enumeration Pros and Cons
Pros:
§Single non-recursive query to get a tree or a subtree
Cons:
§Complex updates to add or remove a node
§Numbers are stored in a string—no referential integrity
Nested Sets
Each comment encodes its descendants using two numbers:
§ A comment’s left number is less than all numbers used by the comment’s descendants.
§ A comment’s right number is greater than all numbers used by the comment’s
descendants.
§ A comment’s numbers are between all
numbers used by the comment’s ancestors.
References:
§ “Recursive Hierarchies: The Relational Taboo!” Michael J. Kamfonas,
Relational Journal, Oct/Nov 1992
§ “Trees and Hierarchies in SQL For Smarties,” Joe Celko, 2004
§ “Managing Hierarchical Data in MySQL,” Mike Hillyer, 2005
Nested Sets Example
Nested Sets Example Data
comment_id nsleft nsright author comment
1 1 14 Fran What’s the cause of this bug?
2 2 5 Ollie I think it’s a null pointer.
3 3 4 Fran No, I checked for that.
4 6 13 Kukla We need to check valid input.
5 7 8 Ollie Yes, that’s a bug.
6 9 12 Fran Yes, please add a check
7 10 11 Kukla That fixed it.
Nested Sets Example Queries
Query ancestors of comment #7:
SELECT ancestor.* FROM Comments child
JOIN Comments ancestor
ON child.nsleft BETWEEN ancestor.nsleft AND ancestor.nsright
WHERE child.comment_id = 7;
Query subtree under comment #4:
SELECT descendant.* FROM Comments parent
JOIN Comments descendant
ON descendant.nsleft BETWEEN parent.nsleft AND parent.nsright
WHERE parent.comment_id = 4;
Nested Sets Pros and Cons
Pros:
§Single non-recursive query to get a tree or a subtree
Cons:
§Complex updates to add or remove a node
§Numbers are not foreign keys—no referential integrity
Closure Table
Many-to-many table
Stores every path from each node to each of its descendants
A node even connects to itself
CREATE TABLE Closure (
ancestor INT NOT NULL,
descendant INT NOT NULL,
length INT NOT NULL,
PRIMARY KEY (ancestor, descendant),
FOREIGN KEY(ancestor) REFERENCES Comments(comment_id),
FOREIGN KEY(descendant) REFERENCES Comments(comment_id)
);
Closure Table Example
Closure Table Example Data
comment_id author comment
1 Fran What’s the cause of this bug?
2 Ollie I think it’s a null pointer.
3 Fran No, I checked for that.
4 Kukla We need to check valid input.
5 Ollie Yes, that’s a bug.
6 Fran Yes, please add a check
7 Kukla That fixed it.
ancestor descendant length
1 1 0
1 2 1
1 3 2
1 4 1
1 5 2
1 6 2
1 7 3
2 2 0
2 3 1
3 3 0
4 4 0
4 5 1
4 6 1
4 7 2
5 5 0
6 6 0
6 7 1
7 7 0
Closure Table Example Queries
Query ancestors of comment #7:
SELECT c.* FROM Comments c
JOIN Closure t
ON (c.comment_id = t.ancestor)
WHERE t.descendant = 7;
Query subtree under comment #4:
SELECT c.* FROM Comments c
JOIN Closure t
ON (c.comment_id = t.descendant)
WHERE t.ancestor = 4;
Closure Table Pros and Cons
Pros:
§Single non-recursive query to get a tree or a subtree
§Referential integrity!
Cons:
§Extra table is required
§Hierarchy is stored redundantly, too easy to mess up
§Lots of joins to do most kinds of queries
ANSI SQL Recursive CTE
WITHer Recursive Queries in MySQL?
SQL vendors gradually implemented SQL-99 WITH syntax:
§ IBM DB2 UDB 8 (Dec. 2002)
§ Microsoft SQL Server 2005 (Oct. 2005)
§ Sybase SQL Anywhere 11 (Aug. 2008)
§ Firebird 2.1 (Sep. 2008)
§ PostgreSQL 8.4 (Jul. 2009)
§ Oracle 11g release 2 (Sep. 2009)
§ Teradata (date and version of support unknown, at least 2009)
§ HSQLDB 2.3 (Jul. 2013)
§ SQLite 3.8.3.1 (Feb. 2014)
§ H2 (date and version unknown)
https://www.percona.com/blog/2014/02/11/wither-recursive-queries/
ANSI SQL Recursive Common Table Expression
WITH RECURSIVE cte_name (col_name, col_name, col_name) AS
(
subquery base case
UNION ALL
subquery referencing cte_name
)
SELECT ... FROM cte_name ...
https://dev.mysql.com/doc/refman/8.0/en/with.html
Generating a Series of Numbers
WITH RECURSIVE MySeries (n) AS
(
SELECT 1 AS n
UNION ALL
SELECT 1+n FROM MySeries WHERE n < 10
)
SELECT * FROM MySeries;
+------+
| n |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
+------+
Generating a Series of Dates
WITH RECURSIVE MyDates (d) AS
(
SELECT CURRENT_DATE() AS d
UNION ALL
SELECT d + INTERVAL 1 DAY FROM MyDates
WHERE d < CURRENT_DATE() + INTERVAL 7 DAY
)
SELECT * FROM MyDates;
+------------+
| d |
+------------+
| 2017-04-24 |
| 2017-04-25 |
| 2017-04-26 |
| 2017-04-27 |
| 2017-04-28 |
| 2017-04-29 |
| 2017-04-30 |
| 2017-05-01 |
+------------+
Query ancestors of comment #7
WITH RECURSIVE CommentTree (comment_id, parent_id, author, comment,
depth) AS
(
SELECT comment_id, parent_id, author, comment, 0 AS depth
FROM Comments
WHERE comment_id = 7
UNION ALL
SELECT c.comment_id, c.parent_id, c.author, c.comment, ct.depth+1
FROM CommentTree ct
JOIN Comments c ON (ct.parent_id = c.comment_id)
)
SELECT * FROM CommentTree;
Query subtree under comment #4
WITH RECURSIVE CommentTree (comment_id, parent_id, author, comment,
depth) AS
(
SELECT comment_id, parent_id, author, comment, 0 AS depth
FROM Comments
WHERE comment_id = 4
UNION ALL
SELECT c.comment_id, c.parent_id, c.author, c.comment, ct.depth+1
FROM CommentTree ct
JOIN Comments c ON (ct.comment_id = c.parent_id)
)
SELECT * FROM CommentTree;
Recursive CTE Pros and Cons
Pros:
§ ANSI SQL-99 Standard
§ Compatible with other SQL implementations
§ Works with Adjacency List (single source of authority)
§ Referential integrity!
Cons:
§ Not compatible with earlier MySQL versions
§ Use of materialized temporary tables may cause performance problems
MySQL CTE Implementation: 💯
Thanks	to	@MarkusWinand for	his	preview	analysis	based	on	8.0.1-dmr
http://modern-sql.com/feature/with
Big Hierarchies
ITIS: Sample Hierarchical Data
Integrated Taxonomic Information System
(https://www.itis.gov/)
§Biological database of species of animals, plants, fungi
§One big tree of 544,954 nodes
§Data comes in adjacency list & path enumeration format
§I converted to closure table for query tests
ITIS Data Model
mysql> select * from longnames
where completename = 'Eschscholzia californica';
+--------+---------------------------+
| tsn | completename |
+--------+---------------------------+
| 18956 | Eschscholzia californica |
+--------+---------------------------+
mysql> select * from hierarchy where TSN = '18956'G
TSN: 18956
Parent_TSN: 18954
level: 11
ChildrenCount: 8
hierarchy_string: 202422-954898-846494-954900-846496-846504-18063-846547-18409-18880-18954-18956
Indexes
mysql> ALTER TABLE hierarchy ADD KEY (tsn, parent_tsn);
Query OK, 0 rows affected (1.30 sec)
Breadcrumbs Query
WITH RECURSIVE taxonomy AS
(
SELECT base.tsn, base.parent_tsn, 0 as depth
FROM hierarchy base
WHERE tsn = '18956'
UNION ALL
SELECT next.tsn, next.parent_tsn, t.depth+1
FROM hierarchy next JOIN taxonomy t
WHERE t.parent_tsn = next.tsn
)
SELECT * FROM taxonomy JOIN longnames USING (tsn)
ORDER BY depth DESC;
Breadcrumbs Query Result
+--------+------------+-------+--------------------------+
| tsn | parent_tsn | depth | completename |
+--------+------------+-------+--------------------------+
| 202422 | 0 | 11 | Plantae |
| 954898 | 202422 | 10 | Viridiplantae |
| 846494 | 954898 | 9 | Streptophyta |
| 954900 | 846494 | 8 | Embryophyta |
| 846496 | 954900 | 7 | Tracheophyta |
| 846504 | 846496 | 6 | Spermatophytina |
| 18063 | 846504 | 5 | Magnoliopsida |
| 846547 | 18063 | 4 | Ranunculanae |
| 18409 | 846547 | 3 | Ranunculales |
| 18880 | 18409 | 2 | Papaveraceae |
| 18954 | 18880 | 1 | Eschscholzia |
| 18956 | 18954 | 0 | Eschscholzia californica |
+--------+------------+-------+--------------------------+
12 rows in set (0.00 sec)
Breadcrumbs Query EXPLAIN Plan
§New note in Extra: "Recursive"
§Using index (covering index) for both base case and recursive case
§I can eliminate the filesort if I allow natural order (base case first)
§No "Using Temporary"? Not so fast…
+----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where; Using filesort |
| 1 | PRIMARY | longnames | eq_ref | PRIMARY,tsn | PRIMARY | 4 | taxonomy.tsn | 1 | 100.00 | NULL |
| 2 | DERIVED | base | ref | TSN | TSN | 4 | const | 1 | 100.00 | Using index |
| 3 | UNION | t | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Recursive; Using where |
| 3 | UNION | next | ref | TSN | TSN | 4 | t.parent_tsn | 1 | 100.00 | Using index |
+----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+
Breadcrumbs Query Performance
mysql> SELECT * FROM SYS.STATEMENTS_WITH_TEMP_TABLESG
query: WITH RECURSIVE `taxonomy` AS ( ...
`tsn` ) ORDER BY `depth` DESC
db: itis
exec_count: 1
total_latency: 10.05 ms
memory_tmp_tables: 1
disk_tmp_tables: 0
avg_tmp_tables_per_query: 1
tmp_tables_to_disk_pct: 0
first_seen: 2017-04-24 22:07:56
last_seen: 2017-04-24 22:07:56
digest: 8438633360bedce178823bb868589fd0
Breadcrumbs Query Stages
mysql> SELECT * FROM SYS.USER_SUMMARY_BY_STAGES;
+------+--------------------------------+-------+---------------+-------------+
| user | event_name | total | total_latency | avg_latency |
+------+--------------------------------+-------+---------------+-------------+
| root | stage/sql/System lock | 40 | 6.62 ms | 165.60 us |
| root | stage/sql/Opening tables | 191 | 3.16 ms | 16.52 us |
| root | stage/sql/checking permissions | 45 | 1.50 ms | 33.44 us |
| root | stage/sql/Creating sort index | 1 | 239.63 us | 239.63 us |
| root | stage/sql/closing tables | 191 | 191.03 us | 1.00 us |
| root | stage/sql/starting | 2 | 188.44 us | 94.22 us |
| root | stage/sql/Sending data | 6 | 138.96 us | 23.16 us |
| root | stage/sql/statistics | 4 | 122.42 us | 30.60 us |
| root | stage/sql/query end | 191 | 56.67 us | 296.00 ns |
| root | stage/sql/preparing | 4 | 33.57 us | 8.39 us |
| root | stage/sql/freeing items | 2 | 27.93 us | 13.96 us |
| root | stage/sql/optimizing | 5 | 20.03 us | 4.01 us |
| root | stage/sql/executing | 7 | 15.39 us | 2.20 us |
| root | stage/sql/removing tmp table | 4 | 9.35 us | 2.34 us |
| root | stage/sql/init | 3 | 8.76 us | 2.92 us |
| root | stage/sql/Sorting result | 2 | 4.16 us | 2.08 us |
| root | stage/sql/end | 3 | 1.93 us | 644.00 ns |
| root | stage/sql/cleaning up | 2 | 1.43 us | 715.00 ns |
+------+--------------------------------+-------+---------------+-------------+
Tree Expansion Query Result
See Demo
Tree Expansion Query
WITH RECURSIVE ancestors (tsn, parent_tsn) AS (
SELECT h.tsn, h.parent_tsn FROM hierarchy AS h WHERE h.tsn = %s
UNION ALL
SELECT h.tsn, h.parent_tsn FROM hierarchy AS h JOIN ancestors AS base ON h.tsn = base.parent_tsn
),
breadcrumbs (tsn, parent_tsn, depth, breadcrumbs) AS (
SELECT h.tsn, h.parent_tsn, 0 AS depth, CAST(LPAD(h.tsn, 8, '0') AS CHAR(255)) AS breadcrumbs
FROM hierarchy AS h WHERE h.parent_tsn = 0
UNION ALL
SELECT h.tsn, h.parent_tsn, base.depth+1 AS depth, CONCAT(base.breadcrumbs, ',', LPAD(h.tsn, 8,
'0'))
FROM hierarchy AS h
JOIN ancestors AS a ON h.tsn = a.tsn
JOIN breadcrumbs AS base ON h.parent_tsn = base.tsn
)
SELECT l.tsn, l.completename, b.depth, b.breadcrumbs
FROM breadcrumbs AS b JOIN longnames AS l ON b.tsn = l.tsn
UNION
SELECT l.tsn, l.completename, b.depth+1, CONCAT(b.breadcrumbs, ',', LPAD(h.tsn, 8, '0'))
FROM breadcrumbs AS b
JOIN hierarchy AS h ON b.tsn = h.parent_tsn
JOIN longnames AS l ON l.tsn = h.tsn
ORDER BY breadcrumbs
Tree Expansion Query EXPLAIN
--------------+------------+--------+-------------+---------+-------------------+--------+----------+--------------------------------
select_type | table | type | key | key_len | ref | rows | filtered | Extra
--------------+------------+--------+-------------+---------+-------------------+--------+----------+--------------------------------
PRIMARY | <derived2> | ALL | NULL | NULL | NULL | 250230 | 100.00 | Using where
PRIMARY | l | eq_ref | PRIMARY | 4 | b.tsn | 1 | 100.00 | NULL
DERIVED | h | index | TSN | 9 | NULL | 500466 | 10.00 | Using where; Using index
UNION | base | ALL | NULL | NULL | NULL | 50046 | 100.00 | Recursive; Using where
UNION | <derived4> | ALL | NULL | NULL | NULL | 4 | 100.00 | Using where; Using join buffer
UNION | h | ref | TSN | 9 | a.tsn,base.tsn | 1 | 100.00 | Using index
DERIVED | h | ref | TSN | 4 | const | 1 | 100.00 | Using index
UNION | base | ALL | NULL | NULL | NULL | 2 | 100.00 | Recursive; Using where
UNION | h | ref | TSN | 4 | base.parent_tsn | 1 | 100.00 | Using index
UNION | h | index | TSN | 9 | NULL | 500466 | 100.00 | Using where; Using index
UNION | l | eq_ref | PRIMARY | 4 | itis.h.TSN | 1 | 100.00 | NULL
UNION | <derived2> | ref | <auto_key0> | 5 | itis.h.Parent_TSN | 10 | 100.00 | NULL
| UNION RESULT | <union1,8> | ALL | NULL | NULL | NULL | NULL | NULL | Using temporary; Using filesort
--------------+------------+--------+-------------+---------+-------------------+--------+----------+--------------------------------
Maybe I need more indexes?
Unfortunately I ran out of time to analyze.
Tree Expansion Query Performance
mysql> SELECT * FROM SYS.STATEMENTS_WITH_TEMP_TABLESG
query: WITH RECURSIVE `ancestors` ( ` ... `l`
. `completename` , `b` .
db: itis
exec_count: 1
total_latency: 1.24 s
memory_tmp_tables: 3
disk_tmp_tables: 0
avg_tmp_tables_per_query: 3
tmp_tables_to_disk_pct: 0
first_seen: 2017-04-27 01:33:14
last_seen: 2017-04-27 01:33:14
digest: 86c1417d2ff3679863db754eff425e94
Tree Expansion Query Stages
mysql> SELECT * FROM SYS.USER_SUMMARY_BY_STAGES;
+------+--------------------------------+-------+---------------+-------------+
| user | event_name | total | total_latency | avg_latency |
+------+--------------------------------+-------+---------------+-------------+
| root | stage/sql/Sending data | 12 | 979.42 ms | 81.62 ms |
| root | stage/sql/System lock | 40 | 6.34 ms | 158.52 us |
| root | stage/sql/Opening tables | 191 | 3.34 ms | 17.51 us |
| root | stage/sql/checking permissions | 53 | 1.35 ms | 25.45 us |
| root | stage/sql/starting | 2 | 356.31 us | 178.16 us |
| root | stage/sql/statistics | 12 | 271.01 us | 22.58 us |
| root | stage/sql/closing tables | 191 | 179.15 us | 937.00 ns |
| root | stage/sql/preparing | 12 | 98.18 us | 8.18 us |
| root | stage/sql/query end | 191 | 57.60 us | 301.00 ns |
| root | stage/sql/freeing items | 2 | 47.93 us | 23.96 us |
| root | stage/sql/Creating sort index | 1 | 37.38 us | 37.38 us |
| root | stage/sql/optimizing | 13 | 30.60 us | 2.35 us |
| root | stage/sql/executing | 13 | 30.27 us | 2.33 us |
| root | stage/sql/removing tmp table | 14 | 24.44 us | 1.74 us |
| root | stage/sql/init | 3 | 14.78 us | 4.93 us |
| root | stage/sql/cleaning up | 2 | 11.66 us | 5.83 us |
| root | stage/sql/Sorting result | 2 | 3.67 us | 1.84 us |
| root | stage/sql/end | 3 | 3.04 us | 1.01 us |
+------+--------------------------------+-------+---------------+-------------+
Conclusions
Conclusions
§Overall, MySQL 8 support for recursive CTE queries is
worth the wait.
§Exotic cases exist that are beyond any optimizer.
§I'm excited to upgrade to MySQL 8.0.x ASAP!
§Now that virtually all major SQL brands support
recursive CTE's, we need developer tools and popular
apps to use them!
License and Copyright
Copyright 2017 Bill Karwin
http://www.slideshare.net/billkarwin
Released under a Creative Commons 3.0 License:
http://creativecommons.org/licenses/by-nc-nd/3.0/
You are free to share—to copy, distribute,
and transmit this work, under the following conditions:
Attribution.
You must attribute this
work to Bill Karwin.
Noncommercial.
You may not use this
work for commercial
purposes.
No Derivative Works.
You may not alter,
transform, or build
upon this work.

More Related Content

What's hot

MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index Tuning
Manikanda kumar
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
Denish Patel
 
pytest로 파이썬 코드 테스트하기
pytest로 파이썬 코드 테스트하기pytest로 파이썬 코드 테스트하기
pytest로 파이썬 코드 테스트하기
Yeongseon Choe
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
Dan McKinley
 

What's hot (20)

Graphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks AgeGraphs in the Database: Rdbms In The Social Networks Age
Graphs in the Database: Rdbms In The Social Networks Age
 
Introduction to SQL Antipatterns
Introduction to SQL AntipatternsIntroduction to SQL Antipatterns
Introduction to SQL Antipatterns
 
How to Use JSON in MySQL Wrong
How to Use JSON in MySQL WrongHow to Use JSON in MySQL Wrong
How to Use JSON in MySQL Wrong
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQLTop 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
 
Percona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimizationPercona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimization
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index Tuning
 
Parallel Replication in MySQL and MariaDB
Parallel Replication in MySQL and MariaDBParallel Replication in MySQL and MariaDB
Parallel Replication in MySQL and MariaDB
 
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuVirtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
 
Advanced MySQL Query Tuning
Advanced MySQL Query TuningAdvanced MySQL Query Tuning
Advanced MySQL Query Tuning
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
 
Functional Design Patterns (DevTernity 2018)
Functional Design Patterns (DevTernity 2018)Functional Design Patterns (DevTernity 2018)
Functional Design Patterns (DevTernity 2018)
 
Apache Parquet
Apache ParquetApache Parquet
Apache Parquet
 
Streaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScaleStreaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScale
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 
Railway Oriented Programming
Railway Oriented ProgrammingRailway Oriented Programming
Railway Oriented Programming
 
Json in Postgres - the Roadmap
 Json in Postgres - the Roadmap Json in Postgres - the Roadmap
Json in Postgres - the Roadmap
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ Twitter
 
pytest로 파이썬 코드 테스트하기
pytest로 파이썬 코드 테스트하기pytest로 파이썬 코드 테스트하기
pytest로 파이썬 코드 테스트하기
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 

Viewers also liked

MySQL Best Practices - OTN LAD Tour
MySQL Best Practices - OTN LAD TourMySQL Best Practices - OTN LAD Tour
MySQL Best Practices - OTN LAD Tour
Ronald Bradford
 
Capturing, Analyzing and Optimizing MySQL
Capturing, Analyzing and Optimizing MySQLCapturing, Analyzing and Optimizing MySQL
Capturing, Analyzing and Optimizing MySQL
Ronald Bradford
 
Reducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQLReducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQL
Kenny Gryp
 

Viewers also liked (20)

MySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster - A complete High Availability solution for MySQLMySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
 
Online MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackupOnline MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackup
 
Mysql参数-GDB
Mysql参数-GDBMysql参数-GDB
Mysql参数-GDB
 
MySQL Best Practices - OTN LAD Tour
MySQL Best Practices - OTN LAD TourMySQL Best Practices - OTN LAD Tour
MySQL Best Practices - OTN LAD Tour
 
MySQL Backup and Recovery Essentials
MySQL Backup and Recovery EssentialsMySQL Backup and Recovery Essentials
MySQL Backup and Recovery Essentials
 
MySQL InnoDB Cluster and Group Replication - OSI 2017 Bangalore
MySQL InnoDB Cluster and Group Replication - OSI 2017 BangaloreMySQL InnoDB Cluster and Group Replication - OSI 2017 Bangalore
MySQL InnoDB Cluster and Group Replication - OSI 2017 Bangalore
 
Group Replication: A Journey to the Group Communication Core
Group Replication: A Journey to the Group Communication CoreGroup Replication: A Journey to the Group Communication Core
Group Replication: A Journey to the Group Communication Core
 
Mysql high availability and scalability
Mysql high availability and scalabilityMysql high availability and scalability
Mysql high availability and scalability
 
What you wanted to know about MySQL, but could not find using inernal instrum...
What you wanted to know about MySQL, but could not find using inernal instrum...What you wanted to know about MySQL, but could not find using inernal instrum...
What you wanted to know about MySQL, but could not find using inernal instrum...
 
Capturing, Analyzing and Optimizing MySQL
Capturing, Analyzing and Optimizing MySQLCapturing, Analyzing and Optimizing MySQL
Capturing, Analyzing and Optimizing MySQL
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
 
MySQL Group Replication - HandsOn Tutorial
MySQL Group Replication - HandsOn TutorialMySQL Group Replication - HandsOn Tutorial
MySQL Group Replication - HandsOn Tutorial
 
Mastering InnoDB Diagnostics
Mastering InnoDB DiagnosticsMastering InnoDB Diagnostics
Mastering InnoDB Diagnostics
 
MySQL Replication Performance Tuning for Fun and Profit!
MySQL Replication Performance Tuning for Fun and Profit!MySQL Replication Performance Tuning for Fun and Profit!
MySQL Replication Performance Tuning for Fun and Profit!
 
MySQL High Availability with Group Replication
MySQL High Availability with Group ReplicationMySQL High Availability with Group Replication
MySQL High Availability with Group Replication
 
MySQL Group Replication
MySQL Group ReplicationMySQL Group Replication
MySQL Group Replication
 
MySQL innodb cluster and Group Replication in a nutshell - hands-on tutorial ...
MySQL innodb cluster and Group Replication in a nutshell - hands-on tutorial ...MySQL innodb cluster and Group Replication in a nutshell - hands-on tutorial ...
MySQL innodb cluster and Group Replication in a nutshell - hands-on tutorial ...
 
淘宝数据库架构演进历程
淘宝数据库架构演进历程淘宝数据库架构演进历程
淘宝数据库架构演进历程
 
Inno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structureInno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structure
 
Reducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQLReducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQL
 

Similar to Recursive Query Throwdown

NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
rantav
 
How did i steal your database CSCamp2011
How did i steal your database CSCamp2011How did i steal your database CSCamp2011
How did i steal your database CSCamp2011
Mostafa Siraj
 
Database Connectivity MYSQL by Dr.C.R.Dhivyaa Kongu Engineering College
Database Connectivity MYSQL by Dr.C.R.Dhivyaa Kongu Engineering CollegeDatabase Connectivity MYSQL by Dr.C.R.Dhivyaa Kongu Engineering College
Database Connectivity MYSQL by Dr.C.R.Dhivyaa Kongu Engineering College
Dhivyaa C.R
 

Similar to Recursive Query Throwdown (20)

NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
How did i steal your database CSCamp2011
How did i steal your database CSCamp2011How did i steal your database CSCamp2011
How did i steal your database CSCamp2011
 
Inside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPInside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTP
 
98765432345671223Intro-to-PostgreSQL.ppt
98765432345671223Intro-to-PostgreSQL.ppt98765432345671223Intro-to-PostgreSQL.ppt
98765432345671223Intro-to-PostgreSQL.ppt
 
2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload
 
Mysql Explain Explained
Mysql Explain ExplainedMysql Explain Explained
Mysql Explain Explained
 
Obtain better data accuracy using reference tables
Obtain better data accuracy using reference tablesObtain better data accuracy using reference tables
Obtain better data accuracy using reference tables
 
DIWE - Working with MySQL Databases
DIWE - Working with MySQL DatabasesDIWE - Working with MySQL Databases
DIWE - Working with MySQL Databases
 
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxMYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
 
Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019
 Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019 Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019
Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019
 
DBMS LAB M.docx
DBMS LAB M.docxDBMS LAB M.docx
DBMS LAB M.docx
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]s
 
Adventures in TclOO
Adventures in TclOOAdventures in TclOO
Adventures in TclOO
 
PostgreSQL Tutorial for Beginners | Edureka
PostgreSQL Tutorial for Beginners | EdurekaPostgreSQL Tutorial for Beginners | Edureka
PostgreSQL Tutorial for Beginners | Edureka
 
Optimizing MySQL Queries
Optimizing MySQL QueriesOptimizing MySQL Queries
Optimizing MySQL Queries
 
MySQL for beginners
MySQL for beginnersMySQL for beginners
MySQL for beginners
 
UNIT V (5).pptx
UNIT V (5).pptxUNIT V (5).pptx
UNIT V (5).pptx
 
Database Connectivity MYSQL by Dr.C.R.Dhivyaa Kongu Engineering College
Database Connectivity MYSQL by Dr.C.R.Dhivyaa Kongu Engineering CollegeDatabase Connectivity MYSQL by Dr.C.R.Dhivyaa Kongu Engineering College
Database Connectivity MYSQL by Dr.C.R.Dhivyaa Kongu Engineering College
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
 
PostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | EdurekaPostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | Edureka
 

More from Karwin Software Solutions LLC

More from Karwin Software Solutions LLC (11)

Load Data Fast!
Load Data Fast!Load Data Fast!
Load Data Fast!
 
InnoDB Locking Explained with Stick Figures
InnoDB Locking Explained with Stick FiguresInnoDB Locking Explained with Stick Figures
InnoDB Locking Explained with Stick Figures
 
SQL Outer Joins for Fun and Profit
SQL Outer Joins for Fun and ProfitSQL Outer Joins for Fun and Profit
SQL Outer Joins for Fun and Profit
 
Survey of Percona Toolkit
Survey of Percona ToolkitSurvey of Percona Toolkit
Survey of Percona Toolkit
 
Schemadoc
SchemadocSchemadoc
Schemadoc
 
Percona toolkit
Percona toolkitPercona toolkit
Percona toolkit
 
MySQL 5.5 Guide to InnoDB Status
MySQL 5.5 Guide to InnoDB StatusMySQL 5.5 Guide to InnoDB Status
MySQL 5.5 Guide to InnoDB Status
 
Requirements the Last Bottleneck
Requirements the Last BottleneckRequirements the Last Bottleneck
Requirements the Last Bottleneck
 
Mentor Your Indexes
Mentor Your IndexesMentor Your Indexes
Mentor Your Indexes
 
Sql Injection Myths and Fallacies
Sql Injection Myths and FallaciesSql Injection Myths and Fallacies
Sql Injection Myths and Fallacies
 
Full Text Search In PostgreSQL
Full Text Search In PostgreSQLFull Text Search In PostgreSQL
Full Text Search In PostgreSQL
 

Recently uploaded

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 

Recently uploaded (20)

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 

Recursive Query Throwdown

  • 1. Recursive Query Throwdown in MySQL 8 BILL KARWIN PERCONA LIVE OPEN SOURCE DATABASE CONFERENCE 2017
  • 2. Bill Karwin Software developer, consultant, trainer Using MySQL since 2000 Senior Database Architect at SchoolMessenger Author of SQL Antipatterns: Avoiding the Pitfalls of Database Programming Oracle ACE Director
  • 3. How to Query a Tree? Hierarchical data § Organization charts § Categories and sub-categories § Parts explosion § Threaded discussions https://commons.wikimedia.org/wiki/File:Staff_Organisation_Diagram,_1896.jpg
  • 5. Adjacency List Example Data comment_id parent_id author comment 1 NULL Fran What’s the cause of this bug? 2 1 Ollie I think it’s a null pointer. 3 2 Fran No, I checked for that. 4 1 Kukla We need to check valid input. 5 4 Ollie Yes, that’s a bug. 6 4 Fran Yes, please add a check 7 6 Kukla That fixed it.
  • 6. Can’t Easily Query Deep Trees SELECT * FROM Comments c1 LEFT JOIN Comments c2 ON (c2.parent_id = c1.comment_id) LEFT JOIN Comments c3 ON (c3.parent_id = c2.comment_id) LEFT JOIN Comments c4 ON (c4.parent_id = c3.comment_id) LEFT JOIN Comments c5 ON (c5.parent_id = c4.comment_id) LEFT JOIN Comments c6 ON (c6.parent_id = c5.comment_id) LEFT JOIN Comments c7 ON (c7.parent_id = c6.comment_id) LEFT JOIN Comments c8 ON (c8.parent_id = c7.comment_id) LEFT JOIN Comments c9 ON (c9.parent_id = c8.comment_id) LEFT JOIN Comments c10 ON (c10.parent_id = c9.comment_id) ...
  • 8. MySQL Workarounds MySQL lacked support for recursive queries, so workarounds were needed These are all denormalized designs, most don’t have referential integrity §Path enumeration §Nested sets §Closure table
  • 9. Path Enumeration Example Data comment_id path author comment 1 1/ Fran What’s the cause of this bug? 2 1/2/ Ollie I think it’s a null pointer. 3 1/2/3/ Fran No, I checked for that. 4 1/4/ Kukla We need to check valid input. 5 1/4/5/ Ollie Yes, that’s a bug. 6 1/4/6/ Fran Yes, please add a check 7 1/4/6/7/ Kukla That fixed it.
  • 10. Path Enumeration Example Queries Query ancestors of comment #7: SELECT * FROM Comments WHERE '1/4/6/7/' LIKE CONCAT(path, '%'); Query descendants of comment #4: SELECT * FROM Comments WHERE path LIKE '1/4/%';
  • 11. Path Enumeration Pros and Cons Pros: §Single non-recursive query to get a tree or a subtree Cons: §Complex updates to add or remove a node §Numbers are stored in a string—no referential integrity
  • 12. Nested Sets Each comment encodes its descendants using two numbers: § A comment’s left number is less than all numbers used by the comment’s descendants. § A comment’s right number is greater than all numbers used by the comment’s descendants. § A comment’s numbers are between all numbers used by the comment’s ancestors. References: § “Recursive Hierarchies: The Relational Taboo!” Michael J. Kamfonas, Relational Journal, Oct/Nov 1992 § “Trees and Hierarchies in SQL For Smarties,” Joe Celko, 2004 § “Managing Hierarchical Data in MySQL,” Mike Hillyer, 2005
  • 14. Nested Sets Example Data comment_id nsleft nsright author comment 1 1 14 Fran What’s the cause of this bug? 2 2 5 Ollie I think it’s a null pointer. 3 3 4 Fran No, I checked for that. 4 6 13 Kukla We need to check valid input. 5 7 8 Ollie Yes, that’s a bug. 6 9 12 Fran Yes, please add a check 7 10 11 Kukla That fixed it.
  • 15. Nested Sets Example Queries Query ancestors of comment #7: SELECT ancestor.* FROM Comments child JOIN Comments ancestor ON child.nsleft BETWEEN ancestor.nsleft AND ancestor.nsright WHERE child.comment_id = 7; Query subtree under comment #4: SELECT descendant.* FROM Comments parent JOIN Comments descendant ON descendant.nsleft BETWEEN parent.nsleft AND parent.nsright WHERE parent.comment_id = 4;
  • 16. Nested Sets Pros and Cons Pros: §Single non-recursive query to get a tree or a subtree Cons: §Complex updates to add or remove a node §Numbers are not foreign keys—no referential integrity
  • 17. Closure Table Many-to-many table Stores every path from each node to each of its descendants A node even connects to itself CREATE TABLE Closure ( ancestor INT NOT NULL, descendant INT NOT NULL, length INT NOT NULL, PRIMARY KEY (ancestor, descendant), FOREIGN KEY(ancestor) REFERENCES Comments(comment_id), FOREIGN KEY(descendant) REFERENCES Comments(comment_id) );
  • 19. Closure Table Example Data comment_id author comment 1 Fran What’s the cause of this bug? 2 Ollie I think it’s a null pointer. 3 Fran No, I checked for that. 4 Kukla We need to check valid input. 5 Ollie Yes, that’s a bug. 6 Fran Yes, please add a check 7 Kukla That fixed it. ancestor descendant length 1 1 0 1 2 1 1 3 2 1 4 1 1 5 2 1 6 2 1 7 3 2 2 0 2 3 1 3 3 0 4 4 0 4 5 1 4 6 1 4 7 2 5 5 0 6 6 0 6 7 1 7 7 0
  • 20. Closure Table Example Queries Query ancestors of comment #7: SELECT c.* FROM Comments c JOIN Closure t ON (c.comment_id = t.ancestor) WHERE t.descendant = 7; Query subtree under comment #4: SELECT c.* FROM Comments c JOIN Closure t ON (c.comment_id = t.descendant) WHERE t.ancestor = 4;
  • 21. Closure Table Pros and Cons Pros: §Single non-recursive query to get a tree or a subtree §Referential integrity! Cons: §Extra table is required §Hierarchy is stored redundantly, too easy to mess up §Lots of joins to do most kinds of queries
  • 23. WITHer Recursive Queries in MySQL? SQL vendors gradually implemented SQL-99 WITH syntax: § IBM DB2 UDB 8 (Dec. 2002) § Microsoft SQL Server 2005 (Oct. 2005) § Sybase SQL Anywhere 11 (Aug. 2008) § Firebird 2.1 (Sep. 2008) § PostgreSQL 8.4 (Jul. 2009) § Oracle 11g release 2 (Sep. 2009) § Teradata (date and version of support unknown, at least 2009) § HSQLDB 2.3 (Jul. 2013) § SQLite 3.8.3.1 (Feb. 2014) § H2 (date and version unknown) https://www.percona.com/blog/2014/02/11/wither-recursive-queries/
  • 24. ANSI SQL Recursive Common Table Expression WITH RECURSIVE cte_name (col_name, col_name, col_name) AS ( subquery base case UNION ALL subquery referencing cte_name ) SELECT ... FROM cte_name ... https://dev.mysql.com/doc/refman/8.0/en/with.html
  • 25. Generating a Series of Numbers WITH RECURSIVE MySeries (n) AS ( SELECT 1 AS n UNION ALL SELECT 1+n FROM MySeries WHERE n < 10 ) SELECT * FROM MySeries; +------+ | n | +------+ | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | +------+
  • 26. Generating a Series of Dates WITH RECURSIVE MyDates (d) AS ( SELECT CURRENT_DATE() AS d UNION ALL SELECT d + INTERVAL 1 DAY FROM MyDates WHERE d < CURRENT_DATE() + INTERVAL 7 DAY ) SELECT * FROM MyDates; +------------+ | d | +------------+ | 2017-04-24 | | 2017-04-25 | | 2017-04-26 | | 2017-04-27 | | 2017-04-28 | | 2017-04-29 | | 2017-04-30 | | 2017-05-01 | +------------+
  • 27. Query ancestors of comment #7 WITH RECURSIVE CommentTree (comment_id, parent_id, author, comment, depth) AS ( SELECT comment_id, parent_id, author, comment, 0 AS depth FROM Comments WHERE comment_id = 7 UNION ALL SELECT c.comment_id, c.parent_id, c.author, c.comment, ct.depth+1 FROM CommentTree ct JOIN Comments c ON (ct.parent_id = c.comment_id) ) SELECT * FROM CommentTree;
  • 28.
  • 29. Query subtree under comment #4 WITH RECURSIVE CommentTree (comment_id, parent_id, author, comment, depth) AS ( SELECT comment_id, parent_id, author, comment, 0 AS depth FROM Comments WHERE comment_id = 4 UNION ALL SELECT c.comment_id, c.parent_id, c.author, c.comment, ct.depth+1 FROM CommentTree ct JOIN Comments c ON (ct.comment_id = c.parent_id) ) SELECT * FROM CommentTree;
  • 30. Recursive CTE Pros and Cons Pros: § ANSI SQL-99 Standard § Compatible with other SQL implementations § Works with Adjacency List (single source of authority) § Referential integrity! Cons: § Not compatible with earlier MySQL versions § Use of materialized temporary tables may cause performance problems
  • 31. MySQL CTE Implementation: 💯 Thanks to @MarkusWinand for his preview analysis based on 8.0.1-dmr http://modern-sql.com/feature/with
  • 33. ITIS: Sample Hierarchical Data Integrated Taxonomic Information System (https://www.itis.gov/) §Biological database of species of animals, plants, fungi §One big tree of 544,954 nodes §Data comes in adjacency list & path enumeration format §I converted to closure table for query tests
  • 34. ITIS Data Model mysql> select * from longnames where completename = 'Eschscholzia californica'; +--------+---------------------------+ | tsn | completename | +--------+---------------------------+ | 18956 | Eschscholzia californica | +--------+---------------------------+ mysql> select * from hierarchy where TSN = '18956'G TSN: 18956 Parent_TSN: 18954 level: 11 ChildrenCount: 8 hierarchy_string: 202422-954898-846494-954900-846496-846504-18063-846547-18409-18880-18954-18956
  • 35. Indexes mysql> ALTER TABLE hierarchy ADD KEY (tsn, parent_tsn); Query OK, 0 rows affected (1.30 sec)
  • 36. Breadcrumbs Query WITH RECURSIVE taxonomy AS ( SELECT base.tsn, base.parent_tsn, 0 as depth FROM hierarchy base WHERE tsn = '18956' UNION ALL SELECT next.tsn, next.parent_tsn, t.depth+1 FROM hierarchy next JOIN taxonomy t WHERE t.parent_tsn = next.tsn ) SELECT * FROM taxonomy JOIN longnames USING (tsn) ORDER BY depth DESC;
  • 37. Breadcrumbs Query Result +--------+------------+-------+--------------------------+ | tsn | parent_tsn | depth | completename | +--------+------------+-------+--------------------------+ | 202422 | 0 | 11 | Plantae | | 954898 | 202422 | 10 | Viridiplantae | | 846494 | 954898 | 9 | Streptophyta | | 954900 | 846494 | 8 | Embryophyta | | 846496 | 954900 | 7 | Tracheophyta | | 846504 | 846496 | 6 | Spermatophytina | | 18063 | 846504 | 5 | Magnoliopsida | | 846547 | 18063 | 4 | Ranunculanae | | 18409 | 846547 | 3 | Ranunculales | | 18880 | 18409 | 2 | Papaveraceae | | 18954 | 18880 | 1 | Eschscholzia | | 18956 | 18954 | 0 | Eschscholzia californica | +--------+------------+-------+--------------------------+ 12 rows in set (0.00 sec)
  • 38. Breadcrumbs Query EXPLAIN Plan §New note in Extra: "Recursive" §Using index (covering index) for both base case and recursive case §I can eliminate the filesort if I allow natural order (base case first) §No "Using Temporary"? Not so fast… +----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where; Using filesort | | 1 | PRIMARY | longnames | eq_ref | PRIMARY,tsn | PRIMARY | 4 | taxonomy.tsn | 1 | 100.00 | NULL | | 2 | DERIVED | base | ref | TSN | TSN | 4 | const | 1 | 100.00 | Using index | | 3 | UNION | t | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Recursive; Using where | | 3 | UNION | next | ref | TSN | TSN | 4 | t.parent_tsn | 1 | 100.00 | Using index | +----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+
  • 39. Breadcrumbs Query Performance mysql> SELECT * FROM SYS.STATEMENTS_WITH_TEMP_TABLESG query: WITH RECURSIVE `taxonomy` AS ( ... `tsn` ) ORDER BY `depth` DESC db: itis exec_count: 1 total_latency: 10.05 ms memory_tmp_tables: 1 disk_tmp_tables: 0 avg_tmp_tables_per_query: 1 tmp_tables_to_disk_pct: 0 first_seen: 2017-04-24 22:07:56 last_seen: 2017-04-24 22:07:56 digest: 8438633360bedce178823bb868589fd0
  • 40. Breadcrumbs Query Stages mysql> SELECT * FROM SYS.USER_SUMMARY_BY_STAGES; +------+--------------------------------+-------+---------------+-------------+ | user | event_name | total | total_latency | avg_latency | +------+--------------------------------+-------+---------------+-------------+ | root | stage/sql/System lock | 40 | 6.62 ms | 165.60 us | | root | stage/sql/Opening tables | 191 | 3.16 ms | 16.52 us | | root | stage/sql/checking permissions | 45 | 1.50 ms | 33.44 us | | root | stage/sql/Creating sort index | 1 | 239.63 us | 239.63 us | | root | stage/sql/closing tables | 191 | 191.03 us | 1.00 us | | root | stage/sql/starting | 2 | 188.44 us | 94.22 us | | root | stage/sql/Sending data | 6 | 138.96 us | 23.16 us | | root | stage/sql/statistics | 4 | 122.42 us | 30.60 us | | root | stage/sql/query end | 191 | 56.67 us | 296.00 ns | | root | stage/sql/preparing | 4 | 33.57 us | 8.39 us | | root | stage/sql/freeing items | 2 | 27.93 us | 13.96 us | | root | stage/sql/optimizing | 5 | 20.03 us | 4.01 us | | root | stage/sql/executing | 7 | 15.39 us | 2.20 us | | root | stage/sql/removing tmp table | 4 | 9.35 us | 2.34 us | | root | stage/sql/init | 3 | 8.76 us | 2.92 us | | root | stage/sql/Sorting result | 2 | 4.16 us | 2.08 us | | root | stage/sql/end | 3 | 1.93 us | 644.00 ns | | root | stage/sql/cleaning up | 2 | 1.43 us | 715.00 ns | +------+--------------------------------+-------+---------------+-------------+
  • 41. Tree Expansion Query Result See Demo
  • 42. Tree Expansion Query WITH RECURSIVE ancestors (tsn, parent_tsn) AS ( SELECT h.tsn, h.parent_tsn FROM hierarchy AS h WHERE h.tsn = %s UNION ALL SELECT h.tsn, h.parent_tsn FROM hierarchy AS h JOIN ancestors AS base ON h.tsn = base.parent_tsn ), breadcrumbs (tsn, parent_tsn, depth, breadcrumbs) AS ( SELECT h.tsn, h.parent_tsn, 0 AS depth, CAST(LPAD(h.tsn, 8, '0') AS CHAR(255)) AS breadcrumbs FROM hierarchy AS h WHERE h.parent_tsn = 0 UNION ALL SELECT h.tsn, h.parent_tsn, base.depth+1 AS depth, CONCAT(base.breadcrumbs, ',', LPAD(h.tsn, 8, '0')) FROM hierarchy AS h JOIN ancestors AS a ON h.tsn = a.tsn JOIN breadcrumbs AS base ON h.parent_tsn = base.tsn ) SELECT l.tsn, l.completename, b.depth, b.breadcrumbs FROM breadcrumbs AS b JOIN longnames AS l ON b.tsn = l.tsn UNION SELECT l.tsn, l.completename, b.depth+1, CONCAT(b.breadcrumbs, ',', LPAD(h.tsn, 8, '0')) FROM breadcrumbs AS b JOIN hierarchy AS h ON b.tsn = h.parent_tsn JOIN longnames AS l ON l.tsn = h.tsn ORDER BY breadcrumbs
  • 43. Tree Expansion Query EXPLAIN --------------+------------+--------+-------------+---------+-------------------+--------+----------+-------------------------------- select_type | table | type | key | key_len | ref | rows | filtered | Extra --------------+------------+--------+-------------+---------+-------------------+--------+----------+-------------------------------- PRIMARY | <derived2> | ALL | NULL | NULL | NULL | 250230 | 100.00 | Using where PRIMARY | l | eq_ref | PRIMARY | 4 | b.tsn | 1 | 100.00 | NULL DERIVED | h | index | TSN | 9 | NULL | 500466 | 10.00 | Using where; Using index UNION | base | ALL | NULL | NULL | NULL | 50046 | 100.00 | Recursive; Using where UNION | <derived4> | ALL | NULL | NULL | NULL | 4 | 100.00 | Using where; Using join buffer UNION | h | ref | TSN | 9 | a.tsn,base.tsn | 1 | 100.00 | Using index DERIVED | h | ref | TSN | 4 | const | 1 | 100.00 | Using index UNION | base | ALL | NULL | NULL | NULL | 2 | 100.00 | Recursive; Using where UNION | h | ref | TSN | 4 | base.parent_tsn | 1 | 100.00 | Using index UNION | h | index | TSN | 9 | NULL | 500466 | 100.00 | Using where; Using index UNION | l | eq_ref | PRIMARY | 4 | itis.h.TSN | 1 | 100.00 | NULL UNION | <derived2> | ref | <auto_key0> | 5 | itis.h.Parent_TSN | 10 | 100.00 | NULL | UNION RESULT | <union1,8> | ALL | NULL | NULL | NULL | NULL | NULL | Using temporary; Using filesort --------------+------------+--------+-------------+---------+-------------------+--------+----------+-------------------------------- Maybe I need more indexes? Unfortunately I ran out of time to analyze.
  • 44. Tree Expansion Query Performance mysql> SELECT * FROM SYS.STATEMENTS_WITH_TEMP_TABLESG query: WITH RECURSIVE `ancestors` ( ` ... `l` . `completename` , `b` . db: itis exec_count: 1 total_latency: 1.24 s memory_tmp_tables: 3 disk_tmp_tables: 0 avg_tmp_tables_per_query: 3 tmp_tables_to_disk_pct: 0 first_seen: 2017-04-27 01:33:14 last_seen: 2017-04-27 01:33:14 digest: 86c1417d2ff3679863db754eff425e94
  • 45. Tree Expansion Query Stages mysql> SELECT * FROM SYS.USER_SUMMARY_BY_STAGES; +------+--------------------------------+-------+---------------+-------------+ | user | event_name | total | total_latency | avg_latency | +------+--------------------------------+-------+---------------+-------------+ | root | stage/sql/Sending data | 12 | 979.42 ms | 81.62 ms | | root | stage/sql/System lock | 40 | 6.34 ms | 158.52 us | | root | stage/sql/Opening tables | 191 | 3.34 ms | 17.51 us | | root | stage/sql/checking permissions | 53 | 1.35 ms | 25.45 us | | root | stage/sql/starting | 2 | 356.31 us | 178.16 us | | root | stage/sql/statistics | 12 | 271.01 us | 22.58 us | | root | stage/sql/closing tables | 191 | 179.15 us | 937.00 ns | | root | stage/sql/preparing | 12 | 98.18 us | 8.18 us | | root | stage/sql/query end | 191 | 57.60 us | 301.00 ns | | root | stage/sql/freeing items | 2 | 47.93 us | 23.96 us | | root | stage/sql/Creating sort index | 1 | 37.38 us | 37.38 us | | root | stage/sql/optimizing | 13 | 30.60 us | 2.35 us | | root | stage/sql/executing | 13 | 30.27 us | 2.33 us | | root | stage/sql/removing tmp table | 14 | 24.44 us | 1.74 us | | root | stage/sql/init | 3 | 14.78 us | 4.93 us | | root | stage/sql/cleaning up | 2 | 11.66 us | 5.83 us | | root | stage/sql/Sorting result | 2 | 3.67 us | 1.84 us | | root | stage/sql/end | 3 | 3.04 us | 1.01 us | +------+--------------------------------+-------+---------------+-------------+
  • 47. Conclusions §Overall, MySQL 8 support for recursive CTE queries is worth the wait. §Exotic cases exist that are beyond any optimizer. §I'm excited to upgrade to MySQL 8.0.x ASAP! §Now that virtually all major SQL brands support recursive CTE's, we need developer tools and popular apps to use them!
  • 48. License and Copyright Copyright 2017 Bill Karwin http://www.slideshare.net/billkarwin Released under a Creative Commons 3.0 License: http://creativecommons.org/licenses/by-nc-nd/3.0/ You are free to share—to copy, distribute, and transmit this work, under the following conditions: Attribution. You must attribute this work to Bill Karwin. Noncommercial. You may not use this work for commercial purposes. No Derivative Works. You may not alter, transform, or build upon this work.