SlideShare a Scribd company logo
1 of 56
Download to read offline
Beyond EXPLAIN
Yuto Hayamizu
Ryoji Kawamichi
Query Optimization
From Theory To Code
2016/5/20
PGCon 2016 @ Ottawa
Before Relational …
• Querying was physical
• Need to understand
physical organization
• Navigate query execution
by yourself
2016/5/20 2
DEPT
EMP
PROJ
“Which file is this table stored in?”
“How are records linked?”
“Which access path is fast for this table?”
“What is the best order of joining tables”
…
Historically …
Before Relational …
• Querying was physical
• Need to understand
physical organization
• Navigate query execution
by yourself
2016/5/20 3
After Relational …
DEPT
EMP
PROJ
“Which file is this table stored in?”
“How are records linked?”
“Which access path is fast for this table?”
“What is the best order of joining tables”
…
• Querying is logical
• Physical organization is
black-boxed
• Just declare what you want
Historically …
Fill the Gap: Physical and Logical
2016/5/20 4
SELECT * FROM DEPT D, EMP E
WHERE E.D_ID = D.ID AND ...
• Storage I/O strategy
• Access path selection
• Join method selection
• Aggregation, sorting
• Resource allocation
• ...
Query Optimizer
If optimizer perfectly fills the gap...
2016/5/20 5
We don’t need EXPLAIN
Reality Is Tough
• Optimizer is NOT PERFECT
• Generated plans are not always optimal, sometimes
far from optimal
• We have to take care of physical behavior
• That’s why EXPLAIN is so much explained
2016/5/20 6
Go Beyond EXPLAIN
• Deeper understanding of optimization, better
control of your databases
• Theoretical fundamentals of query optimization
• From basic framework to cutting-edge technologies
• PostgreSQL Optimizer implementation
• Focusing on basic scan and join methods
• Behavior observation with TPC-H benchmark
2016/5/20 7
Outline
• Introduction
• Theory: Query Optimization Framework
• Code: PostgreSQL Optimizer
• Theory: Cutting-Edge Technologies Overview
• Summary
2016/5/20 8
Query Optimization Framework
• Cost-based optimization
• Plan selection with estimated execution cost
• Most of modern optimizers, including PostgreSQL,
are cost-based
• Rule-based optimization
• Plan selection with heuristically ranked rules
• Easy to produce the same result
• Hard to evaluate wide variety of plans
• Ex) Oracle (~10g), Hive (~0.13)
2016/5/20 9
Main Challenges in Cost-based Optimization
• Cost modeling is HARD
• Overhead of CPU, I/O, memory access, network, …
• Cardinality estimation is HARD
• Output size of scans, joins, aggregations, …
• Join ordering search is HARD
• Combinatorial explosion of join ordering and access path
• Exhaustive search is NP-hard
2016/5/20 10
System-R optimizer (1979)
• “The standard”
• Cost estimation with I/O and CPU
• Cardinality estimation with table statistics
• Bottom-up plan search
• Many of modern optimizers are “System-R style”
• PostgreSQL, MySQL, DB2, Oracle, ...
2016/5/20 11
Cost/Cardinality Estimation
• [#page fetched],[#storage API calls]
are estimated with cost formula and following
statistics
2016/5/20 12
CPU costI/O cost
COST = [#page fetched] + W * [#storage API calls]
weight parameter
• NCARD(T) ... the cardinality of relation T
• TCARD(T) ... the number of pages in relation T
• ICARD(I) ... the number of distinct keys in index I
• NINDX(I) ... the number of pages in index I
Bottom-up Plan Search
• Candidate plans for single relation
• The cheapest access path
• N-relation join ordering search
• Select the cheapest plans for each relation
• Then, find optimal join orderings of every 2-relation join
• Then, find optimal join orderings of every 3-relation join
• ... until N-relation
2016/5/20 13
Ex) A ⨝ B ⨝ C ⨝ D
Ex) A ⨝ B ⨝ C ⨝ D
Ex) A ⨝ B ⨝ C ⨝ D
Ex) A ⨝ B ⨝ C ⨝ D
Ex) A ⨝ B ⨝ C ⨝ D
Volcano/Cascades (1993)
• Top-down transformational plan search
• Yet another optimization approach
• Not well known as “System-R style”, but widely used in
practice
Ex) SQL Server, Apache Hive (Apache Calcite), Greenplum
Orca
• Extensible optimization framework
2016/5/20 19
Extensible Optimization Framework
Query Optimizer Generator
• Generalized expression of query plan not limited
to relational data model
• Users (optimizer developers) defines actual
implementations:
• Logical operator ... corresponds to relational algebra
• Physical algorithm ... corresponds to scan & join
methods such as sequential scan, index scan, hash
join, nested loop join
2016/5/20 20
Top-down Transformational Search
• Starts from an initial “logical plan”
• Generate alternative plans with:
A) Logical operator transformation
B) Physical algorithm selection
C) Enforcing sorting order
2016/5/20 21
Join Select T
Join
Select R Select S
Proj
Join
Select T
JoinSelect R
Select S
Proj
Change join ordering
Join Select T
Join
Select R
Select SProj
Projection push down
Example: 3-way join with projection
Top-down Transformational Search
• Starts from an initial “logical plan”
• Generate alternative plans with:
A) Logical operator transformation
B) Physical algorithm selection
C) Enforcing sorting order
2016/5/20 22
Join Select T
Join
Select R Select S
ProjExample: 3-way join with projection
HashJoin Select T
Join
SeqScan R SeqScan S
Proj
Join IdxScan T
Join
Select R Select S
Proj
…
Top-down Transformational Search
• Starts from an initial “logical plan”
• Generate alternative plans with:
A) Logical operator transformation
B) Physical algorithm selection
C) Enforcing sorting order
2016/5/20 23
Join Select T
Join
Select R Select S
Proj
Example: 3-way join with projection
Join Select T
Join
Sort Sort
Proj
Select R Select S
Enforce sorting order
merge join of R and S is possible now
Benefits of Top-down approach
• Possible to intentionally limit search space
• Effective pruning with branch-and-bound
• Limit search space with search time deadline
2016/5/20 24
Cost-based Optimization Basics
Two major cost-based optimization style
• System-R
• Cost modeling with statistics
• Bottom-up search
• Volcano/Cascades
• Extensible optimizer generator
• Cost estimation is user’s responsibility
• Top-down transformational search
2016/5/20 25
Outline
• Introduction
• Theory: Query Optimization Framework
• Code: PostgreSQL Optimizer
• Theory: Cutting-Edge Technologies Overview
• Summary
2016/5/20 26
PostgreSQL Optimizer
“System-R style” optimization
• Bottom-up plan search with dynamic programming
• CPU and I/O operation based cost modeling
2016/5/20 27
Seq. I/O Random I/O CPU cost per tuple
Cost of single operation
• seq_page_cost
• random_page_cost
• cpu_tuple_cost
• cpu_index_tuple_cost
• cpu_operator_cost
• (parallel_tuple_cost)
Estimated number of each operation
• Cardinality estimation with
statistics
• Cost formula for each plan type
• SeqScan, IndexScan
• NestLoopJoin, HashJoin,
MergeJoin, ...
Detailed Look At Basic Scan Types
• Sequential scan
• Efficient for accessing large potion of tables
• Index scan
• Efficient for accessing a fraction of data
2016/5/20 28
Execution cost
Query selectivity
Sequential scan
of SeqScan
29
= (# pages in a table)
= (# tuples in a table)
= #qual_operator
= (#tuples) × (weight factor of A)
+ (#tuples) × (weight factor of B)
+ ・・・
・・・ WHERE AND AND ・・・A B
cost_seqscan()
@optimizer/path/costsize.c
of IndexScan
Consists of:
(A) CPU cost of searching B+-tree
(B) CPU cost of scanning index tuples in leaf pages
(C) I/O cost of leaf pages
(D) I/O cost of heap pages
(E) CPU cost of scanning heap tuples
2014/12/04 30
of IndexScan
(A) B+-tree search
(B) Scanning index tuples in leaf pages
2014/12/04 31
+= log2(#index_tuples)
I/O cost of internal pages
Assumed to be always cached in the buffer
+= #qual_operator
× #leaf_pages ✕ #ituple_per_page × σ
Selectivity σ
Comes from statistics
Mackert and Lohman function(Yao function)
I/O count estimation with consideration of buffer caching
of IndexScan
(C) I/O cost of index leaf pages
2014/12/04 32
+= Y(effective_cache_size, #leaf_pages)
Selectivity σ
I/Ocount
of IndexScan
(D) I/O cost of heap pages
(E) CPU cost of scanning heap tuples
・ Estimate the number of scanned tuples from σ
2014/12/04 33
+= α2 × #match_pages
Correlation between index and heap ordering: α
α = 0 : I/O pattern is random α = 1 : I/O pattern is sequential
+= (1-α2) × #match_tuples
Detailed Look At Join Methods
• Hash join
• Efficient for joining large number of records
• Usually combined with sequential scans
• Nested Loop Join
• Efficient for joining small number of records
• Usually combined with index scans or small table
sequential scans
2016/5/20 34
of HashJoin
2014/12/04 35
of HashJoin
Build phase
• Cost += Cost(inner)
2014/12/04 36
+= #qual_op × #inner_tuples
+= #inner_tuples
Hashing cost
of HashJoin
2014/12/04 37
Build phase
• Cost += Cost(inner)
+= #qual_op × #inner_tuples
+= #inner_tuples
of HashJoin
2014/12/04 38
Build phase
• Cost += Cost(inner)+
Probe phase
• Cost += Cost(outer)+
+= #qual_op × #inner_tuples
+= #inner_tuples
+= #qual_op × (1 + #bucket_size × 0.5)
× #outer_tuples
+= #match_tuples
Hashing & table lookup (bucket search) cost
recordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordtuple
of HashJoin
2014/12/04 39
#buckets: 2 #buckets : 4
build
4 tuples are compared for
lookup in average 2 tuples are compared for lookup
in average
0.E+00
5.E+06
1.E+07
2.E+07
2.E+07
3.E+07
3.E+07
4.E+07
4.E+07
10000 100000 1000000 10000000 100000000
Estimated cost of 2-way HashJoin
# of records
16 records
recordrecordrecordrecordrecordrecordrecordtuple
recordrecordrecordrecordrecordrecordrecordtuple
recordrecordrecordtuple
recordrecordrecordtuple
recordrecordrecordtuple
recordrecordrecordtuple
of NestLoopJoin
2014/12/04 40
R ⨝ S Scan R r2
r1
r3
r4
Scan S with r1
s1
s2
s3
ReScan S with r2
s1
s2
s3
outer inner
of NestLoopJoin
• When #outer_tuples = 1
2014/12/04 41
R ⨝ S Scan R
r1
Scan S with r1
s1
s2
s3
outer inner
Cost = Cost(outer) + Cost(inner) +
+= #inner_tuples
+= #qual_operator × #inner_tuples
of NestLoopJoin
• When #outer_tuples > 1
R ⨝ S Scan R r2
r1
r3
r4
Scan S with r1
s1
s2
s3
ReScan S with r2
s1
s2
s3
outer inner
Cost = Cost(outer) + Cost(inner) +
+ (#outer_tuples - 1) × Cost(ReScan inner)
Higher buffer hit ratio in ReScan
→ Cost of ReScan is lower than cost of IndexScan
+= #inner_tuples × #outer_tuples
+= #qual_operator × #inner_tuples × #outer_tuples
See How It Works
• TPC-H Benchmark
• Specification and tools for benchmarking data
warehouse workload
• Open source implementation: DBT-3, pg_tpch
• Schema, data generation rules and queries
• Experiments with 100GB
• Scale Factor = 100
2016/5/20 43
Experimental Setup
• Dell R720xd
• Xeon (2sockets, 16cores)
• x24 NL-SAS HDD
• With PostgreSQL 9.5
• Default cost parameter settings
• SeqScan & HashJoin
• enable_seqscan = on, enable_hashjoin = on
and disables other methods
• IndexScan & NestLoopJoin
• enable_indexscan = on, enable_nestloop = on
and disables other methods
2016/5/20 44
TPC-H Q.1: The Simplest Case
2016/5/20 45
5.E+05
5.E+06
5.E+07
5.E+08
1 10 100 1000
Estimated cost
Selectivity (l_shipdate)
IndexScan
SeqScan
10
100
1000
10000
1 10 100 1000
Execution time (sec)
Selectivity(l_shipdate)
IndexScan
SeqScan
• Good trend estimation for each
method
• Estimated break-event point is
errorneus
• IndexScan should be more
expensive (need parameter
calibration)
SELECT count(*), ... FROM lineitem
WHERE l_shipdate BETWEEN [X] AND [Y]
TPC-H Q.3
2016/5/20 46
Estimated cost
SeqScan
customer
SeqScan
orders
SeqScan
lineitem
Hash
HashJoin
HashJoin
IndexScan
orders
IndexScan
lineitem
NestLoop
NestLoop
IndexScan
customer
Execution time (sec)
Selectivity
1.E+00
1.E+02
1.E+04
1.E+06
1.E+08
1 10 100 1000 10000 100000 1000000
1
10
100
1000
10000
1 10 100 1000 10000 100000 1000000
Selectivity
NestLoop+IndexScan
HashJoin+SeqScan
NestLoop+IndexScan
HashJoin+SeqScan Similar result as in Q.1
• Good trend estimation for each
• Erroneous break-event point
without parameter calibration
SELECT count(*), ...
FROM customer, orders, lineitem
WHERE c_custkey = o_custkey AND
o_orderkey = l_orderkey AND
c_custkey < [X] AND
c_mktsegment = ‘MACHINERY’;
100
1000
10000
100000
1 10 100 1000 10000
More Complex Case
TPC-H Q.4: Semi-Join Query
• Plan selection for semi-
join tend to be unstable
2016/5/20 47Selectivity
1.E+05
1.E+06
1.E+07
1.E+08
1.E+09
1 10 100 1000 10000
HashJoin+SeqScan
NestLoop+IndexScan
Estimated cost
Execution time (sec)
HashJoin+SeqScan
NestLoop+IndexScan SELECT count(*), ...
FROM orders
WHERE
o_orderdate >= ‘1995-01-01’ AND
o_orderdate < ‘1995-01-01’
+ interval ‘3 month’ AND
EXISTS(
SELECT * FROM lineitem
WHERE l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate)
Selectivity
1.E+06
1.E+07
1.E+08
1.E+09
1.E+10
1.E+11
1.E+00 1.E+02 1.E+04 1.E+06 1.E+08
More Complex Case
TPC-H Q.22: Anti-Join Query
• Difficulties in overall cost
trend estimation
2016/5/20 48
Selectivity
1
10
100
1000
10000
1.E+00 1.E+02 1.E+04 1.E+06 1.E+08
Estimated cost
Execution time (sec)
Selectivity
HashJoin+SeqScan
NestLoop+IndexScan
HashJoin+SeqScan
NestLoop+IndexScan
SELECT count(*), ...
FROM supplier, lineitem l1, orders, nation
WHERE s_suppkey = l1.l_suppkey AND
o_orderkey = l1.l_orderkey AND
o_orderstatus = 'F' AND
l1.l_receiptdate > l1.l_commitdate AND
EXISTS (
SELECT * FROM lineitem l2
WHERE l2.l_orderkey = l1.l_orderkey
AND l2.l_suppkey <> l1.l_suppkey)
AND NOT EXIST (
SELECT * FROM lineitem l3
WHERE l3.l_orderkey = l1.l_orderkey
AND l3.l_suppkey <> l1.l_suppkey
AND l3.l_receiptdate > l3.l_commitdate)
AND s_nationkey = n_nationkey
AND n_name = ‘JAPAN'
Summary: PostgreSQL
Optimizer
• Detailed look at cost modeling of basic methods
• SeqScan, IndexScan
• HashJoin, NestedLoopJoin
• Observation with TPC-H benchmark
• Good cost trend estimation for simple join queries
• Erroneous cheapest plan selection without parameter tuning
• Difficulties with semi-join and anti-join queries
2016/5/20 49
Outline
• Introduction
• Theory: Query Optimization Framework
• Code: PostgreSQL Optimizer
• Theory: Cutting-Edge Technologies Overview
• Summary
2016/5/20 50
Cutting-Edge Technologies
• Traditional optimization was a “closed” problem
• “Rethink the contract” ー Surajit Chaudhuri
• Feedback from previous execution
• Dynamic integration with execution
2016/5/20 51
cardinality
estimation
cost model
plan space
enumeration
(SQL) query plan
Mid-query Re-optimization
• Detects sub-optimality of executing query plan
• Query plans are annotated for later estimation
improvement
• Runtime statistics collection
• Statistics collector probes are inserted into operators of
executing query plan
• Plan modification strategy
• Discard current execution and re-optimize whole plan
• Re-optimizer only subtree of the plan that are not
started yet
• Save partial execution result and generate new SQL
using the result
2016/5/20 52
[N. Kabra et.al., SIGMOD’98]
Plan Bouquet
• Generate a set of plans for each selectivity range
• Estimation improvement with runtime statistics
collection
• Evaluation with PostgreSQL
2016/5/20 53
[A. Dutt et.al., SIGMOD’14]
Bounding Impact of Estimation Error
• “Uncertainty” analysis of cost estimation
• Optimality sensitivity to estimation error
• Execute partially to reduce uncertainty
2016/5/20 54
[T. Neumann et.al., BTW Conf ‘13]
Outline
• Introduction
• Theory: Query Optimization Framework
• Code: PostgreSQL Optimizer
• Theory: Cutting-Edge Technologies Overview
• Summary
2016/5/20 55
Summary
• Cost-based optimization framework
• System-R style bottom-up optimization
• Volcano style top-down optimization
• Detailed look at PostgreSQL optimizer
• Cost modeling of basic scan and join method
• Experiment with TPC-H benchmark
• Brief overview of cutting-edge technologies
2016/5/20 56

More Related Content

What's hot

雑なMySQLパフォーマンスチューニング
雑なMySQLパフォーマンスチューニング雑なMySQLパフォーマンスチューニング
雑なMySQLパフォーマンスチューニングyoku0825
 
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~Miki Shimogai
 
SQLチューニング入門 入門編
SQLチューニング入門 入門編SQLチューニング入門 入門編
SQLチューニング入門 入門編Miki Shimogai
 
PostgreSQLの関数属性を知ろう
PostgreSQLの関数属性を知ろうPostgreSQLの関数属性を知ろう
PostgreSQLの関数属性を知ろうkasaharatt
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxData
 
Understanding SQL Trace, TKPROF and Execution Plan for beginners
Understanding SQL Trace, TKPROF and Execution Plan for beginnersUnderstanding SQL Trace, TKPROF and Execution Plan for beginners
Understanding SQL Trace, TKPROF and Execution Plan for beginnersCarlos Sierra
 
問合せ最適化インサイド
問合せ最適化インサイド問合せ最適化インサイド
問合せ最適化インサイドTakahiro Itagaki
 
Understanding my database through SQL*Plus using the free tool eDB360
Understanding my database through SQL*Plus using the free tool eDB360Understanding my database through SQL*Plus using the free tool eDB360
Understanding my database through SQL*Plus using the free tool eDB360Carlos Sierra
 
Adapting and adopting spm v04
Adapting and adopting spm v04Adapting and adopting spm v04
Adapting and adopting spm v04Carlos Sierra
 
Memoizeの仕組み(第41回PostgreSQLアンカンファレンス@オンライン 発表資料)
Memoizeの仕組み(第41回PostgreSQLアンカンファレンス@オンライン 発表資料)Memoizeの仕組み(第41回PostgreSQLアンカンファレンス@オンライン 発表資料)
Memoizeの仕組み(第41回PostgreSQLアンカンファレンス@オンライン 発表資料)NTT DATA Technology & Innovation
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkBo Yang
 
Oracleの実行計画を読んでみよう! #dbts2017
Oracleの実行計画を読んでみよう!  #dbts2017Oracleの実行計画を読んでみよう!  #dbts2017
Oracleの実行計画を読んでみよう! #dbts2017Ryota Watabe
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesTanel Poder
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundMasahiko Sawada
 

What's hot (20)

雑なMySQLパフォーマンスチューニング
雑なMySQLパフォーマンスチューニング雑なMySQLパフォーマンスチューニング
雑なMySQLパフォーマンスチューニング
 
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
 
SQLチューニング入門 入門編
SQLチューニング入門 入門編SQLチューニング入門 入門編
SQLチューニング入門 入門編
 
PostgreSQLの関数属性を知ろう
PostgreSQLの関数属性を知ろうPostgreSQLの関数属性を知ろう
PostgreSQLの関数属性を知ろう
 
使ってみませんか?pg_hint_plan
使ってみませんか?pg_hint_plan使ってみませんか?pg_hint_plan
使ってみませんか?pg_hint_plan
 
SQL Tuning 101
SQL Tuning 101SQL Tuning 101
SQL Tuning 101
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
Understanding SQL Trace, TKPROF and Execution Plan for beginners
Understanding SQL Trace, TKPROF and Execution Plan for beginnersUnderstanding SQL Trace, TKPROF and Execution Plan for beginners
Understanding SQL Trace, TKPROF and Execution Plan for beginners
 
問合せ最適化インサイド
問合せ最適化インサイド問合せ最適化インサイド
問合せ最適化インサイド
 
Vacuum徹底解説
Vacuum徹底解説Vacuum徹底解説
Vacuum徹底解説
 
Understanding my database through SQL*Plus using the free tool eDB360
Understanding my database through SQL*Plus using the free tool eDB360Understanding my database through SQL*Plus using the free tool eDB360
Understanding my database through SQL*Plus using the free tool eDB360
 
Adapting and adopting spm v04
Adapting and adopting spm v04Adapting and adopting spm v04
Adapting and adopting spm v04
 
Memoizeの仕組み(第41回PostgreSQLアンカンファレンス@オンライン 発表資料)
Memoizeの仕組み(第41回PostgreSQLアンカンファレンス@オンライン 発表資料)Memoizeの仕組み(第41回PostgreSQLアンカンファレンス@オンライン 発表資料)
Memoizeの仕組み(第41回PostgreSQLアンカンファレンス@オンライン 発表資料)
 
SQLd360
SQLd360SQLd360
SQLd360
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
 
PostgreSQL Query Cache - "pqc"
PostgreSQL Query Cache - "pqc"PostgreSQL Query Cache - "pqc"
PostgreSQL Query Cache - "pqc"
 
Oracleの実行計画を読んでみよう! #dbts2017
Oracleの実行計画を読んでみよう!  #dbts2017Oracleの実行計画を読んでみよう!  #dbts2017
Oracleの実行計画を読んでみよう! #dbts2017
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling Examples
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 

Similar to Beyond EXPLAIN: Query Optimization From Theory To Code

Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Databricks
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Databricks
 
MongoDB's New Aggregation framework
MongoDB's New Aggregation frameworkMongoDB's New Aggregation framework
MongoDB's New Aggregation frameworkChris Westin
 
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineDataWorks Summit
 
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Gruter
 
mongodb-aggregation-may-2012
mongodb-aggregation-may-2012mongodb-aggregation-may-2012
mongodb-aggregation-may-2012Chris Westin
 
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...Spark Summit
 
Scalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OScalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OSri Ambati
 
Accelerating SQL queries in NoSQL Databases using Apache Drill and Secondary ...
Accelerating SQL queries in NoSQL Databases using Apache Drill and Secondary ...Accelerating SQL queries in NoSQL Databases using Apache Drill and Secondary ...
Accelerating SQL queries in NoSQL Databases using Apache Drill and Secondary ...Aman Sinha
 
Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2OSri Ambati
 
From Pipelines to Refineries: Scaling Big Data Applications
From Pipelines to Refineries: Scaling Big Data ApplicationsFrom Pipelines to Refineries: Scaling Big Data Applications
From Pipelines to Refineries: Scaling Big Data ApplicationsDatabricks
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-Systeminside-BigData.com
 
From flat files to deconstructed database
From flat files to deconstructed databaseFrom flat files to deconstructed database
From flat files to deconstructed databaseJulien Le Dem
 
Apache Mahout 於電子商務的應用
Apache Mahout 於電子商務的應用Apache Mahout 於電子商務的應用
Apache Mahout 於電子商務的應用James Chen
 
PostgreSQL 9.4 and Beyond @ FOSSASIA 2015 Singapore
PostgreSQL 9.4 and Beyond @ FOSSASIA 2015 SingaporePostgreSQL 9.4 and Beyond @ FOSSASIA 2015 Singapore
PostgreSQL 9.4 and Beyond @ FOSSASIA 2015 SingaporeSatoshi Nagayasu
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterDatabricks
 

Similar to Beyond EXPLAIN: Query Optimization From Theory To Code (20)

Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
 
MongoDB's New Aggregation framework
MongoDB's New Aggregation frameworkMongoDB's New Aggregation framework
MongoDB's New Aggregation framework
 
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
 
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
 
mongodb-aggregation-may-2012
mongodb-aggregation-may-2012mongodb-aggregation-may-2012
mongodb-aggregation-may-2012
 
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
 
Scalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OScalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2O
 
Accelerating SQL queries in NoSQL Databases using Apache Drill and Secondary ...
Accelerating SQL queries in NoSQL Databases using Apache Drill and Secondary ...Accelerating SQL queries in NoSQL Databases using Apache Drill and Secondary ...
Accelerating SQL queries in NoSQL Databases using Apache Drill and Secondary ...
 
DB
DBDB
DB
 
Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
 
Intro_2.ppt
Intro_2.pptIntro_2.ppt
Intro_2.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
From Pipelines to Refineries: Scaling Big Data Applications
From Pipelines to Refineries: Scaling Big Data ApplicationsFrom Pipelines to Refineries: Scaling Big Data Applications
From Pipelines to Refineries: Scaling Big Data Applications
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
 
From flat files to deconstructed database
From flat files to deconstructed databaseFrom flat files to deconstructed database
From flat files to deconstructed database
 
Apache Mahout 於電子商務的應用
Apache Mahout 於電子商務的應用Apache Mahout 於電子商務的應用
Apache Mahout 於電子商務的應用
 
PostgreSQL 9.4 and Beyond @ FOSSASIA 2015 Singapore
PostgreSQL 9.4 and Beyond @ FOSSASIA 2015 SingaporePostgreSQL 9.4 and Beyond @ FOSSASIA 2015 Singapore
PostgreSQL 9.4 and Beyond @ FOSSASIA 2015 Singapore
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim Hunter
 

More from Yuto Hayamizu

VLDB2015 会議報告
VLDB2015 会議報告VLDB2015 会議報告
VLDB2015 会議報告Yuto Hayamizu
 
The Art of Performance Evaluation
The Art of Performance EvaluationThe Art of Performance Evaluation
The Art of Performance EvaluationYuto Hayamizu
 
Fine-grained Measurement of PostgreSQL
Fine-grained Measurement of PostgreSQLFine-grained Measurement of PostgreSQL
Fine-grained Measurement of PostgreSQLYuto Hayamizu
 
Emacs Lispのある生活
Emacs Lispのある生活Emacs Lispのある生活
Emacs Lispのある生活Yuto Hayamizu
 
いいかげんな人のためのTransactional Memory Primer
いいかげんな人のためのTransactional Memory Primerいいかげんな人のためのTransactional Memory Primer
いいかげんな人のためのTransactional Memory PrimerYuto Hayamizu
 
C-x C-f のなかのひと
C-x C-f のなかのひとC-x C-f のなかのひと
C-x C-f のなかのひとYuto Hayamizu
 

More from Yuto Hayamizu (7)

VLDB2015 会議報告
VLDB2015 会議報告VLDB2015 会議報告
VLDB2015 会議報告
 
The Art of Performance Evaluation
The Art of Performance EvaluationThe Art of Performance Evaluation
The Art of Performance Evaluation
 
Fine-grained Measurement of PostgreSQL
Fine-grained Measurement of PostgreSQLFine-grained Measurement of PostgreSQL
Fine-grained Measurement of PostgreSQL
 
指の話
指の話指の話
指の話
 
Emacs Lispのある生活
Emacs Lispのある生活Emacs Lispのある生活
Emacs Lispのある生活
 
いいかげんな人のためのTransactional Memory Primer
いいかげんな人のためのTransactional Memory Primerいいかげんな人のためのTransactional Memory Primer
いいかげんな人のためのTransactional Memory Primer
 
C-x C-f のなかのひと
C-x C-f のなかのひとC-x C-f のなかのひと
C-x C-f のなかのひと
 

Recently uploaded

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Beyond EXPLAIN: Query Optimization From Theory To Code

  • 1. Beyond EXPLAIN Yuto Hayamizu Ryoji Kawamichi Query Optimization From Theory To Code 2016/5/20 PGCon 2016 @ Ottawa
  • 2. Before Relational … • Querying was physical • Need to understand physical organization • Navigate query execution by yourself 2016/5/20 2 DEPT EMP PROJ “Which file is this table stored in?” “How are records linked?” “Which access path is fast for this table?” “What is the best order of joining tables” … Historically …
  • 3. Before Relational … • Querying was physical • Need to understand physical organization • Navigate query execution by yourself 2016/5/20 3 After Relational … DEPT EMP PROJ “Which file is this table stored in?” “How are records linked?” “Which access path is fast for this table?” “What is the best order of joining tables” … • Querying is logical • Physical organization is black-boxed • Just declare what you want Historically …
  • 4. Fill the Gap: Physical and Logical 2016/5/20 4 SELECT * FROM DEPT D, EMP E WHERE E.D_ID = D.ID AND ... • Storage I/O strategy • Access path selection • Join method selection • Aggregation, sorting • Resource allocation • ... Query Optimizer
  • 5. If optimizer perfectly fills the gap... 2016/5/20 5 We don’t need EXPLAIN
  • 6. Reality Is Tough • Optimizer is NOT PERFECT • Generated plans are not always optimal, sometimes far from optimal • We have to take care of physical behavior • That’s why EXPLAIN is so much explained 2016/5/20 6
  • 7. Go Beyond EXPLAIN • Deeper understanding of optimization, better control of your databases • Theoretical fundamentals of query optimization • From basic framework to cutting-edge technologies • PostgreSQL Optimizer implementation • Focusing on basic scan and join methods • Behavior observation with TPC-H benchmark 2016/5/20 7
  • 8. Outline • Introduction • Theory: Query Optimization Framework • Code: PostgreSQL Optimizer • Theory: Cutting-Edge Technologies Overview • Summary 2016/5/20 8
  • 9. Query Optimization Framework • Cost-based optimization • Plan selection with estimated execution cost • Most of modern optimizers, including PostgreSQL, are cost-based • Rule-based optimization • Plan selection with heuristically ranked rules • Easy to produce the same result • Hard to evaluate wide variety of plans • Ex) Oracle (~10g), Hive (~0.13) 2016/5/20 9
  • 10. Main Challenges in Cost-based Optimization • Cost modeling is HARD • Overhead of CPU, I/O, memory access, network, … • Cardinality estimation is HARD • Output size of scans, joins, aggregations, … • Join ordering search is HARD • Combinatorial explosion of join ordering and access path • Exhaustive search is NP-hard 2016/5/20 10
  • 11. System-R optimizer (1979) • “The standard” • Cost estimation with I/O and CPU • Cardinality estimation with table statistics • Bottom-up plan search • Many of modern optimizers are “System-R style” • PostgreSQL, MySQL, DB2, Oracle, ... 2016/5/20 11
  • 12. Cost/Cardinality Estimation • [#page fetched],[#storage API calls] are estimated with cost formula and following statistics 2016/5/20 12 CPU costI/O cost COST = [#page fetched] + W * [#storage API calls] weight parameter • NCARD(T) ... the cardinality of relation T • TCARD(T) ... the number of pages in relation T • ICARD(I) ... the number of distinct keys in index I • NINDX(I) ... the number of pages in index I
  • 13. Bottom-up Plan Search • Candidate plans for single relation • The cheapest access path • N-relation join ordering search • Select the cheapest plans for each relation • Then, find optimal join orderings of every 2-relation join • Then, find optimal join orderings of every 3-relation join • ... until N-relation 2016/5/20 13
  • 14. Ex) A ⨝ B ⨝ C ⨝ D
  • 15. Ex) A ⨝ B ⨝ C ⨝ D
  • 16. Ex) A ⨝ B ⨝ C ⨝ D
  • 17. Ex) A ⨝ B ⨝ C ⨝ D
  • 18. Ex) A ⨝ B ⨝ C ⨝ D
  • 19. Volcano/Cascades (1993) • Top-down transformational plan search • Yet another optimization approach • Not well known as “System-R style”, but widely used in practice Ex) SQL Server, Apache Hive (Apache Calcite), Greenplum Orca • Extensible optimization framework 2016/5/20 19
  • 20. Extensible Optimization Framework Query Optimizer Generator • Generalized expression of query plan not limited to relational data model • Users (optimizer developers) defines actual implementations: • Logical operator ... corresponds to relational algebra • Physical algorithm ... corresponds to scan & join methods such as sequential scan, index scan, hash join, nested loop join 2016/5/20 20
  • 21. Top-down Transformational Search • Starts from an initial “logical plan” • Generate alternative plans with: A) Logical operator transformation B) Physical algorithm selection C) Enforcing sorting order 2016/5/20 21 Join Select T Join Select R Select S Proj Join Select T JoinSelect R Select S Proj Change join ordering Join Select T Join Select R Select SProj Projection push down Example: 3-way join with projection
  • 22. Top-down Transformational Search • Starts from an initial “logical plan” • Generate alternative plans with: A) Logical operator transformation B) Physical algorithm selection C) Enforcing sorting order 2016/5/20 22 Join Select T Join Select R Select S ProjExample: 3-way join with projection HashJoin Select T Join SeqScan R SeqScan S Proj Join IdxScan T Join Select R Select S Proj …
  • 23. Top-down Transformational Search • Starts from an initial “logical plan” • Generate alternative plans with: A) Logical operator transformation B) Physical algorithm selection C) Enforcing sorting order 2016/5/20 23 Join Select T Join Select R Select S Proj Example: 3-way join with projection Join Select T Join Sort Sort Proj Select R Select S Enforce sorting order merge join of R and S is possible now
  • 24. Benefits of Top-down approach • Possible to intentionally limit search space • Effective pruning with branch-and-bound • Limit search space with search time deadline 2016/5/20 24
  • 25. Cost-based Optimization Basics Two major cost-based optimization style • System-R • Cost modeling with statistics • Bottom-up search • Volcano/Cascades • Extensible optimizer generator • Cost estimation is user’s responsibility • Top-down transformational search 2016/5/20 25
  • 26. Outline • Introduction • Theory: Query Optimization Framework • Code: PostgreSQL Optimizer • Theory: Cutting-Edge Technologies Overview • Summary 2016/5/20 26
  • 27. PostgreSQL Optimizer “System-R style” optimization • Bottom-up plan search with dynamic programming • CPU and I/O operation based cost modeling 2016/5/20 27 Seq. I/O Random I/O CPU cost per tuple Cost of single operation • seq_page_cost • random_page_cost • cpu_tuple_cost • cpu_index_tuple_cost • cpu_operator_cost • (parallel_tuple_cost) Estimated number of each operation • Cardinality estimation with statistics • Cost formula for each plan type • SeqScan, IndexScan • NestLoopJoin, HashJoin, MergeJoin, ...
  • 28. Detailed Look At Basic Scan Types • Sequential scan • Efficient for accessing large potion of tables • Index scan • Efficient for accessing a fraction of data 2016/5/20 28 Execution cost Query selectivity Sequential scan
  • 29. of SeqScan 29 = (# pages in a table) = (# tuples in a table) = #qual_operator = (#tuples) × (weight factor of A) + (#tuples) × (weight factor of B) + ・・・ ・・・ WHERE AND AND ・・・A B cost_seqscan() @optimizer/path/costsize.c
  • 30. of IndexScan Consists of: (A) CPU cost of searching B+-tree (B) CPU cost of scanning index tuples in leaf pages (C) I/O cost of leaf pages (D) I/O cost of heap pages (E) CPU cost of scanning heap tuples 2014/12/04 30
  • 31. of IndexScan (A) B+-tree search (B) Scanning index tuples in leaf pages 2014/12/04 31 += log2(#index_tuples) I/O cost of internal pages Assumed to be always cached in the buffer += #qual_operator × #leaf_pages ✕ #ituple_per_page × σ Selectivity σ Comes from statistics
  • 32. Mackert and Lohman function(Yao function) I/O count estimation with consideration of buffer caching of IndexScan (C) I/O cost of index leaf pages 2014/12/04 32 += Y(effective_cache_size, #leaf_pages) Selectivity σ I/Ocount
  • 33. of IndexScan (D) I/O cost of heap pages (E) CPU cost of scanning heap tuples ・ Estimate the number of scanned tuples from σ 2014/12/04 33 += α2 × #match_pages Correlation between index and heap ordering: α α = 0 : I/O pattern is random α = 1 : I/O pattern is sequential += (1-α2) × #match_tuples
  • 34. Detailed Look At Join Methods • Hash join • Efficient for joining large number of records • Usually combined with sequential scans • Nested Loop Join • Efficient for joining small number of records • Usually combined with index scans or small table sequential scans 2016/5/20 34
  • 36. of HashJoin Build phase • Cost += Cost(inner) 2014/12/04 36 += #qual_op × #inner_tuples += #inner_tuples Hashing cost
  • 37. of HashJoin 2014/12/04 37 Build phase • Cost += Cost(inner) += #qual_op × #inner_tuples += #inner_tuples
  • 38. of HashJoin 2014/12/04 38 Build phase • Cost += Cost(inner)+ Probe phase • Cost += Cost(outer)+ += #qual_op × #inner_tuples += #inner_tuples += #qual_op × (1 + #bucket_size × 0.5) × #outer_tuples += #match_tuples Hashing & table lookup (bucket search) cost
  • 39. recordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordtuple of HashJoin 2014/12/04 39 #buckets: 2 #buckets : 4 build 4 tuples are compared for lookup in average 2 tuples are compared for lookup in average 0.E+00 5.E+06 1.E+07 2.E+07 2.E+07 3.E+07 3.E+07 4.E+07 4.E+07 10000 100000 1000000 10000000 100000000 Estimated cost of 2-way HashJoin # of records 16 records recordrecordrecordrecordrecordrecordrecordtuple recordrecordrecordrecordrecordrecordrecordtuple recordrecordrecordtuple recordrecordrecordtuple recordrecordrecordtuple recordrecordrecordtuple
  • 40. of NestLoopJoin 2014/12/04 40 R ⨝ S Scan R r2 r1 r3 r4 Scan S with r1 s1 s2 s3 ReScan S with r2 s1 s2 s3 outer inner
  • 41. of NestLoopJoin • When #outer_tuples = 1 2014/12/04 41 R ⨝ S Scan R r1 Scan S with r1 s1 s2 s3 outer inner Cost = Cost(outer) + Cost(inner) + += #inner_tuples += #qual_operator × #inner_tuples
  • 42. of NestLoopJoin • When #outer_tuples > 1 R ⨝ S Scan R r2 r1 r3 r4 Scan S with r1 s1 s2 s3 ReScan S with r2 s1 s2 s3 outer inner Cost = Cost(outer) + Cost(inner) + + (#outer_tuples - 1) × Cost(ReScan inner) Higher buffer hit ratio in ReScan → Cost of ReScan is lower than cost of IndexScan += #inner_tuples × #outer_tuples += #qual_operator × #inner_tuples × #outer_tuples
  • 43. See How It Works • TPC-H Benchmark • Specification and tools for benchmarking data warehouse workload • Open source implementation: DBT-3, pg_tpch • Schema, data generation rules and queries • Experiments with 100GB • Scale Factor = 100 2016/5/20 43
  • 44. Experimental Setup • Dell R720xd • Xeon (2sockets, 16cores) • x24 NL-SAS HDD • With PostgreSQL 9.5 • Default cost parameter settings • SeqScan & HashJoin • enable_seqscan = on, enable_hashjoin = on and disables other methods • IndexScan & NestLoopJoin • enable_indexscan = on, enable_nestloop = on and disables other methods 2016/5/20 44
  • 45. TPC-H Q.1: The Simplest Case 2016/5/20 45 5.E+05 5.E+06 5.E+07 5.E+08 1 10 100 1000 Estimated cost Selectivity (l_shipdate) IndexScan SeqScan 10 100 1000 10000 1 10 100 1000 Execution time (sec) Selectivity(l_shipdate) IndexScan SeqScan • Good trend estimation for each method • Estimated break-event point is errorneus • IndexScan should be more expensive (need parameter calibration) SELECT count(*), ... FROM lineitem WHERE l_shipdate BETWEEN [X] AND [Y]
  • 46. TPC-H Q.3 2016/5/20 46 Estimated cost SeqScan customer SeqScan orders SeqScan lineitem Hash HashJoin HashJoin IndexScan orders IndexScan lineitem NestLoop NestLoop IndexScan customer Execution time (sec) Selectivity 1.E+00 1.E+02 1.E+04 1.E+06 1.E+08 1 10 100 1000 10000 100000 1000000 1 10 100 1000 10000 1 10 100 1000 10000 100000 1000000 Selectivity NestLoop+IndexScan HashJoin+SeqScan NestLoop+IndexScan HashJoin+SeqScan Similar result as in Q.1 • Good trend estimation for each • Erroneous break-event point without parameter calibration SELECT count(*), ... FROM customer, orders, lineitem WHERE c_custkey = o_custkey AND o_orderkey = l_orderkey AND c_custkey < [X] AND c_mktsegment = ‘MACHINERY’;
  • 47. 100 1000 10000 100000 1 10 100 1000 10000 More Complex Case TPC-H Q.4: Semi-Join Query • Plan selection for semi- join tend to be unstable 2016/5/20 47Selectivity 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1 10 100 1000 10000 HashJoin+SeqScan NestLoop+IndexScan Estimated cost Execution time (sec) HashJoin+SeqScan NestLoop+IndexScan SELECT count(*), ... FROM orders WHERE o_orderdate >= ‘1995-01-01’ AND o_orderdate < ‘1995-01-01’ + interval ‘3 month’ AND EXISTS( SELECT * FROM lineitem WHERE l_orderkey = o_orderkey AND l_commitdate < l_receiptdate) Selectivity
  • 48. 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+00 1.E+02 1.E+04 1.E+06 1.E+08 More Complex Case TPC-H Q.22: Anti-Join Query • Difficulties in overall cost trend estimation 2016/5/20 48 Selectivity 1 10 100 1000 10000 1.E+00 1.E+02 1.E+04 1.E+06 1.E+08 Estimated cost Execution time (sec) Selectivity HashJoin+SeqScan NestLoop+IndexScan HashJoin+SeqScan NestLoop+IndexScan SELECT count(*), ... FROM supplier, lineitem l1, orders, nation WHERE s_suppkey = l1.l_suppkey AND o_orderkey = l1.l_orderkey AND o_orderstatus = 'F' AND l1.l_receiptdate > l1.l_commitdate AND EXISTS ( SELECT * FROM lineitem l2 WHERE l2.l_orderkey = l1.l_orderkey AND l2.l_suppkey <> l1.l_suppkey) AND NOT EXIST ( SELECT * FROM lineitem l3 WHERE l3.l_orderkey = l1.l_orderkey AND l3.l_suppkey <> l1.l_suppkey AND l3.l_receiptdate > l3.l_commitdate) AND s_nationkey = n_nationkey AND n_name = ‘JAPAN'
  • 49. Summary: PostgreSQL Optimizer • Detailed look at cost modeling of basic methods • SeqScan, IndexScan • HashJoin, NestedLoopJoin • Observation with TPC-H benchmark • Good cost trend estimation for simple join queries • Erroneous cheapest plan selection without parameter tuning • Difficulties with semi-join and anti-join queries 2016/5/20 49
  • 50. Outline • Introduction • Theory: Query Optimization Framework • Code: PostgreSQL Optimizer • Theory: Cutting-Edge Technologies Overview • Summary 2016/5/20 50
  • 51. Cutting-Edge Technologies • Traditional optimization was a “closed” problem • “Rethink the contract” ー Surajit Chaudhuri • Feedback from previous execution • Dynamic integration with execution 2016/5/20 51 cardinality estimation cost model plan space enumeration (SQL) query plan
  • 52. Mid-query Re-optimization • Detects sub-optimality of executing query plan • Query plans are annotated for later estimation improvement • Runtime statistics collection • Statistics collector probes are inserted into operators of executing query plan • Plan modification strategy • Discard current execution and re-optimize whole plan • Re-optimizer only subtree of the plan that are not started yet • Save partial execution result and generate new SQL using the result 2016/5/20 52 [N. Kabra et.al., SIGMOD’98]
  • 53. Plan Bouquet • Generate a set of plans for each selectivity range • Estimation improvement with runtime statistics collection • Evaluation with PostgreSQL 2016/5/20 53 [A. Dutt et.al., SIGMOD’14]
  • 54. Bounding Impact of Estimation Error • “Uncertainty” analysis of cost estimation • Optimality sensitivity to estimation error • Execute partially to reduce uncertainty 2016/5/20 54 [T. Neumann et.al., BTW Conf ‘13]
  • 55. Outline • Introduction • Theory: Query Optimization Framework • Code: PostgreSQL Optimizer • Theory: Cutting-Edge Technologies Overview • Summary 2016/5/20 55
  • 56. Summary • Cost-based optimization framework • System-R style bottom-up optimization • Volcano style top-down optimization • Detailed look at PostgreSQL optimizer • Cost modeling of basic scan and join method • Experiment with TPC-H benchmark • Brief overview of cutting-edge technologies 2016/5/20 56