SlideShare a Scribd company logo
1 of 49
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
MySQL Optimizer Overview
Olav Sandstå
Senior Principal Engineer
MySQL Optimizer Team, Oracle
April 19, 2016
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Program Agenda
Logical transformations
Cost-based optimizations
Analyzing access methods
Join optimizer
Plan refinements
Subquery optimizations
Query execution plan
1
2
3
4
5
2
6
7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
MySQL Optimizer
SELECT a, b
FROM t1
JOIN t2
ON t1.a = t2.b
JOIN t3
ON t2.b = t3.c
WHERE t2.d > 20
AND t2.d < 30;
t2 t3
t1
Table
scan
Range
scan
Ref
access
JOIN
JOIN
Statistics
(storage engines)
Table/index info
(data dictionary)
Query
Optimizer
3
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
MySQL
Architecture
Optimizer
Logical transformations
Cost-based optimizer:
Join order and access methods
Plan refinement
Query execution plan
Query execution
Parser
Resolver:
Semantic check,name resolution
Storage Engine
InnoDB MyISAM
SQL query
Query result
4
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
MySQL Optimizer Characteristics
• Produces the query plan that uses least
resources
– IO and CPU
• Optimizes a single query
– No inter-query optimizations
• Produces a left-deep linear query execution
plan
JOIN
JOIN
t1 t2
t3
JOIN
t4Table
scan
Table
scan
Range
scan
Ref
access
5
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Optimizer Overview
Main phases
Optimizer
Logical transformations
Cost-based optimizer:
Join order and access methods
Plan refinement
Query execution plan
Query execution
Parser
Resolver:
Semantic check,name resolution
Storage engine
InnoDB MyISAM
Prepare for cost-based
optimization
Negation elimination
Equality and constant propagation
Evaluation of constant expressions
Conversions of outer to inner join
Subquery transformation
Ref access analysis
Range access analysis
Estimation of condition fan out
Constant table detection
Table condition pushdown
Access method adjustments
Sort avoidance
Index condition pushdown
Access method selection
Join order
6
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Program Agenda
Logical transformations
Cost-based optimizations
Analyzing access methods
Join optimizer
Plan refinements
Subquery optimizations
Query execution plan
1
2
3
4
5
7
6
7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Logical Transformations
• Logical transformations of query
conditions:
– Negation elimination
– Equality propagations
– Evaluate constant expressions
– Remove trivial conditions
• Conversion of outer to inner join
• Merging of views and derived tables
• Subquery transformations
Simpler query to
optimize and
execute
Prepare for later
optimizations
8
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Example:
Logical Transformations
t1.a = 9 AND t2.a = 9 AND (9 <= 10 AND t2.b <= 3 OR (t1.b = 12 AND t2.b = 5));
Evaluate const
expressions
SELECT * FROM t1 JOIN t2 WHERE
t1.a = t2.a AND t2.a = 9 AND (NOT (t1.a > 10 OR t2.b > 3) OR (t1.b = t2.b + 7 AND t2.b = 5));
Negation
elimination
t1.a = t2.a AND t2.a = 9 AND (t1.a <= 10 AND t2.b <= 3 OR (t1.b = t2.b + 7 AND t2.b = 5));
Equality/const
propagation
t1.a = 9 AND t2.a = 9 AND (9 <= 10 AND t2.b <= 3 OR (t1.b = 5 + 7 AND t2.b = 5));
=TRUE
Trivial condition
removal
t1.a = 9 AND t2.a = 9 AND (t2.b <= 3 OR (t1.b = 12 AND t2.b = 5));
9
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Program Agenda
Logical transformations
Cost-based optimizations
Analyzing access methods
Join optimizer
Plan refinements
Subquery optimizations
Query execution plan
1
2
3
4
5
10
6
7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Cost-based Query Optimization
General idea:
• Assign cost to operations
• Assign cost to partial or alternative plans
• Search for plan with lowest cost
t2 t3
t1
Table
scan
Range
scan
Ref
access
JOIN
JOIN
11
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Cost-based Query Optimizations
The main cost-based optimizations:
• Index and access method
– Table scan
– Index scan
– Range scan
– Index lookup (ref access)
• Join order
• Join buffering strategy
• Subquery strategy
t2 t3
t1
Table
scan
Range
scan
Ref
access
JOIN
JOIN
12
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Optimizer Cost Model
t1 Cost estimate
Row estimate
Cost Model
Cost formulas
Access
methods
Join Subquery
Cost constants
CPU IO
Metadata:
- Record and index size
- Index information
- Uniqueness
Statistics:
- Table size
- Cardinality
- Range estimates
Cost model
configuration
Range
scan
JOIN
13
New in
MySQL 5.7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
• The cost for executing a query
• Cost unit:
– “read a random data page from disk”
• Main cost factors:
– IO cost:
• #pages read from table
• #pages read from index
– CPU cost:
• Evaluating query conditions
• Comparing keys/records
• Sorting keys
• Main cost constants:
Cost Estimates
Cost Default value
Reading a random disk page 1.0
Reading a data page from
memory buffer
1.0
Evaluating query condition 0.2
Comparing key/record 0.1
New in MySQL 5.7:
Configurable
14
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Cost Model Examples
Table scan:
• IO-cost: #pages in table * IO_BLOCK_READ_COST
• CPU-cost: #records * ROW_EVALUATE_COST
Range scan (on secondary index):
• IO-cost: #records_in_range * IO_BLOCK_READ_COST
• CPU cost: #records_in_range * ROW_EVALUATE_COST +
#records_in_range * ROW_EVALUATE_COST
SELECT * FROM t1 WHERE a BETWEEN 20 AND 23
Evaluate range condition Evaluate WHERE condition
15
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Program Agenda
Logical transformations
Cost-based optimizations
Analyzing access methods
Join optimizer
Plan refinements
Subquery optimizations
Query execution plan
1
2
3
4
5
16
6
7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Selecting Access Method
• For each table, find the best access method:
1. Check if the access method is useful
2. Estimate cost of using access method
3. Select the cheapest to be used
• Choice of access method is cost based
Finding the optimal method to read data from storage engine
Main access methods
• Table scan
• Index scan
• Index lookup
(ref access)
• Range scan
• Index merge
• Loose index scan
17
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Index Lookup (Ref Access)
• Read all records with a given key value using an index
• Examples:
SELECT * FROM t1 WHERE t1.key = 7;
SELECT * FROM t1, t2 WHERE t1.key = t2.key;
• “eq_ref”:
– Reading from a unique index, max one record returned
• “ref”:
– Reading from a non-unique index or a prefix of an index, possibly multiple records
returned
– The record estimate is based on cardinality number from index statistics
18
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Ref Access Analysis
• Determine which indexes that can be used for index lookup in a join
country
country_id
capital
19
SELECT city.name as capital, language.name
FROM city
JOIN country ON city.country_id = country.country_id
JOIN language ON country.country_id = language.country_id
WHERE city.city_id = country.capital;
city
country_id
city_id
name
language
country_id
name
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Range Optimizer
• Goal: find the “minimal” ranges to read for each index
• Example:
SELECT * FROM t1 WHERE (key1 > 10 AND key1 < 20) AND key2 > 30
• Range scan using INDEX(key1):
• Range scan using INDEX(key2):
10 20
30
20
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Range Optimizer, cont.
• Range optimizer selects the “useful” parts of the WHERE condition:
– Conditions comparing a column value with a constant:
– Nested AND/OR conditions are supported
• Result: list of disjoint ranges that need to be read from index:
• Cost estimate based on number of records in each range:
– Record estimate is found by asking the storage engine (“index dives”)
key > 3
key = 4
key IS NULLkey BETWEEN 4 AND 6
key LIKE ”abc%”key IN (10,12,..)
21
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Range Access for Multi-part Index
• Table:
• INDEX idx (a, b, c);
• Logical storage layout of index:
Example table with multi-part index
10
1 2 3 4 5
11
1 2 3 4 5
12
1 2 3 4 5
13
1 2 3 4 5
a
b
c
pk a b c d
22
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Range Access for Multi-part Index, cont
• Equality on 1st index part?
– Can add condition on 2nd index part to range condition
• Example:
SELECT * from t1 WHERE a IN (10,11,13) AND (b=2 OR b=4)
• Resulting range scan:
2 4 2 4 2 4
23
10
1 2 3 4 5
11
1 2 3 4 5
12
1 2 3 4 5
13
1 2 3 4 5
a
b
c
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
• Non-equality on 1st index part:
– Can NOT add condition on 2nd index part in range condition
• Example:
SELECT * from t1 WHERE a > 10 AND a < 13 AND (b=2 OR b=4)
• Resulting range scan:
10
1 2 3 4 5
11
1 2 3 4 5
12
1 2 3 4 5
13
1 2 3 4 5
a
b
c
Range Access for Multi-part Index, cont
a > 10 AND a < 13
24
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
• Use multiple indexes on the same
table
• Implemented index merge
strategies:
– Index Merge Union
• OR conditions between different indexes
– Index Merge Intersect
• AND conditions between different indexes
– Index Merge Sort-Union
• OR conditions where condition is a range
Index Merge
• Example:
SELECT * FROM t1 WHERE a=10 OR b=10
10INDEX(a)
10INDEX(b)
a=10 OR b=10Result:
Union
25
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
10
1 2 3 4 5
11
1 2 3 4 5
12
1 2 3 4 5
13
1 2 3 4 5
a
b
c
• Optimization for GROUP BY and DISTINCT:
SELECT a, b FROM t1 GROUP BY a, b;
SELECT DISTINCT a, b FROM t1;
SELECT a, MIN(b) FROM t1 GROUP BY a;
• GROUP BY/DISTINCT must be on the prefix of a multipart index
Loose Index Scan
26
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Program Agenda
Logical transformations
Cost-based optimizations
Analyzing access methods
Join optimizer
Plan refinements
Subquery optimizations
Query execution plan
1
2
3
4
5
27
6
7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Join Optimizer
• Goal:
– Given a JOIN of N tables, find the best JOIN
ordering
• “Greedy search strategy”:
– Start with all 1-table plans
– Expand each plan with remaining tables
• Depth-first
– If “cost of partial plan” > “cost of best plan”:
• “prune” plan
– Heuristic pruning:
• Prune less promising partial plans
t1
t2
t2
t2
t2
t3
t3
t3
t4t4
t4
t4t4
t3
t3 t2
t4t2 t3
28
N! possible
plans
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Join Optimizer Illustrated
SELECT city.name as capital, language.name
FROM city
JOIN country ON city.country_id = country.country_id
JOIN language ON country.country_id = language.country_id
WHERE city.city_id = country.capital;
language country city
language
language
language
languagecountry
country country
country
city
citycity
city
cost=26568 cost=32568 cost=627
cost=1245
cost=862
start
29
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
New in
MySQL 5.7
Record and Cost Estimates for JOIN
• tx JOIN tx+1
• records(tx+1) = records(tx) * condition_filter_effect * records_per_key
Condition filter effect
tx tx+1
Ref
access
Number of
records read
from tx
Conditionfilter
effect
Records passing the
table conditions on tx
30
Cardinality statistics
for index
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
How to Calculate Condition Filter Effect, step 1
A condition contributes to the condition filter effect for a table only if:
– It references a field in the table
– It is not used by the access method
– It depends on an available value:
• employee.name = “John” will always contribute to filter on employee
• employee.first_office_id <> office.id; depends on JOIN order
SELECT office_name
FROM office JOIN employee
WHERE office.id = employee.office_id AND
employee.name = “John” AND
employee.first_office_id <> office.id;
31
New in
MySQL
5.7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Filter estimate based on what is
available:
1. Range estimate
2. Index statistics
3. Guesstimate
= 0.1
<=,<,>,>= 1/3
BETWEEN 1/9
NOT <op> 1 – SEL(<op>)
AND P(A and B) = P(A) * P(B)
OR P(A or B) = P(A) + P(B) – P(A and B)
… …
How to Calculate Condition Filter Effect, step 2
SELECT *
FROM office JOIN employee ON office.id = employee.office_id
WHERE office_name = “San Francisco” AND
employee.name = “John” AND age > 21 AND
hire_date BETWEEN “2014-01-01” AND “2014-06-01”;
32
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
SELECT *
FROM office JOIN employee ON office.id = employee.office_id
WHERE office_name = “San Francisco” AND
employee.name = “John” AND age > 21 AND
hire_date BETWEEN “2014-01-01” AND “2014-06-01”;
Calculating Condition Filter Effect for Tables
Condition filter effect for tables:
– office: 0.03
– employee: 0.1 * 0.11 * 0.89
Example
0.1
(guesstimate)
0.89
(range)
0.11
(guesstimate)
0.03
(index)
33
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Program Agenda
Logical transformations
Cost-based optimizations
Analyzing access methods
Join optimizer
Plan refinements
Subquery optimizations
Query execution plan
1
2
3
4
5
34
6
7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Finalizing Query Plan
• Assigning query conditions to tables
– Evaluate conditions as early as possible in join order
• ORDER BY optimization: avoid sorting
– Change to different index
– Read in descending order
• Change to a cheaper access method
– Example: Use range scan instead of table scan or ref access
• Index Condition Pushdown
Main optimizations:
35
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ORDER BY Optimizations
• General solution: “File sort”
– Store query result in temporary table before sorting
– If data volume is large, may need to sort in several passes with intermediate storage
on disk
• Optimizations:
– Switch to use index that provides result in sorted order
– For “LIMIT n” queries, maintain priority queue on n top items in memory instead of
file sort
36
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Index Condition Pushdown
• Pushes conditions that can be evaluated on the
index down to storage engine
– Works only on indexed columns
• Goal: evaluate conditions without having to
access the actual record
– Reduces number of disk/block accesses
– Reduces CPU usage
Query
conditions
Index
Table data
Storage engine
MySQL server
37
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Program Agenda
Logical transformations
Cost-based optimizations
Analyzing access methods
Join optimizer
Plan refinements
Subquery optimizations
Query execution plan
1
2
3
4
5
38
6
7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Subquery category:
• IN (SELECT …)
• NOT IN (SELECT …)
• FROM (SELECT …)
• <CompOp> ALL/ANY (SELECT ..)
• EXISTS/other
Strategy:
Overview of Subquery Optimizations
• Semi-join
• Materialization
• IN ➜ EXISTS
• Merged
• Materialized
• MAX/MIN re-write
• Execute subquery
39
New in
MySQL 5.7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Optimization of IN subqueries
1. Transform IN (and =ANY) subquery to semi-join:
2. Apply transformations/strategies for avoiding/removing duplicates:
3. Optimize using cost-based JOIN optimizer
A. Semi-join Transformation
Table pullout
Duplicate
Weedout First Match
LooseScan
Semi-join
materialization
40
SELECT * FROM t1
WHERE query_where AND outer_expr IN (SELECT inner_expr FROM t2 WHERE cond2)
SELECT * FROM t1 SEMIJOIN t2 ON outer_expr = inner_expr
WHERE query_where AND cond2
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Optimization of IN Subqueries, cont.
• Only for non-correlated subqueries
• Execute subquery once
– store result in temporary table with unique index (removes duplicates)
• Outer query does lookup in temporary table
B. Subquery Materialization
SELECT title FROM film
WHERE film_id IN
(SELECT film_id FROM actor WHERE name=“Bullock”)
Temporarytable
Index
Materialize
Lookup
41
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Optimization of IN Subqueries, cont.
• Convert IN subquery to EXISTS subquery by “push-down” IN-equality to
subquery:
• Benefit: subquery will evaluate fewer records
• Note: special handling if pushed down expressions can be NULL
C. IN  EXISTS transformation
SELECT title FROM film
WHERE film_id IN (SELECT film_id FROM actor WHERE name=“Bullock”)
SELECT title FROM film
WHERE EXISTS (SELECT 1 FROM actor
WHERE name=“Bullock” AND film.film_id = actor.film_id)
42
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Program Agenda
Logical transformations
Cost-based optimizations
Analyzing access methods
Join optimizer
Plan refinements
Subquery optimizations
Query execution plan
1
2
3
4
5
43
6
7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Understanding the Query Plan
• Use EXPLAIN to print the final query plan:
• Explain for a running query:
EXPLAIN
EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.a = t2.a WHERE b > 10 AND c > 10;
+----+--------+-------+-------+---------------+-----+---------+------+------+----------+-----------------------+
| id | type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------+-------+-------+---------------+-----+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | t1 | range | PRIMARY,idx1 | idx1| 4 | NULL | 12 | 33.33 | Using index condition |
| 2 | SIMPLE | t2 | ref | idx2 | idx2| 4 | t1.a | 1 | 100.00 | NULL |
+----+--------+-------+-------+---------------+-----+---------+------+------+----------+-----------------------+
44
EXPLAIN FOR CONNECTION connection_id;
New in
MySQL 5.7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Added in
MySQL 5.7
Understanding the Query Plan
• JSON format:
• Contains more information:
– Used index parts
– Pushed index conditions
– Cost estimates
– Data estimates
Structured EXPLAIN
EXPLAIN FORMAT=JSON
SELECT * FROM t1 WHERE b > 10 AND c > 10;
EXPLAIN
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "17.81"
},
"table": {
"table_name": "t1",
"access_type": "range",
"possible_keys": [
"idx1"
],
"key": "idx1",
"used_key_parts": [
"b"
],
"key_length": "4",
"rows_examined_per_scan": 12,
"rows_produced_per_join": 3,
"filtered": "33.33",
"index_condition": "(`test`.`t1`.`b` > 10)",
"cost_info": {
"read_cost": "17.01",
"eval_cost": "0.80",
"prefix_cost": "17.81",
"data_read_per_join": "63"
},
………
"attached_condition": "(`test`.`t1`.`c` > 10)"
}
}
}
EXPLAIN FORMAT=JSON SELECT …
45
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. 46
Visual Explain in MySQL Work Bench
Understanding the Query Plan
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Optimizer Trace
• Trace of the main steps and decisions done
by the optimizer
Understand HOW a query is optimized
SET optimizer_trace=”enabled=on”;
SELECT * FROM t1 WHERE a > 10;
SELECT * FROM
INFORMATION_SCHEMA.OPTIMIZER_TRACE;
"table": "`t1`",
"range_analysis": {
"table_scan": {
"rows": 54,
"cost": 13.9
},
"best_covering_index_scan": {
"index": ”idx",
"cost": 11.903,
"chosen": true
},
"analyzing_range_alternatives": {
"range_scan_alternatives": [
{
"index": ”idx",
"ranges": [
"10 < a"
],
"rowid_ordered": false,
"using_mrr": false,
"index_only": true,
"rows": 12,
"cost": 3.4314,
"chosen": true
}
47
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Influencing the Optimizer
• Add indexes
• Use hints:
– Index hints: USE INDEX, FORCE INDEX, IGNORE INDEX
– Join order: STRAIGHT_JOIN
– Subquery strategy: /*+ SEMIJOIN(FirstMatch) */
– Join buffer strategy: /*+ BKA(table1) */
• Adjust optimizer_switch flags:
– set optimizer_switch=“condition_fanout_filter=OFF”
• Ask question in the MySQL optimizer forum
When the optimizer does not do what you want:
New hint syntax
and new hints
in MySQL 5.7
48
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Summary
• Query transformations
• Selecting data access method
• Join optimizer
• Subquery optimizations
• Plan refinements
Questions?
Optimizer:
What´s New in 5.7 and Sneak
Peek at 5.8
Thursday at 11:00
Ballroom C
49

More Related Content

What's hot

Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLSergey Petrunya
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQLGeorgi Sotirov
 
Infobright Column-Oriented Analytical Database Engine
Infobright Column-Oriented Analytical Database EngineInfobright Column-Oriented Analytical Database Engine
Infobright Column-Oriented Analytical Database EngineAlex Esterkin
 
M|18 Deep Dive: InnoDB Transactions and Write Paths
M|18 Deep Dive: InnoDB Transactions and Write PathsM|18 Deep Dive: InnoDB Transactions and Write Paths
M|18 Deep Dive: InnoDB Transactions and Write PathsMariaDB plc
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performanceoysteing
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
 
Mysql Explain Explained
Mysql Explain ExplainedMysql Explain Explained
Mysql Explain ExplainedJeremy Coates
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureMySQLConference
 
Understanding SQL Trace, TKPROF and Execution Plan for beginners
Understanding SQL Trace, TKPROF and Execution Plan for beginnersUnderstanding SQL Trace, TKPROF and Execution Plan for beginners
Understanding SQL Trace, TKPROF and Execution Plan for beginnersCarlos Sierra
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsDanny Bryant
 
Local Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixLocal Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixRajeshbabu Chintaguntla
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLMorgan Tocker
 
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxDataInfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxDataInfluxData
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkTimo Walther
 
PostgreSQL: Joining 1 million tables
PostgreSQL: Joining 1 million tablesPostgreSQL: Joining 1 million tables
PostgreSQL: Joining 1 million tablesHans-Jürgen Schönig
 
InnoDB Flushing and Checkpoints
InnoDB Flushing and CheckpointsInnoDB Flushing and Checkpoints
InnoDB Flushing and CheckpointsMIJIN AN
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index TuningManikanda kumar
 

What's hot (20)

Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQL
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQL
 
Infobright Column-Oriented Analytical Database Engine
Infobright Column-Oriented Analytical Database EngineInfobright Column-Oriented Analytical Database Engine
Infobright Column-Oriented Analytical Database Engine
 
M|18 Deep Dive: InnoDB Transactions and Write Paths
M|18 Deep Dive: InnoDB Transactions and Write PathsM|18 Deep Dive: InnoDB Transactions and Write Paths
M|18 Deep Dive: InnoDB Transactions and Write Paths
 
Sql oracle
Sql oracleSql oracle
Sql oracle
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performance
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
Mysql Explain Explained
Mysql Explain ExplainedMysql Explain Explained
Mysql Explain Explained
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code Structure
 
Understanding SQL Trace, TKPROF and Execution Plan for beginners
Understanding SQL Trace, TKPROF and Execution Plan for beginnersUnderstanding SQL Trace, TKPROF and Execution Plan for beginners
Understanding SQL Trace, TKPROF and Execution Plan for beginners
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular Expressions
 
Local Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixLocal Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache Phoenix
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
 
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxDataInfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
 
advanced sql(database)
advanced sql(database)advanced sql(database)
advanced sql(database)
 
How to Use JSON in MySQL Wrong
How to Use JSON in MySQL WrongHow to Use JSON in MySQL Wrong
How to Use JSON in MySQL Wrong
 
PostgreSQL: Joining 1 million tables
PostgreSQL: Joining 1 million tablesPostgreSQL: Joining 1 million tables
PostgreSQL: Joining 1 million tables
 
InnoDB Flushing and Checkpoints
InnoDB Flushing and CheckpointsInnoDB Flushing and Checkpoints
InnoDB Flushing and Checkpoints
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index Tuning
 

Viewers also liked

MySQL Manchester TT - Security
MySQL Manchester TT  - SecurityMySQL Manchester TT  - Security
MySQL Manchester TT - SecurityMark Swarbrick
 
Zend Core on IBM i - Security Considerations
Zend Core on IBM i - Security ConsiderationsZend Core on IBM i - Security Considerations
Zend Core on IBM i - Security ConsiderationsZendCon
 
PHP on IBM i Tutorial
PHP on IBM i TutorialPHP on IBM i Tutorial
PHP on IBM i TutorialZendCon
 
Tiery Eyed
Tiery EyedTiery Eyed
Tiery EyedZendCon
 
MySQL Tech Tour 2015 - 5.7 Connector/J/Net
MySQL Tech Tour 2015 - 5.7 Connector/J/NetMySQL Tech Tour 2015 - 5.7 Connector/J/Net
MySQL Tech Tour 2015 - 5.7 Connector/J/NetMark Swarbrick
 
MySQL Manchester TT - Replication Features
MySQL Manchester TT  - Replication FeaturesMySQL Manchester TT  - Replication Features
MySQL Manchester TT - Replication FeaturesMark Swarbrick
 
Application Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server TracingApplication Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server TracingZendCon
 
MySQL Manchester TT - 5.7 Whats new
MySQL Manchester TT - 5.7 Whats newMySQL Manchester TT - 5.7 Whats new
MySQL Manchester TT - 5.7 Whats newMark Swarbrick
 
Why MySQL High Availability Matters
Why MySQL High Availability MattersWhy MySQL High Availability Matters
Why MySQL High Availability MattersMark Swarbrick
 
Oracle Compute Cloud Service快速实践
Oracle Compute Cloud Service快速实践Oracle Compute Cloud Service快速实践
Oracle Compute Cloud Service快速实践Zhaoyang Wang
 
Oracle cloud ravello介绍及测试账户申请
Oracle cloud ravello介绍及测试账户申请Oracle cloud ravello介绍及测试账户申请
Oracle cloud ravello介绍及测试账户申请Zhaoyang Wang
 
PHP and Platform Independance in the Cloud
PHP and Platform Independance in the CloudPHP and Platform Independance in the Cloud
PHP and Platform Independance in the CloudZendCon
 
Framework Shootout
Framework ShootoutFramework Shootout
Framework ShootoutZendCon
 
PHP on Windows - What's New
PHP on Windows - What's NewPHP on Windows - What's New
PHP on Windows - What's NewZendCon
 
Oracle cloud 使用云市场快速搭建小型电商网站
Oracle cloud 使用云市场快速搭建小型电商网站Oracle cloud 使用云市场快速搭建小型电商网站
Oracle cloud 使用云市场快速搭建小型电商网站Zhaoyang Wang
 
Zend_Tool: Practical use and Extending
Zend_Tool: Practical use and ExtendingZend_Tool: Practical use and Extending
Zend_Tool: Practical use and ExtendingZendCon
 
Oracle Compute Cloud Service介绍
Oracle Compute Cloud Service介绍Oracle Compute Cloud Service介绍
Oracle Compute Cloud Service介绍Zhaoyang Wang
 
Solving the C20K problem: Raising the bar in PHP Performance and Scalability
Solving the C20K problem: Raising the bar in PHP Performance and ScalabilitySolving the C20K problem: Raising the bar in PHP Performance and Scalability
Solving the C20K problem: Raising the bar in PHP Performance and ScalabilityZendCon
 

Viewers also liked (20)

MySQL Manchester TT - Security
MySQL Manchester TT  - SecurityMySQL Manchester TT  - Security
MySQL Manchester TT - Security
 
Zend Core on IBM i - Security Considerations
Zend Core on IBM i - Security ConsiderationsZend Core on IBM i - Security Considerations
Zend Core on IBM i - Security Considerations
 
PHP on IBM i Tutorial
PHP on IBM i TutorialPHP on IBM i Tutorial
PHP on IBM i Tutorial
 
Tiery Eyed
Tiery EyedTiery Eyed
Tiery Eyed
 
MySQL Tech Tour 2015 - 5.7 Connector/J/Net
MySQL Tech Tour 2015 - 5.7 Connector/J/NetMySQL Tech Tour 2015 - 5.7 Connector/J/Net
MySQL Tech Tour 2015 - 5.7 Connector/J/Net
 
MySQL Manchester TT - Replication Features
MySQL Manchester TT  - Replication FeaturesMySQL Manchester TT  - Replication Features
MySQL Manchester TT - Replication Features
 
Application Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server TracingApplication Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server Tracing
 
Script it
Script itScript it
Script it
 
MySQL Manchester TT - 5.7 Whats new
MySQL Manchester TT - 5.7 Whats newMySQL Manchester TT - 5.7 Whats new
MySQL Manchester TT - 5.7 Whats new
 
Why MySQL High Availability Matters
Why MySQL High Availability MattersWhy MySQL High Availability Matters
Why MySQL High Availability Matters
 
Oracle Compute Cloud Service快速实践
Oracle Compute Cloud Service快速实践Oracle Compute Cloud Service快速实践
Oracle Compute Cloud Service快速实践
 
Oracle cloud ravello介绍及测试账户申请
Oracle cloud ravello介绍及测试账户申请Oracle cloud ravello介绍及测试账户申请
Oracle cloud ravello介绍及测试账户申请
 
PHP and Platform Independance in the Cloud
PHP and Platform Independance in the CloudPHP and Platform Independance in the Cloud
PHP and Platform Independance in the Cloud
 
Framework Shootout
Framework ShootoutFramework Shootout
Framework Shootout
 
PHP on Windows - What's New
PHP on Windows - What's NewPHP on Windows - What's New
PHP on Windows - What's New
 
MySQL in your laptop
MySQL in your laptopMySQL in your laptop
MySQL in your laptop
 
Oracle cloud 使用云市场快速搭建小型电商网站
Oracle cloud 使用云市场快速搭建小型电商网站Oracle cloud 使用云市场快速搭建小型电商网站
Oracle cloud 使用云市场快速搭建小型电商网站
 
Zend_Tool: Practical use and Extending
Zend_Tool: Practical use and ExtendingZend_Tool: Practical use and Extending
Zend_Tool: Practical use and Extending
 
Oracle Compute Cloud Service介绍
Oracle Compute Cloud Service介绍Oracle Compute Cloud Service介绍
Oracle Compute Cloud Service介绍
 
Solving the C20K problem: Raising the bar in PHP Performance and Scalability
Solving the C20K problem: Raising the bar in PHP Performance and ScalabilitySolving the C20K problem: Raising the bar in PHP Performance and Scalability
Solving the C20K problem: Raising the bar in PHP Performance and Scalability
 

Similar to MySQL Optimizer Overview

MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0Manyi Lu
 
Ctes percona live_2017
Ctes percona live_2017Ctes percona live_2017
Ctes percona live_2017Guilhem Bichot
 
Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Databricks
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Databricks
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cRonald Francisco Vargas Quesada
 
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...Spark Summit
 
MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0oysteing
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007paulguerin
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
 
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Databricks
 
Correctness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLCorrectness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLNicolas Poggi
 
Top 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsTop 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsNirav Shah
 
How to analyze and tune sql queries for better performance
How to analyze and tune sql queries for better performanceHow to analyze and tune sql queries for better performance
How to analyze and tune sql queries for better performanceoysteing
 
MySQL 8.0: Common Table Expressions
MySQL 8.0: Common Table ExpressionsMySQL 8.0: Common Table Expressions
MySQL 8.0: Common Table Expressionsoysteing
 
Confoo 2021 - MySQL Indexes & Histograms
Confoo 2021 - MySQL Indexes & HistogramsConfoo 2021 - MySQL Indexes & Histograms
Confoo 2021 - MySQL Indexes & HistogramsDave Stokes
 
How to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinarHow to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinaroysteing
 
How to analyze and tune sql queries for better performance vts2016
How to analyze and tune sql queries for better performance vts2016How to analyze and tune sql queries for better performance vts2016
How to analyze and tune sql queries for better performance vts2016oysteing
 
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...Dave Stokes
 
Dutch PHP Conference 2021 - MySQL Indexes and Histograms
Dutch PHP Conference 2021 - MySQL Indexes and HistogramsDutch PHP Conference 2021 - MySQL Indexes and Histograms
Dutch PHP Conference 2021 - MySQL Indexes and HistogramsDave Stokes
 
Optimizer overviewoow2014
Optimizer overviewoow2014Optimizer overviewoow2014
Optimizer overviewoow2014Mysql User Camp
 

Similar to MySQL Optimizer Overview (20)

MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0MySQL Optimizer: What's New in 8.0
MySQL Optimizer: What's New in 8.0
 
Ctes percona live_2017
Ctes percona live_2017Ctes percona live_2017
Ctes percona live_2017
 
Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12c
 
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
Cost-Based Optimizer Framework for Spark SQL: Spark Summit East talk by Ron H...
 
MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
 
Correctness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLCorrectness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQL
 
Top 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsTop 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tips
 
How to analyze and tune sql queries for better performance
How to analyze and tune sql queries for better performanceHow to analyze and tune sql queries for better performance
How to analyze and tune sql queries for better performance
 
MySQL 8.0: Common Table Expressions
MySQL 8.0: Common Table ExpressionsMySQL 8.0: Common Table Expressions
MySQL 8.0: Common Table Expressions
 
Confoo 2021 - MySQL Indexes & Histograms
Confoo 2021 - MySQL Indexes & HistogramsConfoo 2021 - MySQL Indexes & Histograms
Confoo 2021 - MySQL Indexes & Histograms
 
How to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinarHow to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinar
 
How to analyze and tune sql queries for better performance vts2016
How to analyze and tune sql queries for better performance vts2016How to analyze and tune sql queries for better performance vts2016
How to analyze and tune sql queries for better performance vts2016
 
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
 
Dutch PHP Conference 2021 - MySQL Indexes and Histograms
Dutch PHP Conference 2021 - MySQL Indexes and HistogramsDutch PHP Conference 2021 - MySQL Indexes and Histograms
Dutch PHP Conference 2021 - MySQL Indexes and Histograms
 
Optimizer overviewoow2014
Optimizer overviewoow2014Optimizer overviewoow2014
Optimizer overviewoow2014
 

MySQL Optimizer Overview

  • 1. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. MySQL Optimizer Overview Olav Sandstå Senior Principal Engineer MySQL Optimizer Team, Oracle April 19, 2016
  • 2. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 2 6 7
  • 3. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. MySQL Optimizer SELECT a, b FROM t1 JOIN t2 ON t1.a = t2.b JOIN t3 ON t2.b = t3.c WHERE t2.d > 20 AND t2.d < 30; t2 t3 t1 Table scan Range scan Ref access JOIN JOIN Statistics (storage engines) Table/index info (data dictionary) Query Optimizer 3
  • 4. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. MySQL Architecture Optimizer Logical transformations Cost-based optimizer: Join order and access methods Plan refinement Query execution plan Query execution Parser Resolver: Semantic check,name resolution Storage Engine InnoDB MyISAM SQL query Query result 4
  • 5. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. MySQL Optimizer Characteristics • Produces the query plan that uses least resources – IO and CPU • Optimizes a single query – No inter-query optimizations • Produces a left-deep linear query execution plan JOIN JOIN t1 t2 t3 JOIN t4Table scan Table scan Range scan Ref access 5
  • 6. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimizer Overview Main phases Optimizer Logical transformations Cost-based optimizer: Join order and access methods Plan refinement Query execution plan Query execution Parser Resolver: Semantic check,name resolution Storage engine InnoDB MyISAM Prepare for cost-based optimization Negation elimination Equality and constant propagation Evaluation of constant expressions Conversions of outer to inner join Subquery transformation Ref access analysis Range access analysis Estimation of condition fan out Constant table detection Table condition pushdown Access method adjustments Sort avoidance Index condition pushdown Access method selection Join order 6
  • 7. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 7 6 7
  • 8. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Logical Transformations • Logical transformations of query conditions: – Negation elimination – Equality propagations – Evaluate constant expressions – Remove trivial conditions • Conversion of outer to inner join • Merging of views and derived tables • Subquery transformations Simpler query to optimize and execute Prepare for later optimizations 8
  • 9. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Example: Logical Transformations t1.a = 9 AND t2.a = 9 AND (9 <= 10 AND t2.b <= 3 OR (t1.b = 12 AND t2.b = 5)); Evaluate const expressions SELECT * FROM t1 JOIN t2 WHERE t1.a = t2.a AND t2.a = 9 AND (NOT (t1.a > 10 OR t2.b > 3) OR (t1.b = t2.b + 7 AND t2.b = 5)); Negation elimination t1.a = t2.a AND t2.a = 9 AND (t1.a <= 10 AND t2.b <= 3 OR (t1.b = t2.b + 7 AND t2.b = 5)); Equality/const propagation t1.a = 9 AND t2.a = 9 AND (9 <= 10 AND t2.b <= 3 OR (t1.b = 5 + 7 AND t2.b = 5)); =TRUE Trivial condition removal t1.a = 9 AND t2.a = 9 AND (t2.b <= 3 OR (t1.b = 12 AND t2.b = 5)); 9
  • 10. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 10 6 7
  • 11. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Cost-based Query Optimization General idea: • Assign cost to operations • Assign cost to partial or alternative plans • Search for plan with lowest cost t2 t3 t1 Table scan Range scan Ref access JOIN JOIN 11
  • 12. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Cost-based Query Optimizations The main cost-based optimizations: • Index and access method – Table scan – Index scan – Range scan – Index lookup (ref access) • Join order • Join buffering strategy • Subquery strategy t2 t3 t1 Table scan Range scan Ref access JOIN JOIN 12
  • 13. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimizer Cost Model t1 Cost estimate Row estimate Cost Model Cost formulas Access methods Join Subquery Cost constants CPU IO Metadata: - Record and index size - Index information - Uniqueness Statistics: - Table size - Cardinality - Range estimates Cost model configuration Range scan JOIN 13 New in MySQL 5.7
  • 14. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. • The cost for executing a query • Cost unit: – “read a random data page from disk” • Main cost factors: – IO cost: • #pages read from table • #pages read from index – CPU cost: • Evaluating query conditions • Comparing keys/records • Sorting keys • Main cost constants: Cost Estimates Cost Default value Reading a random disk page 1.0 Reading a data page from memory buffer 1.0 Evaluating query condition 0.2 Comparing key/record 0.1 New in MySQL 5.7: Configurable 14
  • 15. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Cost Model Examples Table scan: • IO-cost: #pages in table * IO_BLOCK_READ_COST • CPU-cost: #records * ROW_EVALUATE_COST Range scan (on secondary index): • IO-cost: #records_in_range * IO_BLOCK_READ_COST • CPU cost: #records_in_range * ROW_EVALUATE_COST + #records_in_range * ROW_EVALUATE_COST SELECT * FROM t1 WHERE a BETWEEN 20 AND 23 Evaluate range condition Evaluate WHERE condition 15
  • 16. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 16 6 7
  • 17. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Selecting Access Method • For each table, find the best access method: 1. Check if the access method is useful 2. Estimate cost of using access method 3. Select the cheapest to be used • Choice of access method is cost based Finding the optimal method to read data from storage engine Main access methods • Table scan • Index scan • Index lookup (ref access) • Range scan • Index merge • Loose index scan 17
  • 18. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Index Lookup (Ref Access) • Read all records with a given key value using an index • Examples: SELECT * FROM t1 WHERE t1.key = 7; SELECT * FROM t1, t2 WHERE t1.key = t2.key; • “eq_ref”: – Reading from a unique index, max one record returned • “ref”: – Reading from a non-unique index or a prefix of an index, possibly multiple records returned – The record estimate is based on cardinality number from index statistics 18
  • 19. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Ref Access Analysis • Determine which indexes that can be used for index lookup in a join country country_id capital 19 SELECT city.name as capital, language.name FROM city JOIN country ON city.country_id = country.country_id JOIN language ON country.country_id = language.country_id WHERE city.city_id = country.capital; city country_id city_id name language country_id name
  • 20. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Range Optimizer • Goal: find the “minimal” ranges to read for each index • Example: SELECT * FROM t1 WHERE (key1 > 10 AND key1 < 20) AND key2 > 30 • Range scan using INDEX(key1): • Range scan using INDEX(key2): 10 20 30 20
  • 21. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Range Optimizer, cont. • Range optimizer selects the “useful” parts of the WHERE condition: – Conditions comparing a column value with a constant: – Nested AND/OR conditions are supported • Result: list of disjoint ranges that need to be read from index: • Cost estimate based on number of records in each range: – Record estimate is found by asking the storage engine (“index dives”) key > 3 key = 4 key IS NULLkey BETWEEN 4 AND 6 key LIKE ”abc%”key IN (10,12,..) 21
  • 22. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Range Access for Multi-part Index • Table: • INDEX idx (a, b, c); • Logical storage layout of index: Example table with multi-part index 10 1 2 3 4 5 11 1 2 3 4 5 12 1 2 3 4 5 13 1 2 3 4 5 a b c pk a b c d 22
  • 23. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Range Access for Multi-part Index, cont • Equality on 1st index part? – Can add condition on 2nd index part to range condition • Example: SELECT * from t1 WHERE a IN (10,11,13) AND (b=2 OR b=4) • Resulting range scan: 2 4 2 4 2 4 23 10 1 2 3 4 5 11 1 2 3 4 5 12 1 2 3 4 5 13 1 2 3 4 5 a b c
  • 24. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. • Non-equality on 1st index part: – Can NOT add condition on 2nd index part in range condition • Example: SELECT * from t1 WHERE a > 10 AND a < 13 AND (b=2 OR b=4) • Resulting range scan: 10 1 2 3 4 5 11 1 2 3 4 5 12 1 2 3 4 5 13 1 2 3 4 5 a b c Range Access for Multi-part Index, cont a > 10 AND a < 13 24
  • 25. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. • Use multiple indexes on the same table • Implemented index merge strategies: – Index Merge Union • OR conditions between different indexes – Index Merge Intersect • AND conditions between different indexes – Index Merge Sort-Union • OR conditions where condition is a range Index Merge • Example: SELECT * FROM t1 WHERE a=10 OR b=10 10INDEX(a) 10INDEX(b) a=10 OR b=10Result: Union 25
  • 26. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. 10 1 2 3 4 5 11 1 2 3 4 5 12 1 2 3 4 5 13 1 2 3 4 5 a b c • Optimization for GROUP BY and DISTINCT: SELECT a, b FROM t1 GROUP BY a, b; SELECT DISTINCT a, b FROM t1; SELECT a, MIN(b) FROM t1 GROUP BY a; • GROUP BY/DISTINCT must be on the prefix of a multipart index Loose Index Scan 26
  • 27. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 27 6 7
  • 28. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Join Optimizer • Goal: – Given a JOIN of N tables, find the best JOIN ordering • “Greedy search strategy”: – Start with all 1-table plans – Expand each plan with remaining tables • Depth-first – If “cost of partial plan” > “cost of best plan”: • “prune” plan – Heuristic pruning: • Prune less promising partial plans t1 t2 t2 t2 t2 t3 t3 t3 t4t4 t4 t4t4 t3 t3 t2 t4t2 t3 28 N! possible plans
  • 29. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Join Optimizer Illustrated SELECT city.name as capital, language.name FROM city JOIN country ON city.country_id = country.country_id JOIN language ON country.country_id = language.country_id WHERE city.city_id = country.capital; language country city language language language languagecountry country country country city citycity city cost=26568 cost=32568 cost=627 cost=1245 cost=862 start 29
  • 30. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. New in MySQL 5.7 Record and Cost Estimates for JOIN • tx JOIN tx+1 • records(tx+1) = records(tx) * condition_filter_effect * records_per_key Condition filter effect tx tx+1 Ref access Number of records read from tx Conditionfilter effect Records passing the table conditions on tx 30 Cardinality statistics for index
  • 31. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. How to Calculate Condition Filter Effect, step 1 A condition contributes to the condition filter effect for a table only if: – It references a field in the table – It is not used by the access method – It depends on an available value: • employee.name = “John” will always contribute to filter on employee • employee.first_office_id <> office.id; depends on JOIN order SELECT office_name FROM office JOIN employee WHERE office.id = employee.office_id AND employee.name = “John” AND employee.first_office_id <> office.id; 31 New in MySQL 5.7
  • 32. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Filter estimate based on what is available: 1. Range estimate 2. Index statistics 3. Guesstimate = 0.1 <=,<,>,>= 1/3 BETWEEN 1/9 NOT <op> 1 – SEL(<op>) AND P(A and B) = P(A) * P(B) OR P(A or B) = P(A) + P(B) – P(A and B) … … How to Calculate Condition Filter Effect, step 2 SELECT * FROM office JOIN employee ON office.id = employee.office_id WHERE office_name = “San Francisco” AND employee.name = “John” AND age > 21 AND hire_date BETWEEN “2014-01-01” AND “2014-06-01”; 32
  • 33. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. SELECT * FROM office JOIN employee ON office.id = employee.office_id WHERE office_name = “San Francisco” AND employee.name = “John” AND age > 21 AND hire_date BETWEEN “2014-01-01” AND “2014-06-01”; Calculating Condition Filter Effect for Tables Condition filter effect for tables: – office: 0.03 – employee: 0.1 * 0.11 * 0.89 Example 0.1 (guesstimate) 0.89 (range) 0.11 (guesstimate) 0.03 (index) 33
  • 34. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 34 6 7
  • 35. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Finalizing Query Plan • Assigning query conditions to tables – Evaluate conditions as early as possible in join order • ORDER BY optimization: avoid sorting – Change to different index – Read in descending order • Change to a cheaper access method – Example: Use range scan instead of table scan or ref access • Index Condition Pushdown Main optimizations: 35
  • 36. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. ORDER BY Optimizations • General solution: “File sort” – Store query result in temporary table before sorting – If data volume is large, may need to sort in several passes with intermediate storage on disk • Optimizations: – Switch to use index that provides result in sorted order – For “LIMIT n” queries, maintain priority queue on n top items in memory instead of file sort 36
  • 37. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Index Condition Pushdown • Pushes conditions that can be evaluated on the index down to storage engine – Works only on indexed columns • Goal: evaluate conditions without having to access the actual record – Reduces number of disk/block accesses – Reduces CPU usage Query conditions Index Table data Storage engine MySQL server 37
  • 38. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 38 6 7
  • 39. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Subquery category: • IN (SELECT …) • NOT IN (SELECT …) • FROM (SELECT …) • <CompOp> ALL/ANY (SELECT ..) • EXISTS/other Strategy: Overview of Subquery Optimizations • Semi-join • Materialization • IN ➜ EXISTS • Merged • Materialized • MAX/MIN re-write • Execute subquery 39 New in MySQL 5.7
  • 40. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimization of IN subqueries 1. Transform IN (and =ANY) subquery to semi-join: 2. Apply transformations/strategies for avoiding/removing duplicates: 3. Optimize using cost-based JOIN optimizer A. Semi-join Transformation Table pullout Duplicate Weedout First Match LooseScan Semi-join materialization 40 SELECT * FROM t1 WHERE query_where AND outer_expr IN (SELECT inner_expr FROM t2 WHERE cond2) SELECT * FROM t1 SEMIJOIN t2 ON outer_expr = inner_expr WHERE query_where AND cond2
  • 41. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimization of IN Subqueries, cont. • Only for non-correlated subqueries • Execute subquery once – store result in temporary table with unique index (removes duplicates) • Outer query does lookup in temporary table B. Subquery Materialization SELECT title FROM film WHERE film_id IN (SELECT film_id FROM actor WHERE name=“Bullock”) Temporarytable Index Materialize Lookup 41
  • 42. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimization of IN Subqueries, cont. • Convert IN subquery to EXISTS subquery by “push-down” IN-equality to subquery: • Benefit: subquery will evaluate fewer records • Note: special handling if pushed down expressions can be NULL C. IN  EXISTS transformation SELECT title FROM film WHERE film_id IN (SELECT film_id FROM actor WHERE name=“Bullock”) SELECT title FROM film WHERE EXISTS (SELECT 1 FROM actor WHERE name=“Bullock” AND film.film_id = actor.film_id) 42
  • 43. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 43 6 7
  • 44. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Understanding the Query Plan • Use EXPLAIN to print the final query plan: • Explain for a running query: EXPLAIN EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.a = t2.a WHERE b > 10 AND c > 10; +----+--------+-------+-------+---------------+-----+---------+------+------+----------+-----------------------+ | id | type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------+-------+-------+---------------+-----+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | t1 | range | PRIMARY,idx1 | idx1| 4 | NULL | 12 | 33.33 | Using index condition | | 2 | SIMPLE | t2 | ref | idx2 | idx2| 4 | t1.a | 1 | 100.00 | NULL | +----+--------+-------+-------+---------------+-----+---------+------+------+----------+-----------------------+ 44 EXPLAIN FOR CONNECTION connection_id; New in MySQL 5.7
  • 45. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Added in MySQL 5.7 Understanding the Query Plan • JSON format: • Contains more information: – Used index parts – Pushed index conditions – Cost estimates – Data estimates Structured EXPLAIN EXPLAIN FORMAT=JSON SELECT * FROM t1 WHERE b > 10 AND c > 10; EXPLAIN { "query_block": { "select_id": 1, "cost_info": { "query_cost": "17.81" }, "table": { "table_name": "t1", "access_type": "range", "possible_keys": [ "idx1" ], "key": "idx1", "used_key_parts": [ "b" ], "key_length": "4", "rows_examined_per_scan": 12, "rows_produced_per_join": 3, "filtered": "33.33", "index_condition": "(`test`.`t1`.`b` > 10)", "cost_info": { "read_cost": "17.01", "eval_cost": "0.80", "prefix_cost": "17.81", "data_read_per_join": "63" }, ……… "attached_condition": "(`test`.`t1`.`c` > 10)" } } } EXPLAIN FORMAT=JSON SELECT … 45
  • 46. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. 46 Visual Explain in MySQL Work Bench Understanding the Query Plan
  • 47. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimizer Trace • Trace of the main steps and decisions done by the optimizer Understand HOW a query is optimized SET optimizer_trace=”enabled=on”; SELECT * FROM t1 WHERE a > 10; SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE; "table": "`t1`", "range_analysis": { "table_scan": { "rows": 54, "cost": 13.9 }, "best_covering_index_scan": { "index": ”idx", "cost": 11.903, "chosen": true }, "analyzing_range_alternatives": { "range_scan_alternatives": [ { "index": ”idx", "ranges": [ "10 < a" ], "rowid_ordered": false, "using_mrr": false, "index_only": true, "rows": 12, "cost": 3.4314, "chosen": true } 47
  • 48. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Influencing the Optimizer • Add indexes • Use hints: – Index hints: USE INDEX, FORCE INDEX, IGNORE INDEX – Join order: STRAIGHT_JOIN – Subquery strategy: /*+ SEMIJOIN(FirstMatch) */ – Join buffer strategy: /*+ BKA(table1) */ • Adjust optimizer_switch flags: – set optimizer_switch=“condition_fanout_filter=OFF” • Ask question in the MySQL optimizer forum When the optimizer does not do what you want: New hint syntax and new hints in MySQL 5.7 48
  • 49. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Summary • Query transformations • Selecting data access method • Join optimizer • Subquery optimizations • Plan refinements Questions? Optimizer: What´s New in 5.7 and Sneak Peek at 5.8 Thursday at 11:00 Ballroom C 49

Editor's Notes

  1. I think it is time to start. I would like to welcome you to this presentation on the MySQL optimizer. My name is Olav Sandstå and I work in the MySQL optimizer team.
  2. The goal for this session is to give an overview of how the MySQL optimizer works.. I will start with a short overview of the optimizer and then present each of the main optimizations of it in more details. I will try to leave some time for questions at the end of the presentation but if you have short and easy questions feel free to ask during the presentation.
  3. The query optimizer takes as input a SQL query and produces a plan for how this query should be executed. For complex queries there are many possible query plans. The goal is that the optimizer should be able to find the best query plan. In order to optimize the query, the optimizer uses information from the data dictionary and statistics from the storage engine about the data in the tables. At the end of this session you all should know a bit more about what happens inside the “optimizer cloud” and how it works.
  4. Let us first start with an overview of the architecture of MySQL to show where the query optimizer fits in. When a query arrives it first goes through the parser. The second step is the “resolver” which does name resolutions and semantic checks of the tables and columns in the query. These two steps produce a query tree representation of the SQL query that goes into the optimizer. The optimizer will optimize the query and produce a query execution plan. This query execution plan will then be executed and data will be read from the storage engine and the result will be returned to the client. The rest of this presentation will go into more details of what happens inside the the optimizer module.
  5. Here are some initial characteristics of the optimizer:
  6. Now we should be ready to look at what is inside of the optimizer. Let us first go back to the MySQL architecture I presented a few slides earlier. Each query that is optimized goes through four main stages. I will show the stages here and briefly mention what they do. The first phase is the logical transformations. This stage is about simplifying the query and prepare it for later optiizations. You do not need to look at the details in the yellow boxes. I will cover that in details later in the presentation. The secnond phase is to make prepartions for the main optimizer. Here we mostly analyze alternative ways of reading data from tables. The third it the main join optimizer. And the final phase is to make some final adjustments and optimizations to the query plan.
  7. Let us start with the first stage of the optimizer which is the logical transformations the optimizer does to the query.
  8. The logical transformations the optimizer has are mostly rule based. The reason for doing logical transformations to the query is either to simplify the query or re-write it in order to prepare it for the later optimizations. Here is a list of the main transformations we do: To simplify the query conditons we a apply a set of rule based transformations. I will show an example of these on the next slide- We convert outer joins to inner joins We merge views and derived tables into the main query And finally, We have a large set of transformations that apply to optimize subqueries. I will get back to the subquery optimizations at the end of the presentations.
  9. Join of two tables with a fairly complex where condition. The slide shows how we apply different rule based transformations to the where clause in order to simplify it …… The final result is easier to optimize and cost less during execution.
  10. 8 minutt: Before going to the next stage of the optimizer, I will spend a few minutes presenting how the optimizer does cost based optimiations.
  11. 8 minutt: The general way we do cost based optimization of queries in MySQL is that we: First calculate the cost for all alternative ways of reading data from tables. This includes looking at all useful ways for using different indexes and different access methods. Then we build alternative plans for how the query can be executed. This is mostly done by the join optimizer. For each alternative plan we calculate the cost Finally, we select the query plan with the lowest cost. This plan is then executed.
  12. In MySQL the main cost based optimization are: Choosing which index and which access method to use for each table The join order Which join buffer strategy to use And for subqueries, which subquery strategy to use.
  13. In order to be able to do cost based optimization we need to have a model for what the cost of a query is. Here is a very simplified view of the cost model in MySQL. As input it takes a basic operation like reading data from a table or joining two tables. As output it produces an estimate for the cost of doing this operation. In addition to the cost estimate, it will in most cases also produce an estimate for how many rows this operation will produce. The cost model consists of a lot of formulas for calculating cost and record estimates for different operations. In addition to the cost formulas, the cost model consists of a set of “cost constants”. These are the cost of the basic operations that the MySQL server does when execution a query. The cost model uses information from the data dictionary and statistics from storage engines to do its calculations. The main statistics is the number of rows in a table, cardinality and range estimates. All of these are produced by the storage engine. From the data dictionary we use information about records and indexes: like the length of records and keys, uniqueness and if they can be null. One thing that is new in 5.7 is that the cost model has been made configurable. The cost constants are now stored in database tables and can be changed. I will go into more details on this on next slide.
  14. In the MySQL cost model the basic cost unit is the cost for reading a random data page from disk. All other cost numbers are relative to this cost unit. The main cost factors we include when estimating the cost for a query is: IO cost: where we estimate the number of pages we need to read for tables and indexes. For CPU the main contributions are the cost for evaluating query conditions and for comparing keys and records. The main cost constants we use in the cost calculations are: Reading a random page from disk which has a cost of 1.0 Reading a page from the database buffer. Evaluating the query condition on a records, which has a cost of 0.2 And comparing keys or records which has a cost of 0.1
  15. To hopefully make it a bit more clear how the cost model works, I will show an example. This query can be executed as a table scan or as a range scan if we add a secondary index. For a table scan, the server must read the entire table and evaluate the query condition for all records. The IO cost estimate for this is based on the number of pages in the table and the cpu cost is based on having to evaluate the query condition on each record. The second alternative is to run this as a range scan. The optimizer will ask the storage engine for an estimate of how many records are in the range between 20 and 23. The cost model will compute the cost estimate as follows: The IO cost will be dominated by having to look each record up in the base table. The cost for reading the secondary index will be small compared to all these lookups. So the IO cost will be calculated as the number of pages we will need to read from the base table. One per record. The cost fomula for computing CPU cost is as follows: We first multiple the number of records to read by the cost for evaluating a condition. This corresponds to having to evaluate the range condition on every record. Then we add the same cost a second time. This time it is for account the CPU cost of evaluating the WHERE condition. The final choice of whether this query will be done as a table scan or a range scan depends on which of these have the lowest cost estimate.
  16. 14 minutt. The second stage of the optimizer is to analyze possible access methods for reading the data needed by the query.
  17. 14 minutt The goal is find for each table in the query what is the best way to read the data from the table. For each table in the query we do the following: Checks if access method is possible and useful Estimates the cost of using that access method Select the access method with the lowest cost The box on the right of the slide lists the main access methods that is used by MySQL. I will go into more details about most of these on the following slides.
  18. Index lookup or ref access is an access method for reading all record with a given key value using an index. This is the main access method for most of the tables in a join. The first table can be read with different access methods, but if the following tables have useful indexes, then ref access will be used. In the first example we will use ref access to look up all record where the key value is seven. In the second example we will be able to use ref access on the second table when doing the join operation. In the optimizer and in the explain output we distinguish between two different ref access methods, equality ref and normal ref. The first one, equality reference, is used when reading from a unique index. The second one is used when reading from a non-unique index or from an prefix of an index.
  19. One important thing the optimizer does when evaluating access methods is to do ref access analysis. This example shows a query that joins three tables. The query finds the name of the capital and the languages that is used in of each country. By analyzing the query and check which columns that have indexes, the optimizer determines which indexes that can be used for ref access for this query. The optimizer looks at the query conditions. If we start by the first one, we see that the the country_id can be used for joining both city and country table by using index lookup. In the same way, by looking at the second query condition, we see that country and language both the can joined using index lookup. And based on this, we can also use country_id for index lookup between the city and language tables. Finally the last condition in the where clause gives us information that we can use index lookup on the city_id in the index table from the country table. This graf is used later by the join optimizer to know which tables that can be added using index lookup..
  20. The next important access method is range access. The range optimizer will for each index on the table try to find what is the minimal range that needs to be read using each index. Example: This example the table has index on key1 and an index on key2. The range optimizer will analyze the query condition and determine which part of the index it has to read using each of these indexes.
  21. The range optimizer is able to use all parts of the query condition that is comparing a column with an index against a constant value. It supports nested and and or conditions. The result from the range optimizer is a list of ranges that need to be read from each index. The cost estimate is based on the number of records that need to be read from each range. The index that has the lowest cost estimate will be selected.
  22. For indexes that only contains a single column, range access is fairly easy to estimate. If the index contains multiple parts it getting more complex. Her is an example of an index that covers three columns (a,b c) in a table. The layout of the index is such that it is first sorted on values from the first column a, Within each a value, we have the corresponding b-values sorted. And similarly for the c values.
  23. The range optimizer is able to find which ranges to read on this multipart index but there are some specific requirements that must be fulfilled. Conditions on the first column can always be used by the range optimizer. If the condition on the first index part is an equality condition, it can also use the conditons on the second index part in the range optimzer. Here is an example where it can use conditions on the two first columns. The condition on a is an equality condition, a should either 10, 11 or 13. The second index part can be added. And since the second index part is also an equality condition, be should be 2 or 4, would could have added conditions on c if there where any. So after having run the range optimizer, the resulting range scan would read the following range from the index when executing this query.
  24. Let look at another example. The query is almost the same. But this time the condition on the first index part is not an equality condition, a should be larger an 10 and less than 13. In this case we can use the condition on a as a range criterium but we can not use the condition on b. The resulting range scan produced by the range optimizer is shown below. We see that much more of the index has to be read.
  25. When using range access we are only reading from a single index. In some cases we can read from multiple indexes simultanously and use this to reduce the number of records that need to be read from the table. This is called index merge. Three index merge strategies are implemented. Example: Single index cannot handle Ored conditions on different columns
  26. The last access method is called loose index scan. This is an optimization for queries containing group by or distinct and twhere the GROUP BY/DISTINCT is on a prefix of a multipart index. If we look at the last of the three example queries, we are grouping on a and would like to have the lowest b value. By using loose index scan we can do this very effiently by just reading the first index entry for each a value. And then “jump” to the next a value without having to read the index entries inbetween.
  27. 22 minutt Then we should be ready for the join optimizer, which is the third stage of the optimizer. The join optimizer does the main job of deciding the final join order for the tables in the query.
  28. 22 minutt The goal of the join optimizer is the find the best join order for the tables in the query. With N tables, there are N faculty possible plans. For instance if you have ten tables, there are 3.6 million possible join orders that are considered. In most cases we do not have to evaluate all possible join orders. This figure shows the alternative plans that we would evaluate for a 4 table join. Our join optimizer uses a “greedy search strategy” to evaluate all possible join orders. We start with all 1-table plans. For each of these we expand the plan by adding the other tables in a depth –first order. When adding a new table to the plan, we estimate the cost of this plan. If the cost is larger than the cost of the currently best plan, we prune this. How this works is best illustrated with an example.
  29. When adding a new table to the join we do: -select the best access method (use the ref access graph we made earlier) -Estimate: -number of rows -calculates the cost of adding this table (both the cost of reading the table and the cost of the join) It is important to note that we here select the best plan based on a cost estimate. In order to get good cost estimates for each alternative plan we need to be able to estimate the number of records that each step in the join will produce.
  30. In the previous example I showed how we used the cost estimte for different join orders was used for selecting the best one. This slide shows how we calculate the estimate for how many records we expect to read from a table in a join. In this slide have read data from one table and are going to add the next table to the join. The number of records that we estimates to be read from the second is computed like this. We start with the number of records read from the first table, then we calculate how many of these that will be filtered away by the query conditions on this table and finally we use the index statistics for getting an estimate of how many records we will need to read from the second table. The inclusion of the condition filter effect is new in MySQL 5.7. This should cause the estimate for how many records that will be read from next table in a join to be more accurate than earlier. In our testing we see that a lot of multi-table join queries getting a better join order due to this.
  31. Let look at how we calculate the “condtion filter effect” for a table. This query has three conditons in the WHERE clause. For each table we find which conditions that will be used for filtering away records. We do this by looking at each condition. In order to have any effect, the condtion must: -reference a field in the table -it must not be used by the access method (because then it is already taken into accunt when calculating the number of record that will be read) -it must be compared against an avaiable value: For instancde the employee.name= john will always be possible to evaluate when reading the employee table. -whice the first_office_id <> id depends on the table order of the join
  32. So when we have determined which conditions that should be used for calculating the condition filter effect, then we need to find out how much of the records it will filter out. We base this on the following: If this is an indexed column and it has a range prediciate, then we use that. If no range estimate is availabe, we use index statics, For non-indexed columns, we use guestimates: here are some examples
  33. 30 minutt After the join optimizer is run, the final join order has been decided. Still, we have some adjustements we can do to improve the query plan.
  34. 30 minutt The main optimization are:
  35. 34 minutt As I said at the start of my presentation, I would get back to subquery optimizations at the end of this talk.
  36. 34 minutt
  37. This is a good example of a transformation where we re-write the query to a similar query that will be less costly to execute.
  38. 40 minutt That was the end of what the optimizer does. Let us quickly spend a few minutes looking at what the optimizer has produced.
  39. 40 minutt.
  40. If you want to see the cost numbers for a given query, we have added this to EXPLAIN in JSON format in 5.7. Here you see the output from the explain of the same query we just used in the example for the cost model for range access.
  41. If you need to understand why the optimizer selects a given plan, then optimizer trace can be used. In the optimizer trace you will find all the main steps and decisions done by the optimizer. To get the optimizer trace you first enable optimizer trace, then run your query and finally get the trace from the information schema table named “optimizer trace”. This example show a part of the optimizer trace for this query. The part that is shown in the result from analyzing which access method to use. -table scan -covering index scan -range scan
  42. There might be cases where the optimizer fails at finding the best query plan. Luckily, there are some ways you can deal with that and force it the select a better plan. This slide list some of the options that can be used: