SlideShare a Scribd company logo
1 of 102
Download to read offline
Advanced query optimizer
tuning and analysis
Sergei Petrunia
Timour Katchaounov
Monty Program Ab
MySQL Conference And Expo 2013
2 07:48:08 AM
● Introduction
– What is an optimizer problem
– How to catch it
● old an new tools
● Single-table selects
– brief recap from 2012
– ref access
● index statistics
– join condition pushdown
– join plan efficiency
– query plan vs reality
● Big I/O bound JOINs
– Batched Key Access
● Aggregate functions
● Subqueries
3 07:48:08 AM
Is there a problem with query optimizer?
• Database
performance is
affected by many
• One of them is the
query optimizer
• Is my performance
problem caused by
the optimizer?
4 07:48:08 AM
Sings that there is a query optimizer problem
• Some (not all) queries are slow
• A query seems to run longer than it ought to
– And examines more records than it ought to
• Usually, query remains slow regardless of
other activity on the server
5 07:48:08 AM
Catching slow queries, the old ways
● Watch the Slow query log
– Percona Server/MariaDB:
# Thread_id: 1 Schema: dbt3sf10 QC_hit: No
# Query_time: 2.452373 Lock_time: 0.000113 Rows_sent: 0 Rows_examined: 1500000
# Full_scan: Yes Full_join: No Tmp_table: No Tmp_table_on_disk: No
# Filesort: No Filesort_on_disk: No Merge_passes: 0
SET timestamp=1333385770;
select * from customer where c_acctbal < -1000;
• Run SHOW PROCESSLIST periodically
– Run pt-query-digest on the log
6 07:48:08 AM
• Available in MariaDB 10.0+
• Displays EXPLAIN of a running statement
MariaDB> show processlist;
|Id|User|Host |db |Command|Time|State |Info
| 1|root|localhost|dbt3sf1|Query | 10|Sending data|select max(o_totalprice) ...
| 2|root|localhost|dbt3sf1|Query | 0|init |show processlist
MariaDB> show explain for 1;
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |orders|ALL |NULL |NULL|NULL |NULL|1498194|Using where|
MariaDB [dbt3sf1]> show warnings;
|Level|Code|Message |
|Note |1003|select max(o_totalprice) from orders where year(o_orderDATE)=1995|
7 07:48:08 AM
● Intended usage
● Why not just run EXPLAIN again
– Difficult to replicate setups
● Temporary tables
● Optimizer settings
● Storage engine's index statistics
● ...
– No uncertainty about whether you're looking at
the same query plan or not.
8 07:48:08 AM
Catching slow queries (NEW)
● use performance_schema
● Many ways to analyze via queries
– events_statements_summary_by_digest
● count_star, sum_timer_wait,
min_timer_wait, avg_timer_wait, max_timer_wait
● digest_text, digest
● sum_rows_examined, sum_created_tmp_disk_tables,
– events_statements_history
● sql_text, digest_text, digest
● timer_start, timer_end, timer_wait
● rows_examined, created_tmp_disk_tables,
9 07:48:08 AM
Catching slow queries (NEW)
• Modified Q18 from DBT3
select c_name, c_custkey, o_orderkey, o_orderdate,
o_totalprice, sum(l_quantity)
from customer, orders, lineitem
o_totalprice > ?
and c_custkey = o_custkey
and o_orderkey = l_orderkey
group by c_name, c_custkey, o_orderkey,
o_orderdate, o_totalprice
order by o_totalprice desc, o_orderdate
• App executes Q18 many times with
? = 550000, 500000, 400000, ...
10 07:48:08 AM
Catching slow queries (NEW)
● Find candidate slow queries
● Simple tests: select_full_join > 0,
created_tmp_disk_tables > 0, etc
● Complex conditions:
max execution time > X sec OR
min/max time vary a lot:
select max_timer_wait/avg_timer_wait as max_ratio,
avg_timer_wait/min_timer_wait as min_ratio
from events_statements_summary_by_digest
where max_timer_wait > 1000000000000
or max_timer_wait / avg_timer_wait > 2
or avg_timer_wait / min_timer_wait > 2G
11 07:48:08 AM
Catching slow queries (NEW)
*************************** 5. row ***************************
DIGEST: 3cd7b881cbc0102f65fe8a290ec1bd6b
DIGEST_TEXT: SELECT `c_name` , `c_custkey` , `o_orderkey` , `o_orderdate` ,
`o_totalprice` , SUM ( `l_quantity` ) FROM `customer` , `orders` , `lineitem` WHERE
`o_totalprice` > ? AND `c_custkey` = `o_custkey` AND `o_orderkey` = `l_orderkey` GROUP BY
`c_name` , `c_custkey` , `o_orderkey` , `o_orderdate` , `o_totalprice` ORDER BY `o_totalprice`
DESC , `o_orderdate` LIMIT ?
SUM_TIMER_WAIT: 3251758347000
MIN_TIMER_WAIT: 3914209000 → 0.0039 sec
AVG_TIMER_WAIT: 1083919449000
MAX_TIMER_WAIT: 3204044053000 → 3.2 sec
SUM_LOCK_TIME: 555000000
FIRST_SEEN: 1970-01-01 03:38:27
LAST_SEEN: 1970-01-01 03:38:43
max_ratio: 2.9560
min_ratio: 276.9192
High variance of
execution time
12 07:48:08 AM
Catching slow queries (NEW)
● Check the actual queries and constants
● The events_statements_history table
select timer_wait/1000000000000 as exec_time, sql_text
from events_statements_history
where digest in
(select digest from events_statements_summary_by_digest
where max_timer_wait > 1000000000000
or max_timer_wait / avg_timer_wait > 2
or avg_timer_wait / min_timer_wait > 2)
order by timer_wait;
13 07:48:08 AM
Catching slow queries (NEW)
| exec_time | sql_text |
| 0.0039 | select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
from customer, orders, lineitem
where o_totalprice > 550000 and c_custkey = o_custkey ... LIMIT 10 |
| 0.0438 | select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
from customer, orders, lineitem
where o_totalprice > 500000 and c_custkey = o_custkey ... LIMIT 10 |
| 3.2040 | select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
from customer, orders, lineitem
where o_totalprice > 400000 and c_custkey = o_custkey ... LIMIT 10 |
orders.o_totalprice > ? is less and less selective
14 07:48:08 AM
Actions after finding the slow query
Bad query plan
– Rewrite the query
– Force a good query plan
• Bad optimizer settings
– Do tuning
• Query is inherently complex
– Don't waste time with it
– Look for other solutions.
15 07:48:08 AM
● Introduction
– What is an optimizer problem
– How to catch it
● old an new tools
● Single-table selects
– brief recap from 2012
– ref access
● index statistics
– join condition pushdown
– join plan efficiency
– query plan vs reality
● Big I/O bound JOINs
– Batched Key Access
● Aggregate functions
● Subqueries
16 07:48:08 AM
Consider a simple select
• 15M rows were scanned, 19 rows in output
• Query plan seems inefficient
– (note: this logic doesn't directly apply to group/order by queries).
select * from orders
o_orderDate BETWEEN '1992-06-06' and '1992-07-06' and
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
| 1 | SIMPLE | orders | ALL | NULL | NULL | NULL | NULL | 15084733 | Using where |
19 rows in set (7.65 sec)
● Check the query plan:
● Run the query:
17 07:48:08 AM
Query plan analysis
• Entire table is scanned
• WHERE condition checked
after records are read
– Not used to limit
#examined rows.
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
| 1 | SIMPLE | orders | ALL | NULL | NULL | NULL | NULL | 15084733 | Using where |
select * from orders
o_orderDate BETWEEN '1992-06-06' and '1992-07-06' and
18 07:48:08 AM
Let's add an index
• Outcome
– Down to reading 300K rows
– Still, 300K >> 19 rows.
alter table orders add key i_o_orderdate (o_orderdate);
select * from orders
o_orderDate BETWEEN '1992-06-06' and '1992-07-06' and
|id|select_type|table |type |possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |orders|range|i_o_orderdate|i_o_orderdate|4 |NULL|306322|Using where|
19 rows in set (0.76 sec)
● Query time:
19 07:48:08 AM
Finding out which indexes to add
● index (o_orderdate)
● index (o_clerk)
Check selectivity of conditions that will use the index
select * from orders
o_orderDate BETWEEN '1992-06-06' and '1992-07-06' and
select count(*) from orders
o_orderDate BETWEEN '1992-06-06' and '1992-07-06';
306322 rows
select count(*) from orders where o_clerk='Clerk#000009506'
1507 rows.
20 07:48:08 AM
|id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra |
|1 |SIMPLE |orders|range|i_o_clerk_...|i_o_clerk_date|20 |NULL|19 |Using where|
|id|select_type|table |type |possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |orders|range|i_o_date_c...|i_o_date_clerk|20 |NULL|360354|Using where|
Try adding composite indexes
● index (o_clerk, o_orderdate)
● index (o_orderdate, o_clerk)
Bingo! 100% efficiency
Much worse!
• If condition uses multiple columns, composite index will be most efficient
• Order of column matters
– Explanation why is outside of scope of this tutorial. Covered in last year's
21 07:48:08 AM
Conditions must be in SARGable form
• Condition must represent a range
• It must have form that is recognized by the optimizer
o_orderDate BETWEEN '1992-06-01' and '1992-06-30'
day(o_orderDate)=1992 and month(o_orderdate)=6
TO_DAYS(o_orderDATE) between TO_DAYS('1992-06-06') and
o_clerk LIKE 'Clerk#000009506'
o_clerk LIKE '%Clerk#000009506%'
column IN (1,10,15,21, ...)
(col1, col2) IN ( (1,1), (2,2), (3,3), …). 
22 07:48:08 AM
New in MySQL-5.6: optimizer_trace
● Lets you see the ranges
set optimizer_trace=1;
explain select * from orders
where o_orderDATE between '1992-06-01' and '1992-07-03' and
o_orderdate not in ('1992-01-01', '1992-06-12','1992-07-04')
select * from information_schema.optimizer_traceG
● Will print a big JSON struct
● Search for range_scan_alternatives.
23 07:48:08 AM
New in MySQL-5.6: optimizer_trace
"range_scan_alternatives": [
"index": "i_o_orderdate",
"ranges": [
"1992-06-01 <= o_orderDATE < 1992-06-12",
"1992-06-12 < o_orderDATE <= 1992-07-03"
"index_dives_for_eq_ranges": true,
"rowid_ordered": false,
"using_mrr": false,
"index_only": false,
"rows": 319082,
"cost": 382900,
"chosen": true
"index": "i_o_date_clerk",
"ranges": [
"1992-06-01 <= o_orderDATE < 1992-06-12",
"1992-06-12 < o_orderDATE <= 1992-07-03"
"index_dives_for_eq_ranges": true,
"rowid_ordered": false,
"using_mrr": false,
"index_only": false,
"rows": 406336,
"cost": 487605,
"chosen": false,
"cause": "cost"
● Considered ranges are shown
in range_scan_alternatives
● This is actually original use
case of optimizer_trace
● Alas, recent mysql-5.6 displays
misleading info about ranges
on multi-component keys (will
file a bug)
● Still, very useful.
24 07:48:08 AM
Source of #rows estimates for range
select * from orders
where o_orderDate BETWEEN '1992-06-06' and '1992-07-06'
|id|select_type|table |type |possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |orders|range|i_o_orderdate|i_o_orderdate|4 |NULL|306322|Using where|
• “records_in_range” estimate
• Done by diving into index
• Usually is fairly accurate
• Not affected by ANALYZE
25 07:48:08 AM
Simple selects: conclusions
• Efficiency == “#rows_scanned is close to #rows_returned”
• Indexes and WHERE conditions reduce #rows scanned
• Index estimates are usually accurate
• Multi-column indexes
– “handle” conditions on multiple columns
– Order of columns in the index matters
• optimizer_trace allows to view the ranges
– But misrepresents ranges over multi-column indexes.
26 07:48:08 AM
Now, will skip some topics
One can also speedup simple selects with
● index_merge access method
● index access method
● Index Condition Pushdown
We don't have time for these now, check out the last
year's tutorial.
27 07:48:08 AM
● Introduction
– What is an optimizer problem
– How to catch it
● old an new tools
● Single-table selects
– brief recap from 2012
– ref access
● index statistics
– join condition pushdown
– join plan efficiency
– query plan vs reality
● Big I/O bound JOINs
– Batched Key Access
● Aggregate functions
● Subqueries
28 07:48:08 AM
A simple join
select * from customer, orders where c_custkey=o_custkey
• “Customers with their orders”
29 07:48:08 AM
Execution: Nested Loops join
select * from customer, orders where c_custkey=o_custkey
for each customer C {
for each order O {
if (C.c_custkey == O.o_custkey)
produce record(C, O);
• Complexity:
– Scans table customer
– For each record in customer, scans table orders
• Is this ok?
30 07:48:08 AM
Execution: Nested loops join (2)
select * from customer, orders where c_custkey=o_custkey
for each customer C {
for each order O {
if (C.c_custkey == O.o_custkey)
produce record(C, O);
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |customer|ALL |NULL |NULL|NULL |NULL|148749 | |
|1 |SIMPLE |orders |ALL |NULL |NULL|NULL |NULL|1493631|Using where|
31 07:48:08 AM
Execution: Nested loops join (3)
select * from customer, orders where c_custkey=o_custkey
for each customer C {
for each order O {
if (C.c_custkey == O.o_custkey)
produce record(C, O);
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |customer|ALL |NULL |NULL|NULL |NULL|148749 | |
|1 |SIMPLE |orders |ALL |NULL |NULL|NULL |NULL|1493631|Using where|
rows to read
from customer
rows to read from orders
32 07:48:08 AM
Execution: Nested loops join (4)
select * from customer, orders where c_custkey=o_custkey
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |customer|ALL |NULL |NULL|NULL |NULL|148749 | |
|1 |SIMPLE |orders |ALL |NULL |NULL|NULL |NULL|1493631|Using where|
• Scan a 1,493,361-row table 148,749 times
– Consider 1,493,361 * 148,749 row combinations
• Is this query inherently complex?
– We know each customer has his own orders
– size(customer x orders)= size(orders)
– Lower bound is
1,493,361 + 148,749 + costs to match customer<->order.
33 07:48:08 AM
Using index for join: ref access
alter table orders add index i_o_custkey(o_custkey)
select * from customer, orders where c_custkey=o_custkey
34 07:48:08 AM
ref access - analysis
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra|
|1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |148749| |
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 | |
select * from customer, orders where c_custkey=o_custkey
● One ref lookup scans 7 rows.
● In total: 7 * 148,749=1,041,243 rows
– `orders` has 1.4M rows
– no redundant reads from `orders`
● The whole query plan
– Reads all customers
– Reads 1M orders (of 1.4M)
● Efficient!
35 07:48:08 AM
Conditions that can be used for ref access
● Can use equalities
– tbl.key=other_table.col
– tbl.key=const
– tbl.key IS NULL
● For multipart keys, will use largest prefix
– keypart1=... AND keypart2= … AND keypartK=... .
36 07:48:08 AM
Conditions that can't be used for ref access
● Doesn't work for non-equalities
t1.key BETWEEN t2.col1 AND t2.col2
● Doesn't work for OR-ed equalities
t1.key=t2.col1 OR t1.key=t2.col2
– Except for ref_or_null
t1.key=... OR t1.key IS NULL
● Doesn't “combine” ref and range
– t.keypart1 BETWEEN c1 AND c2 AND
– t.keypart2 BETWEEN c1 AND c2 AND
t.keypart1=t2.col .
37 07:48:08 AM
Is ref always efficient?
● Efficient, if column has many different values
– Best case – unique index (eq_ref)
● A few different values – not useful
● Skewed distribution: depends on which part the
join touches
38 07:48:08 AM
ref access estimates - index statistics
• How many rows will match
tbl.key_column = $value
for an arbitrary $value?
• Index statistics
show keys from orders where key_name='i_o_custkey'
*************************** 1. row ***************
Table: orders
Non_unique: 1
Key_name: i_o_custkey
Seq_in_index: 1
Column_name: o_custkey
Collation: A
Cardinality: 214462
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
show table status like 'orders'
*************************** 1. row ****
Name: orders
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 1495152
Avg_row_length: 133
Data_length: 199966720
Max_data_length: 0
Index_length: 122421248
Data_free: 6291456
average = Rows /Cardinality = 1495152 / 214462 = 6.97.
39 07:48:08 AM
ref access – conclusions
● Based on t.key=... equality conditions
● Can make joins very efficient
● Relies on index statistics for estimates.
40 07:48:08 AM
Optimizer statistics
● MySQL/Percona Server
– Index statistics
– Persistent/transient InnoDB stats
● MariaDB
– Index statistics, persistent/transient
● Same as Percona Server (via XtraDB)
– Persistent,
index-independent statistics.
41 07:48:08 AM
Index statistics
● Cardinality allows to calculate a table-wide
average #rows-per-key-prefix
● It is a statistical value (inexact)
● Exact collection procedure depends on the
storage engine
– InnoDB – random sampling
– MyISAM – index scan
– Engine-independent – index scan.
42 07:48:08 AM
Index statistics in MySQL 5.6
● Sample [8] random index leaf pages
● Table statistics (stored)
– rows - estimated number of rows in a table
– Other stats not used by optimizer
● Index statistics (stored)
– fields - #fields in the index
– rows_per_key - rows per 1 key value, per prefix fields
([1 column value], [2 columns value], [3 columns value], …)
– Other stats not used by optimizer.
43 07:48:08 AM
Index statics updates
● Statistics updated when:
– ANALYZE TABLE tbl_name [, tbl_name] …
– A table is opened for the first time
(after server restart)
– A table has changed >10%
– When InnoDB Monitor is turned ON.
44 07:48:08 AM
Displaying optimizer statistics
● MySQL 5.5, MariaDB 5.3, and older
– Issue SQL statements to count rows/keys
– Indirectly, look at EXPLAIN for simple queries
● MariaDB 5.5, Percona Server 5.5 (using XtraDB)
– information_schema.[innodb_index_stats, innodb_table_stats]
– Read-only, always visible
● MySQL 5.6
– mysql.[innodb_index_stats, innodb_table_stats]
– User updatetable
– Only available if innodb_analyze_is_persistent=ON
● MariaDB 10.0
– Persistent updateable tables mysql.[index_stats, column_stats, table_stats]
– User updateable
– + current XtraDB mechanisms.
45 07:48:08 AM
Plan [in]stability
● Statistics may vary a lot (orders)
MariaDB [dbt3]> select * from information_schema.innodb_index_stats;
+------------+-----------------+--------------+ +---------------+
| table_name | index_name | rows_per_key | | rows_per_key | error (actual)
+------------+-----------------+--------------+ +---------------+
| partsupp | PRIMARY | 3, 1 | | 4, 1 | 25%
| partsupp | i_ps_partkey | 3, 0 | => | 4, 1 | 25% (4)
| partsupp | i_ps_suppkey | 64, 0 | | 91, 1 | 30% (80)
| orders | i_o_orderdate | 9597, 1 | | 1660956, 0 | 99% (6234)
| orders | i_o_custkey | 15, 1 | | 15, 0 | 0% (15)
| lineitem | i_l_receiptdate | 7425, 1, 1 | | 6665850, 1, 1 | 99.9% (23477)
+------------+-----------------+--------------+ +---------------+
MariaDB [dbt3]> select * from information_schema.innodb_table_stats;
+-----------------+----------+ +----------+
| table_name | rows | | rows |
+-----------------+----------+ +----------+
| partsupp | 6524766 | | 9101065 | 28% (8000000)
| orders | 15039855 | ==> | 14948612 | 0.6% (15000000)
| lineitem | 60062904 | | 59992655 | 0.1% (59986052)
+-----------------+----------+ +----------+
46 07:48:08 AM
Controlling statistics (MySQL 5.6)
● Persistent and user-updatetable InnoDB statistics
– innodb_analyze_is_persistent = ON,
– updated manually by ANALYZE TABLE or
– automatically by innodb_stats_auto_recalc = ON
● Control the precision of sampling [default 8]
– innodb_stats_persistent_sample_pages,
– innodb_stats_transient_sample_pages
No new statistics compared to older versions.
47 07:48:08 AM
Controlling statistics (MariaDB 10.0)
Current XtraDB index statistics
● Engine-independent, persistent, user-updateable statistics
● Precise
● Additional statistics per column (even when there is no
– min_value, max_value: minimum/maximum value per
– nulls_ratio: fraction of null values in a column
– avg_length: average size of values in a column
– avg_frequency: average number of rows with the same
48 07:48:08 AM
Join condition
49 07:48:08 AM
Join condition pushdown
select *
customer, orders
c_custkey=o_custkey and c_acctbal < -500 and
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where|
50 07:48:08 AM
Join condition pushdown
select *
customer, orders
c_custkey=o_custkey and c_acctbal < -500 and
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where|
51 07:48:08 AM
Join condition pushdown
select *
customer, orders
c_custkey=o_custkey and c_acctbal < -500 and
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where|
52 07:48:08 AM
Join condition pushdown
select *
customer, orders
c_custkey=o_custkey and c_acctbal < -500 and
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where|
● Conjunctive (ANDed) conditions are split into parts
● Each part is attached as early as possible
– Either as “Using where”
– Or as table access method.
53 07:48:08 AM
Observing join condition pushdown
"query_block": {
"select_id": 1,
"nested_loop": [
"table": {
"table_name": "orders",
"access_type": "ALL",
"possible_keys": [
"rows": 1499715,
"filtered": 100,
"attached_condition": "((`dbt3sf1`.`orders`.`o_orderpriority` =
'1-URGENT') and (`dbt3sf1`.`orders`.`o_custkey` is not null))"
"table": {
"table_name": "customer",
"access_type": "eq_ref",
"possible_keys": [
"key": "PRIMARY",
"used_key_parts": [
"key_length": "4",
"ref": [
"rows": 1,
"filtered": 100,
"attached_condition": "(`dbt3sf1`.`customer`.`c_acctbal` <
● Before mysql-5.6:
EXPLAIN shows only
“Using where”
– The condition itself
only visible in debug
● Starting from 5.6:
shows attached
54 07:48:08 AM
Reasoning about join plan efficiency
select *
customer, orders
c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT';
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where|
First table, “customer”
● type=ALL, 150 K rows
select count(*) from customer where c_acctbal < -500 gives 6804.
● alter table customer add index (c_acctbal).
55 07:48:08 AM
Reasoning about join plan efficiency
select *
customer, orders
c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT';
First table, “customer”
● type=ALL, 150 K rows
select count(*) from customer where c_acctbal < -500 gives 6804.
● alter table customer add index (c_acctbal)
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where|
|id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra |
|1 |SIMPLE |customer|range|PRIMARY,c_...|c_acctbal |9 |NULL |6802|Using index condition|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where |
Now, access to 'customer' is efficient.
56 07:48:08 AM
Reasoning about join plan efficiency
select *
customer, orders
c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT';
Second table, “orders”
● Attached condition: c_custkey=o_custkey and o_orderpriority='1-URGENT'
● ref access uses only c_custkey=o_custkey
● What about o_orderpriority='1-URGENT'?.
|id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra |
|1 |SIMPLE |customer|range|PRIMARY,c_...|c_acctbal |9 |NULL |6802|Using index condition|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where |
57 07:48:08 AM
● select count(*) from orders – 1.5M rows
● select count(*) from orders where o_orderpriority='1-URGENT' - 300K
● 300K / 1.5M = 0.2
58 07:48:08 AM
Reasoning about join plan efficiency
select *
customer, orders
c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT';
Second table, “orders”
● Attached condition: c_custkey=o_custkey and o_orderpriority='1-URGENT'
● ref access uses only c_custkey=o_custkey
● What about o_orderpriority='1-URGENT'? Selectivity= 0.2
– Can examine 7*0.2=1.4 rows, 6802 times if we add an index:
alter table orders add index (o_custkey, o_orderpriority)
alter table orders add index (o_orderpriority, o_custkey)
|id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra |
|1 |SIMPLE |customer|range|PRIMARY,c_...|c_acctbal |9 |NULL |6802|Using index condition|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where |
59 07:48:08 AM
Reasoning about join plan efficiency - summary
Basic* approach to evaluation of join plan efficiency:
for each table $T in the join order {
Look at conditions attached to table $T (condition must
use table $T, may also use previous tables)
Does access method used with $T make a good use
of attached conditions?
|id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra |
|1 |SIMPLE |customer|range|PRIMARY,c_...|c_acctbal |9 |NULL |6802|Using index condition|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where |
* some other details may also affect join performance
60 07:48:08 AM
Attached conditions
61 07:48:08 AM
Attached conditions
● Ideally, should be used for table access
● Not all conditions can be used [at the same time]
– Unused ones are still useful
– They reduce number of scans for subsequent tables
select *
customer, orders
c_custkey=o_custkey and c_acctbal < -500 and
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra |
|1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where|
62 07:48:08 AM
Informing optimizer about attached conditions
Currently: a range access that's too expensive to use
|id|select_type|table |type|possible_keys |key |key_len|ref |rows |filtered|Extra |
|1 |SIMPLE |customer|ALL |PRIMARY,c_acctbal|NULL |NULL |NULL |150081| 36.22 |Using where|
|1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 | 100.00 |Using where|
explain extended
select *
customer, orders
c_custkey=o_custkey and c_acctbal > 8000 and
● `orders` will be scanned 150081 * 36.22%= 54359 times
● This reduces the cost of join
– Has an effect when comparing potential join plans
● => Index i_o_custkey is not used. But may help the optimizer.
63 07:48:08 AM
Attached condition selectivity
● Unused indexes provide info about selectivity
– Works, but very expensive
● MariaDB 10.0 has engine-independent statistics
– Index statistics
– Non-indexed Column statistics
● Histograms
– Further info:
Tomorrow, 2:20 pm @ Ballroom D
Igor Babaev
Engine-independent persistent statistics with histograms
in MariaDB.
64 07:48:08 AM
How to check if the query plan
matches the reality
65 07:48:08 AM
Check if query plan is realistic
● EXPLAIN shows what optimizer
expects. It may be wrong
– Out-of-date index statistics
– Non-uniform data distribution
● MySQL: no equivalent. Instead, have
– Handler counters
– “User statistics” (Percona, MariaDB)
66 07:48:08 AM
Join analysis: example query (Q18, DBT3)
<reset counters>
select c_name, c_custkey, o_orderkey, o_orderdate,
o_totalprice, sum(l_quantity)
from customer, orders, lineitem
o_totalprice > 500000
and c_custkey = o_custkey
and o_orderkey = l_orderkey
group by c_name, c_custkey, o_orderkey, o_orderdate,
order by o_totalprice desc, o_orderdate
<collect statistics>
67 07:48:08 AM
Join analysis: handler counters (old)
| Handler_mrr_key_refills | 0 |
| Handler_mrr_rowid_refills | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 1646 |
| Handler_read_last | 0 |
| Handler_read_next | 1462 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 10 |
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 184 |
| Handler_tmp_update | 1096 |
| Handler_tmp_write | 183 |
| Handler_update | 0 |
| Handler_write | 0 |
68 07:48:08 AM
Join analysis: USERSTAT by Facebook
MariaDB, Percona Server
| Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_x_#indexes |
| dbt3 | orders | 183 | 0 | 0 |
| dbt3 | lineitem | 1279 | 0 | 0 |
| dbt3 | customer | 183 | 0 | 0 |
| Table_schema | Table_name | Index_name | Rows_read |
| dbt3 | customer | PRIMARY | 183 |
| dbt3 | lineitem | i_l_orderkey_quantity | 1279 |
| dbt3 | orders | i_o_totalprice | 183 |
69 07:48:08 AM
[MySQL 5.6, MariaDB 10.0]
● summary tables with read/write statistics
– table_io_waits_summary_by_table
– table_io_waits_summary_by_index_usage
● Superset of the userstat tables
● More overhead
● Not possible to associate statistics with a query
=> truncate stats tables before running a query
● Possible bug
– performance schema not ignored
– Disable by
UPDATE setup_consumers SET ENABLED = 'NO'
where name = 'global_instrumentation';
70 07:48:08 AM
Analyze joins via PERFORMANCE SCHEMA:
select object_schema, object_name, count_read, count_write,
sum_timer_read, sum_timer_write, ...
from table_io_waits_summary_by_table
where object_schema = 'dbt3' and count_star > 0;
| object_schema | object_name | count_read | count_write |
| dbt3 | customer | 183 | 0 |
| dbt3 | lineitem | 1462 | 0 |
| dbt3 | orders | 184 | 0 |
| sum_timer_read | sum_timer_write | ...
| 8326528406 | 0 |
| 12117332778 | 0 |
| 7946312812 | 0 |
71 07:48:08 AM
Analyze joins via PERFORMANCE SCHEMA:
select object_schema, object_name, index_name, count_read,
sum_timer_read, sum_timer_write, ...
from table_io_waits_summary_by_index_usage
where object_schema = 'dbt3' and count_star > 0
and index_name is not null;
| object_schema | object_name | index_name | count_read |
| dbt3 | customer | PRIMARY | 183 |
| dbt3 | lineitem | i_l_orderkey_quantity | 1462 |
| dbt3 | orders | i_o_totalprice | 184 |
| sum_timer_read | sum_timer_write | ...
| 8326528406 | 0 |
| 12117332778 | 0 |
| 7946312812 | 0 |
72 07:48:08 AM
● Introduction
– What is an optimizer problem
– How to catch it
● old an new tools
● Single-table selects
– brief recap from 2012
– ref access
● index statistics
– join condition pushdown
– join plan efficiency
– query plan vs reality
● Big I/O bound JOINs
– Batched Key Access
● Aggregate functions
● Subqueries
73 07:48:08 AM
Batched joins
● Optimization for analytical queries
● Analytic queries shovel through lots of data
– e.g. “average size of order in the last month”
– or “pairs of goods purchased together”
● Indexes,etc won't help when you really need to
look at all data
● More data means greater chance of being io-bound
● Solution: batched joins
74 07:48:08 AM
Batched Key Access Idea
75 07:48:08 AM
Batched Key Access Idea
76 07:48:08 AM
Batched Key Access Idea
77 07:48:08 AM
Batched Key Access Idea
78 07:48:08 AM
Batched Key Access Idea
79 07:48:08 AM
Batched Key Access Idea
80 07:48:08 AM
Batched Key Access Idea
● Non-BKA join hits data at random
● Caches are not used efficiently
● Prefetching is not useful
81 07:48:08 AM
Batched Key Access Idea
● BKA implementation accesses data
in order
● Takes advantages of caches and
82 07:48:08 AM
Batched Key access effect
set join_cache_level=6;
select max(l_extendedprice)
from orders, lineitem
l_orderkey=o_orderkey and
o_orderdate between $DATE1 and $DATE2
The benchmark was run with
● Various BKA buffer size
● Various size of $DATE1...$DATE2 range
83 07:48:08 AM
Batched Key Access Performance
-2,000,000 3,000,000 8,000,000 13,000,000 18,000,000 23,000,000 28,000,000 33,000,000
BKA join performance depending on buffer size
query_size=1, regular
query_size=1, BKA
query_size=2, regular
query_size=2, BKA
query_size=3, regular
query_size=3, BKA
Buffer size, bytes
Performance without BKA
Performance with BKA,
given sufficient buffer size● 4x-10x speedup
● The more the data, the bigger the speedup
● Buffer size setting is very important.
84 07:48:08 AM
Batched Key Access settings
● Needs to be turned on
set join_buffer_size= 32*1024*1024;
set join_cache_level=6; -- MariaDB
set optimizer_switch='batched_key_access=on' -- MySQL 5.6
set optimizer_switch='mrr=on';
set optimizer_switch='mrr_sort_keys=on'; -- MariaDB only
● Further join_buffer_size tuning is watching
– Query performance
– Handler_mrr_init counter
and increasing join_buffer_size until either saturates.
85 07:48:08 AM
Batched Key Access - conclusions
● Targeted at big joins
● Needs to be enabled manually
● @@join_buffer_size is the most important
● MariaDB's implementation is a superset of
86 07:48:08 AM
● Introduction
– What is an optimizer problem
– How to catch it
● old an new tools
● Single-table selects
– brief recap from 2012
– ref access
● index statistics
– join condition pushdown
– join plan efficiency
– query plan vs reality
● Big I/O bound JOINs
– Batched Key Access
● Aggregate functions
● Subqueries
87 07:48:08 AM
88 07:48:08 AM
Aggregate functions, no GROUP BY
● COUNT, SUM, AVG, etc need to examine all rows
select SUM(column) from tbl needs to examine the whole tbl.
● MIN and MAX can use index for lookup
|id|select_type|table|type|possible_keys|key |key_len|ref |rows|Extra |
|1 |SIMPLE |NULL |NULL|NULL |NULL|NULL |NULL|NULL|Select tables optimized away|
index (o_orderdate)
select max(o_orderdate) from orders
select min(o_orderdate) from orders where o_orderdate > '1995-05-01'
select max(o_orderdate) from orders where o_orderpriority='1-URGENT'
index (o_orderpriority, o_orderdate)
89 07:48:08 AM
Three algorithms
● Use an index to read in order
● Read one table, sort, join - “Using filesort”
● Execute join into temporary table and then
sort - “Using temporary; Using filesort”
90 07:48:08 AM
Using index to read data in order
● No special indication
in EXPLAIN output
● LIMIT n: as soon as
we read n records,
we can stop!
91 07:48:08 AM
A problem with LIMIT N optimization
`orders` has 1.5 M rows
explain select * from orders order by o_orderdate desc limit 10;
|id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra|
|1 |SIMPLE |orders|index|NULL |i_o_orderdate|4 |NULL|10 | |
select * from orders where o_orderpriority='1-URGENT' order by o_orderdate desc limit 10;
|id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra |
|1 |SIMPLE |orders|index|NULL |i_o_orderdate|4 |NULL|10 |Using where|
● A problem:
– 1.5M rows, 300K of them 'URGENT'
– Scanning by date, when will we find 10 'URGENT' rows?
– No good solution so far.
92 07:48:08 AM
Using filesort strategy
● Have to read the entire
first table
● For remaining, can apply
● ORDER BY can only use
columns of tbl1.
93 07:48:08 AM
Using temporary; Using filesort
● ORDER BY clause
can use columns of
any table
● LIMIT is applied only
after executing the
entire join and
94 07:48:08 AM
ORDER BY - conclusions
● Resolving ORDER BY with index allows very
efficient handling for LIMIT
– Optimization for
WHERE unused_condition ORDER BY … LIMIT n
is challenging.
● Use sql_big_result, IGNORE INDEX FOR ORDER BY
● Using filesort
– Needs all ORDER BY columns in the first table
– Take advantage of LIMIT when doing join to non-first tables
● Using where; Using filesort is least efficient.
95 07:48:08 AM
GROUP BY strategies
There are three strategies
● Ordered index scan
● Loose Index Scan (LooseScan)
● Groups table
(Using temporary; [Using filesort]).
96 07:48:08 AM
Ordered index scan
● Groups are
enumerated one after
● Can compute
aggregates on the fly
● Loose index scan is
also able to jump to
next group.
97 07:48:08 AM
Execution of GROUP BY with temptable
98 07:48:08 AM
99 07:48:08 AM
Subquery optimizations
● Before MariaDB 5.3/MySQL 5.6 - “don't use subqueries”
● Queries that caused most of the pain
– SELECT … FROM tbl WHERE col IN (SELECT …) - semi-joins
– SELECT … FROM (SELECT …) - derived tables
● MariaDB 5.3 and MySQL 5.6
– Have common inheritance, MySQL 6.0 alpha
– Huge (100x, 1000x) speedups for painful areas
– Other kinds of subqueries received a speedup, too
– MariaDB 5.3/5.5 has a superset of MySQL 5.6's optimizations
● 5.6 handles some un-handled edge cases, too
100 07:48:08 AM
Tuning for subqueries
● “Before”: one execution strategy
– No tuning possible
● “After”: similar to joins
– Reasonable execution strategies supported
– Need indexes
– Need selective conditions
– Support batching in most important cases
● Should be better 9x% of the time.
101 07:48:08 AM
What if it still picks a poor query plan?
For both MariaDB and MySQL:
● Check EXPLAIN [EXTENDED], find a keyword around a
subquery table
● Google “ $subuqery_keyword”
● Find which optimization it was
● set optimizer_switch='$subquery_optimization=off'
102 07:48:08 AM
Q & A

More Related Content

What's hot

The MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer TraceThe MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer Traceoysteing
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Jaime Crespo
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer OverviewOlav Sandstå
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 MinutesSveta Smirnova
MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MYXPLAIN
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsEnkitec
[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL TuningPgDay.Seoul
ANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gemANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gemSergey Petrunya
Ash masters : advanced ash analytics on Oracle
Ash masters : advanced ash analytics on Oracle Ash masters : advanced ash analytics on Oracle
Ash masters : advanced ash analytics on Oracle Kyle Hailey
Tanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools shortTanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools shortTanel Poder
Oracle statistics by example
Oracle statistics by exampleOracle statistics by example
Oracle statistics by exampleMauro Pagano
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer OverviewOlav Sandstå
MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)NeoClova
Replication Troubleshooting in Classic VS GTID
Replication Troubleshooting in Classic VS GTIDReplication Troubleshooting in Classic VS GTID
Replication Troubleshooting in Classic VS GTIDMydbops
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTanel Poder
How to use histograms to get better performance
How to use histograms to get better performanceHow to use histograms to get better performance
How to use histograms to get better performanceMariaDB plc
Oracle DB Performance Tuning Tips
Oracle DB Performance Tuning TipsOracle DB Performance Tuning Tips
Oracle DB Performance Tuning TipsAsanka Dilruk
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and AdvisorsYour tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and AdvisorsJohn Kanagaraj

What's hot (20)

The MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer TraceThe MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer Trace
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 Minutes
MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing - Best practices for MySQL 5.6
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals
[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning
ANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gemANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gem
Ash masters : advanced ash analytics on Oracle
Ash masters : advanced ash analytics on Oracle Ash masters : advanced ash analytics on Oracle
Ash masters : advanced ash analytics on Oracle
Tanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools shortTanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools short
Oracle statistics by example
Oracle statistics by exampleOracle statistics by example
Oracle statistics by example
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview
MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)
Replication Troubleshooting in Classic VS GTID
Replication Troubleshooting in Classic VS GTIDReplication Troubleshooting in Classic VS GTID
Replication Troubleshooting in Classic VS GTID
PostgreSQL: Advanced indexing
PostgreSQL: Advanced indexingPostgreSQL: Advanced indexing
PostgreSQL: Advanced indexing
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contention
How to use histograms to get better performance
How to use histograms to get better performanceHow to use histograms to get better performance
How to use histograms to get better performance
Oracle DB Performance Tuning Tips
Oracle DB Performance Tuning TipsOracle DB Performance Tuning Tips
Oracle DB Performance Tuning Tips
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and AdvisorsYour tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors

Viewers also liked

Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksQuery Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksJaime Crespo
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer OverviewMYXPLAIN
MariaDB Optimizer
MariaDB OptimizerMariaDB Optimizer
MariaDB OptimizerJongJin Lee
Capturing Network Traffic into Database
Capturing Network Traffic into Database Capturing Network Traffic into Database
Capturing Network Traffic into Database Tigran Tsaturyan
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)Karthik .P.R

Viewers also liked (7)

Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksQuery Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Mysql Optimization
Mysql OptimizationMysql Optimization
Mysql Optimization
Cost-Based query optimization
Cost-Based query optimizationCost-Based query optimization
Cost-Based query optimization
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview
MariaDB Optimizer
MariaDB OptimizerMariaDB Optimizer
MariaDB Optimizer
Capturing Network Traffic into Database
Capturing Network Traffic into Database Capturing Network Traffic into Database
Capturing Network Traffic into Database
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)

Similar to MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013

Advanced Query Optimizer Tuning and Analysis
Advanced Query Optimizer Tuning and AnalysisAdvanced Query Optimizer Tuning and Analysis
Advanced Query Optimizer Tuning and AnalysisMYXPLAIN
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...Sergey Petrunya
Need for Speed: MySQL Indexing
Need for Speed: MySQL IndexingNeed for Speed: MySQL Indexing
Need for Speed: MySQL IndexingMYXPLAIN
Percona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuningPercona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuningSergey Petrunya
Window functions in MySQL 8.0
Window functions in MySQL 8.0Window functions in MySQL 8.0
Window functions in MySQL 8.0Mydbops
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingSveta Smirnova
Workshop 20140522 BigQuery Implementation
Workshop 20140522   BigQuery ImplementationWorkshop 20140522   BigQuery Implementation
Workshop 20140522 BigQuery ImplementationSimon Su
Adaptive Query Optimization
Adaptive Query OptimizationAdaptive Query Optimization
Adaptive Query OptimizationAnju Garg
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingSveta Smirnova
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf TuningHighLoad2009
MariaDB 10.0 Query Optimizer
MariaDB 10.0 Query OptimizerMariaDB 10.0 Query Optimizer
MariaDB 10.0 Query OptimizerSergey Petrunya
Adapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cAdapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cMauro Pagano
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012Roland Bouman
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012Roland Bouman
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)Valeriy Kravchuk
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightDataStax Academy

Similar to MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013 (20)

Advanced Query Optimizer Tuning and Analysis
Advanced Query Optimizer Tuning and AnalysisAdvanced Query Optimizer Tuning and Analysis
Advanced Query Optimizer Tuning and Analysis
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
Need for Speed: MySQL Indexing
Need for Speed: MySQL IndexingNeed for Speed: MySQL Indexing
Need for Speed: MySQL Indexing
Percona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuningPercona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuning
Window functions in MySQL 8.0
Window functions in MySQL 8.0Window functions in MySQL 8.0
Window functions in MySQL 8.0
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL Troubleshooting
Workshop 20140522 BigQuery Implementation
Workshop 20140522   BigQuery ImplementationWorkshop 20140522   BigQuery Implementation
Workshop 20140522 BigQuery Implementation
Adaptive Query Optimization
Adaptive Query OptimizationAdaptive Query Optimization
Adaptive Query Optimization
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL Troubleshooting
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf Tuning
MariaDB 10.0 Query Optimizer
MariaDB 10.0 Query OptimizerMariaDB 10.0 Query Optimizer
MariaDB 10.0 Query Optimizer
Adapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cAdapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12c
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
Mysql tracing
Mysql tracingMysql tracing
Mysql tracing
Mysql tracing
Mysql tracingMysql tracing
Mysql tracing
Perf Tuning Short
Perf Tuning ShortPerf Tuning Short
Perf Tuning Short
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight

More from Sergey Petrunya

New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12Sergey Petrunya
MariaDB's join optimizer: how it works and current fixes
MariaDB's join optimizer: how it works and current fixesMariaDB's join optimizer: how it works and current fixes
MariaDB's join optimizer: how it works and current fixesSergey Petrunya
Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Sergey Petrunya
Improving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesImproving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesSergey Petrunya
JSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger pictureJSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger pictureSergey Petrunya
Optimizer Trace Walkthrough
Optimizer Trace WalkthroughOptimizer Trace Walkthrough
Optimizer Trace WalkthroughSergey Petrunya
Optimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesSergey Petrunya
MariaDB 10.4 - что нового
MariaDB 10.4 - что новогоMariaDB 10.4 - что нового
MariaDB 10.4 - что новогоSergey Petrunya
Using histograms to get better performance
Using histograms to get better performanceUsing histograms to get better performance
Using histograms to get better performanceSergey Petrunya
MariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit holeMariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit holeSergey Petrunya
Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4Sergey Petrunya
Lessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmarkLessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmarkSergey Petrunya
MariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it standMariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it standSergey Petrunya
MyRocks in MariaDB | M18
MyRocks in MariaDB | M18MyRocks in MariaDB | M18
MyRocks in MariaDB | M18Sergey Petrunya
New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3Sergey Petrunya
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLSergey Petrunya
Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2Sergey Petrunya
MyRocks in MariaDB: why and how
MyRocks in MariaDB: why and howMyRocks in MariaDB: why and how
MyRocks in MariaDB: why and howSergey Petrunya

More from Sergey Petrunya (20)

New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12
MariaDB's join optimizer: how it works and current fixes
MariaDB's join optimizer: how it works and current fixesMariaDB's join optimizer: how it works and current fixes
MariaDB's join optimizer: how it works and current fixes
Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8
Improving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesImproving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimates
JSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger pictureJSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger picture
Optimizer Trace Walkthrough
Optimizer Trace WalkthroughOptimizer Trace Walkthrough
Optimizer Trace Walkthrough
Optimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databases
MariaDB 10.4 - что нового
MariaDB 10.4 - что новогоMariaDB 10.4 - что нового
MariaDB 10.4 - что нового
Using histograms to get better performance
Using histograms to get better performanceUsing histograms to get better performance
Using histograms to get better performance
MariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit holeMariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit hole
Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4
Lessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmarkLessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmark
MariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it standMariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it stand
MyRocks in MariaDB | M18
MyRocks in MariaDB | M18MyRocks in MariaDB | M18
MyRocks in MariaDB | M18
New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3
MyRocks in MariaDB
MyRocks in MariaDBMyRocks in MariaDB
MyRocks in MariaDB
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQL
Say Hello to MyRocks
Say Hello to MyRocksSay Hello to MyRocks
Say Hello to MyRocks
Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2
MyRocks in MariaDB: why and how
MyRocks in MariaDB: why and howMyRocks in MariaDB: why and how
MyRocks in MariaDB: why and how

Recently uploaded

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

Recently uploaded (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013

  • 1. Advanced query optimizer tuning and analysis Sergei Petrunia Timour Katchaounov Monty Program Ab MySQL Conference And Expo 2013
  • 2. 2 07:48:08 AM ● Introduction – What is an optimizer problem – How to catch it ● old an new tools ● Single-table selects – brief recap from 2012 ● JOINs – ref access ● index statistics – join condition pushdown – join plan efficiency – query plan vs reality ● Big I/O bound JOINs – Batched Key Access ● Aggregate functions ● ORDER BY ... LIMIT ● GROUP BY ● Subqueries
  • 3. 3 07:48:08 AM Is there a problem with query optimizer? • Database performance is affected by many factors • One of them is the query optimizer • Is my performance problem caused by the optimizer?
  • 4. 4 07:48:08 AM Sings that there is a query optimizer problem • Some (not all) queries are slow • A query seems to run longer than it ought to – And examines more records than it ought to • Usually, query remains slow regardless of other activity on the server
  • 5. 5 07:48:08 AM Catching slow queries, the old ways ● Watch the Slow query log – Percona Server/MariaDB: --log_slow_verbosity=query_plan # Thread_id: 1 Schema: dbt3sf10 QC_hit: No # Query_time: 2.452373 Lock_time: 0.000113 Rows_sent: 0 Rows_examined: 1500000 # Full_scan: Yes Full_join: No Tmp_table: No Tmp_table_on_disk: No # Filesort: No Filesort_on_disk: No Merge_passes: 0 SET timestamp=1333385770; select * from customer where c_acctbal < -1000; • Run SHOW PROCESSLIST periodically – Run pt-query-digest on the log
  • 6. 6 07:48:08 AM The new way: SHOW PROCESSLIST + SHOW EXPLAIN • Available in MariaDB 10.0+ • Displays EXPLAIN of a running statement MariaDB> show processlist; +--+----+---------+-------+-------+----+------------+-------------------------... |Id|User|Host |db |Command|Time|State |Info +--+----+---------+-------+-------+----+------------+-------------------------... | 1|root|localhost|dbt3sf1|Query | 10|Sending data|select max(o_totalprice) ... | 2|root|localhost|dbt3sf1|Query | 0|init |show processlist +--+----+---------+-------+-------+----+------------+-------------------------... MariaDB> show explain for 1; +--+-----------+------+----+-------------+----+-------+----+-------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+------+----+-------------+----+-------+----+-------+-----------+ |1 |SIMPLE |orders|ALL |NULL |NULL|NULL |NULL|1498194|Using where| +--+-----------+------+----+-------------+----+-------+----+-------+-----------+ MariaDB [dbt3sf1]> show warnings; +-----+----+-----------------------------------------------------------------+ |Level|Code|Message | +-----+----+-----------------------------------------------------------------+ |Note |1003|select max(o_totalprice) from orders where year(o_orderDATE)=1995| +-----+----+-----------------------------------------------------------------+
  • 7. 7 07:48:08 AM SHOW EXPLAIN usage ● Intended usage – SHOW PROCESSLIST ... – SHOW EXPLAIN FOR ... ● Why not just run EXPLAIN again – Difficult to replicate setups ● Temporary tables ● Optimizer settings ● Storage engine's index statistics ● ... – No uncertainty about whether you're looking at the same query plan or not.
  • 8. 8 07:48:08 AM Catching slow queries (NEW) PERFORMANCE SCHEMA [MySQL 5.6, MariaDB 10.0] ● use performance_schema ● Many ways to analyze via queries – events_statements_summary_by_digest ● count_star, sum_timer_wait, min_timer_wait, avg_timer_wait, max_timer_wait ● digest_text, digest ● sum_rows_examined, sum_created_tmp_disk_tables, sum_select_full_join – events_statements_history ● sql_text, digest_text, digest ● timer_start, timer_end, timer_wait ● rows_examined, created_tmp_disk_tables, select_full_join 8
  • 9. 9 07:48:08 AM Catching slow queries (NEW) PERFORMANCE SCHEMA [MySQL 5.6, MariaDB 10.0] • Modified Q18 from DBT3 select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_totalprice > ? and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate LIMIT 10; • App executes Q18 many times with ? = 550000, 500000, 400000, ... 9
  • 10. 10 07:48:08 AM Catching slow queries (NEW) PERFORMANCE SCHEMA [MySQL 5.6, MariaDB 10.0] ● Find candidate slow queries ● Simple tests: select_full_join > 0, created_tmp_disk_tables > 0, etc ● Complex conditions: max execution time > X sec OR min/max time vary a lot: select max_timer_wait/avg_timer_wait as max_ratio, avg_timer_wait/min_timer_wait as min_ratio from events_statements_summary_by_digest where max_timer_wait > 1000000000000 or max_timer_wait / avg_timer_wait > 2 or avg_timer_wait / min_timer_wait > 2G
  • 11. 11 07:48:08 AM Catching slow queries (NEW) PERFORMANCE SCHEMA [MySQL 5.6, MariaDB 10.0] *************************** 5. row *************************** DIGEST: 3cd7b881cbc0102f65fe8a290ec1bd6b DIGEST_TEXT: SELECT `c_name` , `c_custkey` , `o_orderkey` , `o_orderdate` , `o_totalprice` , SUM ( `l_quantity` ) FROM `customer` , `orders` , `lineitem` WHERE `o_totalprice` > ? AND `c_custkey` = `o_custkey` AND `o_orderkey` = `l_orderkey` GROUP BY `c_name` , `c_custkey` , `o_orderkey` , `o_orderdate` , `o_totalprice` ORDER BY `o_totalprice` DESC , `o_orderdate` LIMIT ? COUNT_STAR: 3 SUM_TIMER_WAIT: 3251758347000 MIN_TIMER_WAIT: 3914209000 → 0.0039 sec AVG_TIMER_WAIT: 1083919449000 MAX_TIMER_WAIT: 3204044053000 → 3.2 sec SUM_LOCK_TIME: 555000000 SUM_ROWS_SENT: 25 SUM_ROWS_EXAMINED: 0 SUM_CREATED_TMP_DISK_TABLES: 0 SUM_CREATED_TMP_TABLES: 3 SUM_SELECT_FULL_JOIN: 0 SUM_SELECT_RANGE: 3 SUM_SELECT_SCAN: 0 SUM_SORT_RANGE: 0 SUM_SORT_ROWS: 25 SUM_SORT_SCAN: 3 SUM_NO_INDEX_USED: 0 SUM_NO_GOOD_INDEX_USED: 0 FIRST_SEEN: 1970-01-01 03:38:27 LAST_SEEN: 1970-01-01 03:38:43 max_ratio: 2.9560 min_ratio: 276.9192 High variance of execution time
  • 12. 12 07:48:08 AM Catching slow queries (NEW) PERFORMANCE SCHEMA [MySQL 5.6, MariaDB 10.0] ● Check the actual queries and constants ● The events_statements_history table select timer_wait/1000000000000 as exec_time, sql_text from events_statements_history where digest in (select digest from events_statements_summary_by_digest where max_timer_wait > 1000000000000 or max_timer_wait / avg_timer_wait > 2 or avg_timer_wait / min_timer_wait > 2) order by timer_wait;
  • 13. 13 07:48:08 AM Catching slow queries (NEW) PERFORMANCE SCHEMA [MySQL 5.6, MariaDB 10.0] +-----------+-----------------------------------------------------------------------------------+ | exec_time | sql_text | +-----------+-----------------------------------------------------------------------------------+ | 0.0039 | select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_totalprice > 550000 and c_custkey = o_custkey ... LIMIT 10 | | 0.0438 | select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_totalprice > 500000 and c_custkey = o_custkey ... LIMIT 10 | | 3.2040 | select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_totalprice > 400000 and c_custkey = o_custkey ... LIMIT 10 | +-----------+-----------------------------------------------------------------------------------+ Observation: orders.o_totalprice > ? is less and less selective
  • 14. 14 07:48:08 AM Actions after finding the slow query Bad query plan – Rewrite the query – Force a good query plan • Bad optimizer settings – Do tuning • Query is inherently complex – Don't waste time with it – Look for other solutions.
  • 15. 15 07:48:08 AM ● Introduction – What is an optimizer problem – How to catch it ● old an new tools ● Single-table selects – brief recap from 2012 ● JOINs – ref access ● index statistics – join condition pushdown – join plan efficiency – query plan vs reality ● Big I/O bound JOINs – Batched Key Access ● Aggregate functions ● ORDER BY ... LIMIT ● GROUP BY ● Subqueries
  • 16. 16 07:48:08 AM Consider a simple select • 15M rows were scanned, 19 rows in output • Query plan seems inefficient – (note: this logic doesn't directly apply to group/order by queries). select * from orders where o_orderDate BETWEEN '1992-06-06' and '1992-07-06' and o_clerk='Clerk#000009506' +----+-------------+--------+------+---------------+------+---------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------+------+---------------+------+---------+------+----------+-------------+ | 1 | SIMPLE | orders | ALL | NULL | NULL | NULL | NULL | 15084733 | Using where | +----+-------------+--------+------+---------------+------+---------+------+----------+-------------+ 19 rows in set (7.65 sec) ● Check the query plan: ● Run the query:
  • 17. 17 07:48:08 AM Query plan analysis • Entire table is scanned • WHERE condition checked after records are read – Not used to limit #examined rows. +----+-------------+--------+------+---------------+------+---------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------+------+---------------+------+---------+------+----------+-------------+ | 1 | SIMPLE | orders | ALL | NULL | NULL | NULL | NULL | 15084733 | Using where | +----+-------------+--------+------+---------------+------+---------+------+----------+-------------+ select * from orders where o_orderDate BETWEEN '1992-06-06' and '1992-07-06' and o_clerk='Clerk#000009506'
  • 18. 18 07:48:08 AM Let's add an index • Outcome – Down to reading 300K rows – Still, 300K >> 19 rows. alter table orders add key i_o_orderdate (o_orderdate); select * from orders where o_orderDate BETWEEN '1992-06-06' and '1992-07-06' and o_clerk='Clerk#000009506' +--+-----------+------+-----+-------------+-------------+-------+----+------+-----------+ |id|select_type|table |type |possible_keys|key |key_len|ref |rows |Extra | +--+-----------+------+-----+-------------+-------------+-------+----+------+-----------+ |1 |SIMPLE |orders|range|i_o_orderdate|i_o_orderdate|4 |NULL|306322|Using where| +--+-----------+------+-----+-------------+-------------+-------+----+------+-----------+ 19 rows in set (0.76 sec) ● Query time:
  • 19. 19 07:48:08 AM Finding out which indexes to add ● index (o_orderdate) ● index (o_clerk) Check selectivity of conditions that will use the index select * from orders where o_orderDate BETWEEN '1992-06-06' and '1992-07-06' and o_clerk='Clerk#000009506' select count(*) from orders where o_orderDate BETWEEN '1992-06-06' and '1992-07-06'; 306322 rows select count(*) from orders where o_clerk='Clerk#000009506' 1507 rows.
  • 20. 20 07:48:08 AM +--+-----------+------+-----+-------------+--------------+-------+----+----+-----------+ |id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra | +--+-----------+------+-----+-------------+--------------+-------+----+----+-----------+ |1 |SIMPLE |orders|range|i_o_clerk_...|i_o_clerk_date|20 |NULL|19 |Using where| +--+-----------+------+-----+-------------+--------------+-------+----+----+-----------+ +--+-----------+------+-----+-------------+--------------+-------+----+------+-----------+ |id|select_type|table |type |possible_keys|key |key_len|ref |rows |Extra | +--+-----------+------+-----+-------------+--------------+-------+----+------+-----------+ |1 |SIMPLE |orders|range|i_o_date_c...|i_o_date_clerk|20 |NULL|360354|Using where| +--+-----------+------+-----+-------------+--------------+-------+----+------+-----------+ Try adding composite indexes ● index (o_clerk, o_orderdate) ● index (o_orderdate, o_clerk) Bingo! 100% efficiency Much worse! • If condition uses multiple columns, composite index will be most efficient • Order of column matters – Explanation why is outside of scope of this tutorial. Covered in last year's tutorial
  • 21. 21 07:48:08 AM Conditions must be in SARGable form • Condition must represent a range • It must have form that is recognized by the optimizer o_orderDate BETWEEN '1992-06-01' and '1992-06-30' day(o_orderDate)=1992 and month(o_orderdate)=6 TO_DAYS(o_orderDATE) between TO_DAYS('1992-06-06') and TO_DAYS('1992-07-06') o_clerk='Clerk#000009506' o_clerk LIKE 'Clerk#000009506' o_clerk LIKE '%Clerk#000009506%'       column IN (1,10,15,21, ...) (col1, col2) IN ( (1,1), (2,2), (3,3), …).  
  • 22. 22 07:48:08 AM New in MySQL-5.6: optimizer_trace ● Lets you see the ranges set optimizer_trace=1; explain select * from orders where o_orderDATE between '1992-06-01' and '1992-07-03' and o_orderdate not in ('1992-01-01', '1992-06-12','1992-07-04') select * from information_schema.optimizer_traceG ● Will print a big JSON struct ● Search for range_scan_alternatives.
  • 23. 23 07:48:08 AM New in MySQL-5.6: optimizer_trace ... "range_scan_alternatives": [ { "index": "i_o_orderdate", "ranges": [ "1992-06-01 <= o_orderDATE < 1992-06-12", "1992-06-12 < o_orderDATE <= 1992-07-03" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 319082, "cost": 382900, "chosen": true }, { "index": "i_o_date_clerk", "ranges": [ "1992-06-01 <= o_orderDATE < 1992-06-12", "1992-06-12 < o_orderDATE <= 1992-07-03" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 406336, "cost": 487605, "chosen": false, "cause": "cost" } ], ... ● Considered ranges are shown in range_scan_alternatives section ● This is actually original use case of optimizer_trace ● Alas, recent mysql-5.6 displays misleading info about ranges on multi-component keys (will file a bug) ● Still, very useful.
  • 24. 24 07:48:08 AM Source of #rows estimates for range select * from orders where o_orderDate BETWEEN '1992-06-06' and '1992-07-06' +--+-----------+------+-----+-------------+-------------+-------+----+------+-----------+ |id|select_type|table |type |possible_keys|key |key_len|ref |rows |Extra | +--+-----------+------+-----+-------------+-------------+-------+----+------+-----------+ |1 |SIMPLE |orders|range|i_o_orderdate|i_o_orderdate|4 |NULL|306322|Using where| +--+-----------+------+-----+-------------+-------------+-------+----+------+-----------+ ? • “records_in_range” estimate • Done by diving into index • Usually is fairly accurate • Not affected by ANALYZE TABLE.
  • 25. 25 07:48:08 AM Simple selects: conclusions • Efficiency == “#rows_scanned is close to #rows_returned” • Indexes and WHERE conditions reduce #rows scanned • Index estimates are usually accurate • Multi-column indexes – “handle” conditions on multiple columns – Order of columns in the index matters • optimizer_trace allows to view the ranges – But misrepresents ranges over multi-column indexes.
  • 26. 26 07:48:08 AM Now, will skip some topics One can also speedup simple selects with ● index_merge access method ● index access method ● Index Condition Pushdown We don't have time for these now, check out the last year's tutorial.
  • 27. 27 07:48:08 AM ● Introduction – What is an optimizer problem – How to catch it ● old an new tools ● Single-table selects – brief recap from 2012 ● JOINs – ref access ● index statistics – join condition pushdown – join plan efficiency – query plan vs reality ● Big I/O bound JOINs – Batched Key Access ● Aggregate functions ● ORDER BY ... LIMIT ● GROUP BY ● Subqueries
  • 28. 28 07:48:08 AM A simple join select * from customer, orders where c_custkey=o_custkey • “Customers with their orders”
  • 29. 29 07:48:08 AM Execution: Nested Loops join select * from customer, orders where c_custkey=o_custkey for each customer C { for each order O { if (C.c_custkey == O.o_custkey) produce record(C, O); } } • Complexity: – Scans table customer – For each record in customer, scans table orders • Is this ok?
  • 30. 30 07:48:08 AM Execution: Nested loops join (2) select * from customer, orders where c_custkey=o_custkey for each customer C { for each order O { if (C.c_custkey == O.o_custkey) produce record(C, O); } } • EXPLAIN: +--+-----------+--------+----+-------------+----+-------+----+-------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+--------+----+-------------+----+-------+----+-------+-----------+ |1 |SIMPLE |customer|ALL |NULL |NULL|NULL |NULL|148749 | | |1 |SIMPLE |orders |ALL |NULL |NULL|NULL |NULL|1493631|Using where| +--+-----------+--------+----+-------------+----+-------+----+-------+-----------+
  • 31. 31 07:48:08 AM Execution: Nested loops join (3) select * from customer, orders where c_custkey=o_custkey for each customer C { for each order O { if (C.c_custkey == O.o_custkey) produce record(C, O); } } • EXPLAIN: +--+-----------+--------+----+-------------+----+-------+----+-------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+--------+----+-------------+----+-------+----+-------+-----------+ |1 |SIMPLE |customer|ALL |NULL |NULL|NULL |NULL|148749 | | |1 |SIMPLE |orders |ALL |NULL |NULL|NULL |NULL|1493631|Using where| +--+-----------+--------+----+-------------+----+-------+----+-------+-----------+ rows to read from customer rows to read from orders c_custkey=o_custkey
  • 32. 32 07:48:08 AM Execution: Nested loops join (4) select * from customer, orders where c_custkey=o_custkey +--+-----------+--------+----+-------------+----+-------+----+-------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+--------+----+-------------+----+-------+----+-------+-----------+ |1 |SIMPLE |customer|ALL |NULL |NULL|NULL |NULL|148749 | | |1 |SIMPLE |orders |ALL |NULL |NULL|NULL |NULL|1493631|Using where| +--+-----------+--------+----+-------------+----+-------+----+-------+-----------+ • Scan a 1,493,361-row table 148,749 times – Consider 1,493,361 * 148,749 row combinations • Is this query inherently complex? – We know each customer has his own orders – size(customer x orders)= size(orders) – Lower bound is 1,493,361 + 148,749 + costs to match customer<->order.
  • 33. 33 07:48:08 AM Using index for join: ref access alter table orders add index i_o_custkey(o_custkey) select * from customer, orders where c_custkey=o_custkey
  • 34. 34 07:48:08 AM ref access - analysis +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra| +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----+ |1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |148749| | |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 | | +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----+ select * from customer, orders where c_custkey=o_custkey ● One ref lookup scans 7 rows. ● In total: 7 * 148,749=1,041,243 rows – `orders` has 1.4M rows – no redundant reads from `orders` ● The whole query plan – Reads all customers – Reads 1M orders (of 1.4M) ● Efficient!
  • 35. 35 07:48:08 AM Conditions that can be used for ref access ● Can use equalities – tbl.key=other_table.col – tbl.key=const – tbl.key IS NULL ● For multipart keys, will use largest prefix – keypart1=... AND keypart2= … AND keypartK=... .
  • 36. 36 07:48:08 AM Conditions that can't be used for ref access ● Doesn't work for non-equalities t1.key BETWEEN t2.col1 AND t2.col2 ● Doesn't work for OR-ed equalities t1.key=t2.col1 OR t1.key=t2.col2 – Except for ref_or_null t1.key=... OR t1.key IS NULL ● Doesn't “combine” ref and range access – t.keypart1 BETWEEN c1 AND c2 AND t.keypart2=t2.col – t.keypart2 BETWEEN c1 AND c2 AND t.keypart1=t2.col .
  • 37. 37 07:48:08 AM Is ref always efficient? ● Efficient, if column has many different values – Best case – unique index (eq_ref) ● A few different values – not useful ● Skewed distribution: depends on which part the join touches good bad depends
  • 38. 38 07:48:08 AM ref access estimates - index statistics • How many rows will match tbl.key_column = $value for an arbitrary $value? • Index statistics show keys from orders where key_name='i_o_custkey' *************************** 1. row *************** Table: orders Non_unique: 1 Key_name: i_o_custkey Seq_in_index: 1 Column_name: o_custkey Collation: A Cardinality: 214462 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE show table status like 'orders' *************************** 1. row **** Name: orders Engine: InnoDB Version: 10 Row_format: Compact Rows: 1495152 Avg_row_length: 133 Data_length: 199966720 Max_data_length: 0 Index_length: 122421248 Data_free: 6291456 ... average = Rows /Cardinality = 1495152 / 214462 = 6.97.
  • 39. 39 07:48:08 AM ref access – conclusions ● Based on t.key=... equality conditions ● Can make joins very efficient ● Relies on index statistics for estimates.
  • 40. 40 07:48:08 AM Optimizer statistics ● MySQL/Percona Server – Index statistics – Persistent/transient InnoDB stats ● MariaDB – Index statistics, persistent/transient ● Same as Percona Server (via XtraDB) – Persistent, engine-independent, index-independent statistics.
  • 41. 41 07:48:08 AM Index statistics ● Cardinality allows to calculate a table-wide average #rows-per-key-prefix ● It is a statistical value (inexact) ● Exact collection procedure depends on the storage engine – InnoDB – random sampling – MyISAM – index scan – Engine-independent – index scan.
  • 42. 42 07:48:08 AM Index statistics in MySQL 5.6 ● Sample [8] random index leaf pages ● Table statistics (stored) – rows - estimated number of rows in a table – Other stats not used by optimizer ● Index statistics (stored) – fields - #fields in the index – rows_per_key - rows per 1 key value, per prefix fields ([1 column value], [2 columns value], [3 columns value], …) – Other stats not used by optimizer.
  • 43. 43 07:48:08 AM Index statics updates ● Statistics updated when: – ANALYZE TABLE tbl_name [, tbl_name] … – SHOW TABLE STATUS, SHOW INDEX – Access to INFORMATION_SCHEMA.[TABLES| STATISTICS] – A table is opened for the first time (after server restart) – A table has changed >10% – When InnoDB Monitor is turned ON.
  • 44. 44 07:48:08 AM Displaying optimizer statistics ● MySQL 5.5, MariaDB 5.3, and older – Issue SQL statements to count rows/keys – Indirectly, look at EXPLAIN for simple queries ● MariaDB 5.5, Percona Server 5.5 (using XtraDB) – information_schema.[innodb_index_stats, innodb_table_stats] – Read-only, always visible ● MySQL 5.6 – mysql.[innodb_index_stats, innodb_table_stats] – User updatetable – Only available if innodb_analyze_is_persistent=ON ● MariaDB 10.0 – Persistent updateable tables mysql.[index_stats, column_stats, table_stats] – User updateable – + current XtraDB mechanisms.
  • 45. 45 07:48:08 AM Plan [in]stability ● Statistics may vary a lot (orders) MariaDB [dbt3]> select * from information_schema.innodb_index_stats; +------------+-----------------+--------------+ +---------------+ | table_name | index_name | rows_per_key | | rows_per_key | error (actual) +------------+-----------------+--------------+ +---------------+ | partsupp | PRIMARY | 3, 1 | | 4, 1 | 25% | partsupp | i_ps_partkey | 3, 0 | => | 4, 1 | 25% (4) | partsupp | i_ps_suppkey | 64, 0 | | 91, 1 | 30% (80) | orders | i_o_orderdate | 9597, 1 | | 1660956, 0 | 99% (6234) | orders | i_o_custkey | 15, 1 | | 15, 0 | 0% (15) | lineitem | i_l_receiptdate | 7425, 1, 1 | | 6665850, 1, 1 | 99.9% (23477) +------------+-----------------+--------------+ +---------------+ MariaDB [dbt3]> select * from information_schema.innodb_table_stats; +-----------------+----------+ +----------+ | table_name | rows | | rows | +-----------------+----------+ +----------+ | partsupp | 6524766 | | 9101065 | 28% (8000000) | orders | 15039855 | ==> | 14948612 | 0.6% (15000000) | lineitem | 60062904 | | 59992655 | 0.1% (59986052) +-----------------+----------+ +----------+ .
  • 46. 46 07:48:08 AM Controlling statistics (MySQL 5.6) ● Persistent and user-updatetable InnoDB statistics – innodb_analyze_is_persistent = ON, – updated manually by ANALYZE TABLE or – automatically by innodb_stats_auto_recalc = ON ● Control the precision of sampling [default 8] – innodb_stats_persistent_sample_pages, – innodb_stats_transient_sample_pages ● No new statistics compared to older versions.
  • 47. 47 07:48:08 AM Controlling statistics (MariaDB 10.0) Current XtraDB index statistics + ● Engine-independent, persistent, user-updateable statistics ● Precise ● Additional statistics per column (even when there is no index): – min_value, max_value: minimum/maximum value per column – nulls_ratio: fraction of null values in a column – avg_length: average size of values in a column – avg_frequency: average number of rows with the same value.
  • 48. 48 07:48:08 AM Join condition pushdown
  • 49. 49 07:48:08 AM Join condition pushdown select * from customer, orders where c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT'; +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where| +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+.
  • 50. 50 07:48:08 AM Join condition pushdown select * from customer, orders where c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT'; +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where| +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+
  • 51. 51 07:48:08 AM Join condition pushdown select * from customer, orders where c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT'; +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where| +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+
  • 52. 52 07:48:08 AM Join condition pushdown select * from customer, orders where c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT'; +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where| +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ ● Conjunctive (ANDed) conditions are split into parts ● Each part is attached as early as possible – Either as “Using where” – Or as table access method.
  • 53. 53 07:48:08 AM Observing join condition pushdown EXPLAIN: { "query_block": { "select_id": 1, "nested_loop": [ { "table": { "table_name": "orders", "access_type": "ALL", "possible_keys": [ "i_o_custkey" ], "rows": 1499715, "filtered": 100, "attached_condition": "((`dbt3sf1`.`orders`.`o_orderpriority` = '1-URGENT') and (`dbt3sf1`.`orders`.`o_custkey` is not null))" } }, { "table": { "table_name": "customer", "access_type": "eq_ref", "possible_keys": [ "PRIMARY" ], "key": "PRIMARY", "used_key_parts": [ "c_custkey" ], "key_length": "4", "ref": [ "dbt3sf1.orders.o_custkey" ], "rows": 1, "filtered": 100, "attached_condition": "(`dbt3sf1`.`customer`.`c_acctbal` < <cache>(-(500)))" } ● Before mysql-5.6: EXPLAIN shows only “Using where” – The condition itself only visible in debug trace ● Starting from 5.6: EXPLAIN FORMAT=JSON shows attached conditions.
  • 54. 54 07:48:08 AM Reasoning about join plan efficiency select * from customer, orders where c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT'; +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where| +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ First table, “customer” ● type=ALL, 150 K rows ● select count(*) from customer where c_acctbal < -500 gives 6804. ● alter table customer add index (c_acctbal).
  • 55. 55 07:48:08 AM Reasoning about join plan efficiency select * from customer, orders where c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT'; First table, “customer” ● type=ALL, 150 K rows ● select count(*) from customer where c_acctbal < -500 gives 6804. ● alter table customer add index (c_acctbal) +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where| +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+ |id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra | +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+ |1 |SIMPLE |customer|range|PRIMARY,c_...|c_acctbal |9 |NULL |6802|Using index condition| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where | +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+ Now, access to 'customer' is efficient.
  • 56. 56 07:48:08 AM Reasoning about join plan efficiency select * from customer, orders where c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT'; Second table, “orders” ● Attached condition: c_custkey=o_custkey and o_orderpriority='1-URGENT' ● ref access uses only c_custkey=o_custkey ● What about o_orderpriority='1-URGENT'?. +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+ |id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra | +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+ |1 |SIMPLE |customer|range|PRIMARY,c_...|c_acctbal |9 |NULL |6802|Using index condition| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where | +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+
  • 57. 57 07:48:08 AM ●o_orderpriority='1-URGENT' o_orderpriority='1-URGENT' ● select count(*) from orders – 1.5M rows ● select count(*) from orders where o_orderpriority='1-URGENT' - 300K rows ● 300K / 1.5M = 0.2
  • 58. 58 07:48:08 AM Reasoning about join plan efficiency select * from customer, orders where c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT'; Second table, “orders” ● Attached condition: c_custkey=o_custkey and o_orderpriority='1-URGENT' ● ref access uses only c_custkey=o_custkey ● What about o_orderpriority='1-URGENT'? Selectivity= 0.2 – Can examine 7*0.2=1.4 rows, 6802 times if we add an index: alter table orders add index (o_custkey, o_orderpriority) or alter table orders add index (o_orderpriority, o_custkey) +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+ |id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra | +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+ |1 |SIMPLE |customer|range|PRIMARY,c_...|c_acctbal |9 |NULL |6802|Using index condition| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where | +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+
  • 59. 59 07:48:08 AM Reasoning about join plan efficiency - summary Basic* approach to evaluation of join plan efficiency: for each table $T in the join order { Look at conditions attached to table $T (condition must use table $T, may also use previous tables) Does access method used with $T make a good use of attached conditions? } +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+ |id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra | +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+ |1 |SIMPLE |customer|range|PRIMARY,c_...|c_acctbal |9 |NULL |6802|Using index condition| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where | +--+-----------+--------+-----+-------------+-----------+-------+------------------+----+---------------------+ * some other details may also affect join performance
  • 61. 61 07:48:08 AM Attached conditions ● Ideally, should be used for table access ● Not all conditions can be used [at the same time] – Unused ones are still useful – They reduce number of scans for subsequent tables select * from customer, orders where c_custkey=o_custkey and c_acctbal < -500 and o_orderpriority='1-URGENT'; +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |id|select_type|table |type|possible_keys|key |key_len|ref |rows |Extra | +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+ |1 |SIMPLE |customer|ALL |PRIMARY |NULL |NULL |NULL |150081|Using where| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 |Using where| +--+-----------+--------+----+-------------+-----------+-------+------------------+------+-----------+
  • 62. 62 07:48:08 AM Informing optimizer about attached conditions Currently: a range access that's too expensive to use +--+-----------+--------+----+-----------------+-----------+-------+------------------+------+--------+-----------+ |id|select_type|table |type|possible_keys |key |key_len|ref |rows |filtered|Extra | +--+-----------+--------+----+-----------------+-----------+-------+------------------+------+--------+-----------+ |1 |SIMPLE |customer|ALL |PRIMARY,c_acctbal|NULL |NULL |NULL |150081| 36.22 |Using where| |1 |SIMPLE |orders |ref |i_o_custkey |i_o_custkey|5 |customer.c_custkey|7 | 100.00 |Using where| +--+-----------+--------+----+-----------------+-----------+-------+------------------+------+--------+-----------+ explain extended select * from customer, orders where c_custkey=o_custkey and c_acctbal > 8000 and o_orderpriority='1-URGENT'; ● `orders` will be scanned 150081 * 36.22%= 54359 times ● This reduces the cost of join – Has an effect when comparing potential join plans ● => Index i_o_custkey is not used. But may help the optimizer.
  • 63. 63 07:48:08 AM Attached condition selectivity ● Unused indexes provide info about selectivity – Works, but very expensive ● MariaDB 10.0 has engine-independent statistics – Index statistics – Non-indexed Column statistics ● Histograms – Further info: Tomorrow, 2:20 pm @ Ballroom D Igor Babaev Engine-independent persistent statistics with histograms in MariaDB.
  • 64. 64 07:48:08 AM How to check if the query plan matches the reality
  • 65. 65 07:48:08 AM Check if query plan is realistic ● EXPLAIN shows what optimizer expects. It may be wrong – Out-of-date index statistics – Non-uniform data distribution ● Other DBMS: EXPLAIN ANALYZE ● MySQL: no equivalent. Instead, have – Handler counters – “User statistics” (Percona, MariaDB) – PERFORMANCE_SCHEMA
  • 66. 66 07:48:08 AM Join analysis: example query (Q18, DBT3) <reset counters> select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_totalprice > 500000 and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate LIMIT 10; <collect statistics>
  • 67. 67 07:48:08 AM Join analysis: handler counters (old) FLUSH STATUS; => RUN QUERY SHOW STATUS LIKE "Handler%"; +----------------------------+-------+ | Handler_mrr_key_refills | 0 | | Handler_mrr_rowid_refills | 0 | | Handler_read_first | 0 | | Handler_read_key | 1646 | | Handler_read_last | 0 | | Handler_read_next | 1462 | | Handler_read_prev | 0 | | Handler_read_rnd | 10 | | Handler_read_rnd_deleted | 0 | | Handler_read_rnd_next | 184 | | Handler_tmp_update | 1096 | | Handler_tmp_write | 183 | | Handler_update | 0 | | Handler_write | 0 |
  • 68. 68 07:48:08 AM Join analysis: USERSTAT by Facebook MariaDB, Percona Server SET GLOBAL USERSTAT=1; FLUSH TABLE_STATISTICS; FLUSH INDEX_STATISTICS; => RUN QUERY SHOW TABLE_STATISTICS; +--------------+------------+-----------+--------------+-------------------------+ | Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_x_#indexes | +--------------+------------+-----------+--------------+-------------------------+ | dbt3 | orders | 183 | 0 | 0 | | dbt3 | lineitem | 1279 | 0 | 0 | | dbt3 | customer | 183 | 0 | 0 | +--------------+------------+-----------+--------------+-------------------------+ SHOW INDEX_STATISTICS; +--------------+------------+-----------------------+-----------+ | Table_schema | Table_name | Index_name | Rows_read | +--------------+------------+-----------------------+-----------+ | dbt3 | customer | PRIMARY | 183 | | dbt3 | lineitem | i_l_orderkey_quantity | 1279 | | dbt3 | orders | i_o_totalprice | 183 | +--------------+------------+-----------------------+-----------+
  • 69. 69 07:48:08 AM Join analysis: PERFORMANCE SCHEMA [MySQL 5.6, MariaDB 10.0] ● summary tables with read/write statistics – table_io_waits_summary_by_table – table_io_waits_summary_by_index_usage ● Superset of the userstat tables ● More overhead ● Not possible to associate statistics with a query => truncate stats tables before running a query ● Possible bug – performance schema not ignored – Disable by UPDATE setup_consumers SET ENABLED = 'NO' where name = 'global_instrumentation';
  • 70. 70 07:48:08 AM Analyze joins via PERFORMANCE SCHEMA: SHOW TABLE_STATISTICS analogue select object_schema, object_name, count_read, count_write, sum_timer_read, sum_timer_write, ... from table_io_waits_summary_by_table where object_schema = 'dbt3' and count_star > 0; +---------------+-------------+------------+-------------+ | object_schema | object_name | count_read | count_write | +---------------+-------------+------------+-------------+ | dbt3 | customer | 183 | 0 | | dbt3 | lineitem | 1462 | 0 | | dbt3 | orders | 184 | 0 | +---------------+-------------+------------+-------------+ +----------------+-----------------+ | sum_timer_read | sum_timer_write | ... +----------------+-----------------+ | 8326528406 | 0 | | 12117332778 | 0 | | 7946312812 | 0 | +----------------+-----------------+
  • 71. 71 07:48:08 AM Analyze joins via PERFORMANCE SCHEMA: SHOW INDEX_STATISTICS analogue select object_schema, object_name, index_name, count_read, sum_timer_read, sum_timer_write, ... from table_io_waits_summary_by_index_usage where object_schema = 'dbt3' and count_star > 0 and index_name is not null; +---------------+-------------+-----------------------+------------+ | object_schema | object_name | index_name | count_read | +---------------+-------------+-----------------------+------------+ | dbt3 | customer | PRIMARY | 183 | | dbt3 | lineitem | i_l_orderkey_quantity | 1462 | | dbt3 | orders | i_o_totalprice | 184 | +---------------+-------------+-----------------------+------------+ +----------------+-----------------+ | sum_timer_read | sum_timer_write | ... +----------------+-----------------+ | 8326528406 | 0 | | 12117332778 | 0 | | 7946312812 | 0 | +----------------+-----------------+
  • 72. 72 07:48:08 AM ● Introduction – What is an optimizer problem – How to catch it ● old an new tools ● Single-table selects – brief recap from 2012 ● JOINs – ref access ● index statistics – join condition pushdown – join plan efficiency – query plan vs reality ● Big I/O bound JOINs – Batched Key Access ● Aggregate functions ● ORDER BY ... LIMIT ● GROUP BY ● Subqueries
  • 73. 73 07:48:08 AM Batched joins ● Optimization for analytical queries ● Analytic queries shovel through lots of data – e.g. “average size of order in the last month” – or “pairs of goods purchased together” ● Indexes,etc won't help when you really need to look at all data ● More data means greater chance of being io-bound ● Solution: batched joins
  • 74. 74 07:48:08 AM Batched Key Access Idea
  • 75. 75 07:48:08 AM Batched Key Access Idea
  • 76. 76 07:48:08 AM Batched Key Access Idea
  • 77. 77 07:48:08 AM Batched Key Access Idea
  • 78. 78 07:48:08 AM Batched Key Access Idea
  • 79. 79 07:48:08 AM Batched Key Access Idea
  • 80. 80 07:48:08 AM Batched Key Access Idea ● Non-BKA join hits data at random ● Caches are not used efficiently ● Prefetching is not useful
  • 81. 81 07:48:08 AM Batched Key Access Idea ● BKA implementation accesses data in order ● Takes advantages of caches and prefetching
  • 82. 82 07:48:08 AM Batched Key access effect set join_cache_level=6; select max(l_extendedprice) from orders, lineitem where l_orderkey=o_orderkey and o_orderdate between $DATE1 and $DATE2 The benchmark was run with ● Various BKA buffer size ● Various size of $DATE1...$DATE2 range
  • 83. 83 07:48:08 AM Batched Key Access Performance -2,000,000 3,000,000 8,000,000 13,000,000 18,000,000 23,000,000 28,000,000 33,000,000 0 500 1000 1500 2000 2500 3000 BKA join performance depending on buffer size query_size=1, regular query_size=1, BKA query_size=2, regular query_size=2, BKA query_size=3, regular query_size=3, BKA Buffer size, bytes Querytime,sec Performance without BKA Performance with BKA, given sufficient buffer size● 4x-10x speedup ● The more the data, the bigger the speedup ● Buffer size setting is very important.
  • 84. 84 07:48:08 AM Batched Key Access settings ● Needs to be turned on set join_buffer_size= 32*1024*1024; set join_cache_level=6; -- MariaDB set optimizer_switch='batched_key_access=on' -- MySQL 5.6 set optimizer_switch='mrr=on'; set optimizer_switch='mrr_sort_keys=on'; -- MariaDB only ● Further join_buffer_size tuning is watching – Query performance – Handler_mrr_init counter and increasing join_buffer_size until either saturates.
  • 85. 85 07:48:08 AM Batched Key Access - conclusions ● Targeted at big joins ● Needs to be enabled manually ● @@join_buffer_size is the most important setting ● MariaDB's implementation is a superset of MySQL's.
  • 86. 86 07:48:08 AM ● Introduction – What is an optimizer problem – How to catch it ● old an new tools ● Single-table selects – brief recap from 2012 ● JOINs – ref access ● index statistics – join condition pushdown – join plan efficiency – query plan vs reality ● Big I/O bound JOINs – Batched Key Access ● Aggregate functions ● ORDER BY ... LIMIT ● GROUP BY ● Subqueries
  • 87. 87 07:48:08 AM ORDER BY GROUP BY aggregates
  • 88. 88 07:48:08 AM Aggregate functions, no GROUP BY ● COUNT, SUM, AVG, etc need to examine all rows select SUM(column) from tbl needs to examine the whole tbl. ● MIN and MAX can use index for lookup +--+-----------+-----+----+-------------+----+-------+----+----+----------------------------+ |id|select_type|table|type|possible_keys|key |key_len|ref |rows|Extra | +--+-----------+-----+----+-------------+----+-------+----+----+----------------------------+ |1 |SIMPLE |NULL |NULL|NULL |NULL|NULL |NULL|NULL|Select tables optimized away| +--+-----------+-----+----+-------------+----+-------+----+----+----------------------------+ index (o_orderdate) select max(o_orderdate) from orders select min(o_orderdate) from orders where o_orderdate > '1995-05-01' select max(o_orderdate) from orders where o_orderpriority='1-URGENT' index (o_orderpriority, o_orderdate)
  • 89. 89 07:48:08 AM ORDER BY … LIMIT Three algorithms ● Use an index to read in order ● Read one table, sort, join - “Using filesort” ● Execute join into temporary table and then sort - “Using temporary; Using filesort”
  • 90. 90 07:48:08 AM Using index to read data in order ● No special indication in EXPLAIN output ● LIMIT n: as soon as we read n records, we can stop!
  • 91. 91 07:48:08 AM A problem with LIMIT N optimization `orders` has 1.5 M rows explain select * from orders order by o_orderdate desc limit 10; +--+-----------+------+-----+-------------+-------------+-------+----+----+-----+ |id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra| +--+-----------+------+-----+-------------+-------------+-------+----+----+-----+ |1 |SIMPLE |orders|index|NULL |i_o_orderdate|4 |NULL|10 | | +--+-----------+------+-----+-------------+-------------+-------+----+----+-----+ select * from orders where o_orderpriority='1-URGENT' order by o_orderdate desc limit 10; +--+-----------+------+-----+-------------+-------------+-------+----+----+-----------+ |id|select_type|table |type |possible_keys|key |key_len|ref |rows|Extra | +--+-----------+------+-----+-------------+-------------+-------+----+----+-----------+ |1 |SIMPLE |orders|index|NULL |i_o_orderdate|4 |NULL|10 |Using where| +--+-----------+------+-----+-------------+-------------+-------+----+----+-----------+ ● A problem: – 1.5M rows, 300K of them 'URGENT' – Scanning by date, when will we find 10 'URGENT' rows? – No good solution so far.
  • 92. 92 07:48:08 AM Using filesort strategy ● Have to read the entire first table ● For remaining, can apply LIMIT n ● ORDER BY can only use columns of tbl1.
  • 93. 93 07:48:08 AM Using temporary; Using filesort ● ORDER BY clause can use columns of any table ● LIMIT is applied only after executing the entire join and sorting.
  • 94. 94 07:48:08 AM ORDER BY - conclusions ● Resolving ORDER BY with index allows very efficient handling for LIMIT – Optimization for WHERE unused_condition ORDER BY … LIMIT n is challenging. ● Use sql_big_result, IGNORE INDEX FOR ORDER BY ● Using filesort – Needs all ORDER BY columns in the first table – Take advantage of LIMIT when doing join to non-first tables ● Using where; Using filesort is least efficient.
  • 95. 95 07:48:08 AM GROUP BY strategies There are three strategies ● Ordered index scan ● Loose Index Scan (LooseScan) ● Groups table (Using temporary; [Using filesort]).
  • 96. 96 07:48:08 AM Ordered index scan ● Groups are enumerated one after another ● Can compute aggregates on the fly ● Loose index scan is also able to jump to next group.
  • 97. 97 07:48:08 AM Execution of GROUP BY with temptable
  • 99. 99 07:48:08 AM Subquery optimizations ● Before MariaDB 5.3/MySQL 5.6 - “don't use subqueries” ● Queries that caused most of the pain – SELECT … FROM tbl WHERE col IN (SELECT …) - semi-joins – SELECT … FROM (SELECT …) - derived tables ● MariaDB 5.3 and MySQL 5.6 – Have common inheritance, MySQL 6.0 alpha – Huge (100x, 1000x) speedups for painful areas – Other kinds of subqueries received a speedup, too – MariaDB 5.3/5.5 has a superset of MySQL 5.6's optimizations ● 5.6 handles some un-handled edge cases, too
  • 100. 100 07:48:08 AM Tuning for subqueries ● “Before”: one execution strategy – No tuning possible ● “After”: similar to joins – Reasonable execution strategies supported – Need indexes – Need selective conditions – Support batching in most important cases ● Should be better 9x% of the time.
  • 101. 101 07:48:08 AM What if it still picks a poor query plan? For both MariaDB and MySQL: ● Check EXPLAIN [EXTENDED], find a keyword around a subquery table ● Google “ $subuqery_keyword” or ● Find which optimization it was ● set optimizer_switch='$subquery_optimization=off'