SlideShare a Scribd company logo
1 of 61
Download to read offline
Explaining the Postgres Query Optimizer
BRUCE MOMJIAN
January, 2015
The optimizer is the "brain" of the database, interpreting SQL
queries and determining the fastest method of execution. This
talk uses the EXPLAIN command to show how the optimizer
interprets queries and determines optimal execution.
Creative Commons Attribution License http://momjian.us/presentations
1 / 61
PostgreSQL the database…
◮ Open Source Object Relational DBMS since 1996
◮ Distributed under the PostgreSQL License
◮ Similar technical heritage as Oracle, SQL Server & DB2
◮ However, a strong adherence to standards (ANSI-SQL 2008)
◮ Highly extensible and adaptable design
◮ Languages, indexing, data types, etc.
◮ E.g. PostGIS, JSONB, SQL/MED
◮ Extensive use throughout the world for applications and
organizations of all types
◮ Bundled into Red Hat Enterprise Linux, Ubuntu, CentOS
and Amazon Linux
Explaining the Postgres Query Optimizer 2 / 61
PostgreSQL the community…
◮ Independent community led by a Core Team of six
◮ Large, active and vibrant community
◮ www.postgresql.org
◮ Downloads, Mailing lists, Documentation
◮ Sponsors sampler:
◮ Google, Red Hat, VMWare, Skype, Salesforce, HP and
EnterpriseDB
◮ http://www.postgresql.org/community/
Explaining the Postgres Query Optimizer 3 / 61
EnterpriseDB the company…
◮ The worldwide leader of Postgres based products and
services founded in 2004
◮ Customers include 50 of the Fortune 500 and 98 of the
Forbes Global 2000
◮ Enterprise offerings:
◮ PostgreSQL Support, Services and Training
◮ Postgres Plus Advanced Server with Oracle Compatibility
◮ Tools for Monitoring, Replication, HA, Backup & Recovery
Community
◮ Citizenship
◮ Contributor of key features: Materialized Views, JSON, &
more
◮ Nine community members on staff
Explaining the Postgres Query Optimizer 4 / 61
EnterpriseDB the company…
Explaining the Postgres Query Optimizer 5 / 61
Postgres Query Execution
User
Terminal
Code
Database
Server
Application
Queries
Results
PostgreSQL
Libpq
Explaining the Postgres Query Optimizer 6 / 61
Postgres Query Execution
utility
Plan
Optimal Path
Query
Postmaster
Postgres Postgres
Libpq
Main
Generate Plan
Traffic Cop
Generate Paths
Execute Plan
e.g. CREATE TABLE, COPY
SELECT, INSERT, UPDATE, DELETE
Rewrite Query
Parse Statement
Utility
Command
Storage ManagersCatalogUtilities
Access Methods Nodes / Lists
Explaining the Postgres Query Optimizer 7 / 61
Postgres Query Execution
utility
Plan
Optimal Path
Query
Generate Plan
Traffic Cop
Generate Paths
Execute Plan
e.g. CREATE TABLE, COPY
SELECT, INSERT, UPDATE, DELETE
Rewrite Query
Parse Statement
Utility
Command
Explaining the Postgres Query Optimizer 8 / 61
The Optimizer Is the Brain
http://www.wsmanaging.com/
Explaining the Postgres Query Optimizer 9 / 61
What Decisions Does the Optimizer Have to Make?
◮ Scan Method
◮ Join Method
◮ Join Order
Explaining the Postgres Query Optimizer 10 / 61
Which Scan Method?
◮ Sequential Scan
◮ Bitmap Index Scan
◮ Index Scan
Explaining the Postgres Query Optimizer 11 / 61
A Simple Example Using pg_class.relname
SELECT relname
FROM pg_class
ORDER BY 1
LIMIT 8;
relname
-----------------------------------
_pg_foreign_data_wrappers
_pg_foreign_servers
_pg_user_mappings
administrable_role_authorizations
applicable_roles
attributes
check_constraint_routine_usage
check_constraints
(8 rows)
Explaining the Postgres Query Optimizer 12 / 61
Let’s Use Just the First Letter of pg_class.relname
SELECT substring(relname, 1, 1)
FROM pg_class
ORDER BY 1
LIMIT 8;
substring
-----------
_
_
_
a
a
a
c
c
(8 rows)
Explaining the Postgres Query Optimizer 13 / 61
Create a Temporary Table with an Index
CREATE TEMPORARY TABLE sample (letter, junk) AS
SELECT substring(relname, 1, 1), repeat(’x’, 250)
FROM pg_class
ORDER BY random(); -- add rows in random order
SELECT 253
CREATE INDEX i_sample on sample (letter);
CREATE INDEX
All the queries used in this presentation are available at
http://momjian.us/main/writings/pgsql/optimizer.sql.
Explaining the Postgres Query Optimizer 14 / 61
Create an EXPLAIN Function
CREATE OR REPLACE FUNCTION lookup_letter(text) RETURNS SETOF text AS $$
BEGIN
RETURN QUERY EXECUTE ’
EXPLAIN SELECT letter
FROM sample
WHERE letter = ’’’ || $1 || ’’’’;
END
$$ LANGUAGE plpgsql;
CREATE FUNCTION
Explaining the Postgres Query Optimizer 15 / 61
What is the Distribution of the sample Table?
WITH letters (letter, count) AS (
SELECT letter, COUNT(*)
FROM sample
GROUP BY 1
)
SELECT letter, count, (count * 100.0 / (SUM(count) OVER ()))::numeric(4,1) AS "%"
FROM letters
ORDER BY 2 DESC;
Explaining the Postgres Query Optimizer 16 / 61
What is the Distribution of the sample Table?
letter | count | %
--------+-------+------
p | 199 | 78.7
s | 9 | 3.6
c | 8 | 3.2
r | 7 | 2.8
t | 5 | 2.0
v | 4 | 1.6
f | 4 | 1.6
d | 4 | 1.6
u | 3 | 1.2
a | 3 | 1.2
_ | 3 | 1.2
e | 2 | 0.8
i | 1 | 0.4
k | 1 | 0.4
(14 rows)
Explaining the Postgres Query Optimizer 17 / 61
Is the Distribution Important?
EXPLAIN SELECT letter
FROM sample
WHERE letter = ’p’;
QUERY PLAN
------------------------------------------------------------------------
Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32)
Index Cond: (letter = ’p’::text)
(2 rows)
Explaining the Postgres Query Optimizer 18 / 61
Is the Distribution Important?
EXPLAIN SELECT letter
FROM sample
WHERE letter = ’d’;
QUERY PLAN
------------------------------------------------------------------------
Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32)
Index Cond: (letter = ’d’::text)
(2 rows)
Explaining the Postgres Query Optimizer 19 / 61
Is the Distribution Important?
EXPLAIN SELECT letter
FROM sample
WHERE letter = ’k’;
QUERY PLAN
------------------------------------------------------------------------
Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32)
Index Cond: (letter = ’k’::text)
(2 rows)
Explaining the Postgres Query Optimizer 20 / 61
Running ANALYZE Causes
a Sequential Scan for a Common Value
ANALYZE sample;
ANALYZE
EXPLAIN SELECT letter
FROM sample
WHERE letter = ’p’;
QUERY PLAN
---------------------------------------------------------
Seq Scan on sample (cost=0.00..13.16 rows=199 width=2)
Filter: (letter = ’p’::text)
(2 rows)
Autovacuum cannot ANALYZE (or VACUUM) temporary tables because
these tables are only visible to the creating session.
Explaining the Postgres Query Optimizer 21 / 61
Sequential Scan
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
8K
Heap
A
A
D
T
A
T
A
D
A
T
A
D
A
Explaining the Postgres Query Optimizer 22 / 61
A Less Common Value Causes a Bitmap Index Scan
EXPLAIN SELECT letter
FROM sample
WHERE letter = ’d’;
QUERY PLAN
-----------------------------------------------------------------------
Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)
Recheck Cond: (letter = ’d’::text)
-> Bitmap Index Scan on i_sample (cost=0.00..4.28 rows=4 width=0)
Index Cond: (letter = ’d’::text)
(4 rows)
Explaining the Postgres Query Optimizer 23 / 61
Bitmap Index Scan
=&
Combined
’A’ AND ’NS’
1
0
1
0
TableIndex 1
col1 = ’A’
Index 2
1
0
0
col2 = ’NS’
1 0
1
0
0
Index
Explaining the Postgres Query Optimizer 24 / 61
An Even Rarer Value Causes an Index Scan
EXPLAIN SELECT letter
FROM sample
WHERE letter = ’k’;
QUERY PLAN
-----------------------------------------------------------------------
Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
Index Cond: (letter = ’k’::text)
(2 rows)
Explaining the Postgres Query Optimizer 25 / 61
Index Scan
A
D
A
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
< >=Key
< >=Key
Index
Heap
< >=Key
A
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
A
T
A
D
A
T
Explaining the Postgres Query Optimizer 26 / 61
Let’s Look at All Values and their Effects
WITH letter (letter, count) AS (
SELECT letter, COUNT(*)
FROM sample
GROUP BY 1
)
SELECT letter AS l, count, lookup_letter(letter)
FROM letter
ORDER BY 2 DESC;
l | count | lookup_letter
---+-------+-----------------------------------------------------------------------
p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2)
p | 199 | Filter: (letter = ’p’::text)
s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2)
s | 9 | Filter: (letter = ’s’::text)
c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2)
c | 8 | Filter: (letter = ’c’::text)
r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2)
r | 7 | Filter: (letter = ’r’::text)
…
Explaining the Postgres Query Optimizer 27 / 61
OK, Just the First Lines
WITH letter (letter, count) AS (
SELECT letter, COUNT(*)
FROM sample
GROUP BY 1
)
SELECT letter AS l, count,
(SELECT *
FROM lookup_letter(letter) AS l2
LIMIT 1) AS lookup_letter
FROM letter
ORDER BY 2 DESC;
Explaining the Postgres Query Optimizer 28 / 61
Just the First EXPLAIN Lines
l | count | lookup_letter
---+-------+-----------------------------------------------------------------------
p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2)
s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2)
c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2)
r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2)
t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2)
f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)
v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)
d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)
a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)
_ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)
u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)
e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
(14 rows)
Explaining the Postgres Query Optimizer 29 / 61
We Can Force an Index Scan
SET enable_seqscan = false;
SET enable_bitmapscan = false;
WITH letter (letter, count) AS (
SELECT letter, COUNT(*)
FROM sample
GROUP BY 1
)
SELECT letter AS l, count,
(SELECT *
FROM lookup_letter(letter) AS l2
LIMIT 1) AS lookup_letter
FROM letter
ORDER BY 2 DESC;
Explaining the Postgres Query Optimizer 30 / 61
Notice the High Cost for Common Values
l | count | lookup_letter
---+-------+-----------------------------------------------------------------------
p | 199 | Index Scan using i_sample on sample (cost=0.00..39.33 rows=199 width=
s | 9 | Index Scan using i_sample on sample (cost=0.00..22.14 rows=9 width=2)
c | 8 | Index Scan using i_sample on sample (cost=0.00..19.84 rows=8 width=2)
r | 7 | Index Scan using i_sample on sample (cost=0.00..19.82 rows=7 width=2)
t | 5 | Index Scan using i_sample on sample (cost=0.00..15.21 rows=5 width=2)
d | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2)
v | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2)
f | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2)
_ | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2)
a | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2)
u | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2)
e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
(14 rows)
RESET ALL;
RESET
Explaining the Postgres Query Optimizer 31 / 61
This Was the Optimizer’s Preference
l | count | lookup_letter
---+-------+-----------------------------------------------------------------------
p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2)
s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2)
c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2)
r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2)
t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2)
f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)
v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)
d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)
a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)
_ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)
u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)
e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)
(14 rows)
Explaining the Postgres Query Optimizer 32 / 61
Which Join Method?
◮ Nested Loop
◮ With Inner Sequential Scan
◮ With Inner Index Scan
◮ Hash Join
◮ Merge Join
Explaining the Postgres Query Optimizer 33 / 61
What Is in pg_proc.oid?
SELECT oid
FROM pg_proc
ORDER BY 1
LIMIT 8;
oid
-----
31
33
34
35
38
39
40
41
(8 rows)
Explaining the Postgres Query Optimizer 34 / 61
Create Temporary Tables
from pg_proc and pg_class
CREATE TEMPORARY TABLE sample1 (id, junk) AS
SELECT oid, repeat(’x’, 250)
FROM pg_proc
ORDER BY random(); -- add rows in random order
SELECT 2256
CREATE TEMPORARY TABLE sample2 (id, junk) AS
SELECT oid, repeat(’x’, 250)
FROM pg_class
ORDER BY random(); -- add rows in random order
SELECT 260
These tables have no indexes and no optimizer statistics.
Explaining the Postgres Query Optimizer 35 / 61
Join the Two Tables
with a Tight Restriction
EXPLAIN SELECT sample2.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id)
WHERE sample1.id = 33;
QUERY PLAN
---------------------------------------------------------------------
Nested Loop (cost=0.00..234.68 rows=300 width=32)
-> Seq Scan on sample1 (cost=0.00..205.54 rows=50 width=4)
Filter: (id = 33::oid)
-> Materialize (cost=0.00..25.41 rows=6 width=36)
-> Seq Scan on sample2 (cost=0.00..25.38 rows=6 width=36)
Filter: (id = 33::oid)
(6 rows)
Explaining the Postgres Query Optimizer 36 / 61
Nested Loop Join
with Inner Sequential Scan
aag
aar
aay aag
aas
aar
aaa
aay
aai
aag
No Setup Required
aai
Used For Small Tables
Outer Inner
Explaining the Postgres Query Optimizer 37 / 61
Pseudocode for Nested Loop Join
with Inner Sequential Scan
for (i = 0; i < length(outer); i++)
for (j = 0; j < length(inner); j++)
if (outer[i] == inner[j])
output(outer[i], inner[j]);
Explaining the Postgres Query Optimizer 38 / 61
Join the Two Tables with a Looser Restriction
EXPLAIN SELECT sample1.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id)
WHERE sample2.id > 33;
QUERY PLAN
----------------------------------------------------------------------
Hash Join (cost=30.50..950.88 rows=20424 width=32)
Hash Cond: (sample1.id = sample2.id)
-> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=36)
-> Hash (cost=25.38..25.38 rows=410 width=4)
-> Seq Scan on sample2 (cost=0.00..25.38 rows=410 width=4)
Filter: (id > 33::oid)
(6 rows)
Explaining the Postgres Query Optimizer 39 / 61
Hash Join
Hashed
Must fit in Main Memory
aak
aar
aak
aay aaraam
aao aaw
aay
aag
aas
Outer Inner
Explaining the Postgres Query Optimizer 40 / 61
Pseudocode for Hash Join
for (j = 0; j < length(inner); j++)
hash_key = hash(inner[j]);
append(hash_store[hash_key], inner[j]);
for (i = 0; i < length(outer); i++)
hash_key = hash(outer[i]);
for (j = 0; j < length(hash_store[hash_key]); j++)
if (outer[i] == hash_store[hash_key][j])
output(outer[i], inner[j]);
Explaining the Postgres Query Optimizer 41 / 61
Join the Two Tables with No Restriction
EXPLAIN SELECT sample1.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id);
QUERY PLAN
-------------------------------------------------------------------------
Merge Join (cost=927.72..1852.95 rows=61272 width=32)
Merge Cond: (sample2.id = sample1.id)
-> Sort (cost=85.43..88.50 rows=1230 width=4)
Sort Key: sample2.id
-> Seq Scan on sample2 (cost=0.00..22.30 rows=1230 width=4)
-> Sort (cost=842.29..867.20 rows=9963 width=36)
Sort Key: sample1.id
-> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=36)
(8 rows)
Explaining the Postgres Query Optimizer 42 / 61
Merge Join
Sorted
Sorted
Ideal for Large Tables
An Index Can Be Used to Eliminate the Sort
aaa
aab
aac
aad
aaa
aab
aab
aaf
aaf
aac
aae
Outer Inner
Explaining the Postgres Query Optimizer 43 / 61
Pseudocode for Merge Join
sort(outer);
sort(inner);
i = 0;
j = 0;
save_j = 0;
while (i < length(outer))
if (outer[i] == inner[j])
output(outer[i], inner[j]);
if (outer[i] <= inner[j] && j < length(inner))
j++;
if (outer[i] < inner[j])
save_j = j;
else
i++;
j = save_j;
Explaining the Postgres Query Optimizer 44 / 61
Order of Joined Relations Is Insignificant
EXPLAIN SELECT sample2.junk
FROM sample2 JOIN sample1 ON (sample2.id = sample1.id);
QUERY PLAN
------------------------------------------------------------------------
Merge Join (cost=927.72..1852.95 rows=61272 width=32)
Merge Cond: (sample2.id = sample1.id)
-> Sort (cost=85.43..88.50 rows=1230 width=36)
Sort Key: sample2.id
-> Seq Scan on sample2 (cost=0.00..22.30 rows=1230 width=36)
-> Sort (cost=842.29..867.20 rows=9963 width=4)
Sort Key: sample1.id
-> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=4)
(8 rows)
The most restrictive relation, e.g. sample2, is always on the outer side of
merge joins. All previous merge joins also had sample2 in outer position.
Explaining the Postgres Query Optimizer 45 / 61
Add Optimizer Statistics
ANALYZE sample1;
ANALYZE sample2;
Explaining the Postgres Query Optimizer 46 / 61
This Was a Merge Join without Optimizer Statistics
EXPLAIN SELECT sample2.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id);
QUERY PLAN
------------------------------------------------------------------------
Hash Join (cost=15.85..130.47 rows=260 width=254)
Hash Cond: (sample1.id = sample2.id)
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4)
-> Hash (cost=12.60..12.60 rows=260 width=258)
-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258)
(5 rows)
Explaining the Postgres Query Optimizer 47 / 61
Outer Joins Can Affect Optimizer Join Usage
EXPLAIN SELECT sample1.junk
FROM sample1 RIGHT OUTER JOIN sample2 ON (sample1.id = sample2.id);
QUERY PLAN
--------------------------------------------------------------------------
Hash Left Join (cost=131.76..148.26 rows=260 width=254)
Hash Cond: (sample2.id = sample1.id)
-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=4)
-> Hash (cost=103.56..103.56 rows=2256 width=258)
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=258)
(5 rows)
Use of hashes for outer joins was added in Postgres 9.1.
Explaining the Postgres Query Optimizer 48 / 61
Cross Joins Are Nested Loop Joins
without Join Restriction
EXPLAIN SELECT sample1.junk
FROM sample1 CROSS JOIN sample2;
QUERY PLAN
----------------------------------------------------------------------
Nested Loop (cost=0.00..7448.81 rows=586560 width=254)
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=254)
-> Materialize (cost=0.00..13.90 rows=260 width=0)
-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=0)
(4 rows)
Explaining the Postgres Query Optimizer 49 / 61
Create Indexes
CREATE INDEX i_sample1 on sample1 (id);
CREATE INDEX i_sample2 on sample2 (id);
Explaining the Postgres Query Optimizer 50 / 61
Nested Loop with Inner Index Scan Now Possible
EXPLAIN SELECT sample2.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id)
WHERE sample1.id = 33;
QUERY PLAN
---------------------------------------------------------------------------------
Nested Loop (cost=0.00..16.55 rows=1 width=254)
-> Index Scan using i_sample1 on sample1 (cost=0.00..8.27 rows=1 width=4)
Index Cond: (id = 33::oid)
-> Index Scan using i_sample2 on sample2 (cost=0.00..8.27 rows=1 width=258)
Index Cond: (sample2.id = 33::oid)
(5 rows)
Explaining the Postgres Query Optimizer 51 / 61
Nested Loop Join with Inner Index Scan
aag
aar
aai
aay aag
aas
aar
aaa
aay
aai
aag
No Setup Required
Index Lookup
Index Must Already Exist
Outer Inner
Explaining the Postgres Query Optimizer 52 / 61
Pseudocode for Nested Loop Join
with Inner Index Scan
for (i = 0; i < length(outer); i++)
index_entry = get_first_match(outer[j])
while (index_entry)
output(outer[i], inner[index_entry]);
index_entry = get_next_match(index_entry);
Explaining the Postgres Query Optimizer 53 / 61
Query Restrictions Affect Join Usage
EXPLAIN SELECT sample2.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id)
WHERE sample2.junk ˜ ’^aaa’;
QUERY PLAN
-------------------------------------------------------------------------------
Nested Loop (cost=0.00..21.53 rows=1 width=254)
-> Seq Scan on sample2 (cost=0.00..13.25 rows=1 width=258)
Filter: (junk ˜ ’^aaa’::text)
-> Index Scan using i_sample1 on sample1 (cost=0.00..8.27 rows=1 width=4)
Index Cond: (sample1.id = sample2.id)
(5 rows)
No junk rows begin with ’aaa’.
Explaining the Postgres Query Optimizer 54 / 61
All ’junk’ Columns Begin with ’xxx’
EXPLAIN SELECT sample2.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id)
WHERE sample2.junk ˜ ’^xxx’;
QUERY PLAN
------------------------------------------------------------------------
Hash Join (cost=16.50..131.12 rows=260 width=254)
Hash Cond: (sample1.id = sample2.id)
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4)
-> Hash (cost=13.25..13.25 rows=260 width=258)
-> Seq Scan on sample2 (cost=0.00..13.25 rows=260 width=258)
Filter: (junk ˜ ’^xxx’::text)
(6 rows)
Hash join was chosen because many more rows are expected. The
smaller table, e.g. sample2, is always hashed.
Explaining the Postgres Query Optimizer 55 / 61
Without LIMIT, Hash Is Used
for this Unrestricted Join
EXPLAIN SELECT sample2.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id);
QUERY PLAN
------------------------------------------------------------------------
Hash Join (cost=15.85..130.47 rows=260 width=254)
Hash Cond: (sample1.id = sample2.id)
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4)
-> Hash (cost=12.60..12.60 rows=260 width=258)
-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258)
(5 rows)
Explaining the Postgres Query Optimizer 56 / 61
LIMIT Can Affect Join Usage
EXPLAIN SELECT sample2.id, sample2.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id)
ORDER BY 1
LIMIT 1;
QUERY PLAN
------------------------------------------------------------------------------------------
Limit (cost=0.00..1.83 rows=1 width=258)
-> Nested Loop (cost=0.00..477.02 rows=260 width=258)
-> Index Scan using i_sample2 on sample2 (cost=0.00..52.15 rows=260 width=258)
-> Index Scan using i_sample1 on sample1 (cost=0.00..1.62 rows=1 width=4)
Index Cond: (sample1.id = sample2.id)
(5 rows)
Explaining the Postgres Query Optimizer 57 / 61
LIMIT 10
EXPLAIN SELECT sample2.id, sample2.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id)
ORDER BY 1
LIMIT 10;
QUERY PLAN
------------------------------------------------------------------------------------------
Limit (cost=0.00..18.35 rows=10 width=258)
-> Nested Loop (cost=0.00..477.02 rows=260 width=258)
-> Index Scan using i_sample2 on sample2 (cost=0.00..52.15 rows=260 width=258)
-> Index Scan using i_sample1 on sample1 (cost=0.00..1.62 rows=1 width=4)
Index Cond: (sample1.id = sample2.id)
(5 rows)
Explaining the Postgres Query Optimizer 58 / 61
LIMIT 100 Switches to Hash Join
EXPLAIN SELECT sample2.id, sample2.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id)
ORDER BY 1
LIMIT 100;
QUERY PLAN
------------------------------------------------------------------------------------
Limit (cost=140.41..140.66 rows=100 width=258)
-> Sort (cost=140.41..141.06 rows=260 width=258)
Sort Key: sample2.id
-> Hash Join (cost=15.85..130.47 rows=260 width=258)
Hash Cond: (sample1.id = sample2.id)
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4)
-> Hash (cost=12.60..12.60 rows=260 width=258)
-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258)
(8 rows)
Explaining the Postgres Query Optimizer 59 / 61
Additional Resources…
◮ Postgres Downloads:
◮ www.enterprisedb.com/downloads
◮ Product and Services information:
◮ info@enterprisedb.com
Explaining the Postgres Query Optimizer 60 / 61
Conclusion
http://momjian.us/presentations http://www.vivapixel.com/photo/14252
Explaining the Postgres Query Optimizer 61 / 61

More Related Content

What's hot

How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinDataStax Academy
 
PostgreSQL query planner's internals
PostgreSQL query planner's internalsPostgreSQL query planner's internals
PostgreSQL query planner's internalsAlexey Ermakov
 
[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouseVianney FOUCAULT
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015PostgreSQL-Consulting
 
Query Optimizer – MySQL vs. PostgreSQL
Query Optimizer – MySQL vs. PostgreSQLQuery Optimizer – MySQL vs. PostgreSQL
Query Optimizer – MySQL vs. PostgreSQLChristian Antognini
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper lookJignesh Shah
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuningelliando dias
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...Altinity Ltd
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides Altinity Ltd
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres MonitoringDenish Patel
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresqlbotsplash.com
 

What's hot (20)

How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
PostgreSQL query planner's internals
PostgreSQL query planner's internalsPostgreSQL query planner's internals
PostgreSQL query planner's internals
 
[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
 
Query Optimizer – MySQL vs. PostgreSQL
Query Optimizer – MySQL vs. PostgreSQLQuery Optimizer – MySQL vs. PostgreSQL
Query Optimizer – MySQL vs. PostgreSQL
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper look
 
Ash and awr deep dive hotsos
Ash and awr deep dive hotsosAsh and awr deep dive hotsos
Ash and awr deep dive hotsos
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuning
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
 
PostgreSQL.pptx
PostgreSQL.pptxPostgreSQL.pptx
PostgreSQL.pptx
 
Sql query patterns, optimized
Sql query patterns, optimizedSql query patterns, optimized
Sql query patterns, optimized
 

Viewers also liked

Postgres-XC as a Key Value Store Compared To MongoDB
Postgres-XC as a Key Value Store Compared To MongoDBPostgres-XC as a Key Value Store Compared To MongoDB
Postgres-XC as a Key Value Store Compared To MongoDBMason Sharp
 
variedades linguisticas por Paola Espinoza
variedades linguisticas por Paola Espinozavariedades linguisticas por Paola Espinoza
variedades linguisticas por Paola Espinozapaopeke26
 
Indexes: The neglected performance all rounder
Indexes: The neglected performance all rounderIndexes: The neglected performance all rounder
Indexes: The neglected performance all rounderMarkus Winand
 
Postgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL ClusterPostgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL ClusterPavan Deolasee
 
Distributed Postgres
Distributed PostgresDistributed Postgres
Distributed PostgresStas Kelvich
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
Postgres-XC Write Scalable PostgreSQL Cluster
Postgres-XC Write Scalable PostgreSQL ClusterPostgres-XC Write Scalable PostgreSQL Cluster
Postgres-XC Write Scalable PostgreSQL ClusterMason Sharp
 
SQL: Query optimization in practice
SQL: Query optimization in practiceSQL: Query optimization in practice
SQL: Query optimization in practiceJano Suchal
 
Modern SQL in Open Source and Commercial Databases
Modern SQL in Open Source and Commercial DatabasesModern SQL in Open Source and Commercial Databases
Modern SQL in Open Source and Commercial DatabasesMarkus Winand
 

Viewers also liked (12)

Postgres clusters
Postgres clustersPostgres clusters
Postgres clusters
 
Postgres-XC as a Key Value Store Compared To MongoDB
Postgres-XC as a Key Value Store Compared To MongoDBPostgres-XC as a Key Value Store Compared To MongoDB
Postgres-XC as a Key Value Store Compared To MongoDB
 
variedades linguisticas por Paola Espinoza
variedades linguisticas por Paola Espinozavariedades linguisticas por Paola Espinoza
variedades linguisticas por Paola Espinoza
 
1
11
1
 
Indexes: The neglected performance all rounder
Indexes: The neglected performance all rounderIndexes: The neglected performance all rounder
Indexes: The neglected performance all rounder
 
Postgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL ClusterPostgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL Cluster
 
Distributed Postgres
Distributed PostgresDistributed Postgres
Distributed Postgres
 
Multimaster
MultimasterMultimaster
Multimaster
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
Postgres-XC Write Scalable PostgreSQL Cluster
Postgres-XC Write Scalable PostgreSQL ClusterPostgres-XC Write Scalable PostgreSQL Cluster
Postgres-XC Write Scalable PostgreSQL Cluster
 
SQL: Query optimization in practice
SQL: Query optimization in practiceSQL: Query optimization in practice
SQL: Query optimization in practice
 
Modern SQL in Open Source and Commercial Databases
Modern SQL in Open Source and Commercial DatabasesModern SQL in Open Source and Commercial Databases
Modern SQL in Open Source and Commercial Databases
 

Similar to How the Postgres Query Optimizer Works

Explaining the Postgres Query Optimizer - PGCon 2014
Explaining the Postgres Query Optimizer - PGCon 2014Explaining the Postgres Query Optimizer - PGCon 2014
Explaining the Postgres Query Optimizer - PGCon 2014EDB
 
Explaining the Postgres Query Optimizer
Explaining the Postgres Query OptimizerExplaining the Postgres Query Optimizer
Explaining the Postgres Query OptimizerEDB
 
Explaining the Postgres Query Optimizer (Bruce Momjian)
Explaining the Postgres Query Optimizer (Bruce Momjian)Explaining the Postgres Query Optimizer (Bruce Momjian)
Explaining the Postgres Query Optimizer (Bruce Momjian)Ontico
 
Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluationavniS
 
Correctness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLCorrectness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLNicolas Poggi
 
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Databricks
 
Fast and Reliable Apache Spark SQL Releases
Fast and Reliable Apache Spark SQL ReleasesFast and Reliable Apache Spark SQL Releases
Fast and Reliable Apache Spark SQL ReleasesDataWorks Summit
 
Fast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL EngineFast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL EngineDatabricks
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cRonald Francisco Vargas Quesada
 
SQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19cSQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19cRachelBarker26
 
Managing Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceManaging Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceKaren Morton
 
Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Databricks
 
Summarizing Software API Usage Examples Using Clustering Techniques
Summarizing Software API Usage Examples Using Clustering TechniquesSummarizing Software API Usage Examples Using Clustering Techniques
Summarizing Software API Usage Examples Using Clustering TechniquesNikos Katirtzis
 
Top 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsTop 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsNirav Shah
 
PostgreSQL 8.4 TriLUG 2009-11-12
PostgreSQL 8.4 TriLUG 2009-11-12PostgreSQL 8.4 TriLUG 2009-11-12
PostgreSQL 8.4 TriLUG 2009-11-12Andrew Dunstan
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Databricks
 
Advanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreAdvanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreLukas Fittl
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQLSatoshi Nagayasu
 
Query optimizer vivek sharma
Query optimizer vivek sharmaQuery optimizer vivek sharma
Query optimizer vivek sharmaaioughydchapter
 
Sql and PL/SQL Best Practices I
Sql and PL/SQL Best Practices ISql and PL/SQL Best Practices I
Sql and PL/SQL Best Practices ICarlos Oliveira
 

Similar to How the Postgres Query Optimizer Works (20)

Explaining the Postgres Query Optimizer - PGCon 2014
Explaining the Postgres Query Optimizer - PGCon 2014Explaining the Postgres Query Optimizer - PGCon 2014
Explaining the Postgres Query Optimizer - PGCon 2014
 
Explaining the Postgres Query Optimizer
Explaining the Postgres Query OptimizerExplaining the Postgres Query Optimizer
Explaining the Postgres Query Optimizer
 
Explaining the Postgres Query Optimizer (Bruce Momjian)
Explaining the Postgres Query Optimizer (Bruce Momjian)Explaining the Postgres Query Optimizer (Bruce Momjian)
Explaining the Postgres Query Optimizer (Bruce Momjian)
 
Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluation
 
Correctness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLCorrectness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQL
 
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
 
Fast and Reliable Apache Spark SQL Releases
Fast and Reliable Apache Spark SQL ReleasesFast and Reliable Apache Spark SQL Releases
Fast and Reliable Apache Spark SQL Releases
 
Fast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL EngineFast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL Engine
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12c
 
SQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19cSQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19c
 
Managing Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceManaging Statistics for Optimal Query Performance
Managing Statistics for Optimal Query Performance
 
Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2
 
Summarizing Software API Usage Examples Using Clustering Techniques
Summarizing Software API Usage Examples Using Clustering TechniquesSummarizing Software API Usage Examples Using Clustering Techniques
Summarizing Software API Usage Examples Using Clustering Techniques
 
Top 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsTop 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tips
 
PostgreSQL 8.4 TriLUG 2009-11-12
PostgreSQL 8.4 TriLUG 2009-11-12PostgreSQL 8.4 TriLUG 2009-11-12
PostgreSQL 8.4 TriLUG 2009-11-12
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
 
Advanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreAdvanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & more
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
 
Query optimizer vivek sharma
Query optimizer vivek sharmaQuery optimizer vivek sharma
Query optimizer vivek sharma
 
Sql and PL/SQL Best Practices I
Sql and PL/SQL Best Practices ISql and PL/SQL Best Practices I
Sql and PL/SQL Best Practices I
 

More from EDB

Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSEDB
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenDie 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenEDB
 
Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube EDB
 
EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EDB
 
Benchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLBenchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLEDB
 
Las Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLLas Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLEDB
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLEDB
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?EDB
 
Data Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQLData Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQLEDB
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresEDB
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINEDB
 
IOT with PostgreSQL
IOT with PostgreSQLIOT with PostgreSQL
IOT with PostgreSQLEDB
 
A Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLA Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLEDB
 
Psql is awesome!
Psql is awesome!Psql is awesome!
Psql is awesome!EDB
 
EDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB
 
Comment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesComment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesEDB
 
Cloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoCloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoEDB
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13EDB
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLEDB
 
Cloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJCloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJEDB
 

More from EDB (20)

Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenDie 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
 
Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube
 
EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021
 
Benchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLBenchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQL
 
Las Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLLas Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQL
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
 
Data Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQLData Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQL
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
 
IOT with PostgreSQL
IOT with PostgreSQLIOT with PostgreSQL
IOT with PostgreSQL
 
A Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLA Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQL
 
Psql is awesome!
Psql is awesome!Psql is awesome!
Psql is awesome!
 
EDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJ
 
Comment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesComment sauvegarder correctement vos données
Comment sauvegarder correctement vos données
 
Cloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoCloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - Italiano
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
 
Cloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJCloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJ
 

Recently uploaded

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

How the Postgres Query Optimizer Works

  • 1. Explaining the Postgres Query Optimizer BRUCE MOMJIAN January, 2015 The optimizer is the "brain" of the database, interpreting SQL queries and determining the fastest method of execution. This talk uses the EXPLAIN command to show how the optimizer interprets queries and determines optimal execution. Creative Commons Attribution License http://momjian.us/presentations 1 / 61
  • 2. PostgreSQL the database… ◮ Open Source Object Relational DBMS since 1996 ◮ Distributed under the PostgreSQL License ◮ Similar technical heritage as Oracle, SQL Server & DB2 ◮ However, a strong adherence to standards (ANSI-SQL 2008) ◮ Highly extensible and adaptable design ◮ Languages, indexing, data types, etc. ◮ E.g. PostGIS, JSONB, SQL/MED ◮ Extensive use throughout the world for applications and organizations of all types ◮ Bundled into Red Hat Enterprise Linux, Ubuntu, CentOS and Amazon Linux Explaining the Postgres Query Optimizer 2 / 61
  • 3. PostgreSQL the community… ◮ Independent community led by a Core Team of six ◮ Large, active and vibrant community ◮ www.postgresql.org ◮ Downloads, Mailing lists, Documentation ◮ Sponsors sampler: ◮ Google, Red Hat, VMWare, Skype, Salesforce, HP and EnterpriseDB ◮ http://www.postgresql.org/community/ Explaining the Postgres Query Optimizer 3 / 61
  • 4. EnterpriseDB the company… ◮ The worldwide leader of Postgres based products and services founded in 2004 ◮ Customers include 50 of the Fortune 500 and 98 of the Forbes Global 2000 ◮ Enterprise offerings: ◮ PostgreSQL Support, Services and Training ◮ Postgres Plus Advanced Server with Oracle Compatibility ◮ Tools for Monitoring, Replication, HA, Backup & Recovery Community ◮ Citizenship ◮ Contributor of key features: Materialized Views, JSON, & more ◮ Nine community members on staff Explaining the Postgres Query Optimizer 4 / 61
  • 5. EnterpriseDB the company… Explaining the Postgres Query Optimizer 5 / 61
  • 7. Postgres Query Execution utility Plan Optimal Path Query Postmaster Postgres Postgres Libpq Main Generate Plan Traffic Cop Generate Paths Execute Plan e.g. CREATE TABLE, COPY SELECT, INSERT, UPDATE, DELETE Rewrite Query Parse Statement Utility Command Storage ManagersCatalogUtilities Access Methods Nodes / Lists Explaining the Postgres Query Optimizer 7 / 61
  • 8. Postgres Query Execution utility Plan Optimal Path Query Generate Plan Traffic Cop Generate Paths Execute Plan e.g. CREATE TABLE, COPY SELECT, INSERT, UPDATE, DELETE Rewrite Query Parse Statement Utility Command Explaining the Postgres Query Optimizer 8 / 61
  • 9. The Optimizer Is the Brain http://www.wsmanaging.com/ Explaining the Postgres Query Optimizer 9 / 61
  • 10. What Decisions Does the Optimizer Have to Make? ◮ Scan Method ◮ Join Method ◮ Join Order Explaining the Postgres Query Optimizer 10 / 61
  • 11. Which Scan Method? ◮ Sequential Scan ◮ Bitmap Index Scan ◮ Index Scan Explaining the Postgres Query Optimizer 11 / 61
  • 12. A Simple Example Using pg_class.relname SELECT relname FROM pg_class ORDER BY 1 LIMIT 8; relname ----------------------------------- _pg_foreign_data_wrappers _pg_foreign_servers _pg_user_mappings administrable_role_authorizations applicable_roles attributes check_constraint_routine_usage check_constraints (8 rows) Explaining the Postgres Query Optimizer 12 / 61
  • 13. Let’s Use Just the First Letter of pg_class.relname SELECT substring(relname, 1, 1) FROM pg_class ORDER BY 1 LIMIT 8; substring ----------- _ _ _ a a a c c (8 rows) Explaining the Postgres Query Optimizer 13 / 61
  • 14. Create a Temporary Table with an Index CREATE TEMPORARY TABLE sample (letter, junk) AS SELECT substring(relname, 1, 1), repeat(’x’, 250) FROM pg_class ORDER BY random(); -- add rows in random order SELECT 253 CREATE INDEX i_sample on sample (letter); CREATE INDEX All the queries used in this presentation are available at http://momjian.us/main/writings/pgsql/optimizer.sql. Explaining the Postgres Query Optimizer 14 / 61
  • 15. Create an EXPLAIN Function CREATE OR REPLACE FUNCTION lookup_letter(text) RETURNS SETOF text AS $$ BEGIN RETURN QUERY EXECUTE ’ EXPLAIN SELECT letter FROM sample WHERE letter = ’’’ || $1 || ’’’’; END $$ LANGUAGE plpgsql; CREATE FUNCTION Explaining the Postgres Query Optimizer 15 / 61
  • 16. What is the Distribution of the sample Table? WITH letters (letter, count) AS ( SELECT letter, COUNT(*) FROM sample GROUP BY 1 ) SELECT letter, count, (count * 100.0 / (SUM(count) OVER ()))::numeric(4,1) AS "%" FROM letters ORDER BY 2 DESC; Explaining the Postgres Query Optimizer 16 / 61
  • 17. What is the Distribution of the sample Table? letter | count | % --------+-------+------ p | 199 | 78.7 s | 9 | 3.6 c | 8 | 3.2 r | 7 | 2.8 t | 5 | 2.0 v | 4 | 1.6 f | 4 | 1.6 d | 4 | 1.6 u | 3 | 1.2 a | 3 | 1.2 _ | 3 | 1.2 e | 2 | 0.8 i | 1 | 0.4 k | 1 | 0.4 (14 rows) Explaining the Postgres Query Optimizer 17 / 61
  • 18. Is the Distribution Important? EXPLAIN SELECT letter FROM sample WHERE letter = ’p’; QUERY PLAN ------------------------------------------------------------------------ Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32) Index Cond: (letter = ’p’::text) (2 rows) Explaining the Postgres Query Optimizer 18 / 61
  • 19. Is the Distribution Important? EXPLAIN SELECT letter FROM sample WHERE letter = ’d’; QUERY PLAN ------------------------------------------------------------------------ Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32) Index Cond: (letter = ’d’::text) (2 rows) Explaining the Postgres Query Optimizer 19 / 61
  • 20. Is the Distribution Important? EXPLAIN SELECT letter FROM sample WHERE letter = ’k’; QUERY PLAN ------------------------------------------------------------------------ Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32) Index Cond: (letter = ’k’::text) (2 rows) Explaining the Postgres Query Optimizer 20 / 61
  • 21. Running ANALYZE Causes a Sequential Scan for a Common Value ANALYZE sample; ANALYZE EXPLAIN SELECT letter FROM sample WHERE letter = ’p’; QUERY PLAN --------------------------------------------------------- Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) Filter: (letter = ’p’::text) (2 rows) Autovacuum cannot ANALYZE (or VACUUM) temporary tables because these tables are only visible to the creating session. Explaining the Postgres Query Optimizer 21 / 61
  • 23. A Less Common Value Causes a Bitmap Index Scan EXPLAIN SELECT letter FROM sample WHERE letter = ’d’; QUERY PLAN ----------------------------------------------------------------------- Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) Recheck Cond: (letter = ’d’::text) -> Bitmap Index Scan on i_sample (cost=0.00..4.28 rows=4 width=0) Index Cond: (letter = ’d’::text) (4 rows) Explaining the Postgres Query Optimizer 23 / 61
  • 24. Bitmap Index Scan =& Combined ’A’ AND ’NS’ 1 0 1 0 TableIndex 1 col1 = ’A’ Index 2 1 0 0 col2 = ’NS’ 1 0 1 0 0 Index Explaining the Postgres Query Optimizer 24 / 61
  • 25. An Even Rarer Value Causes an Index Scan EXPLAIN SELECT letter FROM sample WHERE letter = ’k’; QUERY PLAN ----------------------------------------------------------------------- Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) Index Cond: (letter = ’k’::text) (2 rows) Explaining the Postgres Query Optimizer 25 / 61
  • 26. Index Scan A D A T A D A T A D A T A D A T A D A T A D A T A D < >=Key < >=Key Index Heap < >=Key A T A D A T A D A T A D A T A D A T A D A T Explaining the Postgres Query Optimizer 26 / 61
  • 27. Let’s Look at All Values and their Effects WITH letter (letter, count) AS ( SELECT letter, COUNT(*) FROM sample GROUP BY 1 ) SELECT letter AS l, count, lookup_letter(letter) FROM letter ORDER BY 2 DESC; l | count | lookup_letter ---+-------+----------------------------------------------------------------------- p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) p | 199 | Filter: (letter = ’p’::text) s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2) s | 9 | Filter: (letter = ’s’::text) c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2) c | 8 | Filter: (letter = ’c’::text) r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2) r | 7 | Filter: (letter = ’r’::text) … Explaining the Postgres Query Optimizer 27 / 61
  • 28. OK, Just the First Lines WITH letter (letter, count) AS ( SELECT letter, COUNT(*) FROM sample GROUP BY 1 ) SELECT letter AS l, count, (SELECT * FROM lookup_letter(letter) AS l2 LIMIT 1) AS lookup_letter FROM letter ORDER BY 2 DESC; Explaining the Postgres Query Optimizer 28 / 61
  • 29. Just the First EXPLAIN Lines l | count | lookup_letter ---+-------+----------------------------------------------------------------------- p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2) c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2) r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2) t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2) f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) _ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) (14 rows) Explaining the Postgres Query Optimizer 29 / 61
  • 30. We Can Force an Index Scan SET enable_seqscan = false; SET enable_bitmapscan = false; WITH letter (letter, count) AS ( SELECT letter, COUNT(*) FROM sample GROUP BY 1 ) SELECT letter AS l, count, (SELECT * FROM lookup_letter(letter) AS l2 LIMIT 1) AS lookup_letter FROM letter ORDER BY 2 DESC; Explaining the Postgres Query Optimizer 30 / 61
  • 31. Notice the High Cost for Common Values l | count | lookup_letter ---+-------+----------------------------------------------------------------------- p | 199 | Index Scan using i_sample on sample (cost=0.00..39.33 rows=199 width= s | 9 | Index Scan using i_sample on sample (cost=0.00..22.14 rows=9 width=2) c | 8 | Index Scan using i_sample on sample (cost=0.00..19.84 rows=8 width=2) r | 7 | Index Scan using i_sample on sample (cost=0.00..19.82 rows=7 width=2) t | 5 | Index Scan using i_sample on sample (cost=0.00..15.21 rows=5 width=2) d | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2) v | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2) f | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2) _ | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2) a | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2) u | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2) e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) (14 rows) RESET ALL; RESET Explaining the Postgres Query Optimizer 31 / 61
  • 32. This Was the Optimizer’s Preference l | count | lookup_letter ---+-------+----------------------------------------------------------------------- p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2) c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2) r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2) t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2) f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) _ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) (14 rows) Explaining the Postgres Query Optimizer 32 / 61
  • 33. Which Join Method? ◮ Nested Loop ◮ With Inner Sequential Scan ◮ With Inner Index Scan ◮ Hash Join ◮ Merge Join Explaining the Postgres Query Optimizer 33 / 61
  • 34. What Is in pg_proc.oid? SELECT oid FROM pg_proc ORDER BY 1 LIMIT 8; oid ----- 31 33 34 35 38 39 40 41 (8 rows) Explaining the Postgres Query Optimizer 34 / 61
  • 35. Create Temporary Tables from pg_proc and pg_class CREATE TEMPORARY TABLE sample1 (id, junk) AS SELECT oid, repeat(’x’, 250) FROM pg_proc ORDER BY random(); -- add rows in random order SELECT 2256 CREATE TEMPORARY TABLE sample2 (id, junk) AS SELECT oid, repeat(’x’, 250) FROM pg_class ORDER BY random(); -- add rows in random order SELECT 260 These tables have no indexes and no optimizer statistics. Explaining the Postgres Query Optimizer 35 / 61
  • 36. Join the Two Tables with a Tight Restriction EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) WHERE sample1.id = 33; QUERY PLAN --------------------------------------------------------------------- Nested Loop (cost=0.00..234.68 rows=300 width=32) -> Seq Scan on sample1 (cost=0.00..205.54 rows=50 width=4) Filter: (id = 33::oid) -> Materialize (cost=0.00..25.41 rows=6 width=36) -> Seq Scan on sample2 (cost=0.00..25.38 rows=6 width=36) Filter: (id = 33::oid) (6 rows) Explaining the Postgres Query Optimizer 36 / 61
  • 37. Nested Loop Join with Inner Sequential Scan aag aar aay aag aas aar aaa aay aai aag No Setup Required aai Used For Small Tables Outer Inner Explaining the Postgres Query Optimizer 37 / 61
  • 38. Pseudocode for Nested Loop Join with Inner Sequential Scan for (i = 0; i < length(outer); i++) for (j = 0; j < length(inner); j++) if (outer[i] == inner[j]) output(outer[i], inner[j]); Explaining the Postgres Query Optimizer 38 / 61
  • 39. Join the Two Tables with a Looser Restriction EXPLAIN SELECT sample1.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) WHERE sample2.id > 33; QUERY PLAN ---------------------------------------------------------------------- Hash Join (cost=30.50..950.88 rows=20424 width=32) Hash Cond: (sample1.id = sample2.id) -> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=36) -> Hash (cost=25.38..25.38 rows=410 width=4) -> Seq Scan on sample2 (cost=0.00..25.38 rows=410 width=4) Filter: (id > 33::oid) (6 rows) Explaining the Postgres Query Optimizer 39 / 61
  • 40. Hash Join Hashed Must fit in Main Memory aak aar aak aay aaraam aao aaw aay aag aas Outer Inner Explaining the Postgres Query Optimizer 40 / 61
  • 41. Pseudocode for Hash Join for (j = 0; j < length(inner); j++) hash_key = hash(inner[j]); append(hash_store[hash_key], inner[j]); for (i = 0; i < length(outer); i++) hash_key = hash(outer[i]); for (j = 0; j < length(hash_store[hash_key]); j++) if (outer[i] == hash_store[hash_key][j]) output(outer[i], inner[j]); Explaining the Postgres Query Optimizer 41 / 61
  • 42. Join the Two Tables with No Restriction EXPLAIN SELECT sample1.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id); QUERY PLAN ------------------------------------------------------------------------- Merge Join (cost=927.72..1852.95 rows=61272 width=32) Merge Cond: (sample2.id = sample1.id) -> Sort (cost=85.43..88.50 rows=1230 width=4) Sort Key: sample2.id -> Seq Scan on sample2 (cost=0.00..22.30 rows=1230 width=4) -> Sort (cost=842.29..867.20 rows=9963 width=36) Sort Key: sample1.id -> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=36) (8 rows) Explaining the Postgres Query Optimizer 42 / 61
  • 43. Merge Join Sorted Sorted Ideal for Large Tables An Index Can Be Used to Eliminate the Sort aaa aab aac aad aaa aab aab aaf aaf aac aae Outer Inner Explaining the Postgres Query Optimizer 43 / 61
  • 44. Pseudocode for Merge Join sort(outer); sort(inner); i = 0; j = 0; save_j = 0; while (i < length(outer)) if (outer[i] == inner[j]) output(outer[i], inner[j]); if (outer[i] <= inner[j] && j < length(inner)) j++; if (outer[i] < inner[j]) save_j = j; else i++; j = save_j; Explaining the Postgres Query Optimizer 44 / 61
  • 45. Order of Joined Relations Is Insignificant EXPLAIN SELECT sample2.junk FROM sample2 JOIN sample1 ON (sample2.id = sample1.id); QUERY PLAN ------------------------------------------------------------------------ Merge Join (cost=927.72..1852.95 rows=61272 width=32) Merge Cond: (sample2.id = sample1.id) -> Sort (cost=85.43..88.50 rows=1230 width=36) Sort Key: sample2.id -> Seq Scan on sample2 (cost=0.00..22.30 rows=1230 width=36) -> Sort (cost=842.29..867.20 rows=9963 width=4) Sort Key: sample1.id -> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=4) (8 rows) The most restrictive relation, e.g. sample2, is always on the outer side of merge joins. All previous merge joins also had sample2 in outer position. Explaining the Postgres Query Optimizer 45 / 61
  • 46. Add Optimizer Statistics ANALYZE sample1; ANALYZE sample2; Explaining the Postgres Query Optimizer 46 / 61
  • 47. This Was a Merge Join without Optimizer Statistics EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id); QUERY PLAN ------------------------------------------------------------------------ Hash Join (cost=15.85..130.47 rows=260 width=254) Hash Cond: (sample1.id = sample2.id) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) -> Hash (cost=12.60..12.60 rows=260 width=258) -> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258) (5 rows) Explaining the Postgres Query Optimizer 47 / 61
  • 48. Outer Joins Can Affect Optimizer Join Usage EXPLAIN SELECT sample1.junk FROM sample1 RIGHT OUTER JOIN sample2 ON (sample1.id = sample2.id); QUERY PLAN -------------------------------------------------------------------------- Hash Left Join (cost=131.76..148.26 rows=260 width=254) Hash Cond: (sample2.id = sample1.id) -> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=4) -> Hash (cost=103.56..103.56 rows=2256 width=258) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=258) (5 rows) Use of hashes for outer joins was added in Postgres 9.1. Explaining the Postgres Query Optimizer 48 / 61
  • 49. Cross Joins Are Nested Loop Joins without Join Restriction EXPLAIN SELECT sample1.junk FROM sample1 CROSS JOIN sample2; QUERY PLAN ---------------------------------------------------------------------- Nested Loop (cost=0.00..7448.81 rows=586560 width=254) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=254) -> Materialize (cost=0.00..13.90 rows=260 width=0) -> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=0) (4 rows) Explaining the Postgres Query Optimizer 49 / 61
  • 50. Create Indexes CREATE INDEX i_sample1 on sample1 (id); CREATE INDEX i_sample2 on sample2 (id); Explaining the Postgres Query Optimizer 50 / 61
  • 51. Nested Loop with Inner Index Scan Now Possible EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) WHERE sample1.id = 33; QUERY PLAN --------------------------------------------------------------------------------- Nested Loop (cost=0.00..16.55 rows=1 width=254) -> Index Scan using i_sample1 on sample1 (cost=0.00..8.27 rows=1 width=4) Index Cond: (id = 33::oid) -> Index Scan using i_sample2 on sample2 (cost=0.00..8.27 rows=1 width=258) Index Cond: (sample2.id = 33::oid) (5 rows) Explaining the Postgres Query Optimizer 51 / 61
  • 52. Nested Loop Join with Inner Index Scan aag aar aai aay aag aas aar aaa aay aai aag No Setup Required Index Lookup Index Must Already Exist Outer Inner Explaining the Postgres Query Optimizer 52 / 61
  • 53. Pseudocode for Nested Loop Join with Inner Index Scan for (i = 0; i < length(outer); i++) index_entry = get_first_match(outer[j]) while (index_entry) output(outer[i], inner[index_entry]); index_entry = get_next_match(index_entry); Explaining the Postgres Query Optimizer 53 / 61
  • 54. Query Restrictions Affect Join Usage EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) WHERE sample2.junk ˜ ’^aaa’; QUERY PLAN ------------------------------------------------------------------------------- Nested Loop (cost=0.00..21.53 rows=1 width=254) -> Seq Scan on sample2 (cost=0.00..13.25 rows=1 width=258) Filter: (junk ˜ ’^aaa’::text) -> Index Scan using i_sample1 on sample1 (cost=0.00..8.27 rows=1 width=4) Index Cond: (sample1.id = sample2.id) (5 rows) No junk rows begin with ’aaa’. Explaining the Postgres Query Optimizer 54 / 61
  • 55. All ’junk’ Columns Begin with ’xxx’ EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) WHERE sample2.junk ˜ ’^xxx’; QUERY PLAN ------------------------------------------------------------------------ Hash Join (cost=16.50..131.12 rows=260 width=254) Hash Cond: (sample1.id = sample2.id) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) -> Hash (cost=13.25..13.25 rows=260 width=258) -> Seq Scan on sample2 (cost=0.00..13.25 rows=260 width=258) Filter: (junk ˜ ’^xxx’::text) (6 rows) Hash join was chosen because many more rows are expected. The smaller table, e.g. sample2, is always hashed. Explaining the Postgres Query Optimizer 55 / 61
  • 56. Without LIMIT, Hash Is Used for this Unrestricted Join EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id); QUERY PLAN ------------------------------------------------------------------------ Hash Join (cost=15.85..130.47 rows=260 width=254) Hash Cond: (sample1.id = sample2.id) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) -> Hash (cost=12.60..12.60 rows=260 width=258) -> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258) (5 rows) Explaining the Postgres Query Optimizer 56 / 61
  • 57. LIMIT Can Affect Join Usage EXPLAIN SELECT sample2.id, sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) ORDER BY 1 LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------ Limit (cost=0.00..1.83 rows=1 width=258) -> Nested Loop (cost=0.00..477.02 rows=260 width=258) -> Index Scan using i_sample2 on sample2 (cost=0.00..52.15 rows=260 width=258) -> Index Scan using i_sample1 on sample1 (cost=0.00..1.62 rows=1 width=4) Index Cond: (sample1.id = sample2.id) (5 rows) Explaining the Postgres Query Optimizer 57 / 61
  • 58. LIMIT 10 EXPLAIN SELECT sample2.id, sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) ORDER BY 1 LIMIT 10; QUERY PLAN ------------------------------------------------------------------------------------------ Limit (cost=0.00..18.35 rows=10 width=258) -> Nested Loop (cost=0.00..477.02 rows=260 width=258) -> Index Scan using i_sample2 on sample2 (cost=0.00..52.15 rows=260 width=258) -> Index Scan using i_sample1 on sample1 (cost=0.00..1.62 rows=1 width=4) Index Cond: (sample1.id = sample2.id) (5 rows) Explaining the Postgres Query Optimizer 58 / 61
  • 59. LIMIT 100 Switches to Hash Join EXPLAIN SELECT sample2.id, sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) ORDER BY 1 LIMIT 100; QUERY PLAN ------------------------------------------------------------------------------------ Limit (cost=140.41..140.66 rows=100 width=258) -> Sort (cost=140.41..141.06 rows=260 width=258) Sort Key: sample2.id -> Hash Join (cost=15.85..130.47 rows=260 width=258) Hash Cond: (sample1.id = sample2.id) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) -> Hash (cost=12.60..12.60 rows=260 width=258) -> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258) (8 rows) Explaining the Postgres Query Optimizer 59 / 61
  • 60. Additional Resources… ◮ Postgres Downloads: ◮ www.enterprisedb.com/downloads ◮ Product and Services information: ◮ info@enterprisedb.com Explaining the Postgres Query Optimizer 60 / 61