PostgreSQL's is one of the finest database systems available.
The talk will cover the history, the basic concepts of the PostgreSQL's architecture and the how the community behind the "the most advanced open source database" works.
1. The hitchiker’s guide to PostgreSQL
Federico Campoli
Kamedata
Plone Conf, Ferrara, 24 October 2019
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 1 / 55
2. Few words about the speaker
Born in 1972
Passionate about IT since 1982
Joined the Oracle DBA secret society in 2004
In love with PostgreSQL since 2006
PostgreSQL tattoo on the right shoulder
Freelance devops and data engineer
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 2 / 55
3. Kamedata
Whether you need a simple audit a tailored training or support for your
infrastructure,
we can help you to improve.
Devops PostgreSQL Support
Training Audit Migrations
https://kamedata.com
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 3 / 55
4. Table of contents
1 Don’t panic!
2 The pan galactic gargle blaster
3 The Ravenous Bugblatter Beast of Traal
4 Time is an illusion. Lunchtime doubly so
5 Mostly harmless
6 Wrap up
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 4 / 55
5. Don’t panic!
Copyright by Kreg Steppe - https://www.flickr.com/photos/spyndle/
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 5 / 55
6. PostgreSQL, history
PostgreSQL is an open source RDBMS with a BSD-like or MIT-like license.
Started by Professor M. Stonebraker at Berkeley
Follow up of Ingres hence the chosen name was POSTinGRES
Managed by the PostgreSQL Global Development Group (PGDG)
A new release each year, latest version is 12
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 6 / 55
7. Versioning and support policy
Version Supported First release date End of life date
12 Yes 3 October 2019 14 November 2024
11 Yes 18 October 2018 9 November 2023
10 Yes 5 October 2017 10 November 2022
9.6 Yes 29 September 2016 11 November 2021
9.5 Yes 7 January 2016 11 February 2021
9.4 EOL 18 December 2014 13 February 2020
9.3 No 9 September 2013 8 November 2018
Table: End Of Life (EOL) dates
The PostgreSQL project aims to fully support a major release for five
years. The policy is applied on a best-effort basis.
The complete list of versions is available at
https://www.postgresql.org/support/versioning/
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 7 / 55
8. Minor version releases
Every year there are at least four minor releases for each supported version.
The target date for these releases are, unless otherwise stated, the second
Thursday of February, May, August, and November.
The current schedule for upcoming releases is:
7th November 2019
13th February 2020
14th May 2020
13th August 2020
The up to date release schedule is available here:
https://www.postgresql.org/developer/roadmap/
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 8 / 55
9. The pan galactic gargle blaster
Copyright by Federico Campoli
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 9 / 55
10. PostgreSQL, features
ACID compliant
Highly scalable with MVCC
Tablespaces
Runs on almost any unix...
And on *cough* MS Windows *cough*
Foreign tables
Procedural languages (pl/pgsql, pl/python, pl/perl...)
Supports for NOSQL data type (HSTORE, JSON)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 10 / 55
11. PostgreSQL, features
Development
Licence PostgreSQL, BSD-like
Developed in C
Programmable interface with external libraries
Extension system
Partitioning
Parallel execution
Logical decoding
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 11 / 55
12. PostgreSQL 12 new features
Partitioning improvements, thousands of partitions managed.
Btree improvements, size, deduplication, vacuum efficiency
Statistics, most-common-value statistics for multiple columns
CTE inline, no materialisation slowness when using WITH
Prepared plan control, it’s possible to determine whether to use the
generic or custom execution plan
Just-in-time Compilation, rewrites the execution plan for complex
queries. Designed to improve analytics queries.
Checksum Control, allow to enable/disable the checksums while the
cluster is offline
REINDEX CONCURRENTLY, allows REINDEX with minimal locking
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 12 / 55
13. Community
There are several ways to interact and get help from the PostgreSQL
community.
Mailing lists organised by topics: https://www.postgresql.org/list/
IRC on freenode: channel #postgresql
Slack: https://postgres-slack.herokuapp.com/
Telegram: https://t.me/pg sql
The community is informal, friendly and respectful.
There is a Code of Conduct to follow
https://www.postgresql.org/about/policies/coc/
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 13 / 55
14. Some cool PostgreSQL features
As there’s no much time we’ll have a quick look to some of the cool
features included in PostgreSQL
JSON and JSONB
Transaction snapshot exports
Table inheritance and partitioning
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 14 / 55
15. The Ravenous Bugblatter Beast of Traal
Copyright by Federico Campoli
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 15 / 55
16. A multi flavour RDBMS
Alongside with the general purpose data types PostgreSQL have some
exotic types.
Range types
Geometric types
Network address types
XML type
JSON type
HSTORE type (extension)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 16 / 55
17. JSON
JSON
JSON is the JavaScript Object Notation.
Supported as datatype in PostgreSQL with two flavours
JSON The data is stored as text, parsed and validated when the data
is accessed
JSONB parsed and converted in binary representation during the
insert and update
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 17 / 55
18. JSON
JSON - Examples
From record to JSON
postgres =# SELECT row_to_json(ROW(1,’foo ’));
row_to_json
-- -------------------
{"f1":1,"f2":"foo"}
(1 row)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 18 / 55
19. JSON
Expanding JSON into key to value elements
postgres =# SELECT * from json_each(’{"a":" foo", "b":" bar "}’);
key | value
-- ---+-------
a | "foo"
b | "bar"
(2 rows)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 19 / 55
20. JSONB
Things to consider when using JSON
JSON is parsed and validated on the fly
JSONB is parsed, validated and transformed during
INSERT/UPDATE
JSONB allows a single index on the field. All the keys are indexed
automatically.
If you don’t use the json functions in PostgreSQL then just use text
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 20 / 55
21. Some numbers
Let’s create three tables with text,json and jsonb type fields.
Each record contains the same json element generated on
https://www.json-generator.com/
[
{
"_id": "5da82999b1c11e2a5a3cd35e",
"index": 0,
"guid": "4320e8ee-e28f-435e-8bce-975e75b23f79",
"isActive": true,
"balance": "$2,211.00",
"picture": "http://placehold.it/32x32",
"age": 34,
"eyeColor": "blue",
"name": "Wall Shaw",
...
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 21 / 55
22. Some numbers
DROP TABLE IF EXISTS t_json ;
DROP TABLE IF EXISTS t_jsonb ;
DROP TABLE IF EXISTS t_text ;
timing
CREATE TABLE t_json as
SELECT
’<JSON ELEMENT >’:: json as js_value
FROM
generate_series (1 ,100000);
SELECT 100000
Time: 27350.758 ms (00:27.351)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 22 / 55
23. Some numbers
CREATE TABLE t_text as
SELECT
’<JSON ELEMENT >’:: text as t_value
FROM
generate_series (1 ,100000);
SELECT 100000
Time: 29713.060 ms (00:29.713)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 23 / 55
24. Some numbers
CREATE TABLE t_jsonb as
SELECT
’<JSON ELEMENT >’:: jsonb as jsb_value
FROM
generate_series (1 ,100000);
SELECT 100000
Time: 27109.003 ms (00:27.109)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 24 / 55
25. Table size
SELECT
pg_size_pretty ( pg_total_relation_size (oid)),
relname
FROM
pg_class
WHERE
relname LIKE ’t_%’
;
pg_size_pretty | relname
-- --------------+---------
270 MB | t_json
322 MB | t_jsonb
270 MB | t_text
(3 rows)
JSONB use more space than the json and text1
1
Sizing may vary depending on the json size
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 25 / 55
26. Sequential scans
TEXT
EXPLAIN (BUFFERS , ANALYZE) SELECT * FROM t_text;
Seq Scan on t_text (cost =0.00..1637.00 rows =100000 width =18) (
actual time =0.013..11.533 rows =100000 loops =1)
Buffers: shared hit =637
Planning Time: 0.041 ms
Execution Time: 18.359 ms
(4 rows)
Time: 18.796 ms
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 26 / 55
27. Sequential scans
JSON
EXPLAIN (BUFFERS , ANALYZE) SELECT * FROM t_json;
Seq Scan on t_json (cost =0.00..1637.00 rows =100000 width =18) (
actual time =0.043..14.880 rows =100000 loops =1)
Buffers: shared read =637
Planning Time: 0.100 ms
Execution Time: 21.310 ms
(4 rows)
Time: 21.747 ms
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 27 / 55
28. Sequential scans
JSONB
EXPLAIN (BUFFERS , ANALYZE) SELECT * FROM t_jsonb;
Seq Scan on t_jsonb (cost =0.00..1637.00 rows =100000 width =18) (
actual time =0.060..12.409 rows =100000 loops =1)
Buffers: shared hit =91 read =546
Planning Time: 0.090 ms
Execution Time: 18.471 ms
(4 rows)
Time: 18.852 ms
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 28 / 55
29. Sequential scan with cast to json
TEXT
EXPLAIN (BUFFERS , ANALYZE) SELECT t_value ::json ->0->’index ’ FROM
t_text;
Seq Scan on t_text (cost =0.00..2387.00 rows =100000 width =32) (
actual time =0.260..5301.080 rows =100000 loops =1)
Buffers: shared hit =366212 read =34426
Planning Time: 0.311 ms
Execution Time: 5309.246 ms
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 29 / 55
30. Sequential scan with native json
JSON
EXPLAIN (BUFFERS , ANALYZE) SELECT js_value ->0->’index ’ FROM t_json
;
Seq Scan on t_json (cost =0.00..1887.00 rows =100000 width =32) (
actual time =0.234..4145.148 rows =100000 loops =1)
Buffers: shared hit =366116 read =34522
Planning Time: 0.043 ms
Execution Time: 4152.678 ms
(4 rows)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 30 / 55
31. Sequential scan with jsonb
JSONB
EXPLAIN (BUFFERS , ANALYZE) SELECT jsb_value ->0->’index ’ FROM
t_jsonb;
Seq Scan on t_jsonb (cost =0.00..1887.00 rows =100000 width =32) (
actual time =0.154..1236.189 rows =100000 loops =1)
Buffers: shared hit =379450 read =41188
Planning Time: 0.048 ms
Execution Time: 1241.749 ms
(4 rows)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 31 / 55
32. Jsonpath
PostgreSQL 12 adds the SQL specifications by introducing SQL/JSON
path language.
EXPLAIN (BUFFERS , ANALYZE)
SELECT
jsonb_path_query (jsb_value ,’$[0]. index ’)
FROM
t_jsonb;
ProjectSet (cost =0.00..502387.00 rows =100000000 width =32) (actual
time =0.116..1264.445 rows =100000 loops =1)
Buffers: shared hit =420637
-> Seq Scan on t_jsonb (cost =0.00..1637.00 rows =100000 width
=18) (actual time =0.013..8.009 rows =100000 loops =1)
Buffers: shared hit =637
Planning Time: 0.058 ms
Execution Time: 1269.603 ms
(6 rows)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 32 / 55
33. Time is an illusion. Lunchtime doubly so
Copyright by Federico Campoli
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 33 / 55
34. Transaction’s snapshots
Depending from the isolation level, when a query or a transaction access
the database for the first time acquires a snapshot which includes the
transactions that have been committed at that moment.
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 34 / 55
35. Transaction’s snapshots
In PostgreSQL a transaction’s snapshot can be exported to other
sessions
A session which import an exported snapshot sees the same data of
the exporting session
Useful when multiple sessions need to see a consistent data set frozen
in time (e.g. parallel execution)
pg dump use this feature to backup the database in parallel
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 35 / 55
36. Example
Let’s create a table
CREATE TABLE t_data
(
i_id serial ,
t_content text
);
ALTER TABLE t_data
ADD CONSTRAINT pk_t_data PRIMARY KEY (i_id);
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 36 / 55
37. Example
INSERT INTO t_data
(
t_content
)
SELECT
md5(i_counter :: text)
FROM
(
SELECT
i_counter
FROM
generate_series (1 ,200) as i_counter
) t_series;
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 37 / 55
38. Exporting a snapshot
postgres =# BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
postgres =# SELECT pg_export_snapshot ();
pg_export_snapshot
-- ------------------
00000004 -0000005A-1
(1 row)
postgres =# SELECT count (*) FROM t_data;
count
-------
200
(1 row)
Isolation levels explained here:
https://pgdba.org/post/2019/10/transactions/
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 38 / 55
39. Importing a snapshot
In a different session we remove all the table’s rows
postgres =# DELETE FROM t_data;
DELETE 200
postgres =# SELECT count (*) FROM t_data;
count
-------
0
(1 row)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 39 / 55
40. Example
If we import the snapshot we’ll see the rows back again.
postgres =# BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
postgres =# SET TRANSACTION SNAPSHOT ’00000004 -0000005A-1’;
SET
postgres =# SELECT count (*) FROM t_data;
count
-------
200
(1 row)
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 40 / 55
41. Mostly harmless
Copyright by Federico Campoli
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 41 / 55
42. Table inheritance
Before PostgreSQL 10 the partitioning were possible through the table
inheritance.
The table inheritance is a logical relationship between a parent table and
its child tables.
A child table inherits parent’s attributes. However the physical storage is
not shared.
Multiple level of inheritance are allowed.
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 42 / 55
43. Table inheritance
As physical storage is not shared between inherited tables, the unique
constraints can’t be globally enforced on the inheritance tree preventing
the creation of any global foreign key.
There is no built in mechanism for distributing the data across the child
tables.
Query optimisation is made via CHECK constraint exclusion which need to
be created and maintained.
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 43 / 55
44. Table partitioning
PostgreSQL 12 have the native support for table partitioning.
Global foreign keys are allowed PostgreSQL to and from the partitioned
table.
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 44 / 55
45. Table partitioning
PostgreSQL supports range, list and hash partitioning.
Range Partitioning: The table is partitioned into “ranges” defined by
a key column or set of columns, with no overlap between the ranges
of values assigned to different partitions.
List Partitioning: The table is partitioned by explicitly listing which
key values appear in each partition.
Hash Partitioning: The table is partitioned by specifying a modulus
and a remainder for each partition. Each partition will hold the rows
for which the hash value of the partition key divided by the specified
modulus will produce the specified remainder.
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 45 / 55
46. Partitioning
Define a table as partitioned
CREATE TABLE t_rng_part
(
id integer NOT NULL ,
value text ,
CONSTRAINT pk_t_rng_part primary key (id)
)
PARTITION BY RANGE (id) ;
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 46 / 55
47. Partitioning
Add the range partitions.
The range’s upper bound is not inclusive.
CREATE TABLE t_rng_part_0001_1000
PARTITION OF t_rng_part
FOR VALUES FROM (’0’) TO (’1000 ’);
CREATE TABLE t_rng_part_1000_2000
PARTITION OF t_rng_part
FOR VALUES FROM (’1000 ’) TO (’2000 ’);
CREATE TABLE t_rng_part_2000_3000
PARTITION OF t_rng_part
FOR VALUES FROM (’2000 ’) TO (’3000 ’);
CREATE TABLE t_rng_part_default
PARTITION OF t_rng_part
DEFAULT;
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 47 / 55
48. Partitioning
Insert data
INSERT INTO t_rng_part
(id)
SELECT
generate_series (1 ,50000)
;
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 48 / 55
49. Partitioning
Table size
dt+ t_rng_part*
List of relations
Schema | Name | Type | Owner | Size |
Description
--
------+----------------------+-------+----------+---------+-------
public | t_rng_part | table | postgres | 0 bytes |
public | t_rng_part_0001_1000 | table | postgres | 72 kB |
public | t_rng_part_1000_2000 | table | postgres | 72 kB |
public | t_rng_part_2000_3000 | table | postgres | 72 kB |
public | t_rng_part_default | table | postgres | 1696 kB |
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 49 / 55
50. Partitioning
PostgreSQL’s declarative partitioning relies on table inheritance system
which rules still apply. However there are some differences.
CHECK and NOT NULL constraints on a partitioned table are always
inherited by all the partitions. CHECK constraints marked as NO
INHERIT cannot be created on the partitioned tables.
A partitioned table does not have any data directly therefore using
TRUNCATE ONLY on a partitioned table will always generate an
error.
Differently from the table inheritance partitions cannot have columns
that are not present in the parent.
Is not possible to drop the NOT NULL constraint on a partition’s
column if the constraint is present in the parent table.
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 50 / 55
51. Wrap up
Copyright by Federico Campoli
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 51 / 55
52. License
This document is distributed under the terms of the Creative Commons
Attribution, Not Commercial, Share Alike
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 52 / 55
53. Contacts
Blog: https://pgdba.org
Twitter: @4thdoctor scarf
Github: https://github.com/the4thdoctor
Linkedin: https://www.linkedin.com/in/federicocampoli/
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 53 / 55
54. That’s all folks!
Thank you for listening!
Any questions?
Copyright by dan232323 http://dan232323.deviantart.com/art/Pinkie-Pie-Thats-All-Folks-454693000
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 54 / 55
55. The hitchiker’s guide to PostgreSQL
Federico Campoli
Kamedata
Plone Conf, Ferrara, 24 October 2019
Federico Campoli (Kamedata) The hitchiker’s guide to PostgreSQLPlone Conf, Ferrara, 24 October 2019 55 / 55