SlideShare a Scribd company logo
1 of 31
Download to read offline
The future is CSN
Alexander Korotkov
Postgres Professional
Alexander Korotkov The future is CSN 1 / 31
Russian developers of PostgreSQL:
Alexander Korotkov, Teodor Sigaev, Oleg Bartunov
▶ Speakers at PGCon, PGConf: 20+ talks
▶ GSoC mentors
▶ PostgreSQL commi ers (1+1 in progress)
▶ Conference organizers
▶ 50+ years of PostgreSQL expertship:
development, audit, consul ng
▶ Postgres Professional co-founders
PostgreSQL CORE
▶ Locale support
▶ PostgreSQL extendability:
GiST(KNN), GIN, SP-GiST
▶ Full Text Search (FTS)
▶ NoSQL (hstore, jsonb)
▶ Indexed regexp search
▶ Create AM & Generic WAL
▶ Table engines (WIP)
Extensions
▶ intarray
▶ pg_trgm
▶ ltree
▶ hstore
▶ plantuner
▶ jsquery
▶ RUM
Alexander Korotkov The future is CSN 2 / 31
Why do we need snapshots? (1/5)
Alexander Korotkov The future is CSN 3 / 31
Why do we need snapshots? (2/5)
Alexander Korotkov The future is CSN 4 / 31
Why do we need snapshots? (3/5)
Alexander Korotkov The future is CSN 5 / 31
Why do we need snapshots? (4/5)
Alexander Korotkov The future is CSN 6 / 31
Why do we need snapshots? (5/5)
Alexander Korotkov The future is CSN 7 / 31
Exis ng MVCC snapshots
Alexander Korotkov The future is CSN 8 / 31
How do snapshots work?
▶ Array of ac ve transac on ids is stored in shared
memory.
▶ GetSnapshotData() scans all the ac ve xids while
holding shared ProcArrayLock.
▶ Assigning of new xid doesn’t require ProcArrayLock.
▶ Clearing ac ve xid requires exclusive ProcArrayLock.
▶ 9.6 comes with “group clear xid” op miza on.
Mul ple xids of finished transac ons could be
cleared using single exclusive ProcArrayLock.
Alexander Korotkov The future is CSN 9 / 31
Problem with of snapshots
▶ Nowadays mul -core systems running can run
thousands of backends simultaneously. For short
queries GetSnapshotData() becomes just CPU
expensive.
▶ LWLock subsystem is just not designed for high
concurrency. In par cular, exclusive lock waits could
have infinite starva on. Therefore, it’s impossible to
connect while there is high flow of short readonly
queries.
▶ In the mixed read-write workload, ProcArrayLock
could become a bo leneck.
Alexander Korotkov The future is CSN 10 / 31
Commit sequence number (CSN) snapshots
Alexander Korotkov The future is CSN 11 / 31
CSN development history
▶ Jun 7, 2013 – proposal by Ants Aasma
▶ May 30, 2014 – first path by Heikki Linnakangas
▶ PGCon 2015 – talk by Dilip Kumar (no patch
published)
▶ Aug, 2016 – Heikki returned to this work
Alexander Korotkov The future is CSN 12 / 31
CSN snapshots proper es
Pro:
▶ Taking snapshots is cheaper. It’s even possible to
make it lockless.
▶ CSN snapshots are more friendly to distributed
systems. Distributed visibility techniques like
incremental snapshots or Clock-SI assumes that
snapshot is represented by single number.
Cons:
▶ Have to map XID ⇒ CSN while visibility check.
Alexander Korotkov The future is CSN 13 / 31
XID ⇒ CSN map worst case
1 M rows table, xmin is random in 10 M transac ons
version first scan, ms next scans, ms
master 2500 50
csn 4900 4900
Without CSN we have to lookup CLOG only during first
scan of the table. During first scan hint bits are set.
Second and subsequent scans use hint bits and don’t
lookup CLOG.
Alexander Korotkov The future is CSN 14 / 31
Could we hint XID ⇒ CSN map as well?
▶ In general, it’s possible. We could rewrite XID of
commi ed transac on into its CSN.
▶ Xmin and xmax are 32-bit. Usage of 32-bit CSN is
undesirable. We already have xid and mul xact
wraparounds. Yet another CSN wraparound would
be discouraging.
▶ Se ng hint bits is not WAL-logged. We need to
preserve this property.
Alexander Korotkov The future is CSN 15 / 31
Make both XID and CSN 64-bit
▶ Add 64-bit xid_epoch, mul xact_epoch and
csn_epoch to page header.
▶ Allocate high bit of xmin and xmax for CSN flag.
▶ Actual xid or csn stored in xmin or xmax should be
found as corresponding epoch plus xmin or xmax.
▶ We s ll can address 231
xids from xmin and xmax as
we did before.
▶ Wraparound is possible only inside single page. And
it could be resolved by single page freeze.
Alexander Korotkov The future is CSN 16 / 31
CSN-rewrite patch
▶ Use 64-bit XID and CSN as described before.
▶ Rewrite XID to CSN instead of se ng “commi ed”
hint bit.
▶ Lockless snapshot taking.
▶ WIP, not published yet.
Alexander Korotkov The future is CSN 17 / 31
XID ⇒ CSN map worst case
1 M rows table, xmin is random in 10 M transac ons
version first scan, ms next scans, ms
master 2500 50
csn 4900 4900
csn-rewrite 4900 50
Subsequent scans of table is as cheap as it was before.
First scan s ll have a room for op miza on.
Alexander Korotkov The future is CSN 18 / 31
Benchmarks
Alexander Korotkov The future is CSN 19 / 31
Taking snapshots (SELECT 1)
0 50 100 150 200 250
# Clients
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
4000000
TPS
pgbench -s 1000 -j $n -c $n -M prepared -f sel1.sql on 4 x 18 cores Intel Xeon E7-8890 processors
median of 3 2-minute runs with shared_buffers = 32GB, max_connections = 300
master
xact-align
csn-rewrite
Alexander Korotkov The future is CSN 20 / 31
Read-only benchmark
0 50 100 150 200 250
# Clients
0
500000
1000000
1500000
2000000
TPS
pgbench -s 1000 -j $n -c $n -M prepared -S on 4 x 18 cores Intel Xeon E7-8890 processors
median of 3 2-minute runs with shared_buffers = 32GB, max_connections = 300
master
xact-align
csn-rewrite
Alexander Korotkov The future is CSN 21 / 31
Read-write benchmark
0 50 100 150 200 250
# Clients
0
20000
40000
60000
80000
100000
120000
140000
160000
TPS
pgbench -s 1000 -j $n -c $n -M prepared on 4 x 18 cores Intel Xeon E7-8890 processors
median of 3 2-minute runs with shared_buffers = 32GB, max_connections = 300
master
xact-align
csn-rewrite
Alexander Korotkov The future is CSN 22 / 31
Random: 78% read queries, 22% write
queries
0 50 100 150 200 250
# Clients
0
200000
400000
600000
800000
1000000
TPS
pgbench -s 1000 -j $n -c $n -M prepared -b select-only@9 -b tpcb-like@1
on 4 x 18 cores Intel Xeon E7-8890 processors
median of 3 2-minute runs with shared_buffers = 32GB, max_connections = 300
master
xact-align
csn-rewrite
Alexander Korotkov The future is CSN 23 / 31
Custom script with extra 20 read queries
set naccounts 100000 * :scale
set aid1 random(1, :naccounts)
...........................................................
set aid20 random(1, :naccounts)
set aid random(1, 100000 * :scale)
set bid random(1, 1 * :scale)
set tid random(1, 10 * :scale)
set delta random(-5000, 5000)
SELECT abalance FROM pgbench_accounts WHERE aid IN (:aid1);
...........................................................
SELECT abalance FROM pgbench_accounts WHERE aid IN (:aid20);
BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :d
END;
Alexander Korotkov The future is CSN 24 / 31
Custom script with extra 20 read queries
0 50 100 150 200 250
# Clients
0
10000
20000
30000
40000
50000
60000
70000
TPS
pgbench -s 1000 -j $n -c $n -M prepared -f rrw.sql on 4 x 18 cores Intel Xeon E7-8890 processors
median of 3 2-minute runs with shared_buffers = 32GB, max_connections = 300
master
xact-align
csn-rewrite
Alexander Korotkov The future is CSN 25 / 31
Further PostgreSQL OLTP bo lenecks
Alexander Korotkov The future is CSN 26 / 31
Further PostgreSQL OLTP bo lenecks
▶ Buffer manager – slow hash-table, pin, locks etc.
▶ Synchronous protocol.
▶ Executor.
▶ Slow xid alloca on – a lot of locks.
Alexander Korotkov The future is CSN 27 / 31
Further PostgreSQL OLTP bo lenecks in
numbers
▶ SELECT val FROM t WHERE id IN (:id1, ... :id10) –
150K per second = 1.5M key-value pairs per second, no gain.
Bo leneck in buffer manager.
▶ SELECT 1 with CSN-rewrite patch – 3.9M queries per second.
Protocol and executor are bo lenecks.
▶ SELECT txid_current() – 390K per second. Bo leneck in
locks.
Alexander Korotkov The future is CSN 28 / 31
How can we improve PostgreSQL OLTP?
▶ True in-memory engine without buffer manager.
▶ Asynchronous binary protocol for processing more
short queries.
▶ Executor improvements including JIT-compila on.
▶ Lockless xid alloca on.
Alexander Korotkov The future is CSN 29 / 31
Conclusion
▶ Despite all the micro-op miza ons made, our
snapshot model could be a bo leneck on modern
mul core systems. And it would be even worse
bo leneck on future systems.
▶ CSN is the way to remove this bo leneck. It also
more friendly to distributed systems.
▶ It’s possible to minimize XID ⇒ CSN map in the
same way we minimize CLOG accesses.
Alexander Korotkov The future is CSN 30 / 31
Thank you for a en on!
Alexander Korotkov The future is CSN 31 / 31

More Related Content

What's hot

Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelDavidlohr Bueso
 
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...Ontico
 
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)Red Hat Developers
 
Range reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernelRange reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernelDavidlohr Bueso
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterAlexey Lesovsky
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Tier1 App
 
Streaming huge databases using logical decoding
Streaming huge databases using logical decodingStreaming huge databases using logical decoding
Streaming huge databases using logical decodingAlexander Shulgin
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodbPGConf APAC
 
Percona XtraDB 集群安装与配置
Percona XtraDB 集群安装与配置Percona XtraDB 集群安装与配置
Percona XtraDB 集群安装与配置YUCHENG HU
 
FOSDEM2015: Live migration for containers is around the corner
FOSDEM2015: Live migration for containers is around the cornerFOSDEM2015: Live migration for containers is around the corner
FOSDEM2015: Live migration for containers is around the cornerAndrey Vagin
 
grsecurity and PaX
grsecurity and PaXgrsecurity and PaX
grsecurity and PaXKernel TLV
 
Go Programming Patterns
Go Programming PatternsGo Programming Patterns
Go Programming PatternsHao Chen
 
XtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsXtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsLaurynas Biveinis
 
How Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and SaferHow Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and SaferScyllaDB
 
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
Goroutine stack and local variable allocation in Go
Goroutine stack and local variable allocation in GoGoroutine stack and local variable allocation in Go
Goroutine stack and local variable allocation in GoYu-Shuan Hsieh
 

What's hot (20)

Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux Kernel
 
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...
 
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
 
Range reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernelRange reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernel
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenter
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?
 
Streaming huge databases using logical decoding
Streaming huge databases using logical decodingStreaming huge databases using logical decoding
Streaming huge databases using logical decoding
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
Percona XtraDB 集群安装与配置
Percona XtraDB 集群安装与配置Percona XtraDB 集群安装与配置
Percona XtraDB 集群安装与配置
 
FOSDEM2015: Live migration for containers is around the corner
FOSDEM2015: Live migration for containers is around the cornerFOSDEM2015: Live migration for containers is around the corner
FOSDEM2015: Live migration for containers is around the corner
 
Pgcenter overview
Pgcenter overviewPgcenter overview
Pgcenter overview
 
grsecurity and PaX
grsecurity and PaXgrsecurity and PaX
grsecurity and PaX
 
Go Programming Patterns
Go Programming PatternsGo Programming Patterns
Go Programming Patterns
 
XtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsXtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithms
 
PostgreSQL Replication Tutorial
PostgreSQL Replication TutorialPostgreSQL Replication Tutorial
PostgreSQL Replication Tutorial
 
How Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and SaferHow Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and Safer
 
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
Goroutine stack and local variable allocation in Go
Goroutine stack and local variable allocation in GoGoroutine stack and local variable allocation in Go
Goroutine stack and local variable allocation in Go
 
Move from C to Go
Move from C to GoMove from C to Go
Move from C to Go
 

Viewers also liked

Managing thousands of databases
Managing thousands of databasesManaging thousands of databases
Managing thousands of databasesEmre Hasegeli
 
PGDay.Seoul 2016 lightingtalk
PGDay.Seoul 2016 lightingtalkPGDay.Seoul 2016 lightingtalk
PGDay.Seoul 2016 lightingtalkhyeongchae lee
 
Gbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGiuseppe Broccolo
 
What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6EDB
 
談 Uber 從 PostgreSQL 轉用 MySQL 的技術爭議
談 Uber 從 PostgreSQL 轉用 MySQL 的技術爭議談 Uber 從 PostgreSQL 轉用 MySQL 的技術爭議
談 Uber 從 PostgreSQL 轉用 MySQL 的技術爭議Yi-Feng Tzeng
 
淺入淺出 MySQL & PostgreSQL
淺入淺出 MySQL & PostgreSQL淺入淺出 MySQL & PostgreSQL
淺入淺出 MySQL & PostgreSQLYi-Feng Tzeng
 
PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL performance improvements in 9.5 and 9.6PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL performance improvements in 9.5 and 9.6Tomas Vondra
 
PostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability ImprovementsPostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability ImprovementsPGConf APAC
 
恰如其分的 MySQL 設計技巧 [Modern Web 2016]
恰如其分的 MySQL 設計技巧 [Modern Web 2016]恰如其分的 MySQL 設計技巧 [Modern Web 2016]
恰如其分的 MySQL 設計技巧 [Modern Web 2016]Yi-Feng Tzeng
 
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
 
Modern SQL in Open Source and Commercial Databases
Modern SQL in Open Source and Commercial DatabasesModern SQL in Open Source and Commercial Databases
Modern SQL in Open Source and Commercial DatabasesMarkus Winand
 
Portal user guide
Portal user guidePortal user guide
Portal user guidePete_BEM
 
Technet profile
Technet profileTechnet profile
Technet profileBoaz Shani
 
Textayin xmbagrich
Textayin xmbagrichTextayin xmbagrich
Textayin xmbagrichArmine
 
Swimming sports.pptx george
Swimming sports.pptx georgeSwimming sports.pptx george
Swimming sports.pptx georgelesleymccardle
 
Multimedia06
Multimedia06Multimedia06
Multimedia06Les Davy
 
Introduction to CSS3
Introduction to CSS3Introduction to CSS3
Introduction to CSS3hstryk
 

Viewers also liked (20)

Managing thousands of databases
Managing thousands of databasesManaging thousands of databases
Managing thousands of databases
 
PGDay.Seoul 2016 lightingtalk
PGDay.Seoul 2016 lightingtalkPGDay.Seoul 2016 lightingtalk
PGDay.Seoul 2016 lightingtalk
 
Gbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfs
 
What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6
 
談 Uber 從 PostgreSQL 轉用 MySQL 的技術爭議
談 Uber 從 PostgreSQL 轉用 MySQL 的技術爭議談 Uber 從 PostgreSQL 轉用 MySQL 的技術爭議
談 Uber 從 PostgreSQL 轉用 MySQL 的技術爭議
 
淺入淺出 MySQL & PostgreSQL
淺入淺出 MySQL & PostgreSQL淺入淺出 MySQL & PostgreSQL
淺入淺出 MySQL & PostgreSQL
 
PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL performance improvements in 9.5 and 9.6PostgreSQL performance improvements in 9.5 and 9.6
PostgreSQL performance improvements in 9.5 and 9.6
 
PostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability ImprovementsPostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability Improvements
 
恰如其分的 MySQL 設計技巧 [Modern Web 2016]
恰如其分的 MySQL 設計技巧 [Modern Web 2016]恰如其分的 MySQL 設計技巧 [Modern Web 2016]
恰如其分的 MySQL 設計技巧 [Modern Web 2016]
 
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
 
Modern SQL in Open Source and Commercial Databases
Modern SQL in Open Source and Commercial DatabasesModern SQL in Open Source and Commercial Databases
Modern SQL in Open Source and Commercial Databases
 
Life on a_rollercoaster
Life on a_rollercoasterLife on a_rollercoaster
Life on a_rollercoaster
 
Parctica2
Parctica2Parctica2
Parctica2
 
Portal user guide
Portal user guidePortal user guide
Portal user guide
 
Technet profile
Technet profileTechnet profile
Technet profile
 
Textayin xmbagrich
Textayin xmbagrichTextayin xmbagrich
Textayin xmbagrich
 
My sister´s keeper
My sister´s keeperMy sister´s keeper
My sister´s keeper
 
Swimming sports.pptx george
Swimming sports.pptx georgeSwimming sports.pptx george
Swimming sports.pptx george
 
Multimedia06
Multimedia06Multimedia06
Multimedia06
 
Introduction to CSS3
Introduction to CSS3Introduction to CSS3
Introduction to CSS3
 

Similar to The future is CSN

Open Source SQL databases enters millions queries per second era
Open Source SQL databases enters millions queries per second eraOpen Source SQL databases enters millions queries per second era
Open Source SQL databases enters millions queries per second eraSveta Smirnova
 
Is your SQL Exadata-aware?
Is your SQL Exadata-aware?Is your SQL Exadata-aware?
Is your SQL Exadata-aware?Mauro Pagano
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Graham Wihlidal
 
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
 SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK... SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...Chester Chen
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockScyllaDB
 
5035-Pipeline-Optimization-Techniques.pdf
5035-Pipeline-Optimization-Techniques.pdf5035-Pipeline-Optimization-Techniques.pdf
5035-Pipeline-Optimization-Techniques.pdfssmukherjee2013
 
A CGRA-based Approach for Accelerating Convolutional Neural Networks
A CGRA-based Approachfor Accelerating Convolutional Neural NetworksA CGRA-based Approachfor Accelerating Convolutional Neural Networks
A CGRA-based Approach for Accelerating Convolutional Neural NetworksShinya Takamaeda-Y
 
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...Edge AI and Vision Alliance
 
Intro to Machine Learning for GPUs
Intro to Machine Learning for GPUsIntro to Machine Learning for GPUs
Intro to Machine Learning for GPUsSri Ambati
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsKohei KaiGai
 
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...confluent
 
GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用
GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用
GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用NVIDIA Taiwan
 
Accelerating HPC Applications on NVIDIA GPUs with OpenACC
Accelerating HPC Applications on NVIDIA GPUs with OpenACCAccelerating HPC Applications on NVIDIA GPUs with OpenACC
Accelerating HPC Applications on NVIDIA GPUs with OpenACCinside-BigData.com
 
Time Sensitive Networking in the Linux Kernel
Time Sensitive Networking in the Linux KernelTime Sensitive Networking in the Linux Kernel
Time Sensitive Networking in the Linux Kernelhenrikau
 
11 Synchoricity as the basis for going Beyond Moore
11 Synchoricity as the basis for going Beyond Moore11 Synchoricity as the basis for going Beyond Moore
11 Synchoricity as the basis for going Beyond MooreRCCSRENKEI
 
C++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр Титов
C++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр ТитовC++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр Титов
C++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр Титовcorehard_by
 
Speedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderSpeedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderGregSmith458515
 
Adaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and EigensolversAdaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and Eigensolversinside-BigData.com
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPFAlex Maestretti
 

Similar to The future is CSN (20)

Open Source SQL databases enters millions queries per second era
Open Source SQL databases enters millions queries per second eraOpen Source SQL databases enters millions queries per second era
Open Source SQL databases enters millions queries per second era
 
Is your SQL Exadata-aware?
Is your SQL Exadata-aware?Is your SQL Exadata-aware?
Is your SQL Exadata-aware?
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016
 
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
 SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK... SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
 
Ixgbe internals
Ixgbe internalsIxgbe internals
Ixgbe internals
 
5035-Pipeline-Optimization-Techniques.pdf
5035-Pipeline-Optimization-Techniques.pdf5035-Pipeline-Optimization-Techniques.pdf
5035-Pipeline-Optimization-Techniques.pdf
 
A CGRA-based Approach for Accelerating Convolutional Neural Networks
A CGRA-based Approachfor Accelerating Convolutional Neural NetworksA CGRA-based Approachfor Accelerating Convolutional Neural Networks
A CGRA-based Approach for Accelerating Convolutional Neural Networks
 
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
 
Intro to Machine Learning for GPUs
Intro to Machine Learning for GPUsIntro to Machine Learning for GPUs
Intro to Machine Learning for GPUs
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
 
GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用
GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用
GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用
 
Accelerating HPC Applications on NVIDIA GPUs with OpenACC
Accelerating HPC Applications on NVIDIA GPUs with OpenACCAccelerating HPC Applications on NVIDIA GPUs with OpenACC
Accelerating HPC Applications on NVIDIA GPUs with OpenACC
 
Time Sensitive Networking in the Linux Kernel
Time Sensitive Networking in the Linux KernelTime Sensitive Networking in the Linux Kernel
Time Sensitive Networking in the Linux Kernel
 
11 Synchoricity as the basis for going Beyond Moore
11 Synchoricity as the basis for going Beyond Moore11 Synchoricity as the basis for going Beyond Moore
11 Synchoricity as the basis for going Beyond Moore
 
C++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр Титов
C++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр ТитовC++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр Титов
C++ CoreHard Autumn 2018. Знай свое "железо": иерархия памяти - Александр Титов
 
Speedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderSpeedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql Loader
 
Adaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and EigensolversAdaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and Eigensolvers
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
 

Recently uploaded

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 

Recently uploaded (20)

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 

The future is CSN

  • 1. The future is CSN Alexander Korotkov Postgres Professional Alexander Korotkov The future is CSN 1 / 31
  • 2. Russian developers of PostgreSQL: Alexander Korotkov, Teodor Sigaev, Oleg Bartunov ▶ Speakers at PGCon, PGConf: 20+ talks ▶ GSoC mentors ▶ PostgreSQL commi ers (1+1 in progress) ▶ Conference organizers ▶ 50+ years of PostgreSQL expertship: development, audit, consul ng ▶ Postgres Professional co-founders PostgreSQL CORE ▶ Locale support ▶ PostgreSQL extendability: GiST(KNN), GIN, SP-GiST ▶ Full Text Search (FTS) ▶ NoSQL (hstore, jsonb) ▶ Indexed regexp search ▶ Create AM & Generic WAL ▶ Table engines (WIP) Extensions ▶ intarray ▶ pg_trgm ▶ ltree ▶ hstore ▶ plantuner ▶ jsquery ▶ RUM Alexander Korotkov The future is CSN 2 / 31
  • 3. Why do we need snapshots? (1/5) Alexander Korotkov The future is CSN 3 / 31
  • 4. Why do we need snapshots? (2/5) Alexander Korotkov The future is CSN 4 / 31
  • 5. Why do we need snapshots? (3/5) Alexander Korotkov The future is CSN 5 / 31
  • 6. Why do we need snapshots? (4/5) Alexander Korotkov The future is CSN 6 / 31
  • 7. Why do we need snapshots? (5/5) Alexander Korotkov The future is CSN 7 / 31
  • 8. Exis ng MVCC snapshots Alexander Korotkov The future is CSN 8 / 31
  • 9. How do snapshots work? ▶ Array of ac ve transac on ids is stored in shared memory. ▶ GetSnapshotData() scans all the ac ve xids while holding shared ProcArrayLock. ▶ Assigning of new xid doesn’t require ProcArrayLock. ▶ Clearing ac ve xid requires exclusive ProcArrayLock. ▶ 9.6 comes with “group clear xid” op miza on. Mul ple xids of finished transac ons could be cleared using single exclusive ProcArrayLock. Alexander Korotkov The future is CSN 9 / 31
  • 10. Problem with of snapshots ▶ Nowadays mul -core systems running can run thousands of backends simultaneously. For short queries GetSnapshotData() becomes just CPU expensive. ▶ LWLock subsystem is just not designed for high concurrency. In par cular, exclusive lock waits could have infinite starva on. Therefore, it’s impossible to connect while there is high flow of short readonly queries. ▶ In the mixed read-write workload, ProcArrayLock could become a bo leneck. Alexander Korotkov The future is CSN 10 / 31
  • 11. Commit sequence number (CSN) snapshots Alexander Korotkov The future is CSN 11 / 31
  • 12. CSN development history ▶ Jun 7, 2013 – proposal by Ants Aasma ▶ May 30, 2014 – first path by Heikki Linnakangas ▶ PGCon 2015 – talk by Dilip Kumar (no patch published) ▶ Aug, 2016 – Heikki returned to this work Alexander Korotkov The future is CSN 12 / 31
  • 13. CSN snapshots proper es Pro: ▶ Taking snapshots is cheaper. It’s even possible to make it lockless. ▶ CSN snapshots are more friendly to distributed systems. Distributed visibility techniques like incremental snapshots or Clock-SI assumes that snapshot is represented by single number. Cons: ▶ Have to map XID ⇒ CSN while visibility check. Alexander Korotkov The future is CSN 13 / 31
  • 14. XID ⇒ CSN map worst case 1 M rows table, xmin is random in 10 M transac ons version first scan, ms next scans, ms master 2500 50 csn 4900 4900 Without CSN we have to lookup CLOG only during first scan of the table. During first scan hint bits are set. Second and subsequent scans use hint bits and don’t lookup CLOG. Alexander Korotkov The future is CSN 14 / 31
  • 15. Could we hint XID ⇒ CSN map as well? ▶ In general, it’s possible. We could rewrite XID of commi ed transac on into its CSN. ▶ Xmin and xmax are 32-bit. Usage of 32-bit CSN is undesirable. We already have xid and mul xact wraparounds. Yet another CSN wraparound would be discouraging. ▶ Se ng hint bits is not WAL-logged. We need to preserve this property. Alexander Korotkov The future is CSN 15 / 31
  • 16. Make both XID and CSN 64-bit ▶ Add 64-bit xid_epoch, mul xact_epoch and csn_epoch to page header. ▶ Allocate high bit of xmin and xmax for CSN flag. ▶ Actual xid or csn stored in xmin or xmax should be found as corresponding epoch plus xmin or xmax. ▶ We s ll can address 231 xids from xmin and xmax as we did before. ▶ Wraparound is possible only inside single page. And it could be resolved by single page freeze. Alexander Korotkov The future is CSN 16 / 31
  • 17. CSN-rewrite patch ▶ Use 64-bit XID and CSN as described before. ▶ Rewrite XID to CSN instead of se ng “commi ed” hint bit. ▶ Lockless snapshot taking. ▶ WIP, not published yet. Alexander Korotkov The future is CSN 17 / 31
  • 18. XID ⇒ CSN map worst case 1 M rows table, xmin is random in 10 M transac ons version first scan, ms next scans, ms master 2500 50 csn 4900 4900 csn-rewrite 4900 50 Subsequent scans of table is as cheap as it was before. First scan s ll have a room for op miza on. Alexander Korotkov The future is CSN 18 / 31
  • 19. Benchmarks Alexander Korotkov The future is CSN 19 / 31
  • 20. Taking snapshots (SELECT 1) 0 50 100 150 200 250 # Clients 0 500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000 TPS pgbench -s 1000 -j $n -c $n -M prepared -f sel1.sql on 4 x 18 cores Intel Xeon E7-8890 processors median of 3 2-minute runs with shared_buffers = 32GB, max_connections = 300 master xact-align csn-rewrite Alexander Korotkov The future is CSN 20 / 31
  • 21. Read-only benchmark 0 50 100 150 200 250 # Clients 0 500000 1000000 1500000 2000000 TPS pgbench -s 1000 -j $n -c $n -M prepared -S on 4 x 18 cores Intel Xeon E7-8890 processors median of 3 2-minute runs with shared_buffers = 32GB, max_connections = 300 master xact-align csn-rewrite Alexander Korotkov The future is CSN 21 / 31
  • 22. Read-write benchmark 0 50 100 150 200 250 # Clients 0 20000 40000 60000 80000 100000 120000 140000 160000 TPS pgbench -s 1000 -j $n -c $n -M prepared on 4 x 18 cores Intel Xeon E7-8890 processors median of 3 2-minute runs with shared_buffers = 32GB, max_connections = 300 master xact-align csn-rewrite Alexander Korotkov The future is CSN 22 / 31
  • 23. Random: 78% read queries, 22% write queries 0 50 100 150 200 250 # Clients 0 200000 400000 600000 800000 1000000 TPS pgbench -s 1000 -j $n -c $n -M prepared -b select-only@9 -b tpcb-like@1 on 4 x 18 cores Intel Xeon E7-8890 processors median of 3 2-minute runs with shared_buffers = 32GB, max_connections = 300 master xact-align csn-rewrite Alexander Korotkov The future is CSN 23 / 31
  • 24. Custom script with extra 20 read queries set naccounts 100000 * :scale set aid1 random(1, :naccounts) ........................................................... set aid20 random(1, :naccounts) set aid random(1, 100000 * :scale) set bid random(1, 1 * :scale) set tid random(1, 10 * :scale) set delta random(-5000, 5000) SELECT abalance FROM pgbench_accounts WHERE aid IN (:aid1); ........................................................... SELECT abalance FROM pgbench_accounts WHERE aid IN (:aid20); BEGIN; UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; SELECT abalance FROM pgbench_accounts WHERE aid = :aid; UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :d END; Alexander Korotkov The future is CSN 24 / 31
  • 25. Custom script with extra 20 read queries 0 50 100 150 200 250 # Clients 0 10000 20000 30000 40000 50000 60000 70000 TPS pgbench -s 1000 -j $n -c $n -M prepared -f rrw.sql on 4 x 18 cores Intel Xeon E7-8890 processors median of 3 2-minute runs with shared_buffers = 32GB, max_connections = 300 master xact-align csn-rewrite Alexander Korotkov The future is CSN 25 / 31
  • 26. Further PostgreSQL OLTP bo lenecks Alexander Korotkov The future is CSN 26 / 31
  • 27. Further PostgreSQL OLTP bo lenecks ▶ Buffer manager – slow hash-table, pin, locks etc. ▶ Synchronous protocol. ▶ Executor. ▶ Slow xid alloca on – a lot of locks. Alexander Korotkov The future is CSN 27 / 31
  • 28. Further PostgreSQL OLTP bo lenecks in numbers ▶ SELECT val FROM t WHERE id IN (:id1, ... :id10) – 150K per second = 1.5M key-value pairs per second, no gain. Bo leneck in buffer manager. ▶ SELECT 1 with CSN-rewrite patch – 3.9M queries per second. Protocol and executor are bo lenecks. ▶ SELECT txid_current() – 390K per second. Bo leneck in locks. Alexander Korotkov The future is CSN 28 / 31
  • 29. How can we improve PostgreSQL OLTP? ▶ True in-memory engine without buffer manager. ▶ Asynchronous binary protocol for processing more short queries. ▶ Executor improvements including JIT-compila on. ▶ Lockless xid alloca on. Alexander Korotkov The future is CSN 29 / 31
  • 30. Conclusion ▶ Despite all the micro-op miza ons made, our snapshot model could be a bo leneck on modern mul core systems. And it would be even worse bo leneck on future systems. ▶ CSN is the way to remove this bo leneck. It also more friendly to distributed systems. ▶ It’s possible to minimize XID ⇒ CSN map in the same way we minimize CLOG accesses. Alexander Korotkov The future is CSN 30 / 31
  • 31. Thank you for a en on! Alexander Korotkov The future is CSN 31 / 31