SlideShare a Scribd company logo
1 of 67
Download to read offline
Copyright©2018 NTT Corp. All Rights Reserved.
Architecture & Pitfalls

of Logical Replication
NTT OSS Center
Atsushi Torikoshi
PGConf.US 2018
2
Who am I
➢Atsushi Torikoshi
➢@atorik_shi
➢torikoshi_atsushi_z2@lab.ntt.co.jp
➢NTT Open Source Software Center
➢PostgreSQL technical support
➢PostgreSQL performance verification
Copyright©2018 NTT Corp. All Rights Reserved.
3
About NTT
• Who we are
– NTT(Nippon Telegraph and Telephone Corporation)
– Japanese telecommunications company
• What NTT OSS Center does
– Promotes the adoption of OSS by the group companies
• Total support
– support desk, Introduction support, Product maintenance
• R&D
– developing OSS and related tools with the communities
• Deals about 60 OSS products
– developing OSS and related tools with the communities
NTT
NTT OSS Center
Copyright©2018 NTT Corp. All Rights Reserved.
Copyright©2018 NTT Corp. All Rights Reserved.
4
•Background of Logical Replication
•Architecture and Behavior
•Pitfalls
•Summary
INDEX
Copyright©2018 NTT Corp. All Rights Reserved.
5
BACKGROUND OF
LOGICAL REPLICATION
Copyright©2018 NTT Corp. All Rights Reserved.
6
PostgreSQL has built-in Physical Replication
since 2010.
It replicates a whole DB by sending WAL.
Suitable for load balancing and high
availability.
Physical Replication
Upstream Downstream
sendTable
Table
Table
WALWAL WALWAL Table
Table
Table
replay
Copyright©2018 NTT Corp. All Rights Reserved.
7
Physical Replication cannot do things like:
• partial replication
• replication between different major version
PostgreSQL
Logical Replication has added flexibility to
built-in replication and made these things
possible!
Logical Replication
Upstream Downstream
decode, sendTable
Table
Table
WALWAL WALWAL Table
Table
apply
write
Copyright©2018 NTT Corp. All Rights Reserved.
8
Comparison between Logical and Physical Replication
Physical Logical
way of the
replication
Sending and
replaying all
WAL
decoding WAL and extracting changes
downstream DB copy of the
upstream DB
not necessarily the same as upstream
DB
up/downstream DB can be different
PostgreSQL version
manipulations
for downstream
DB
SELECT only No restriction, but some manipulations
may lead to conflict
What is
replicated
ALL views, partition root tables, large
objects and some manipulations
including DDL are NOT replicated
Copyright©2018 NTT Corp. All Rights Reserved.
9
Logical Replication enables flexible data
replication.
1. Replicating partial data for analytical
purpose
2. Consolidating multiple DBs into a single
one
3. Online version up
Expected use cases of Logical Replication
(1) (2)
(3)
Copyright©2018 NTT Corp. All Rights Reserved.
10
ARCHITECTURE

AND

BEHAVIOR
Copyright©2018 NTT Corp. All Rights Reserved.
11
• ‘walsender’ and ‘apply worker’ do most of
the work for Logical Replication.
• ‘sync worker’ and corresponding
‘walsender’ run only at initial table sync.
Basics of the architecture
WAL
wal
sender
Publisher (upstream)
write
wal
sender
apply
worker
launcher
sync
worker
launch
launch
Subscriber(downstream)
backend
process
read
decode
backend
process
Copyright©2018 NTT Corp. All Rights Reserved.
12
• ‘walsender’ reads WAL and decodes it. Then
sends it to subscriber.
• ‘apply worker’ applies that change.
Basics of the architecture ~replication
WAL
backend
process
wal
sender
Publisher
write
read
apply
worker
Subscriber
TableTableTable
write
decode
send
change
Copyright©2018 NTT Corp. All Rights Reserved.
13
• ‘walsender’ reassembles queries by its
transaction.
• When WAL is INSERT, UPDATE or DELETE,
‘walsender’ keeps the change in memory.
Basics of the architecture ~replication
WAL
walsender
INSERT
UPDATE
UPDATE
DELETE
UPDATE apply
worker
Publisher Subscriber
:transaction
Copyright©2018 NTT Corp. All Rights Reserved.
14
• ‘walsender’ reassembles queries by its
transaction.
• When WAL is INSERT, UPDATE or DELETE,
‘walsender’ keeps the change in memory.
Basics of the architecture ~replication
WAL
walsender
INSERT
UPDATE
UPDATE
DELETE
UPDATE
1. read WAL
apply
worker
Publisher Subscriber
:transaction
Copyright©2018 NTT Corp. All Rights Reserved.
15
• ‘walsender’ reassembles queries by its
transaction.
• When WAL is INSERT, UPDATE or DELETE,
‘walsender’ keeps the change in memory.
Basics of the architecture ~replication
WAL
walsender
INSERT
INSERT
UPDATE
UPDATE
DELETE
UPDATE
1. read WAL
2. decode
apply
worker
Publisher Subscriber
:transaction
Copyright©2018 NTT Corp. All Rights Reserved.
16
• ‘walsender’ reassembles queries by its
transaction.
• When WAL is INSERT, UPDATE or DELETE,
‘walsender’ keeps the change in memory.
Basics of the architecture ~replication
WAL
walsender
INSERT
INSERT
UPDATE
UPDATE
DELETE
UPDATE
1. read WAL
2. decode
3. reassemble
by transaction
apply
worker
Publisher Subscriber
:transaction
INSERT
Copyright©2018 NTT Corp. All Rights Reserved.
17
• When WAL is COMMIT, ‘walsender’ sends all
the changes for that transaction to
subscriber.
Basics of the architecture ~replication
:transaction
WAL
apply
worker
walsender
COMMIT
INSERT
UPDATE
UPDATE
DELETE
UPDATE
1. read WAL
2. decode
4. send
Publisher Subscriber
3. reassemble
by transaction
COMMIT
Copyright©2018 NTT Corp. All Rights Reserved.
18
• When WAL is ROLLBACK, ‘walsender’ just
throws away the changes for that
transaction.
Basics of the architecture ~replication
:transaction
WAL
walsender
ROLLBACK
INSERT
UPDATE
UPDATE
DELETE
UPDATE
ROLLBACK
1. read WAL
2. decode
4. cleanup
apply
worker
Publisher Subscriber
3. reassemble
by transaction
Copyright©2018 NTT Corp. All Rights Reserved.
19
• At initial table sync, COPY runs.
• COPY is done by dedicated ‘walsender’ and
sync worker. These processes exit after
COPY is done.
Initial table sync
WAL
backend
process
wal
sender
Publisher
write
read
apply
worker
Subscriber
TableTableTable
sync
worker
wal
sender write
(COPY)
Copyright©2018 NTT Corp. All Rights Reserved.
20
• PostgreSQL doesn’t have merge agents for
conflict resolution. If there are multiple
changes for the same data at one time, the
last change is reflected.
(Not) Conflict
Publisher Subscriber
id name
1 ‘A’
2 ‘B’
id name
1 ‘A’
2 ‘B’
Copyright©2018 NTT Corp. All Rights Reserved.
21
• PostgreSQL doesn’t have merge agents for
conflict resolution. If there are multiple
changes for the same data at one time, the
last change is reflected.
(Not) Conflict
Publisher Subscriber
2. UPDATE table SET name = ‘Y‘
WHERE id = 2
id name
1 ‘A’
2 ‘Y’
1. UPDATE table SET name = ‘X‘
WHERE id = 2
id name
1 ‘A’
2 ‘X’
Copyright©2018 NTT Corp. All Rights Reserved.
22
• PostgreSQL doesn’t have merge agents for
conflict resolution. If there are multiple
changes for the same data at one time, the
last change is reflected.
(Not) Conflict
Publisher Subscriber
2. UPDATE table SET name = ‘Y‘
WHERE id = 2
id name
1 ‘A’
2 ‘X’
1. UPDATE table SET name = ‘X‘
WHERE id = 2
3. replicate
id name
1 ‘A’
2 ‘X’
Copyright©2018 NTT Corp. All Rights Reserved.
23
• If replicating data causes an error at
subscriber side, the replication stops.
Conflict
Publisher Subscriber
id
1
2
1. INSERT INTO table VALUES (2);
id
1
2
2. INSERT INTO table VALUES (2);
Copyright©2018 NTT Corp. All Rights Reserved.
24
• If replicating data causes an error at
subscriber side, the replication stops.
Conflict
Publisher Subscriber
id
1
2
1. INSERT INTO table VALUES (2);
id
1
2
2. INSERT INTO table VALUES (2);
3. replicate
Copyright©2018 NTT Corp. All Rights Reserved.
25
• If replicating data causes an error at
subscriber side, the replication stops.
Conflict
Publisher Subscriber
id
1
2
2. INSERT INTO table VALUES (2);
id
1
2
1. INSERT INTO table VALUES (2);
3. replicate 4. conflict
Copyright©2018 NTT Corp. All Rights Reserved.
26
• Users must resolve conflict manually.
• After the conflict is resolved, replication is
resumed.
Conflict
Publisher Subscriber
id
1
2
2. INSERT INTO table VALUES (2);
id
1
2
1. INSERT INTO table VALUES (2);
3. replicate 4. conflict
Copyright©2018 NTT Corp. All Rights Reserved.
27
PITFALLS
Copyright©2018 NTT Corp. All Rights Reserved.
28
Q1. How does ‘walsender’ deal with WAL
which are NOT target of replication?
Copyright©2018 NTT Corp. All Rights Reserved.
29
A1. ‘walsender’ decodes most of
the WAL.
Copyright©2018 NTT Corp. All Rights Reserved.
30
• behavior: 'walsender’ decodes *all* of the
changes to the target database, NOT just
the changes to subscribed tables.
1. ‘walsender’ decodes most of the WAL
Copyright©2018 NTT Corp. All Rights Reserved.
31
• pitfall: Changes in non-subscribed tables
even consume resources, such as CPU and
memory.
1. ‘walsender’ decodes most of the WAL
perf visualization of walsender updating only non-subscribed tables
DecodeDelete DecodeInsert DecodeCommit
Copyright©2018 NTT Corp. All Rights Reserved.
32
• Lesson: ‘walsender’ consumes resources
depending on the whole amount of changes
on the publisher database database, NOT
only on the amount of changes on
subscribed tables.
1. ‘walsender’ decodes most of the WAL
Copyright©2018 NTT Corp. All Rights Reserved.
33
Q2. Does keeping changes on walsender
cause issues?
Copyright©2018 NTT Corp. All Rights Reserved.
34
A2. Yes, It may consume a lot of memory.
Copyright©2018 NTT Corp. All Rights Reserved.
35
• behavior: ‘walsender’ keeps each change of
a transaction in memory until COMMIT or
ROLLBACK.
2. ‘walsender’ may consume a lot of memory
Copyright©2018 NTT Corp. All Rights Reserved.
36
• pitfall: It may cause ‘walsender’ to
consume a lot of memory.
2. ‘walsender’ may consume a lot of memory
Type of manipulation Measures to prevent memory use
many changes in
one transaction
walsender’ has a feature to spill
out changes to disk, when the
number of changes in one
transaction exceeds 4096.
changes which
modifies much data
There are no feature to avoid using
memory.
many transactions
many savepoints
※ Patches changing this behavior are under discussion.
Copyright©2018 NTT Corp. All Rights Reserved.
37
• lesson: If possible, it’s better to avoid the
manipulations which have no measures to
prevent consuming a lot of memory.
Monitoring memory usage at publisher may
be a good idea.
2. ‘walsender’ may consume a lot of memory
Copyright©2018 NTT Corp. All Rights Reserved.
38
Q3. Can we use synchronous replication in
Logical Replication?
Copyright©2018 NTT Corp. All Rights Reserved.
39
A3. Yes, but the response time may
be quite long.
Copyright©2018 NTT Corp. All Rights Reserved.
40
• behavior: Under synchronous replication,
before replying to the client, publishers
wait for the COMMIT responses from all the
subscribers.
3. The response time may be quite long
Publisher
table
2
table
1
Client
BEGIN;
INSERT INTO Table1 VALUES (‘a’);
COMMIT;
(1)
(4)
Subscriber
table
1
BEGIN;
INSERT INTO Table1 VALUES (‘a’);
COMMIT;
(2)
(3)
Table1
Copyright©2018 NTT Corp. All Rights Reserved.
41
• pitfall: Under synchronous replication,
Publishers wait for COMMIT responses from
all the subscribers, even when there are no
changes to those subscribers.
3. The response time may be quite long
Publisher
table
2
table
1 Subscriber2
table
2
Client
BEGIN;
INSERT INTO Table1 VALUES (‘a’);
COMMIT;
Sends only
BEGIN and COMMIT
(1)
(2)
(3)
(4)
Subscriber1
table
1
BEGIN;
INSERT INTO Table1 VALUES (‘a’);
COMMIT;
(2)
(3)
Table1
Copyright©2018 NTT Corp. All Rights Reserved.
42
• lesson: The response time to clients
depends on the slowest subscriber.
• Also, as we’ve seen it on Q2, ‘walsender‘
sends changes to ‘apply worker’ after
COMMIT, it also tends to make response
time longer.
• It may also be beneficial to confirm you
really need synchronous replication.
3. The response time may be quite long
Copyright©2018 NTT Corp. All Rights Reserved.
43
Q4. Is the way to monitor the status of
replication the same as Physical
Replication?
Copyright©2018 NTT Corp. All Rights Reserved.
44
A4. Only monitoring pg_stat_replication
might not be enough.
Copyright©2018 NTT Corp. All Rights Reserved.
45
• behavior: Initial table sync is done by
dedicated processes, sync worker and
walsender.
4. pg_stat_replication might not be enough
WAL
backend
process
wal
sender
Publisher
write
read
apply
worker
Subscriber
TableTableTable
sync
worker
wal
sender write
(COPY)
Copyright©2018 NTT Corp. All Rights Reserved.
46
• pitfall: Even if ‘sync worker’ failed to start
and nothing has been replicated yet,
pg_stat_replication.state is ‘streaming’.
4. pg_stat_replication might not be enough
Copyright©2018 NTT Corp. All Rights Reserved.
47
• lesson: We should also monitor
pg_subscription_rel and check ‘srsubstate’
is ‘r’, meaning ready.
4. pg_stat_replication might not be enough
Copyright©2018 NTT Corp. All Rights Reserved.
48
Q5. How should we resolve the conflict?
Copyright©2018 NTT Corp. All Rights Reserved.
49
A5. We can use
pg_replication_origin_advance(),
but it may skip some data.
Copyright©2018 NTT Corp. All Rights Reserved.
50
• behavior: pg_replication_origin_advance()
enables us to set the LSN up to which data
has been replicated.
5. pg_replication_origin_advance() may skip data
| |
10 20
remote lsn
Here
Copyright©2018 NTT Corp. All Rights Reserved.
51
• behavior: pg_replication_origin_advance()
enables us to set the LSN up to which data
has been replicated.
5. pg_replication_origin_advance() may skip data
| |
10 20
remote lsn
pg_replication_origin_advance(‘node_name’, 20)
Here
Copyright©2018 NTT Corp. All Rights Reserved.
52
• behavior: pg_replication_origin_advance()
enables us to set the LSN up to which data
has been replicated.
5. pg_replication_origin_advance() may skip data
| |
10 20
remote lsn
pg_replication_origin_advance(‘node_name’, 20)
Here
Conflict point
Copyright©2018 NTT Corp. All Rights Reserved.
53
• pitfalls: If there are some changes on the
publisher after the conflict,
pg_replication_origin_advance(‘current wal
lsn on publisher’) skips applying that
changes.
5. pg_replication_origin_advance() may skip data
| |
10 20
remote lsn
pg_replication_origin_advance(‘node_name’, 20)
INSERT
UPDATEConflict point
Copyright©2018 NTT Corp. All Rights Reserved.
54
• lessons: Changing conflicting data on the
subscriber may be usually a better choice.
5. pg_replication_origin_advance() may skip data
Copyright©2018 NTT Corp. All Rights Reserved.
55
Q6. Can backup be performed usual?
Copyright©2018 NTT Corp. All Rights Reserved.
56
A6. Backup DB under Logical Replication
may need additional procedure.
Copyright©2018 NTT Corp. All Rights Reserved.
57
• behavior: pg_dump doesn't backup
pg_subscription_rel, which keeps the state
of initial table sync.
6. Logical Replication may need additional procedure
Copyright©2018 NTT Corp. All Rights Reserved.
58
• pitfalls: Restoring data backed up by
pg_dump at a subscriber causes initial table
sync again.
It usually makes the replication stop due to
key duplication error.
6. Logical Replication may need additional procedure
Publisher Subscriber
pg_dump
TableTableTable
Copyright©2018 NTT Corp. All Rights Reserved.
59
• pitfalls: Restoring data backed up by
pg_dump at a subscriber causes initial table
synchronization again.
It usually makes the replication stop due to
key duplication error.
6. Logical Replication may need additional procedure
Publisher Subscriber
(1)restore
pg_dump
TableTableTable TableTableTable
Copyright©2018 NTT Corp. All Rights Reserved.
60
• pitfalls: Restoring data backed up by
pg_dump at a subscriber causes initial table
synchronization again.
It usually makes the replication stop due to
key duplication error.
6. Logical Replication may need additional procedure
Publisher Subscriber
(1)restore
pg_dump
TableTableTable TableTableTable
(2)replication
Copyright©2018 NTT Corp. All Rights Reserved.
61
• pitfalls: Restoring data backed up by
pg_dump at a subscriber causes initial table
synchronization again.
It usually makes the replication stop due to
key duplication error.
6. Logical Replication may need additional procedure
Publisher Subscriber
(1)restore
pg_dump
TableTableTable TableTableTable
(3)conflict
(2)replication
Copyright©2018 NTT Corp. All Rights Reserved.
62
• lessons: We can avoid this resyncing by
refresh subscription with 'copy_data =
false‘.
But if a subscription has tables which have
not completed the initial sync, we need
more work..
It's better to consider well what data is
really necessary and how to prevent data
loss.
In some cases it may be better to start
replication from scratch.
6. Logical Replication may need additional procedure
Copyright©2018 NTT Corp. All Rights Reserved.
63
SUMMARY
Copyright©2018 NTT Corp. All Rights Reserved.
64
Design
Take into account some counterintuitive
behaviors which cause performance impact.
• ‘walsender’ keeps changes in memory
• In sync replication, publishers wait for COMMIT
from all the subscribers even which have no
change.
• Changes on non-subscribed tables are also
decoded.
How should we manage Logical Replication?
Copyright©2018 NTT Corp. All Rights Reserved.
65
Monitoring
• Monitor memory usage on publisher.
• Monitor not only pg_stat_replication but
pg_subscription_rel.
How should we manage Logical Replication?
Copyright©2018 NTT Corp. All Rights Reserved.
66
Operation
• pg_replication_origin_advance() may skip
some data.
• Backup and restore need some extra
procedures, It's better to consider well
what data is really necessary and how to
prevent data loss.
How should we manage Logical Replication?
Copyright©2018 NTT Corp. All Rights Reserved.
67
Thank you !
torikoshi_atsushi_z2@lab.ntt.co.jp
@atorik_shi

More Related Content

What's hot

What's hot (20)

トランザクション処理可能な分散DB 「YugabyteDB」入門(Open Source Conference 2022 Online/Fukuoka 発...
トランザクション処理可能な分散DB 「YugabyteDB」入門(Open Source Conference 2022 Online/Fukuoka 発...トランザクション処理可能な分散DB 「YugabyteDB」入門(Open Source Conference 2022 Online/Fukuoka 発...
トランザクション処理可能な分散DB 「YugabyteDB」入門(Open Source Conference 2022 Online/Fukuoka 発...
 
PostgreSQLの統計情報について(第26回PostgreSQLアンカンファレンス@オンライン 発表資料)
PostgreSQLの統計情報について(第26回PostgreSQLアンカンファレンス@オンライン 発表資料)PostgreSQLの統計情報について(第26回PostgreSQLアンカンファレンス@オンライン 発表資料)
PostgreSQLの統計情報について(第26回PostgreSQLアンカンファレンス@オンライン 発表資料)
 
PostgreSQLモニタリングの基本とNTTデータが追加したモニタリング新機能(Open Source Conference 2021 Online F...
PostgreSQLモニタリングの基本とNTTデータが追加したモニタリング新機能(Open Source Conference 2021 Online F...PostgreSQLモニタリングの基本とNTTデータが追加したモニタリング新機能(Open Source Conference 2021 Online F...
PostgreSQLモニタリングの基本とNTTデータが追加したモニタリング新機能(Open Source Conference 2021 Online F...
 
クラウド利用者として考えるサステナビリティ
クラウド利用者として考えるサステナビリティクラウド利用者として考えるサステナビリティ
クラウド利用者として考えるサステナビリティ
 
PostgreSQLレプリケーション10周年!徹底紹介!(PostgreSQL Conference Japan 2019講演資料)
PostgreSQLレプリケーション10周年!徹底紹介!(PostgreSQL Conference Japan 2019講演資料)PostgreSQLレプリケーション10周年!徹底紹介!(PostgreSQL Conference Japan 2019講演資料)
PostgreSQLレプリケーション10周年!徹底紹介!(PostgreSQL Conference Japan 2019講演資料)
 
アーキテクチャから理解するPostgreSQLのレプリケーション
アーキテクチャから理解するPostgreSQLのレプリケーションアーキテクチャから理解するPostgreSQLのレプリケーション
アーキテクチャから理解するPostgreSQLのレプリケーション
 
あなたの知らないPostgreSQL監視の世界
あなたの知らないPostgreSQL監視の世界あなたの知らないPostgreSQL監視の世界
あなたの知らないPostgreSQL監視の世界
 
速習!論理レプリケーション ~基礎から最新動向まで~(PostgreSQL Conference Japan 2022 発表資料)
速習!論理レプリケーション ~基礎から最新動向まで~(PostgreSQL Conference Japan 2022 発表資料)速習!論理レプリケーション ~基礎から最新動向まで~(PostgreSQL Conference Japan 2022 発表資料)
速習!論理レプリケーション ~基礎から最新動向まで~(PostgreSQL Conference Japan 2022 発表資料)
 
監査要件を有するシステムに対する PostgreSQL 導入の課題と可能性
監査要件を有するシステムに対する PostgreSQL 導入の課題と可能性監査要件を有するシステムに対する PostgreSQL 導入の課題と可能性
監査要件を有するシステムに対する PostgreSQL 導入の課題と可能性
 
LineairDBの紹介
LineairDBの紹介LineairDBの紹介
LineairDBの紹介
 
Postgres Playground で pgbench を走らせよう!(第35回PostgreSQLアンカンファレンス@オンライン 発表資料)
Postgres Playground で pgbench を走らせよう!(第35回PostgreSQLアンカンファレンス@オンライン 発表資料)Postgres Playground で pgbench を走らせよう!(第35回PostgreSQLアンカンファレンス@オンライン 発表資料)
Postgres Playground で pgbench を走らせよう!(第35回PostgreSQLアンカンファレンス@オンライン 発表資料)
 
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
PostgreSQL 12は ここがスゴイ! ~性能改善やpluggable storage engineなどの新機能を徹底解説~ (NTTデータ テクノ...
 
Javaのログ出力: 道具と考え方
Javaのログ出力: 道具と考え方Javaのログ出力: 道具と考え方
Javaのログ出力: 道具と考え方
 
PostgreSQL 9.6 新機能紹介
PostgreSQL 9.6 新機能紹介PostgreSQL 9.6 新機能紹介
PostgreSQL 9.6 新機能紹介
 
株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...
株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...
株式会社コロプラ『GKE と Cloud Spanner が躍動するドラゴンクエストウォーク』第 9 回 Google Cloud INSIDE Game...
 
Qlik Replicate - Control Tableの詳細
Qlik Replicate - Control Tableの詳細Qlik Replicate - Control Tableの詳細
Qlik Replicate - Control Tableの詳細
 
PGOを用いたPostgreSQL on Kubernetes入門(PostgreSQL Conference Japan 2022 発表資料)
PGOを用いたPostgreSQL on Kubernetes入門(PostgreSQL Conference Japan 2022 発表資料)PGOを用いたPostgreSQL on Kubernetes入門(PostgreSQL Conference Japan 2022 発表資料)
PGOを用いたPostgreSQL on Kubernetes入門(PostgreSQL Conference Japan 2022 発表資料)
 
トランザクションの設計と進化
トランザクションの設計と進化トランザクションの設計と進化
トランザクションの設計と進化
 
Git超入門_座学編.pdf
Git超入門_座学編.pdfGit超入門_座学編.pdf
Git超入門_座学編.pdf
 
PostgreSQL13でのレプリケーション関連の改善について(第14回PostgreSQLアンカンファレンス@オンライン)
PostgreSQL13でのレプリケーション関連の改善について(第14回PostgreSQLアンカンファレンス@オンライン)PostgreSQL13でのレプリケーション関連の改善について(第14回PostgreSQLアンカンファレンス@オンライン)
PostgreSQL13でのレプリケーション関連の改善について(第14回PostgreSQLアンカンファレンス@オンライン)
 

Similar to Architecture & Pitfalls of Logical Replication

NFV Infrastructure Manager with High Performance Software Switch Lagopus
NFV Infrastructure Manager with High Performance Software Switch Lagopus NFV Infrastructure Manager with High Performance Software Switch Lagopus
NFV Infrastructure Manager with High Performance Software Switch Lagopus
Hirofumi Ichihara
 
The Principle Of Ultrasound Imaging System
The Principle Of Ultrasound Imaging SystemThe Principle Of Ultrasound Imaging System
The Principle Of Ultrasound Imaging System
Melissa Luster
 

Similar to Architecture & Pitfalls of Logical Replication (20)

Vacuum more efficient than ever
Vacuum more efficient than everVacuum more efficient than ever
Vacuum more efficient than ever
 
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
 
Media processing with serverless architecture
Media processing with serverless architectureMedia processing with serverless architecture
Media processing with serverless architecture
 
Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...
Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...
Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...
 
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
NTTドコモ様 導入事例 OpenStack Summit 2015 Tokyo 講演「After One year of OpenStack Cloud...
 
training report on embedded system and AVR
training report on embedded system and AVRtraining report on embedded system and AVR
training report on embedded system and AVR
 
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Lessons Learned from Leveraging Real-Time Power Consumption Data with Apache ...
Lessons Learned from Leveraging Real-Time Power Consumption Data with Apache ...Lessons Learned from Leveraging Real-Time Power Consumption Data with Apache ...
Lessons Learned from Leveraging Real-Time Power Consumption Data with Apache ...
 
Sensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxData
Sensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxDataSensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxData
Sensor Data in InfluxDB by David Simmons, IoT Developer Evangelist | InfluxData
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
 
NFV Infrastructure Manager with High Performance Software Switch Lagopus
NFV Infrastructure Manager with High Performance Software Switch Lagopus NFV Infrastructure Manager with High Performance Software Switch Lagopus
NFV Infrastructure Manager with High Performance Software Switch Lagopus
 
How YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQLHow YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQL
 
The Principle Of Ultrasound Imaging System
The Principle Of Ultrasound Imaging SystemThe Principle Of Ultrasound Imaging System
The Principle Of Ultrasound Imaging System
 
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus presentation on 14th Annual ON*VECTOR International Photonics WorkshopLagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
 
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native WayMigrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
 
Bloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQLBloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQL
 
FIAT/IFTA MMC Seminar May 2015. Key Points for a Successful Migration. Fikriy...
FIAT/IFTA MMC Seminar May 2015. Key Points for a Successful Migration. Fikriy...FIAT/IFTA MMC Seminar May 2015. Key Points for a Successful Migration. Fikriy...
FIAT/IFTA MMC Seminar May 2015. Key Points for a Successful Migration. Fikriy...
 
Getting Started: Intro to Telegraf - July 2021
Getting Started: Intro to Telegraf - July 2021Getting Started: Intro to Telegraf - July 2021
Getting Started: Intro to Telegraf - July 2021
 
Labqazwsxedcrfvtgbyhnujmqazwsxedcrfvtgbyhnujmqazwsx.pptx
Labqazwsxedcrfvtgbyhnujmqazwsxedcrfvtgbyhnujmqazwsx.pptxLabqazwsxedcrfvtgbyhnujmqazwsxedcrfvtgbyhnujmqazwsx.pptx
Labqazwsxedcrfvtgbyhnujmqazwsxedcrfvtgbyhnujmqazwsx.pptx
 

Recently uploaded

%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 

Recently uploaded (20)

Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 

Architecture & Pitfalls of Logical Replication

  • 1. Copyright©2018 NTT Corp. All Rights Reserved. Architecture & Pitfalls
 of Logical Replication NTT OSS Center Atsushi Torikoshi PGConf.US 2018
  • 2. 2 Who am I ➢Atsushi Torikoshi ➢@atorik_shi ➢torikoshi_atsushi_z2@lab.ntt.co.jp ➢NTT Open Source Software Center ➢PostgreSQL technical support ➢PostgreSQL performance verification Copyright©2018 NTT Corp. All Rights Reserved.
  • 3. 3 About NTT • Who we are – NTT(Nippon Telegraph and Telephone Corporation) – Japanese telecommunications company • What NTT OSS Center does – Promotes the adoption of OSS by the group companies • Total support – support desk, Introduction support, Product maintenance • R&D – developing OSS and related tools with the communities • Deals about 60 OSS products – developing OSS and related tools with the communities NTT NTT OSS Center Copyright©2018 NTT Corp. All Rights Reserved.
  • 4. Copyright©2018 NTT Corp. All Rights Reserved. 4 •Background of Logical Replication •Architecture and Behavior •Pitfalls •Summary INDEX
  • 5. Copyright©2018 NTT Corp. All Rights Reserved. 5 BACKGROUND OF LOGICAL REPLICATION
  • 6. Copyright©2018 NTT Corp. All Rights Reserved. 6 PostgreSQL has built-in Physical Replication since 2010. It replicates a whole DB by sending WAL. Suitable for load balancing and high availability. Physical Replication Upstream Downstream sendTable Table Table WALWAL WALWAL Table Table Table replay
  • 7. Copyright©2018 NTT Corp. All Rights Reserved. 7 Physical Replication cannot do things like: • partial replication • replication between different major version PostgreSQL Logical Replication has added flexibility to built-in replication and made these things possible! Logical Replication Upstream Downstream decode, sendTable Table Table WALWAL WALWAL Table Table apply write
  • 8. Copyright©2018 NTT Corp. All Rights Reserved. 8 Comparison between Logical and Physical Replication Physical Logical way of the replication Sending and replaying all WAL decoding WAL and extracting changes downstream DB copy of the upstream DB not necessarily the same as upstream DB up/downstream DB can be different PostgreSQL version manipulations for downstream DB SELECT only No restriction, but some manipulations may lead to conflict What is replicated ALL views, partition root tables, large objects and some manipulations including DDL are NOT replicated
  • 9. Copyright©2018 NTT Corp. All Rights Reserved. 9 Logical Replication enables flexible data replication. 1. Replicating partial data for analytical purpose 2. Consolidating multiple DBs into a single one 3. Online version up Expected use cases of Logical Replication (1) (2) (3)
  • 10. Copyright©2018 NTT Corp. All Rights Reserved. 10 ARCHITECTURE
 AND
 BEHAVIOR
  • 11. Copyright©2018 NTT Corp. All Rights Reserved. 11 • ‘walsender’ and ‘apply worker’ do most of the work for Logical Replication. • ‘sync worker’ and corresponding ‘walsender’ run only at initial table sync. Basics of the architecture WAL wal sender Publisher (upstream) write wal sender apply worker launcher sync worker launch launch Subscriber(downstream) backend process read decode backend process
  • 12. Copyright©2018 NTT Corp. All Rights Reserved. 12 • ‘walsender’ reads WAL and decodes it. Then sends it to subscriber. • ‘apply worker’ applies that change. Basics of the architecture ~replication WAL backend process wal sender Publisher write read apply worker Subscriber TableTableTable write decode send change
  • 13. Copyright©2018 NTT Corp. All Rights Reserved. 13 • ‘walsender’ reassembles queries by its transaction. • When WAL is INSERT, UPDATE or DELETE, ‘walsender’ keeps the change in memory. Basics of the architecture ~replication WAL walsender INSERT UPDATE UPDATE DELETE UPDATE apply worker Publisher Subscriber :transaction
  • 14. Copyright©2018 NTT Corp. All Rights Reserved. 14 • ‘walsender’ reassembles queries by its transaction. • When WAL is INSERT, UPDATE or DELETE, ‘walsender’ keeps the change in memory. Basics of the architecture ~replication WAL walsender INSERT UPDATE UPDATE DELETE UPDATE 1. read WAL apply worker Publisher Subscriber :transaction
  • 15. Copyright©2018 NTT Corp. All Rights Reserved. 15 • ‘walsender’ reassembles queries by its transaction. • When WAL is INSERT, UPDATE or DELETE, ‘walsender’ keeps the change in memory. Basics of the architecture ~replication WAL walsender INSERT INSERT UPDATE UPDATE DELETE UPDATE 1. read WAL 2. decode apply worker Publisher Subscriber :transaction
  • 16. Copyright©2018 NTT Corp. All Rights Reserved. 16 • ‘walsender’ reassembles queries by its transaction. • When WAL is INSERT, UPDATE or DELETE, ‘walsender’ keeps the change in memory. Basics of the architecture ~replication WAL walsender INSERT INSERT UPDATE UPDATE DELETE UPDATE 1. read WAL 2. decode 3. reassemble by transaction apply worker Publisher Subscriber :transaction INSERT
  • 17. Copyright©2018 NTT Corp. All Rights Reserved. 17 • When WAL is COMMIT, ‘walsender’ sends all the changes for that transaction to subscriber. Basics of the architecture ~replication :transaction WAL apply worker walsender COMMIT INSERT UPDATE UPDATE DELETE UPDATE 1. read WAL 2. decode 4. send Publisher Subscriber 3. reassemble by transaction COMMIT
  • 18. Copyright©2018 NTT Corp. All Rights Reserved. 18 • When WAL is ROLLBACK, ‘walsender’ just throws away the changes for that transaction. Basics of the architecture ~replication :transaction WAL walsender ROLLBACK INSERT UPDATE UPDATE DELETE UPDATE ROLLBACK 1. read WAL 2. decode 4. cleanup apply worker Publisher Subscriber 3. reassemble by transaction
  • 19. Copyright©2018 NTT Corp. All Rights Reserved. 19 • At initial table sync, COPY runs. • COPY is done by dedicated ‘walsender’ and sync worker. These processes exit after COPY is done. Initial table sync WAL backend process wal sender Publisher write read apply worker Subscriber TableTableTable sync worker wal sender write (COPY)
  • 20. Copyright©2018 NTT Corp. All Rights Reserved. 20 • PostgreSQL doesn’t have merge agents for conflict resolution. If there are multiple changes for the same data at one time, the last change is reflected. (Not) Conflict Publisher Subscriber id name 1 ‘A’ 2 ‘B’ id name 1 ‘A’ 2 ‘B’
  • 21. Copyright©2018 NTT Corp. All Rights Reserved. 21 • PostgreSQL doesn’t have merge agents for conflict resolution. If there are multiple changes for the same data at one time, the last change is reflected. (Not) Conflict Publisher Subscriber 2. UPDATE table SET name = ‘Y‘ WHERE id = 2 id name 1 ‘A’ 2 ‘Y’ 1. UPDATE table SET name = ‘X‘ WHERE id = 2 id name 1 ‘A’ 2 ‘X’
  • 22. Copyright©2018 NTT Corp. All Rights Reserved. 22 • PostgreSQL doesn’t have merge agents for conflict resolution. If there are multiple changes for the same data at one time, the last change is reflected. (Not) Conflict Publisher Subscriber 2. UPDATE table SET name = ‘Y‘ WHERE id = 2 id name 1 ‘A’ 2 ‘X’ 1. UPDATE table SET name = ‘X‘ WHERE id = 2 3. replicate id name 1 ‘A’ 2 ‘X’
  • 23. Copyright©2018 NTT Corp. All Rights Reserved. 23 • If replicating data causes an error at subscriber side, the replication stops. Conflict Publisher Subscriber id 1 2 1. INSERT INTO table VALUES (2); id 1 2 2. INSERT INTO table VALUES (2);
  • 24. Copyright©2018 NTT Corp. All Rights Reserved. 24 • If replicating data causes an error at subscriber side, the replication stops. Conflict Publisher Subscriber id 1 2 1. INSERT INTO table VALUES (2); id 1 2 2. INSERT INTO table VALUES (2); 3. replicate
  • 25. Copyright©2018 NTT Corp. All Rights Reserved. 25 • If replicating data causes an error at subscriber side, the replication stops. Conflict Publisher Subscriber id 1 2 2. INSERT INTO table VALUES (2); id 1 2 1. INSERT INTO table VALUES (2); 3. replicate 4. conflict
  • 26. Copyright©2018 NTT Corp. All Rights Reserved. 26 • Users must resolve conflict manually. • After the conflict is resolved, replication is resumed. Conflict Publisher Subscriber id 1 2 2. INSERT INTO table VALUES (2); id 1 2 1. INSERT INTO table VALUES (2); 3. replicate 4. conflict
  • 27. Copyright©2018 NTT Corp. All Rights Reserved. 27 PITFALLS
  • 28. Copyright©2018 NTT Corp. All Rights Reserved. 28 Q1. How does ‘walsender’ deal with WAL which are NOT target of replication?
  • 29. Copyright©2018 NTT Corp. All Rights Reserved. 29 A1. ‘walsender’ decodes most of the WAL.
  • 30. Copyright©2018 NTT Corp. All Rights Reserved. 30 • behavior: 'walsender’ decodes *all* of the changes to the target database, NOT just the changes to subscribed tables. 1. ‘walsender’ decodes most of the WAL
  • 31. Copyright©2018 NTT Corp. All Rights Reserved. 31 • pitfall: Changes in non-subscribed tables even consume resources, such as CPU and memory. 1. ‘walsender’ decodes most of the WAL perf visualization of walsender updating only non-subscribed tables DecodeDelete DecodeInsert DecodeCommit
  • 32. Copyright©2018 NTT Corp. All Rights Reserved. 32 • Lesson: ‘walsender’ consumes resources depending on the whole amount of changes on the publisher database database, NOT only on the amount of changes on subscribed tables. 1. ‘walsender’ decodes most of the WAL
  • 33. Copyright©2018 NTT Corp. All Rights Reserved. 33 Q2. Does keeping changes on walsender cause issues?
  • 34. Copyright©2018 NTT Corp. All Rights Reserved. 34 A2. Yes, It may consume a lot of memory.
  • 35. Copyright©2018 NTT Corp. All Rights Reserved. 35 • behavior: ‘walsender’ keeps each change of a transaction in memory until COMMIT or ROLLBACK. 2. ‘walsender’ may consume a lot of memory
  • 36. Copyright©2018 NTT Corp. All Rights Reserved. 36 • pitfall: It may cause ‘walsender’ to consume a lot of memory. 2. ‘walsender’ may consume a lot of memory Type of manipulation Measures to prevent memory use many changes in one transaction walsender’ has a feature to spill out changes to disk, when the number of changes in one transaction exceeds 4096. changes which modifies much data There are no feature to avoid using memory. many transactions many savepoints ※ Patches changing this behavior are under discussion.
  • 37. Copyright©2018 NTT Corp. All Rights Reserved. 37 • lesson: If possible, it’s better to avoid the manipulations which have no measures to prevent consuming a lot of memory. Monitoring memory usage at publisher may be a good idea. 2. ‘walsender’ may consume a lot of memory
  • 38. Copyright©2018 NTT Corp. All Rights Reserved. 38 Q3. Can we use synchronous replication in Logical Replication?
  • 39. Copyright©2018 NTT Corp. All Rights Reserved. 39 A3. Yes, but the response time may be quite long.
  • 40. Copyright©2018 NTT Corp. All Rights Reserved. 40 • behavior: Under synchronous replication, before replying to the client, publishers wait for the COMMIT responses from all the subscribers. 3. The response time may be quite long Publisher table 2 table 1 Client BEGIN; INSERT INTO Table1 VALUES (‘a’); COMMIT; (1) (4) Subscriber table 1 BEGIN; INSERT INTO Table1 VALUES (‘a’); COMMIT; (2) (3) Table1
  • 41. Copyright©2018 NTT Corp. All Rights Reserved. 41 • pitfall: Under synchronous replication, Publishers wait for COMMIT responses from all the subscribers, even when there are no changes to those subscribers. 3. The response time may be quite long Publisher table 2 table 1 Subscriber2 table 2 Client BEGIN; INSERT INTO Table1 VALUES (‘a’); COMMIT; Sends only BEGIN and COMMIT (1) (2) (3) (4) Subscriber1 table 1 BEGIN; INSERT INTO Table1 VALUES (‘a’); COMMIT; (2) (3) Table1
  • 42. Copyright©2018 NTT Corp. All Rights Reserved. 42 • lesson: The response time to clients depends on the slowest subscriber. • Also, as we’ve seen it on Q2, ‘walsender‘ sends changes to ‘apply worker’ after COMMIT, it also tends to make response time longer. • It may also be beneficial to confirm you really need synchronous replication. 3. The response time may be quite long
  • 43. Copyright©2018 NTT Corp. All Rights Reserved. 43 Q4. Is the way to monitor the status of replication the same as Physical Replication?
  • 44. Copyright©2018 NTT Corp. All Rights Reserved. 44 A4. Only monitoring pg_stat_replication might not be enough.
  • 45. Copyright©2018 NTT Corp. All Rights Reserved. 45 • behavior: Initial table sync is done by dedicated processes, sync worker and walsender. 4. pg_stat_replication might not be enough WAL backend process wal sender Publisher write read apply worker Subscriber TableTableTable sync worker wal sender write (COPY)
  • 46. Copyright©2018 NTT Corp. All Rights Reserved. 46 • pitfall: Even if ‘sync worker’ failed to start and nothing has been replicated yet, pg_stat_replication.state is ‘streaming’. 4. pg_stat_replication might not be enough
  • 47. Copyright©2018 NTT Corp. All Rights Reserved. 47 • lesson: We should also monitor pg_subscription_rel and check ‘srsubstate’ is ‘r’, meaning ready. 4. pg_stat_replication might not be enough
  • 48. Copyright©2018 NTT Corp. All Rights Reserved. 48 Q5. How should we resolve the conflict?
  • 49. Copyright©2018 NTT Corp. All Rights Reserved. 49 A5. We can use pg_replication_origin_advance(), but it may skip some data.
  • 50. Copyright©2018 NTT Corp. All Rights Reserved. 50 • behavior: pg_replication_origin_advance() enables us to set the LSN up to which data has been replicated. 5. pg_replication_origin_advance() may skip data | | 10 20 remote lsn Here
  • 51. Copyright©2018 NTT Corp. All Rights Reserved. 51 • behavior: pg_replication_origin_advance() enables us to set the LSN up to which data has been replicated. 5. pg_replication_origin_advance() may skip data | | 10 20 remote lsn pg_replication_origin_advance(‘node_name’, 20) Here
  • 52. Copyright©2018 NTT Corp. All Rights Reserved. 52 • behavior: pg_replication_origin_advance() enables us to set the LSN up to which data has been replicated. 5. pg_replication_origin_advance() may skip data | | 10 20 remote lsn pg_replication_origin_advance(‘node_name’, 20) Here Conflict point
  • 53. Copyright©2018 NTT Corp. All Rights Reserved. 53 • pitfalls: If there are some changes on the publisher after the conflict, pg_replication_origin_advance(‘current wal lsn on publisher’) skips applying that changes. 5. pg_replication_origin_advance() may skip data | | 10 20 remote lsn pg_replication_origin_advance(‘node_name’, 20) INSERT UPDATEConflict point
  • 54. Copyright©2018 NTT Corp. All Rights Reserved. 54 • lessons: Changing conflicting data on the subscriber may be usually a better choice. 5. pg_replication_origin_advance() may skip data
  • 55. Copyright©2018 NTT Corp. All Rights Reserved. 55 Q6. Can backup be performed usual?
  • 56. Copyright©2018 NTT Corp. All Rights Reserved. 56 A6. Backup DB under Logical Replication may need additional procedure.
  • 57. Copyright©2018 NTT Corp. All Rights Reserved. 57 • behavior: pg_dump doesn't backup pg_subscription_rel, which keeps the state of initial table sync. 6. Logical Replication may need additional procedure
  • 58. Copyright©2018 NTT Corp. All Rights Reserved. 58 • pitfalls: Restoring data backed up by pg_dump at a subscriber causes initial table sync again. It usually makes the replication stop due to key duplication error. 6. Logical Replication may need additional procedure Publisher Subscriber pg_dump TableTableTable
  • 59. Copyright©2018 NTT Corp. All Rights Reserved. 59 • pitfalls: Restoring data backed up by pg_dump at a subscriber causes initial table synchronization again. It usually makes the replication stop due to key duplication error. 6. Logical Replication may need additional procedure Publisher Subscriber (1)restore pg_dump TableTableTable TableTableTable
  • 60. Copyright©2018 NTT Corp. All Rights Reserved. 60 • pitfalls: Restoring data backed up by pg_dump at a subscriber causes initial table synchronization again. It usually makes the replication stop due to key duplication error. 6. Logical Replication may need additional procedure Publisher Subscriber (1)restore pg_dump TableTableTable TableTableTable (2)replication
  • 61. Copyright©2018 NTT Corp. All Rights Reserved. 61 • pitfalls: Restoring data backed up by pg_dump at a subscriber causes initial table synchronization again. It usually makes the replication stop due to key duplication error. 6. Logical Replication may need additional procedure Publisher Subscriber (1)restore pg_dump TableTableTable TableTableTable (3)conflict (2)replication
  • 62. Copyright©2018 NTT Corp. All Rights Reserved. 62 • lessons: We can avoid this resyncing by refresh subscription with 'copy_data = false‘. But if a subscription has tables which have not completed the initial sync, we need more work.. It's better to consider well what data is really necessary and how to prevent data loss. In some cases it may be better to start replication from scratch. 6. Logical Replication may need additional procedure
  • 63. Copyright©2018 NTT Corp. All Rights Reserved. 63 SUMMARY
  • 64. Copyright©2018 NTT Corp. All Rights Reserved. 64 Design Take into account some counterintuitive behaviors which cause performance impact. • ‘walsender’ keeps changes in memory • In sync replication, publishers wait for COMMIT from all the subscribers even which have no change. • Changes on non-subscribed tables are also decoded. How should we manage Logical Replication?
  • 65. Copyright©2018 NTT Corp. All Rights Reserved. 65 Monitoring • Monitor memory usage on publisher. • Monitor not only pg_stat_replication but pg_subscription_rel. How should we manage Logical Replication?
  • 66. Copyright©2018 NTT Corp. All Rights Reserved. 66 Operation • pg_replication_origin_advance() may skip some data. • Backup and restore need some extra procedures, It's better to consider well what data is really necessary and how to prevent data loss. How should we manage Logical Replication?
  • 67. Copyright©2018 NTT Corp. All Rights Reserved. 67 Thank you ! torikoshi_atsushi_z2@lab.ntt.co.jp @atorik_shi