SlideShare a Scribd company logo
1 of 39
Download to read offline
Binlog Servers
at
Booking.com
Jean-François Gagné
jeanfrancois DOT gagne AT booking.com
Presented at Oracle Open World 2015
Booking.com
2
Booking.com‟
● Based in Amsterdam since 1996
● Online Hotel and Accommodation Agent:
● 170 offices worldwide
● +819.000 properties in 221 countries
● 42 languages (website and customer service)
● Part of the Priceline Group
● And we use MySQL:
● Thousands (1000s) of servers, ~85% replicating
● >110 masters: ~25 >50 slaves & ~8 >100 slaves
3
Binlog Server: Session Summary
1. Replication and the Binlog Server
2. Extreme Read Scaling
3. Remote Site Replication (and Disaster Recovery)
4. Easy High Availability
5. Other Use-Cases
(Crash Safety, Parallel Replication and Backups)
6. Binlog Servers at Booking.com
7. New master without touching slaves
4
Binlog Server: Replication
● One master / one or more slaves
● The master records all writes in a journal: the binary logs
● Each slave:
● Downloads the journal and saves it locally (IO thread): relay logs
● Executes the relay logs on the local database (SQL thread)
● Could produce binary logs to be itself a master (log-slave-updates)
● Replication is:
● Asynchronous  lag
● Single threaded (in MySQL 5.6)  slower than the master
5
Binlog Server: Booking.com‟‟
● Typical replication deployment:
+---+
| M |
+---+
|
+------+-- ... --+---------------+-------- ...
| | | |
+---+ +---+ +---+ +---+
| S1| | S2| | Sn| | M1|
+---+ +---+ +---+ +---+
|
+-- ... --+
| |
+---+ +---+
| T1| | Tm|
+---+ +---+
● Si and Tj are for read scaling
● Mi are the DR master
6
Intermediate Master
:-(
Binlog Server: What
● Binlog Server (BLS): is a daemon that:
● Downloads the binary logs from the master
● Saves them identically as on the master
● Serves them to slaves
● A or X are the same for B and C:
● By design, the binary logs served by A and X are the same
7
+---+ / 
| A | ---> / X 
+---+ -----
| |
+---+ +---+
| B | | C |
+---+ +---+
Binlog Server: Read Scaling
● Typical replication topology for read scaling:
+---+
| M |
+---+
|
+--------+--------+--- ... ---+
| | | |
+---+ +---+ +---+ +---+
| S1| | S2| | S3| | Sn|
+---+ +---+ +---+ +---+
● When there are too many slaves, the network of M is overloaded:
● 100 slaves x 1Mbit/s: very close to 1Gbit/s
● OSC or purging data in RBR becomes hard
 Slave lag or unreachable master for writes
8
Binlog Server: Read Scaling‟
● Typical solution: fan-out with Intermediary Masters (IM):
+---+
| M |
+---+
|
+--------+- ... -+
| | |
+---+ +---+ +---+
| M1| | M2| | Mm|
+---+ +---+ +---+
| | |
+-- ... +- ... +-- ... --+
| | | |
+---+ +---+ +---+ +---+
| S1| | T1| | Z1| | Zi|
+---+ +---+ +---+ +---+
9
● But Intermediate Masters bring problems:
● log-slave-updates  IM are slower than slaves
● Lag of an IM  all its slaves are lagging
● Rogue transaction on IM  infection of all its slave
● Failure of an IM  all its slaves stop replicating
(and action must be taken fast)
Binlog Server: Read Scaling‟‟
● Solving IM problems with shared disk:
● Filers (expensive) or DRBD (doubling the number of servers)
● sync_binlog = 1 + trx_commit = 1  slower replication  lag
● After a crash of an Intermediate Master:
● we need InnoDB recovery  replication on slaves stalled  lag
● and the cache is cold  replication will be slow  lag
● Solving IM problems with GTIDs:
● They allow slave repointing at the cost of added complexity :-|
● But they do not completely solve the lag problem :-(
● And we cannot migrate online with MySQL 5.6 :-( :-(
10
Binlog Server: Read Scaling‟‟‟
● New Solution: replace IM by Binlog Servers
+---+
| M |
+---+
|
+----------------+----- ... -----+
| | |
/  /  / 
/ I1 / I2 / Im
----- ----- -----
| | |
+------+ ... +--- ... +--- ... ---+
| | | | |
+---+ +---+ +---+ +---+ +---+
| S1| | S2| | Si| | Sj| | Sn|
+---+ +---+ +---+ +---+ +---+
11
● BLS should not lag
● If a BLS fails, repoint its slaves
to other BLSs (easy by design)
Binlog Server: Remote Site
● Typical deployment for remote site:
+---+
| A |
+---+
|
+------+------+---------------+
| | | |
+---+ +---+ +---+ +---+
| B | | C | | D | | E |
+---+ +---+ +---+ +---+
|
+------+------+
| | |
+---+ +---+ +---+
| F | | G | | H |
+---+ +---+ +---+
12
Intermediate Master
:-(
E is an Intermediate Master
 same problems as read scaling
Binlog Server: Remote Site‟
● Ideally, we would like this:
+---+
| A |
+---+
|
+------+------+---------------+------+------+------+
| | | | | | |
+---+ +---+ +---+ +---+ +---+ +---+ +---+
| B | | C | | D | | E | | F | | G | | H |
+---+ +---+ +---+ +---+ +---+ +---+ +---+
● No lag and no Single Point of Failure (SPOF)
● But no master on remote site for writes (easy solvable problem)
● And expensive in WAN bandwidth (harder problem to solve)
13
Binlog Server: Remote Site‟‟
● New solution: a Binlog Server on the remote site:
+---+
| A |
+---+
|
+------+------+---------------+
| | | |
+---+ +---+ +---+ / 
| B | | C | | D | / X 
+---+ +---+ +---+ -----
|
+------+------+------+
| | | |
+---+ +---+ +---+ +---+
| E | | F | | G | | H |
+---+ +---+ +---+ +---+
14
Binlog Server: Remote Site‟‟‟
● Or deploy 2 Binlog Servers to get better resilience:
+---+
| A |
+---+
|
+------+------+---------------+
| | | |
+---+ +---+ +---+ /  / 
| B | | C | | D | / X  ------> / Y 
+---+ +---+ +---+ ----- -----
| |
+------+ +------+
| | | |
+---+ +---+ +---+ +---+
| E | | F | | G | | H |
+---+ +---+ +---+ +---+
15
● If Y fails, repoint G and H to X,
● If X fails, repoint Y to A and E and F to Y
Binlog Server: Remote Site‟‟‟ ‟
● Interesting property: if A fails, E, F, G & H converge to a common state
+---+
| A |
+---+
|
+------+------+---------------+
| | | |
+---+ +---+ +---+ / 
| B | | C | | D | / X 
+---+ +---+ +---+ -----
|
+------+------+------+
| | | |
+---+ +---+ +---+ +---+
| E | | F | | G | | H |
+---+ +---+ +---+ +---+
● New master promotion is easy on remote site
16
Binlog Server: Remote Site‟‟‟ ‟‟
● Step by step master promotion:
1. The 1st slave that is up to date can be the new master
2. “SHOW MASTER STATUS” or “RESET MASTER”,
and “RESET SLAVE ALL” on the new master
3. Writes can be pointed to the new master
4. Once a slave is up to date, repoint it
to the new master at the position of step # 2
5. Keep delayed/lagging slaves under X until up to date
6. Once no slaves is left under X, recycle it as
a Binlog Server for the new master
17
/ 
/ X 
-----
|
+-------------+
|
+---+
| G |
+---+ +---+
| F |
+---+
|
+------+
| |
+---+ +---+
| E | | H |
+---+ +---+
Binlog Server: High Availability
● This property can be used for high availability:
+---+
| A |
+---+
|
|
/ 
/ X 
-----
|
+------+------+------+------+------+------+------+
| | | | | | | |
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
| B | | C | | D | | E | | F | | G | | H | | I |
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
18
Binlog Server: Other Use-Cases
● Better Crash-Safe Replication
● http://blog.booking.com/better_crash_safe_replication_for_mysql.html
● Better Parallel Replication in MySQL 5.7 (LOGICAL_CLOCK)
● http://blog.booking.com/better_parallel_replication_for_mysql.html
● Easier Point in Time Recovery
● http://jfg-mysql.blogspot.com/
2015/10/binlog-servers-for-backups-and-point-in-time-recovery.html
19
Binlog Server: Better // Replication
● Four transactions on X, Y and Z:
+---+
| X |
+---+
|
V
+---+
| Y |
+---+
|
V
+---+
| Z |
+---+
● IM might stall the parallel replication pipeline
● To benefit from parallel replication, IM must disappear
● The Binlog Server allows exactly that
20
On Y:
----Time---->
B---C
B---C
B-------C
B-------C
On Z:
----Time--------->
B---C
B---C
B-------C
B-------C
On X:
----Time---->
T1 B---C
T2 B---C
T3 B-------C
T4 B-------C
Binlog Server: Point in Time Recovery
● Implementing Point in Time Recovery means:
● to regularly take a backup of the database,
● and to save the binary logs of that database.
● Executing Point in Time Recovery means:
● Restoring the backup,
● Applying the binary logs.
+---+ /  +---+
| M | ----> / X  ----> | S |
+---+ ----- +---+
21
BLS@Booking.com
● Reminder: typical deployment at Booking.com:
+---+
| M |
+---+
|
+--- ... ---+ ... +-------------+--- ...
| | | |
+---+ +---+ +---+ +---+
| S1| | Sj| | Sn| | M1|
+---+ +---+ +---+ +---+
|
+-- ... --+
| |
+---+ +---+
| T1| | To|
+---+ +---+
22
BLS@Booking.com‟
● We are deploying Binlog Server Clusters to offload masters:
+---+
| M |
+---+
|
+-----+-----+ ... +-------------+--- ...
| | | | |
/  /  +---+ +---+ +---+
/ A1 / A2 | Sj| | Sn| | M1|
----- ----- +---+ +---+ +---+
| | |
+-+-----+-+ ... -+-----+
| | | |
+---+ +---+ /  / 
| S1| ... | Si| / Z1 / Z2
+---+ +---+ ----- -----
23
● We have in production:
● >40 Binlog Servers
● >20 BLS Clusters
● >650 slaves replicating
from Binlog Servers
BLS@Booking.com‟‟
● What is a Binlog Server Cluster ?
● At least 2 Binlog Servers
● Replicating from the same master
● With independent failure mode (not same switch/rack/…)
● With a Service DNS entry resolving to all IP addresses
 Failure of a BLS transparent to slaves
● Thanks to DNS, the slaves connected to a failing
Binlog Server reconnect to the others
 Easy maintenance/upgrade of a Binlog Server
24
|
+-----+
| |
/  / 
/ A1 / A2
----- -----
| |
+-+-----+-+
| |
+---+ +---+
| S1| ... | Si|
+---+ +---+
BLS@Booking.com‟‟‟
● We are deploying BLS side-by-side with IM to reduce delay:
+---+
| M |
+---+
|
+-----+-----+ ... +-------------+-----+----------- ...
| | | | | |
/  /  +---+ +---+ +---+ / -->/ 
/ A1 / A2 | Sj| | Sn| | M1| / B1 / B2
----- ----- +---+ +---+ +---+ ----- -----
| | | | |
+-+-----+-+ ... +-+-----+-+
| | | |
+---+ +---+ +---+ +---+
| S1| ... | Si| | Tk| ... | To|
+---+ +---+ +---+ +---+
25
BLS@Booking.com‟‟‟ ‟
● We are deploying a new Data Center without IM:
+---+
| M |
+---+
|
+-----+-----+ ... +-------------+-----+---------------------+
| | | | | | |
/  /  +---+ +---+ +---+ / -->/  / -->/ 
/ A1 / A2 | Sj| | Sn| | M1| / B1 / B2 / C1 / C2
----- ----- +---+ +---+ +---+ ----- ----- ----- -----
| | | | | | |
+-+-----+-+ ... +-+-----+-+ +-+-----+-+
| | | | | |
+---+ +---+ +---+ +---+ +---+ +---+
| S1| ... | Si| | Tk| ... | To| | U1| ... | Up|
+---+ +---+ +---+ +---+ +---+ -----
26
HA with Binlog Servers
● Distributed Binlog Serving Service (DBSS):
+---+
| M |
+---+
|
+----+----------------------------------------------------------+
| |
+----+-------+--------------+-------+--------------+-------+----+
| | | | | |
+---+ +---+ +---+ +---+ +---+ +---+
| S1|...| Sn| | T1|...| Tm| | U1|...| Uo|
+---+ +---+ +---+ +---+ +---+ +---+
● Properties:
● A single Binlog Server failure does not disrupt the service (resilience)
● Minimise inter Data Center bandwidth requirements
● Allows to promote a new master without touching any slave
27
HA with Binlog Servers‟
● Zoom in DBSS:
|
+----|---------------------------------------------------------+
| | |
| +----------------------+---------------------+ |
| | | | |
| /  /  /  |
| / --->/  / --->/  / --->/  |
| ----- /  ----- /  ----- /  |
| | ----- | ----- | ----- |
+----|-------|--------------|-------|-------------|-------|----+
| | | | | |
28
HA with Binlog Servers‟‟
● Crash of the master:
+--------------------------------------------------------------+
| |
| |
| |
| /  /  /  |
| / --->/  / --->/  / --->/  |
| ----- /  ----- /  ----- /  |
| | ----- | ----- | ----- |
+----|-------|--------------|-------|-------------|-------|----+
| | | | | |
29
HA with Binlog Servers‟‟
● Crash of the master:
● Step # 1: level the Binlog Servers (the slaves will follow)
+--------------------------------------------------------------+
| |
| |
| |
| /  <----------------- /  ----------------> /  |
| / --->/  / --->/  / --->/  |
| ----- /  ----- /  ----- /  |
| | ----- | ----- | ----- |
+----|-------|--------------|-------|-------------|-------|----+
| | | | | |
30
HA with Binlog Servers‟‟‟
● Crash of the master:
● Step # 2: promote a slave as the new master (there is a trick)
|
+-------------------------------------------------|------------+
| | |
| +----------------------+---------------------+ |
| | | | |
| /  /  /  |
| / --->/  / --->/  / --->/  |
| ----- /  ----- /  ----- /  |
| | ----- | ----- | ----- |
+----|-------|--------------|-------|-------------|-------|----+
| | | | | |
31
HA with Binlog Servers‟‟‟ ‟
● Crash of the master - the trick:
● Needs the same binary log filename on master and slaves
1. “FLUSH BINARY LOGS” on candidate master
until its binary log filename follows the one available on the BLSs
2. On the new master:
● “PURGE BINARY LOGS TO „<latest binary log file>'”
● “RESET SLAVE ALL”
3. Point the writes to the new master
4. Make the Binlog Servers replicate from the new master
● From the point of view of the Binlog Server, the master only rebooted
with a new ServerID and a new UUID.
32
New Master wo Touching Slaves
+---+
| M |
+---+
|
+----+----------------------------------------------------+
| |
+----+-------+-----------+-------+-----------+-------+----+
| | | | | |
+---+ +---+ +---+ +---+ +---+ +---+
| S1|...| Sn| | T1|...| Tm| | U1|...| Uo|
+---+ +---+ +---+ +---+ +---+ +---+
33
New Master wo Touching Slaves
34
+-/+
| X |
+/-+
+---------------------------------------------------------+
| |
+----+-------+-----------+-------+-----------+-------+----+
| | | | | |
+---+ +---+ +---+ +---+ +---+ +---+
| S1|...| Sn| | T1|...| Tm| | U1|...| Uo|
+---+ +---+ +---+ +---+ +---+ +---+
New Master wo Touching Slaves
35
+-/+ +---+
| X | | T1|
+/-+ +---+
|
+------------------------+--------------------------------+
| |
+----+-------+-----------+-------+-----------+-------+----+
| | | | | |
+---+ +---+ +---+ +---+ +---+ +---+
| S1|...| Sn| | T2|...| Tm| | U1|...| Uo|
+---+ +---+ +---+ +---+ +---+ +---+
New Master wo Touching Slaves‟
36
● “FLUSH BINARY LOGS” in a loop is ugly (but it works)
● A “RESET MASTER at/to „binlog.00xxxx‟”
would be much nicer:
● https://bugs.mysql.com/bug.php?id=77438
Binlog Servers with Orchestrator
37
● Orchestrator is the tool we use for managing Binlog Servers
● https://github.com/orchestrator/orchestrator
Binlog Server: Links
● http://blog.booking.com/mysql_slave_scaling_and_more.html
● http://blog.booking.com/abstracting_binlog_servers_and_mysql_master_promotion_wo_reconfig
uring_slaves.html
● HOWTO Install and Configure Binlog Servers:
http://jfg-mysql.blogspot.com/2015/04/maxscale-binlog-server-howto-install-and-configure.html
● http://blog.booking.com/better_crash_safe_replication_for_mysql.html
● http://blog.booking.com/better_parallel_replication_for_mysql.html
(http://blog.booking.com/evaluating_mysql_parallel_replication_2-slave_group_commit.html)
● http://jfg-mysql.blogspot.nl/2015/10/binlog-servers-for-backups-and-point-in-time-recovery.html
● Note: the Binlog Servers concept should work with any version of
MySQL (5.7, 5.6, 5.5 and 5.1)
38
Questions
Jean-François Gagné
jeanfrancois DOT gagne AT booking.com

More Related Content

Viewers also liked

Avoiding the ring of death
Avoiding the ring of deathAvoiding the ring of death
Avoiding the ring of deathAishvarya Verma
 
Media magazine covers
Media magazine covers Media magazine covers
Media magazine covers rjenkinss
 
Basic Knowledge on MySql Replication
Basic Knowledge on MySql ReplicationBasic Knowledge on MySql Replication
Basic Knowledge on MySql ReplicationTasawr Interactive
 
MySQL Group Replication
MySQL Group ReplicationMySQL Group Replication
MySQL Group ReplicationBogdan Kecman
 
A/B Testing - In data we trust
A/B Testing - In data we trustA/B Testing - In data we trust
A/B Testing - In data we trustPedro Marques
 
Riding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamRiding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamJean-François Gagné
 
Technical Introduction to PostgreSQL and PPAS
Technical Introduction to PostgreSQL and PPASTechnical Introduction to PostgreSQL and PPAS
Technical Introduction to PostgreSQL and PPASAshnikbiz
 
Mix ‘n’ Match Async and Group Replication for Advanced Replication Setups
Mix ‘n’ Match Async and Group Replication for Advanced Replication SetupsMix ‘n’ Match Async and Group Replication for Advanced Replication Setups
Mix ‘n’ Match Async and Group Replication for Advanced Replication SetupsPedro Gomes
 
New awesome features in MySQL 5.7
New awesome features in MySQL 5.7New awesome features in MySQL 5.7
New awesome features in MySQL 5.7Zhaoyang Wang
 
MySQL Day Paris 2016 - State Of The Dolphin
MySQL Day Paris 2016 - State Of The DolphinMySQL Day Paris 2016 - State Of The Dolphin
MySQL Day Paris 2016 - State Of The DolphinOlivier DASINI
 

Viewers also liked (11)

Historia 2
Historia 2Historia 2
Historia 2
 
Avoiding the ring of death
Avoiding the ring of deathAvoiding the ring of death
Avoiding the ring of death
 
Media magazine covers
Media magazine covers Media magazine covers
Media magazine covers
 
Basic Knowledge on MySql Replication
Basic Knowledge on MySql ReplicationBasic Knowledge on MySql Replication
Basic Knowledge on MySql Replication
 
MySQL Group Replication
MySQL Group ReplicationMySQL Group Replication
MySQL Group Replication
 
A/B Testing - In data we trust
A/B Testing - In data we trustA/B Testing - In data we trust
A/B Testing - In data we trust
 
Riding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamRiding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication Stream
 
Technical Introduction to PostgreSQL and PPAS
Technical Introduction to PostgreSQL and PPASTechnical Introduction to PostgreSQL and PPAS
Technical Introduction to PostgreSQL and PPAS
 
Mix ‘n’ Match Async and Group Replication for Advanced Replication Setups
Mix ‘n’ Match Async and Group Replication for Advanced Replication SetupsMix ‘n’ Match Async and Group Replication for Advanced Replication Setups
Mix ‘n’ Match Async and Group Replication for Advanced Replication Setups
 
New awesome features in MySQL 5.7
New awesome features in MySQL 5.7New awesome features in MySQL 5.7
New awesome features in MySQL 5.7
 
MySQL Day Paris 2016 - State Of The Dolphin
MySQL Day Paris 2016 - State Of The DolphinMySQL Day Paris 2016 - State Of The Dolphin
MySQL Day Paris 2016 - State Of The Dolphin
 

More from Jean-François Gagné

MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)Jean-François Gagné
 
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorAlmost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorJean-François Gagné
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyJean-François Gagné
 
The consequences of sync_binlog != 1
The consequences of sync_binlog != 1The consequences of sync_binlog != 1
The consequences of sync_binlog != 1Jean-François Gagné
 
Autopsy of a MySQL Automation Disaster
Autopsy of a MySQL Automation DisasterAutopsy of a MySQL Automation Disaster
Autopsy of a MySQL Automation DisasterJean-François Gagné
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentJean-François Gagné
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentJean-François Gagné
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyJean-François Gagné
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyJean-François Gagné
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialJean-François Gagné
 
MySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comMySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comJean-François Gagné
 
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...Jean-François Gagné
 
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsMySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsJean-François Gagné
 
The two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.comThe two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.comJean-François Gagné
 
How Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagHow Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagJean-François Gagné
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
 

More from Jean-François Gagné (17)

MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
 
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorAlmost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
The consequences of sync_binlog != 1
The consequences of sync_binlog != 1The consequences of sync_binlog != 1
The consequences of sync_binlog != 1
 
Autopsy of a MySQL Automation Disaster
Autopsy of a MySQL Automation DisasterAutopsy of a MySQL Automation Disaster
Autopsy of a MySQL Automation Disaster
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
 
MySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comMySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.com
 
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
MySQL Parallel Replication (LOGICAL_CLOCK): all the 5.7 (and some of the 8.0)...
 
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsMySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
 
The two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.comThe two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.com
 
Autopsy of an automation disaster
Autopsy of an automation disasterAutopsy of an automation disaster
Autopsy of an automation disaster
 
How Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagHow Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lag
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 

Recently uploaded

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Binlog Servers at Booking.com

  • 1. Binlog Servers at Booking.com Jean-François Gagné jeanfrancois DOT gagne AT booking.com Presented at Oracle Open World 2015
  • 3. Booking.com‟ ● Based in Amsterdam since 1996 ● Online Hotel and Accommodation Agent: ● 170 offices worldwide ● +819.000 properties in 221 countries ● 42 languages (website and customer service) ● Part of the Priceline Group ● And we use MySQL: ● Thousands (1000s) of servers, ~85% replicating ● >110 masters: ~25 >50 slaves & ~8 >100 slaves 3
  • 4. Binlog Server: Session Summary 1. Replication and the Binlog Server 2. Extreme Read Scaling 3. Remote Site Replication (and Disaster Recovery) 4. Easy High Availability 5. Other Use-Cases (Crash Safety, Parallel Replication and Backups) 6. Binlog Servers at Booking.com 7. New master without touching slaves 4
  • 5. Binlog Server: Replication ● One master / one or more slaves ● The master records all writes in a journal: the binary logs ● Each slave: ● Downloads the journal and saves it locally (IO thread): relay logs ● Executes the relay logs on the local database (SQL thread) ● Could produce binary logs to be itself a master (log-slave-updates) ● Replication is: ● Asynchronous  lag ● Single threaded (in MySQL 5.6)  slower than the master 5
  • 6. Binlog Server: Booking.com‟‟ ● Typical replication deployment: +---+ | M | +---+ | +------+-- ... --+---------------+-------- ... | | | | +---+ +---+ +---+ +---+ | S1| | S2| | Sn| | M1| +---+ +---+ +---+ +---+ | +-- ... --+ | | +---+ +---+ | T1| | Tm| +---+ +---+ ● Si and Tj are for read scaling ● Mi are the DR master 6 Intermediate Master :-(
  • 7. Binlog Server: What ● Binlog Server (BLS): is a daemon that: ● Downloads the binary logs from the master ● Saves them identically as on the master ● Serves them to slaves ● A or X are the same for B and C: ● By design, the binary logs served by A and X are the same 7 +---+ / | A | ---> / X +---+ ----- | | +---+ +---+ | B | | C | +---+ +---+
  • 8. Binlog Server: Read Scaling ● Typical replication topology for read scaling: +---+ | M | +---+ | +--------+--------+--- ... ---+ | | | | +---+ +---+ +---+ +---+ | S1| | S2| | S3| | Sn| +---+ +---+ +---+ +---+ ● When there are too many slaves, the network of M is overloaded: ● 100 slaves x 1Mbit/s: very close to 1Gbit/s ● OSC or purging data in RBR becomes hard  Slave lag or unreachable master for writes 8
  • 9. Binlog Server: Read Scaling‟ ● Typical solution: fan-out with Intermediary Masters (IM): +---+ | M | +---+ | +--------+- ... -+ | | | +---+ +---+ +---+ | M1| | M2| | Mm| +---+ +---+ +---+ | | | +-- ... +- ... +-- ... --+ | | | | +---+ +---+ +---+ +---+ | S1| | T1| | Z1| | Zi| +---+ +---+ +---+ +---+ 9 ● But Intermediate Masters bring problems: ● log-slave-updates  IM are slower than slaves ● Lag of an IM  all its slaves are lagging ● Rogue transaction on IM  infection of all its slave ● Failure of an IM  all its slaves stop replicating (and action must be taken fast)
  • 10. Binlog Server: Read Scaling‟‟ ● Solving IM problems with shared disk: ● Filers (expensive) or DRBD (doubling the number of servers) ● sync_binlog = 1 + trx_commit = 1  slower replication  lag ● After a crash of an Intermediate Master: ● we need InnoDB recovery  replication on slaves stalled  lag ● and the cache is cold  replication will be slow  lag ● Solving IM problems with GTIDs: ● They allow slave repointing at the cost of added complexity :-| ● But they do not completely solve the lag problem :-( ● And we cannot migrate online with MySQL 5.6 :-( :-( 10
  • 11. Binlog Server: Read Scaling‟‟‟ ● New Solution: replace IM by Binlog Servers +---+ | M | +---+ | +----------------+----- ... -----+ | | | / / / / I1 / I2 / Im ----- ----- ----- | | | +------+ ... +--- ... +--- ... ---+ | | | | | +---+ +---+ +---+ +---+ +---+ | S1| | S2| | Si| | Sj| | Sn| +---+ +---+ +---+ +---+ +---+ 11 ● BLS should not lag ● If a BLS fails, repoint its slaves to other BLSs (easy by design)
  • 12. Binlog Server: Remote Site ● Typical deployment for remote site: +---+ | A | +---+ | +------+------+---------------+ | | | | +---+ +---+ +---+ +---+ | B | | C | | D | | E | +---+ +---+ +---+ +---+ | +------+------+ | | | +---+ +---+ +---+ | F | | G | | H | +---+ +---+ +---+ 12 Intermediate Master :-( E is an Intermediate Master  same problems as read scaling
  • 13. Binlog Server: Remote Site‟ ● Ideally, we would like this: +---+ | A | +---+ | +------+------+---------------+------+------+------+ | | | | | | | +---+ +---+ +---+ +---+ +---+ +---+ +---+ | B | | C | | D | | E | | F | | G | | H | +---+ +---+ +---+ +---+ +---+ +---+ +---+ ● No lag and no Single Point of Failure (SPOF) ● But no master on remote site for writes (easy solvable problem) ● And expensive in WAN bandwidth (harder problem to solve) 13
  • 14. Binlog Server: Remote Site‟‟ ● New solution: a Binlog Server on the remote site: +---+ | A | +---+ | +------+------+---------------+ | | | | +---+ +---+ +---+ / | B | | C | | D | / X +---+ +---+ +---+ ----- | +------+------+------+ | | | | +---+ +---+ +---+ +---+ | E | | F | | G | | H | +---+ +---+ +---+ +---+ 14
  • 15. Binlog Server: Remote Site‟‟‟ ● Or deploy 2 Binlog Servers to get better resilience: +---+ | A | +---+ | +------+------+---------------+ | | | | +---+ +---+ +---+ / / | B | | C | | D | / X ------> / Y +---+ +---+ +---+ ----- ----- | | +------+ +------+ | | | | +---+ +---+ +---+ +---+ | E | | F | | G | | H | +---+ +---+ +---+ +---+ 15 ● If Y fails, repoint G and H to X, ● If X fails, repoint Y to A and E and F to Y
  • 16. Binlog Server: Remote Site‟‟‟ ‟ ● Interesting property: if A fails, E, F, G & H converge to a common state +---+ | A | +---+ | +------+------+---------------+ | | | | +---+ +---+ +---+ / | B | | C | | D | / X +---+ +---+ +---+ ----- | +------+------+------+ | | | | +---+ +---+ +---+ +---+ | E | | F | | G | | H | +---+ +---+ +---+ +---+ ● New master promotion is easy on remote site 16
  • 17. Binlog Server: Remote Site‟‟‟ ‟‟ ● Step by step master promotion: 1. The 1st slave that is up to date can be the new master 2. “SHOW MASTER STATUS” or “RESET MASTER”, and “RESET SLAVE ALL” on the new master 3. Writes can be pointed to the new master 4. Once a slave is up to date, repoint it to the new master at the position of step # 2 5. Keep delayed/lagging slaves under X until up to date 6. Once no slaves is left under X, recycle it as a Binlog Server for the new master 17 / / X ----- | +-------------+ | +---+ | G | +---+ +---+ | F | +---+ | +------+ | | +---+ +---+ | E | | H | +---+ +---+
  • 18. Binlog Server: High Availability ● This property can be used for high availability: +---+ | A | +---+ | | / / X ----- | +------+------+------+------+------+------+------+ | | | | | | | | +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ | B | | C | | D | | E | | F | | G | | H | | I | +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ 18
  • 19. Binlog Server: Other Use-Cases ● Better Crash-Safe Replication ● http://blog.booking.com/better_crash_safe_replication_for_mysql.html ● Better Parallel Replication in MySQL 5.7 (LOGICAL_CLOCK) ● http://blog.booking.com/better_parallel_replication_for_mysql.html ● Easier Point in Time Recovery ● http://jfg-mysql.blogspot.com/ 2015/10/binlog-servers-for-backups-and-point-in-time-recovery.html 19
  • 20. Binlog Server: Better // Replication ● Four transactions on X, Y and Z: +---+ | X | +---+ | V +---+ | Y | +---+ | V +---+ | Z | +---+ ● IM might stall the parallel replication pipeline ● To benefit from parallel replication, IM must disappear ● The Binlog Server allows exactly that 20 On Y: ----Time----> B---C B---C B-------C B-------C On Z: ----Time---------> B---C B---C B-------C B-------C On X: ----Time----> T1 B---C T2 B---C T3 B-------C T4 B-------C
  • 21. Binlog Server: Point in Time Recovery ● Implementing Point in Time Recovery means: ● to regularly take a backup of the database, ● and to save the binary logs of that database. ● Executing Point in Time Recovery means: ● Restoring the backup, ● Applying the binary logs. +---+ / +---+ | M | ----> / X ----> | S | +---+ ----- +---+ 21
  • 22. BLS@Booking.com ● Reminder: typical deployment at Booking.com: +---+ | M | +---+ | +--- ... ---+ ... +-------------+--- ... | | | | +---+ +---+ +---+ +---+ | S1| | Sj| | Sn| | M1| +---+ +---+ +---+ +---+ | +-- ... --+ | | +---+ +---+ | T1| | To| +---+ +---+ 22
  • 23. BLS@Booking.com‟ ● We are deploying Binlog Server Clusters to offload masters: +---+ | M | +---+ | +-----+-----+ ... +-------------+--- ... | | | | | / / +---+ +---+ +---+ / A1 / A2 | Sj| | Sn| | M1| ----- ----- +---+ +---+ +---+ | | | +-+-----+-+ ... -+-----+ | | | | +---+ +---+ / / | S1| ... | Si| / Z1 / Z2 +---+ +---+ ----- ----- 23 ● We have in production: ● >40 Binlog Servers ● >20 BLS Clusters ● >650 slaves replicating from Binlog Servers
  • 24. BLS@Booking.com‟‟ ● What is a Binlog Server Cluster ? ● At least 2 Binlog Servers ● Replicating from the same master ● With independent failure mode (not same switch/rack/…) ● With a Service DNS entry resolving to all IP addresses  Failure of a BLS transparent to slaves ● Thanks to DNS, the slaves connected to a failing Binlog Server reconnect to the others  Easy maintenance/upgrade of a Binlog Server 24 | +-----+ | | / / / A1 / A2 ----- ----- | | +-+-----+-+ | | +---+ +---+ | S1| ... | Si| +---+ +---+
  • 25. BLS@Booking.com‟‟‟ ● We are deploying BLS side-by-side with IM to reduce delay: +---+ | M | +---+ | +-----+-----+ ... +-------------+-----+----------- ... | | | | | | / / +---+ +---+ +---+ / -->/ / A1 / A2 | Sj| | Sn| | M1| / B1 / B2 ----- ----- +---+ +---+ +---+ ----- ----- | | | | | +-+-----+-+ ... +-+-----+-+ | | | | +---+ +---+ +---+ +---+ | S1| ... | Si| | Tk| ... | To| +---+ +---+ +---+ +---+ 25
  • 26. BLS@Booking.com‟‟‟ ‟ ● We are deploying a new Data Center without IM: +---+ | M | +---+ | +-----+-----+ ... +-------------+-----+---------------------+ | | | | | | | / / +---+ +---+ +---+ / -->/ / -->/ / A1 / A2 | Sj| | Sn| | M1| / B1 / B2 / C1 / C2 ----- ----- +---+ +---+ +---+ ----- ----- ----- ----- | | | | | | | +-+-----+-+ ... +-+-----+-+ +-+-----+-+ | | | | | | +---+ +---+ +---+ +---+ +---+ +---+ | S1| ... | Si| | Tk| ... | To| | U1| ... | Up| +---+ +---+ +---+ +---+ +---+ ----- 26
  • 27. HA with Binlog Servers ● Distributed Binlog Serving Service (DBSS): +---+ | M | +---+ | +----+----------------------------------------------------------+ | | +----+-------+--------------+-------+--------------+-------+----+ | | | | | | +---+ +---+ +---+ +---+ +---+ +---+ | S1|...| Sn| | T1|...| Tm| | U1|...| Uo| +---+ +---+ +---+ +---+ +---+ +---+ ● Properties: ● A single Binlog Server failure does not disrupt the service (resilience) ● Minimise inter Data Center bandwidth requirements ● Allows to promote a new master without touching any slave 27
  • 28. HA with Binlog Servers‟ ● Zoom in DBSS: | +----|---------------------------------------------------------+ | | | | +----------------------+---------------------+ | | | | | | | / / / | | / --->/ / --->/ / --->/ | | ----- / ----- / ----- / | | | ----- | ----- | ----- | +----|-------|--------------|-------|-------------|-------|----+ | | | | | | 28
  • 29. HA with Binlog Servers‟‟ ● Crash of the master: +--------------------------------------------------------------+ | | | | | | | / / / | | / --->/ / --->/ / --->/ | | ----- / ----- / ----- / | | | ----- | ----- | ----- | +----|-------|--------------|-------|-------------|-------|----+ | | | | | | 29
  • 30. HA with Binlog Servers‟‟ ● Crash of the master: ● Step # 1: level the Binlog Servers (the slaves will follow) +--------------------------------------------------------------+ | | | | | | | / <----------------- / ----------------> / | | / --->/ / --->/ / --->/ | | ----- / ----- / ----- / | | | ----- | ----- | ----- | +----|-------|--------------|-------|-------------|-------|----+ | | | | | | 30
  • 31. HA with Binlog Servers‟‟‟ ● Crash of the master: ● Step # 2: promote a slave as the new master (there is a trick) | +-------------------------------------------------|------------+ | | | | +----------------------+---------------------+ | | | | | | | / / / | | / --->/ / --->/ / --->/ | | ----- / ----- / ----- / | | | ----- | ----- | ----- | +----|-------|--------------|-------|-------------|-------|----+ | | | | | | 31
  • 32. HA with Binlog Servers‟‟‟ ‟ ● Crash of the master - the trick: ● Needs the same binary log filename on master and slaves 1. “FLUSH BINARY LOGS” on candidate master until its binary log filename follows the one available on the BLSs 2. On the new master: ● “PURGE BINARY LOGS TO „<latest binary log file>'” ● “RESET SLAVE ALL” 3. Point the writes to the new master 4. Make the Binlog Servers replicate from the new master ● From the point of view of the Binlog Server, the master only rebooted with a new ServerID and a new UUID. 32
  • 33. New Master wo Touching Slaves +---+ | M | +---+ | +----+----------------------------------------------------+ | | +----+-------+-----------+-------+-----------+-------+----+ | | | | | | +---+ +---+ +---+ +---+ +---+ +---+ | S1|...| Sn| | T1|...| Tm| | U1|...| Uo| +---+ +---+ +---+ +---+ +---+ +---+ 33
  • 34. New Master wo Touching Slaves 34 +-/+ | X | +/-+ +---------------------------------------------------------+ | | +----+-------+-----------+-------+-----------+-------+----+ | | | | | | +---+ +---+ +---+ +---+ +---+ +---+ | S1|...| Sn| | T1|...| Tm| | U1|...| Uo| +---+ +---+ +---+ +---+ +---+ +---+
  • 35. New Master wo Touching Slaves 35 +-/+ +---+ | X | | T1| +/-+ +---+ | +------------------------+--------------------------------+ | | +----+-------+-----------+-------+-----------+-------+----+ | | | | | | +---+ +---+ +---+ +---+ +---+ +---+ | S1|...| Sn| | T2|...| Tm| | U1|...| Uo| +---+ +---+ +---+ +---+ +---+ +---+
  • 36. New Master wo Touching Slaves‟ 36 ● “FLUSH BINARY LOGS” in a loop is ugly (but it works) ● A “RESET MASTER at/to „binlog.00xxxx‟” would be much nicer: ● https://bugs.mysql.com/bug.php?id=77438
  • 37. Binlog Servers with Orchestrator 37 ● Orchestrator is the tool we use for managing Binlog Servers ● https://github.com/orchestrator/orchestrator
  • 38. Binlog Server: Links ● http://blog.booking.com/mysql_slave_scaling_and_more.html ● http://blog.booking.com/abstracting_binlog_servers_and_mysql_master_promotion_wo_reconfig uring_slaves.html ● HOWTO Install and Configure Binlog Servers: http://jfg-mysql.blogspot.com/2015/04/maxscale-binlog-server-howto-install-and-configure.html ● http://blog.booking.com/better_crash_safe_replication_for_mysql.html ● http://blog.booking.com/better_parallel_replication_for_mysql.html (http://blog.booking.com/evaluating_mysql_parallel_replication_2-slave_group_commit.html) ● http://jfg-mysql.blogspot.nl/2015/10/binlog-servers-for-backups-and-point-in-time-recovery.html ● Note: the Binlog Servers concept should work with any version of MySQL (5.7, 5.6, 5.5 and 5.1) 38