Slides for the webinar held on January 21st 2014
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters
Galera Cluster, NDB Cluster, VIP with HAProxy and Keepalived, MongoDB Sharded Cluster, etc. all have their own availability models. We are aware of these availability models and will demonstrate in this webinar how to take corrective action in case of failures via our cluster management tool, ClusterControl.
In this webinar, Severalnines CTO Johan Andersson will show you how to leverage ClusterControl to detect failures in your database cluster and automatically repair them to maximize the availability of your database services. And Codership CEO Seppo Jaakola will be joining Johan to provide a deep-dive into Galera recovery internals.
Agenda:
Redundancy models for Galera, NDB and MongoDB/TokuMX
Failover & Recovery (Automatic vs Manual)
Zooming into Galera recovery procedures
Split brains in multi-datacenter setups
3. Node Recovery Scenarios
Node drops from cluster gracefully and joins back
Replication state is stored in grastate.dat le
Joining happens by Incremental State Transfer (IST)
●
●
Joining after node crash
●
●
Node has either known or unknown state
Joining can happen by IST or full State Snapshot Transfer (SST)
is required
Full Cluster recovery
●
●
●
●
e.g. data center power down
All nodes with known or unknown states
The node with latest known state must be identi ed
New cluster needs to be bootstrapped
www.codership.com
3
6. Automatic Node Joining
Cluster selects donor to help
the joiner to join
Send state
MySQL
joiner
Donor
IST or SST
Galera Replication
www.codership.com
6
10. Incremental State Transfer
●
●
●
Every node in Galera Cluster has a log of replicated write
sets: gcache
Gcache is mmap le, available disk space is upper limit
for size allocation
If joining node has past history in the cluster and donor
has long enough gcache containing joiner's seqno
position => then IST can be used for synchronization
www.codership.com
10
11. Incremental State Transfer
Request to join
Node-1
GTID: seqno-n
Node-n
Joiner
Donor
seqno-n+m
grastate.dat
seqno-n
gcache
Group ID:seqno
gcache
www.codership.com
11
13. Incremental State Transfer
●
●
●
Node synchronization by IST is very e/ective and least
intrusive method for the donor
gcache.size parameter de nes how big cache will be
maintained
Use database size and write rate to optimize gcache:
➢
➢
●
●
gcache < database size
Write rate de nes how long tail is available in cache
If joiner node had crashed and IST was used to
synchronize it back, then it is essential that InnoDB
recovery works (innodb_doublewrite)
If IST is not possible, donor will switch automatically to
SST method
www.codership.com
13
17. State Snapshot Transfer
●
●
●
●
●
To send full database state
wsrep_sst_method to choose the method:
➢ mysqldump
➢ rsync
➢ Xtrabackup
Open API for creating new SST methods
All SST methods cause at least some service break in
donor node
If node has crashed, InnoDB recovery will happen
during startup. But with SST, this InnoDB recovery is
more or less useless
www.codership.com
17
19. Full Cluster Recovery
All nodes dropped from cluster:
1. Find the node which has latest changes
2. Bootstrap new cluster from the latest node
www.codership.com
19
20. Node With Latest Changes
Check grastate.dat les:
1. File has valid seqno
# GALERA saved state
version: 2.1
uuid:
5ee99582-bb8d-11e2-b8e3-23de375c1d30
seqno:
8204503945773
●
Graceful shutdown
●
Find node which has biggest seqno
2. No seqno, but group ID is there
# GALERA saved state
version: 2.1
uuid:
5ee99582-bb8d-11e2-b8e3-23de375c1d30
seqno:
-1
●
Crash during transaction processing
●
Use –wsrep-recover to dig out the last seqno
3. No seqno, no group ID
●
# GALERA saved state
version: 2.1
uuid:
00000000-0000-0000-0000-000000000000
seqno:
-1
Crash during DDL
http://www.codership.com/wiki/doku.php?id=mysql_galera_restart
www.codership.com
20
21. --wsrep-recover
MySQL stores last committed GTID in InnoDB data
header, transactionally
●
This GTID can be read by starting mysqld with
–wsrep-recover option
●
<path to bin>/mysqld
–wsrep-recover –defaults- le=<path to my.cnf>
●
Mysqld will read InnoDB header les and shutdown immediately
●
Last wsrep position is printed in mysql error le
130514 18:39:13 [Note] WSREP: Recovered position: 5ee99582-bb8d-11e2-b8e3-23de375c1d30:8204503945771
www.codership.com
21
22. Bootstrapping New Cluster
When the latest node has been identi ed, start this
node as rst node in cluster
●
●
●
service mysql start –wsrep_new_cluster
service mysql start –wsrep_cluster_address=gcomm://
Start all other nodes. my.cnf should have
wsrep_cluster_address pointing to all other nodes
●
●
●
service mysql start
Don't re all nodes at once, rather start them one by one
www.codership.com
22