Geographically Distributed Multi-Master MySQL Clusters

© 2014 VMware Inc. All rights reserved.
Geographically Distributed
Multi-Master MySQL Clusters
Featuring VMware Continuent
Jeff Mace
2/10/2015

VMware Continuent Quick Introduction
History Products
VMware Continuent
Industry-leading clustering and
replication for open source DBMS
Clustering – Commercial-grade
HA, performance scaling and
data management for MySQL
Replication – Flexible, high-
performance data movement
2004 Continuent established in
USA
2009 3rd Generation Continuent
Tungsten (aka VMware
Continuent) ships
2014 100+ customers running
business critical applications
Oct 2014 Acquisition by VMware: Now
part of the vCloud Air
Business Unit
Oct 2015 Continuent solutions available
through VMware sales

VMware Continuent Facts
Business Critical Deployment Examples
High Availability for
MySQL
Largest cluster deployment performs 800M+ transactions/
day on 275 TB of relational data
Business Continuity Cross-site cluster topologies widely deployed including
Primary/DR and Multi-Master
High Performance
Replication
Largest installations transfer billions of transactions daily
using high speed, parallel replication
Heterogeneous
Integration
Customers replicate from MySQL to Oracle, Hadoop,
Redshift, Vertica and others
Real-time Analytics Optimized data loading for data warehouses with
deployments of up to 200 MySQL masters feeding
Hadoop

Select VMware Continuent Customers

Registering and Updating Distributed Devices
•  There are many handheld and portable devices that move between
cities and regions.
•  These devices need constant access to pull data or updates.
•  DNS routes the user to an available site with the lowest latency
•  Each site has full access to modify data for the user
•  Benefit: Transactions are always processed at the fastest location

Europe Asia/Pacific
New Jersey1ms
2ms
1.5ms

Credit Card Transaction Processing
•  Application connectivity to a particular location isn’t guaranteed
•  Processing apps use a list of potential sites
•  Connectivity to each site is attempted until successful
•  The application proceeds as normal with the successful site
•  Benefit: Automatic failover to any data center by the application

Europe Asia/Pacific
New Jersey3
2
1

Remote Office Connectivity
•  Local sites potentially subject to prolonged outages
•  Business processing continues at other sites
•  Replication resumes when connectivity is restored
•  Benefit: Each data center is able to run independently

Europe Asia/Pacific
New Jersey

Introducing Multi-Site
Multi-Master (MSMM)

CROSS-REGION
REPLICATION
Public Internet or VMware NSX Secure Gateway
vCLOUD AIR
VIRTUAL DATA CENTER
ON-PREMISES
DATA CENTER
DB2.CA
SLAVE
DB1.CA
MASTER
DB3.CA
SLAVE
Continuent Connector Continuent Connector
DB2.NJ
SLAVE
DB1.NJ
MASTER
DB3.NJ
SLAVE
Asynchronous
Multi-Master
MSMM Topology

MSMM Components
•  Each location runs an independent master/slave cluster.
•  The cluster is responsible for ensuring local availability.
•  An additional replicator is installed on every server.
•  The additional replicator replicates data from one or more remote
locations to the local server.
•  Replication data is never written to the slave binary log.

MSMM Replication
•  Each node runs an additional replicator to apply data
from remote data centers.
•  Each cluster should run with 3 nodes (1 is hidden).
vCLOUD AIR
VIRTUAL DATA CENTER
DB2.NJ
DB1.NJ
CA
NJ
CA
NJ
NJ
MASTER
ON-PREMISES
DATA CENTER
DB2.CA
DB1.CA
CA
NJ
CA
NJ
CA
MASTER

Local Failover
ON-PREMISES
DATA CENTER
DB2.CA
SLAVE
DB1.CA
MASTER
DB3.CA
SLAVE
X
•  The master has failed.
•  The manager on a remaining server identifies the failure.

Local Failover
ON-PREMISES
DATA CENTER
DB2.CA
MASTER
DB1.CA
SHUNNED
DB3.CA
SLAVE
X
•  The failed server is shunned.
•  One of the slaves is promoted to master.
•  The connectors send writes to the new master.

Local Failover
ON-PREMISES
DATA CENTER
DB2.CA
MASTER
DB1.CA
SLAVE
DB3.CA
SLAVE
•  An administrator checks the failed server and corrects the issue.
•  They run a recover command to reintroduce the server.

MSMM Replication After Failover
•  Each node runs an additional replicator to apply data
from remote data centers.
•  Each cluster should run with 3 nodes (1 is hidden).
vCLOUD AIR
VIRTUAL DATA CENTER
DB2.NJ
DB1.NJ
CA
NJ
CA
NJ
NJ
MASTER
ON-PREMISES
DATA CENTER
DB2.CA
DB1.CA
CA
NJ
CA
NJ
CA
MASTER
X

The Case Against Many Sites
•  It is difficult.
•  The application, database and operations teams must all build with
multi-site multi-master in mind.
•  The topology implies that no site will ever be fully consistent with all
locations. There will always be pending transactions.

Application Design
•  Shard users/locations into different schemas if possible
•  Design the application with heavy INSERT workloads
•  Avoid UPDATE/DELETE statements of the same data from different
locations
•  Move batch operations to a single location
•  Limit the size of commits for batch operations

Database Design
•  Use InnoDB or another ACID compliant table engine
•  Use truly unique primary keys like auto_increment or UUID
•  Enable auto_increment_increment and
auto_increment_offset to prevent conflicts between sites
•  Limit the use of triggers or prepare them for row-based replication with
Tungsten Replicator.
•  Use NTP and consistent time zones on every server

Operations Plan
•  Handle DNS management with a global DNS system
•  Configure 24/7 monitoring of all sites
•  Schedule regular runs of pt-table-checksum
–  Expect some level of differences due to the nature of MSMM
–  Monitor results for an increasing number of differences
•  Schedule regular backups in each location

Synchronous Clustering Fails Over Distance
•  Synchronous replication adds server performance drag (Daniel Abadi,
Yale University)
•  Global locks create exploding deadlock problems (Jim Gray, Microsoft)
•  Strong consistency between DBMS requires some/all regions to stop
when network fails (CAP proof by Nancy Lynch, MIT)

Multi-Master is Best for Multi-Site Operation
Europe Asia/Pacific
New Jersey
Optimized
performance
for users
SQL transaction
processing
in any region
Local
high-availability
in any region
Continuous
updates across
regions

For more information:
Eero Teerikorpi
Sr. Director, Strategic Alliance
eteerikorpi@vmware.com
+1 (408) 431 3305
Robert Noyes
Alliance Manager, USA & Canada
rnoyes@vmware.com
+1 (650) 575-0958
Philippe Bernard
Alliance Manager, EMEA & APAC
pbernard@vmware.com
+41 79 347 1385

Geographically Distributed Multi-Master MySQL Clusters

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Geographically Distributed Multi-Master MySQL Clusters

Similar to Geographically Distributed Multi-Master MySQL Clusters (20)

More from Continuent

More from Continuent (20)

Recently uploaded

Recently uploaded (20)

Geographically Distributed Multi-Master MySQL Clusters