Disaster Recovery Planning for MySQL & MariaDB

Disaster Recovery
Planning for MySQL &
MariaDB
Bart Oles
Severalnines

Free to download
Initial 30 days Enterprise trial
Converts into free Community Edition
Enterprise / paid versions available

Automation & Management
Deployment (Free Community)
● Deploy a Cluster in Minutes
○ On-Prem
○ Cloud (AWS/Azure/Google) - paid
Monitoring (Free Community)
● Systems View with 1 sec Resolution
● DB / OS stats & Performance Advisors
● Configurable Dashboards
● Query Analyzer
● Real-time / historical
Management (Paid Features)
● Backup Management
● Upgrades & Patching
● Security & Compliance
● Operational Reports
● Automatic Recovery & Repair
● Performance Management
● Automatic Performance Advisors

Supported Databases

Our Customers

Agenda
- How it is implemented?
- What is encrypted:
- Tablespaces?
- General tablespace?
- Double write buffer/parallel double write buffer?
- Temporary tablespaces? (KEY BLOCKS)
- Binlogs?
- Slow/general/error logs?
- MyISAM? MyRocks? X?
- Performance overhead.
- Backups?
- Transportable tablespaces. Transfer key.
- Plugins
- Keyrings in general
- Key rotation?
- General-Purpose Keyring Key-Management
Functions
- Keyring_file
- Is useful? How to make it profitable?
- Keyring Vault
- How does it work?
- How to make a transition from keyring_file

Business Considerations for Disaster Recovery

What is Disaster Recovery?
● Failures
○ Operational (power, network, IT systems)
○ Natural (hurricane, flood, fire, earthquake)
○ Human caused (operator error, malicious
activity, terrorism)
● Drivers
○ How fast can we get up and running
○ What data have we lost
○ How can we reduce risk
Policies, tools & procedures that ensure your data is secure
and protected in case of an outage or serious catastrophe

Uptime Guarantees - Why Compromise?

“We Offer 100% Availability, But We Exclude… “
● Planned outages
○ e.g., server or network maintenance
● Failure of network, power or facilities
delivered by an upstream provider
● DOS attacks, hacker activity or other
malicious events
● Acts of God
○ e.g., weather related - hurricane, flood

Defining Disaster Recovery

Stockholm to Oslo Train Breaks Down After One Hour
x

Recovery Point Objective (RPO)

RTO + RPO = 0 ?
Database
Replication
Storage
Clustering
Data
Integrity
Security
Load
Balancing
Network
Bonding
File
Replication
DB
Clustering

“Everything Fails,
All the Time”
Werner Vogels

Disaster Recovery Tiers

Matching Disaster Recovery Plans to the Business

Backup with No Hot Site
● Physical vs Logical backup
○ High impact on RTO
● Combine Full & Incremental
○ PITR-compatible to reduce RPO
● Schrödinger’s backup
○ “The condition of any backup is
unknown until a restore is attempted”
● Encryption
● Keep a copy of latest backup in active site

Backup Retention
● Local Server
○ Up to 1 week
● Local Datacenter
○ Up to 2 weeks
● Remote Datacenter
○ Up to 4 weeks
○ Plus keep monthly backups &
annual backups as required

Backup with Hot Site
● We can reinstall DBs and apps
from scratch and restore data
● Recovery time predictable
● In case of AWS, pre-configured
AMIs can be used to quickly
provision the application
environment

Asynchronous Replication to Hot Site
● Low RTO
○ ‘Almost current’ data
enables fast failover
● Low RPO
● Add a delayed slave to guard
against operator error
● Backup still important

Synchronous Replication to Hot Site
● Highest tier of DR
○ Minimal RPO and RTO
● Data on primary site and hot
sites have same transactional
state
○ Failover instantaneous
and automatic
● Failure detection time is main
culprit that adds to RTO
● 3 sites to avoid network
partitioning

Synchronous Replication to Hot Site - Database Proxy

Concluding Remarks

Reality Check
Disaster Recovery Planning & Testing
Source: https://www.zetta.net/resource/state-disaster-recovery-2016

Source:https://uptimeinstitute.com/about-
ui/press-releases/uptime-institute-annual-
survey-results-enterprise-owned-data-
centers-still-primary-compute-venue
Geographically Distributed Datacenters on the increase
Uptime Institute Data Center Industry Survey (2017)

Failover the New Normal
● Failover used to be a complex
procedure
○ Required lot of staff
○ Required availability of VPs /
technology heads
● In modern distributed infrastructure,
design for failure
● Considerations
○ How many sites?
○ How to route users to sites?
○ What goes into a failover?

Source: https://severalnines.com/blog/database-tco-calculating-total-cost-ownership-mysql-management
Calculating Database TCO - Colocation

Source: https://severalnines.com/blog/database-tco-calculating-total-cost-ownership-mysql-management
Calculating Database TCO - Cloud

Additional Resources
Free Disaster
Recovery Whitepaper
severalnines.com/res
ources/whitepapers

Free Database
Backup Whitepaper
severalnines.com/res
ources/whitepapers

● Calculating Database TCO
○ https://severalnines.com/blog/database-tco-calculating-total-cost-
ownership-mysql-management
● Multi-DC setups for MySQL & MariaDB
○ https://severalnines.com/blog/multiple-data-center-setups-using-galera-
cluster-mysql-or-mariadb
● Contact us: info@severalnines.com

Disaster Recovery Planning for MySQL & MariaDB

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Disaster Recovery Planning for MySQL & MariaDB

Similar to Disaster Recovery Planning for MySQL & MariaDB (20)

More from Severalnines

More from Severalnines (20)

Recently uploaded

Recently uploaded (20)

Disaster Recovery Planning for MySQL & MariaDB