Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 61

MySQL and MariaDB Backups

0

Share

Download to read offline

MySQL backups overview. Characteristics of every backup type, including dumps, Xtrabackup and snapshots. Planning proper backup strategies. Why and how to test backups.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

MySQL and MariaDB Backups

  1. 1. MySQL Backups Federico Razzoli
  2. 2. € whoami ● Federico Razzoli ● Freelance consultant ● Working with databases since 2000 hello@federico-razzoli.com federico-razzoli.com ● I worked as a consultant for Percona and Ibuildings (mainly MySQL and MariaDB) ● I worked as a DBA for fast-growing companies like Catawiki, HumanState, TransferWise
  3. 3. Agenda ● We’ll talk about different MySQL/MariaDB backup types and their characteristics ● Not many details about each solution ● We’ll also talk about planning backup strategies and testing backups
  4. 4. Planning Backups
  5. 5. Before Planning... ● Evaluate costs ● Evaluate risks ● Define Backup strategies
  6. 6. Costs ● Everything that increases your reliability: ○ Costs money. ○ Produces no money. ● Managers understand this very well and keep it in mind when you talk about backups
  7. 7. Costs ● Disasters: ○ Cost money. ○ Produces no money. ● Reduce disaster probability ○ Avoiding them is impossible ● Limit the damage they make ○ Serious Backup Strategies ● Then what is the “right” amount of money to spend for backups?
  8. 8. Data Loss Costs ● Cost to re-acquire data (if possible) ● Cost of services/products that cannot be sold ● Lifetime value of lost customers that are lost or not acquired ● Reputation cost, disappointed candidates…
  9. 9. Downtime Costs is the cost of the time elapsed until a backup is restored and the services back to life ● Cost of unsold services/products ● Lifetime value of customers that won’t return ● Reputation cost, Google rank, disappointed candidates...
  10. 10. Evaluate risks ● “If data is lost” is not a risk, it’s a generic concern ● Risks are, for example: ○ We drop a wrong table by mistake ○ Application deletes useful data because of a bug ○ A table gets corrupted during a write ○ Disk gets corrupted after a hardware failure ○ Cloud vendor is down ○ … ● Different risks are prevented in different ways
  11. 11. Defining Backup Strategies ● For each strategy, decide: ○ How the backup is taken ○ How often ○ Where the backup is stored ○ For how much time ○ How much time the backup can take ○ How much space the backup can take ○ How much time the restore can take ● You can easily see how some of these questions are inter-dependent ● You can have multiple backup levels: ○ Latest backup stored locally, uncompressed ○ 7 Daily backups in S3
  12. 12. Backup Types
  13. 13. Cold vs Hot ● Cold: stop MySQL and copy data ○ Downtime ○ Faster ● Hot: take backups while MySQL is running ○ Slow down ○ More complex
  14. 14. Physical vs Logical ● Physical: Set of files and directory ○ Usually takes more space (indexes are included) ○ Doing it while the server is running requires tools ● Logical: SQL statements to recreate a dataset ○ Can only be Hot ○ Can potentially be restored on any MySQL/MariaDB version ■ ...and even different SQL databases ○ Takes more time to take backup and to restore it ○ Inconsistent OR long transaction
  15. 15. Complete vs Partial ● Complete: All data ● Partial: A selected part of data ○ One or more databases ○ One or more tables from a database ○ Result of a SELECT (only Logical backup)
  16. 16. Full vs Incremental ● Full: A backup of data at a certain point in time ○ A Full backup can take a long time ○ Suppose you take it every 24 hours. In case of disaster, you could lose up to 24 hours of changes ● Incremental ○ Taking an incremental backup normally takes much less time ○ It includes the changes that happened after the last full backup ○ If there is a hole between the full backup and the oldest changes we have, our incremental backups are useless
  17. 17. Backups and Replication
  18. 18. Galera and Group Replication ● Often it makes sense to leave one node unused, and leave it available for failover ● Backups can be taken from this node
  19. 19. Async Slaves ● It is a common practice to take backups from slaves ● But remember that slaves can lag ● If a slave is unused, it can be stopped to take cold backups ● MySQL and MariaDB support delayed replication ● Useful in case of human mistakes
  20. 20. Logical Backups
  21. 21. mysqldump mysqldump [options] > dump.sql mysql < dump.sql ● The dump file contains CREATE and INSERT statements ● Part of the syntax is enclosed in executable statements: /*!080000 ... */ /*M!100400 ... */
  22. 22. mysqldump ● Monothread - slow! ● CREATE TABLEs include indexes - restore is slow! ● Many options to include/exclude tables/databases ● --where ● --no-data ● --triggers, --routines, --events ● Inconsistent by default. Use --single-transaction ○ MariaDB uses savepoints to unlock backuped rows ● For big tables use --quick
  23. 23. mysqlpump ● Comes with MySQL 5.7+, not in MariaDB ● Very similar to mysqldump ● Multi-thread, produces multiple files reloadable as normal dumps ● Option to compress the resulted files (cannot mysqlpump | gzip) ● Indexes are created after INSERT ● Users are dumped as CREATE USER
  24. 24. mydumper, myloader ● Multiple files per table: --threads 4 --rows 100000 How it works: ● Master thread connects MySQL/MariaDB and runs FLUSH TABLES WITH READ LOCK; ● Worker threads connect and run START TRANSACTION WITH CONSISTENT SNAPSHOT; ● Master thread: UNLOCK TABLES; ● Worker threads copy all data
  25. 25. mydumper, myloader Other important options: ● --no-data ● --triggers, --routines, --events ● --compress ● --trx-consistency-only if you only dump InnoDB tables ● Use myloader to restore ○ --thread 4
  26. 26. Non-Transactional Tables ● (MyISAM, Aria, ARCHIVE…) ● To get a consistent dump, tables must be locked with LOCK TABLES
  27. 27. Temporal Tables ● MariaDB feature to track how data change over time ● Temporal tables have timestamp columns that define when a row version validity started and when it ended ● If we create those columns explicitly, they are visible ● If those columns are visible and included in the dumps, dumps cannot be restored ● Otherwise, the original timestamps will be lost
  28. 28. SHOW CREATE ● To get more flexibility, you can also write a script that uses SHOW CREATE statements ● They return the CREATE statements to create an identical object (without data) ● SHOW DATABASES; SHOW CREATE DATABASE; ● SHOW TABLES; SHOW CREATE TABLE; ● SHOW VIEWS; SHOW CREATE VIEW; ● SHOW TRIGGERS; SHOW CREATE TRIGGER; ● ...
  29. 29. Physical Backups
  30. 30. Cold Backups ● Copy the files to somewhere else ● This includes configuration files, etc ● MySQL must not be running ○ OR, you can make sure it is not writing to files: ○ STOP SLAVE; ○ FLUSH TABLES WITH READ LOCK; ○ Copy ○ UNLOCK TABLES; ○ START SLAVE; ● The copy can be done incrementally, by using rsync
  31. 31. Snapshots ● Snapshots are not a MySQL/MariaDB feature ● They can be implemented in an underlying technology: volume manager (lvm), filesystem (zfs), Virtual Machine, Container ○ Your cloud provider most probably provides snapshots ○ but check documentation ● Existing files are frozen. They are the snapshots ● Everything written to disk afterwards is written separately (CoW), leaving the current files intact ● Snapshots can be incremental - only contains changes since the previous snapshot ● Snapshots can be sent to other servers ● Windows has Shadow Copies
  32. 32. Restoring Snapshots ● When mysqld or the filesystem suddenly crashes, it leaves inconsistent files ○ InnoDB tables (depending on configuration) don’t lose data, but tables must be repaired on restart using the information stored in redo log and undo log ○ MyISAM tables lose changes not flushed at the time of the crash ● When you take a snapshot it’s the same: you take a frozen copy of inconsistent files (if mysqld was running) ● Restoring a backup implies that when mysqld is restarted tables will be repaired
  33. 33. InnoDB Transportable Tablespaces
  34. 34. Transportable Tablespaces ● Works for InnoDB tables contained in a dedicated file ● Allows to copy a table from another server that runs the same MySQL version Source server: ● Run: FLUSH TABLES my_table FOR EXPORT; ● Copy the table file (.ibd) or take a snapshot Target server: ● Run: ALTER TABLE my_table DISCARD TABLESPACE; ● Copy the table ● Run: ALTER TABLE my_table IMPORT TABLESPACE;
  35. 35. xtrabackup
  36. 36. xtrabackup ● xtrabackup is a tool to copy files while the server is running, without locking InnoDB tables ● Produced by Percona ● Focus on MySQL ● Only works on Linux
  37. 37. xtrabackup ● xtrabackup is a tool to copy files while the server is running, without locking InnoDB tables ● Produced by Percona ● Focus on MySQL, no support for MariaDB where it is incompatible with MySQL ● Only works on Linux, no Windows support
  38. 38. mariabackup ● Introduced in MariaDB 10.1 ● Fork of Xtrabackup 2.3 ● Supports all MariaDB features ● Run on both Linux and Windows
  39. 39. Xtrabackup 8 ● MySQL 8.0.20 (April 2020) introduced a Redo Log format change (despite being a GA version) ● This break Xtrabackup compatibility ● Percona is working on a new release that understands the new Redo Log PXB-2167 - Pending Release You can subscribe to the issue
  40. 40. Taking a Full Backup ● Take a full backup: xtrabackup --backup --target-dir=/data/backups/ ● Check that the last line ends with: completed OK! ● Second last line contains numbers you need to note: xtrabackup: Transaction log of lsn (26970807) to (137343534) was copied.
  41. 41. Restoring a Full Backup Make a copy of the backup! Percona does not guarantee that it remains usable if preparation is interrupted. (though, in my experience, it is likely that it is useful; so it’s still worth a try) ● Prepare the backup: xtrabackup --prepare --target-dir=/data/backups/ ● Again, the last line should be: completed OK! ● Copy the files to the correct place: xtrabackup --copy-back --target-dir=/data/backups/
  42. 42. Taking Incremental Backups ● Take a full backup: xtrabackup --backup --target-dir=/data/backups/full ● Take an incremental backup: xtrabackup --backup --target-dir=/data/backups/inc1 --incremental-basedir=/data/backups/full ● Each directory contains a file called xtrabackup_checkpoints
  43. 43. Restoring Incremental Backups ● Prepare the full backup: xtrabackup --prepare --apply-log-only --target-dir=/data/backups/full ● Prepare incremental backups (update the full backup): ● xtrabackup --prepare --apply-log-only --target-dir=/data/backups/full --incremental-dir=/data/backups/inc1 ● Restore the full backup: xtrabackup --copy-back --target-dir=/data/backups/ Make a copy of all backups before preparation! A preparation failure could ruin the full backup. Also, an incremental backup cannot be prepared twice.
  44. 44. Other Features ● Compressed backups ● Stream backup to another server ● Choose databases / tables to backup
  45. 45. Performance ● Always make it use a reasonable amount of memory: --use-memory=8G ● Use enough threads: --threads=4 ● If I/O could be saturated: --throttle=1
  46. 46. Binary Log (binlog)
  47. 47. The Binary Log ● The Binary Log contains all changes to data ● It is used for replication and incremental backups
  48. 48. The Binary Log ● Every change has coordinates ● When you make a full backup, you can record the coordinates of the last change ● If you ever restore the backup, you can also re-apply the binlog after those coordinates mysqlbinlog --start-position=46183 /mysql-bin.000039 | mysql mysqlbinlog --start-position=46183 --database=db1 /mysql-bin.000039 | mysql
  49. 49. Binlog Formats ● binlog_format determines how changes are logged ○ ROW: primary key / UNIQUE index + new values ■ binlog_row_image = MINIMAL ○ STATEMENT: original SQL statement ○ MIXED: use STATEMENT when it is safe to do so
  50. 50. Binlog Reliability Other binlog settings to make the binlog reliable: ● binlog_checksum = CRC32 ● sync_binlog = 1
  51. 51. Testing Backups
  52. 52. To Test or Not To Test Poll on my website: Do you test your backups? ● 25% No ● 60% I don’t see why
  53. 53. Why to Test? ● Something may go wrong and the backups may be unusable ● Restore procedure may become wrong at some point ● The person who restores the backup may not know how to do it
  54. 54. Can something really go wrong??? ● Google for "GitLab.com database incident", happened in 2017 ● 3 backups strategies in place ● 0 usable backups ● They recovered data up to 6 hours before, because someone took a backup manually for some random reason
  55. 55. Can something really go wrong??? ● Think about Percona Xtrabackup problem ● You update MySQL, you don’t test your backups, they simply stop working ● You will find out when you need them
  56. 56. Can something really go wrong??? ● Disk full ● Versions mismatch after an update ● Network outage ● ...
  57. 57. How to test? ● Have multiple backup strategies ○ Tests are not perfect ○ Even if test tells you your backup failed, you need to have another working backup ● Automate your backups ● Automate tests
  58. 58. How to test? ● A good test would be to use backups to feed staging databases, nightly ● Different sets of staging DBs can be fed by different backup types ● The script that restores backups in staging can be used to restore them in production
  59. 59. What to test? Early tests - if a backup obviously failed, you may know it immediately ● Exit status ● Backup exists ● Backup size reasonable (not 1 byte…) ● Time took by backup procedure (not too short, not too long)
  60. 60. What to test? Late tests - if the backup procedure apparently succeeded, let’s make more tests ● Restore backup automatically ● Number of tables looks right ● Number of columns looks right ○ If migrations may happen during the night, a small difference must not trigger an alert ● information_schema tables can be queried and don’t generate errors/warnings ● Regular tables can be queried ● Check a small sample of rows that are not expected to change ○ It’s easy for read-only and append-only tables
  61. 61. Thank you for listening! federico-razzoli.com/services Telegram channel: open_source_databases

×