SlideShare a Scribd company logo
1 of 36
Alibaba Patches in MariaDB
Lixun Peng
Topic
• Time Machine / Flashback (Developing)
• Double Sync Replication (Will Contribute)
• Multi-Source Replication
• Thread Memory Monitor
What’s a Time Machine
• Rolling back instances/databases/tables to a snapshot
• Implement on Server-Level to support all engines.
• By full image format binary logs
• Currently, it’s a feature of mysqlbinlog tool (with--flashback
option)
Why Time Machine
• Everyone may make mistakes, including a DBA.
• After users mis-operating their data, of course, we can
recovery it from the last full backup set and binary logs.
• But if users’ database is too huge, it will cost so much time!
And usually, mis-operation just modify a few data, but we
need to recovery whole database.
How Time Machine Works
• As we know, if binlog_format is ROW (binlog-row-
image=FULL in 5.6 and later), all columns’ values are
store in the row event, so we can get the data before mis-
operation.
• Just do following things:
• Change Event Type, INSERT->DELETE, DELETE->INSERT
• For Update_Event, swapping the SET part and WHERE part
• Applying those events from the last one to the first one which
mis-operation happened.
• All the data will be recovered by inverse operations of mis-
oprerations.
Done List
• Full DML support
• Review table support
• Because users may want to check which part of data is flashbacked.
• GTID support (MariaDB)
• We add GTID event support for MariaDB 10.1
• MySQL 5.6 GTID events support is still working
ToDo List
• Adding DDL supports
• For ADD INDEX/COLUMN, or CREATE TABLE query, just drop the
index, column, table when running Flashback.
• For DROP INDEX/COLUMN, or DROP TABLE query, copy or
rename the old table to a reserved database. When Flashback is
running, I can drop the new table, and rename the saved old one
to the original database.
• For TRUNCATE table, I just rename the old table to a reserved
database and create a new empty table.
• Adding a script for time machine.
Flashback command
Double Sync Replication
——Enhancing data security guarantee
Lixun Peng @ Alibaba Cloud Compute
Problem of Async Replication
• Master don’t need to wait the ACK from Slave.
• Slave doesn’t know if it dumped the latest binary logs from
Master.
• When crashed, slave can check if itself is the same with
Master or not by its own.
• So,The main problem is that Slave doesn’t know the
status of Master.
Semi-Sync Replication
Problem of SemiSync
• Master needs to wait ACK from Slave.
• Slave will downgrade to Async when timeout happen.
• If the timeout is too small, timeout will happen frequently.
• If the timeout is too big, Master will often be blocked.
• After network is recovered, Slave should dump the binary logs
generated during timeout. During the time, Slave is still Async.
• When a Master is crashed, Slave doesn’t know if the master is
Async or SemiSync.
• So, Slave still doesn’t know if it’s the same with Master or not
when Master crashed.
• So,SemiSync doesn’t solve the main problem of Async
Repplication.
Problem of Async/SemiSync
Backgroup & Target
• Backgroup
• SA guarantee the server availability: 99.999%
• NA guarantee the network availability: 99.999%
• So, we can assume when the Master is crashed, network will not
timeout at that time point.
• Target
• Slave can know its status by itself. (the same with Master or not)
• If the data isn’t the same with Master, notice the app&dev to fix the
data, and show the range of lost data.
• Key Point: To avoid Slave's status being unknown!
Solve the weak point of SemiSync
• Once SemiSync is timeout, even network is recovered, Slave
still need to dump the binary logs generated during timeout,
under Async.
• If SemiSync is timeout, we give up the binary logs during timeout,
Master just send the latest position & logs. What will happen?
• When the network is down, the Slave will always know the latest
position on Master.
• So, Slave can know if its data is the same with Master or not.
• But, if Slave just dump the latest data, how to get the data
during the time when network is down?
• Async replication can dump the continuous binaray logs
• So we can use Async replication to do the full log apply.
Combine the Async and SemiSync
• Async Replication(Async_Channel)
• Dumping continuous binary logs to guarantee that the Slave’s logs
are continuous.
• Applying for logs after received immediately.
• SemiSync Replication(Sync_Channel)
• Dumping the latest binary logs to guarantee that the Slave knows
the latest position of Master.
• Will not apply logs after received, just save the logs & position and
outdated logs will be purged automatically.
• Analyzing consistency
• Comparing the received logs positions with these two channels.
Combine the Async and SemiSync
How to create two channels(1)
• Multi-Source replication can create N channels in one Slave.
• Problem:When Master received two dump requests from the
same Server-ID servers, it will disconnect the previous one.
• Solve:We set Sync Channel as a special Server-ID (0xFFFFFF).
How to create two channels (2)
• Problem:There are a SemiSync and a non-SemiSync Channel
in one Slave, but the SemiSync settings are global.
• Solve:We moved SemiSyncSlave class to Master_info.
Analyzing consistency
• Using the GTID
• Using the Log_file_name and Log_file_pos
• How to judge, check the following pictures 
Analyzing consistency
CASE 1: Needn’t Fix
• GTIDs between Sync and Async Channel are the same.
CASE 2: Can’t Fix
• Exist broken gap between Sync and Async Channel.
CASE 3: Can Repair
• Combine two channel’s logs, it’s continuous.
How to Repair
• We wait for the Async Channel till it applied for all logs that
received. Then start the SQL THREAD of Sync Channel.
• GTID will filter the event that applied by Async Channel.
• We provide the REPAIR SLAVE command to do these things
automaticially.
Multi-Source Replication
——N Masters and 1 Slave
Lixun Peng @ Alibaba Cloud Compute
Why we need multi-source
• OLAP
• Most of users using MySQL for data sharding.
• Multi-Source can help users to combine their data from
sharding instances.
• If you are using Master-Slave for backup, Multi-Source can
help you to backup many instances into one, it’s easy to
maintain.
How Multi-Source implement
What changes in the code
• Move Rpl_filter/skip_slave_counters into Master_info.
• Every channels will create a new Master_info.
• Every replication-related function will use the special
Maser_info.
• We create a Master_info_index class to maintain all
Master_info.
The Syntax
• CHANGE MASTER ["connection_name"] ...
• FLUSH RELAY LOGS ["connection_name"]
• MASTER_POS_WAIT(....,["connection_name"])
• RESET SLAVE ["connection_name"]
• SHOW RELAYLOG ["connection_name"] EVENTS
• SHOW SLAVE ["connection_name"] STATUS
• SHOW ALL SLAVES STATUS
• START SLAVE ["connection_name"...]
• START ALL SLAVES ...
• STOP SLAVE ["connection_name"] ...
• STOP ALL SLAVES ...
The Syntax
• set @@default_master_connection='';
• show status like 'Slave_running';
• set @@default_master_connection=‘connection';
• show status like 'Slave_running';
How it runs
Thread Memory Monitor
——Known how MySQL using memory
Lixun Peng @ Alibaba Cloud Compute
Why we need TMM
• MySQL’s memory limitation just work fine on Storage Engine
• For example in InnoDB: innodb_buffer_pool_size
• In the Server we can limit only some features’ memory, like
sort_buffer_size, join_buffer_size.
• But for big Query,the most of memory cost is from
MEM_ROOT,no option to limit it.
• So when mysqld process used too many memory, we don’t
know which thread is the reason.
• Then we don’t know which thread to kill to release the
memory.
How to solve it
• Add a hack in my_malloc.
• Record the malloc size and which thread applied for this
memory
• Calculate a total memory size of all threads.
THANKS!

More Related Content

What's hot

Making MySQL Administration a Breeze - A look into a MySQL DBA's toolchest
Making MySQL Administration a Breeze - A look into a MySQL DBA's toolchest Making MySQL Administration a Breeze - A look into a MySQL DBA's toolchest
Making MySQL Administration a Breeze - A look into a MySQL DBA's toolchest Lenz Grimmer
 
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...Severalnines
 
PowerDNS with MySQL
PowerDNS with MySQLPowerDNS with MySQL
PowerDNS with MySQLI Goo Lee
 
Geographically Distributed Multi-Master MySQL Clusters
Geographically Distributed Multi-Master MySQL ClustersGeographically Distributed Multi-Master MySQL Clusters
Geographically Distributed Multi-Master MySQL ClustersContinuent
 
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...confluent
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Masao Fujii
 
How We Made Scylla Maintenance Easier, Safer and Faster
How We Made Scylla Maintenance Easier, Safer and FasterHow We Made Scylla Maintenance Easier, Safer and Faster
How We Made Scylla Maintenance Easier, Safer and FasterScyllaDB
 
MySqL Failover by Weatherly Cloud Computing USA
MySqL Failover by Weatherly Cloud Computing USAMySqL Failover by Weatherly Cloud Computing USA
MySqL Failover by Weatherly Cloud Computing USAHarry Gonzalez
 
Logical Replication in PostgreSQL - FLOSSUK 2016
Logical Replication in PostgreSQL - FLOSSUK 2016Logical Replication in PostgreSQL - FLOSSUK 2016
Logical Replication in PostgreSQL - FLOSSUK 2016Petr Jelinek
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALEPostgreSQL Experts, Inc.
 
Webinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterWebinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterSeveralnines
 
Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data storesTomas Doran
 
Apache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpApache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpSander Temme
 
Failover or not to failover
Failover or not to failoverFailover or not to failover
Failover or not to failoverHenrik Ingo
 
Pulsarctl & Pulsar Manager
Pulsarctl & Pulsar ManagerPulsarctl & Pulsar Manager
Pulsarctl & Pulsar ManagerStreamNative
 

What's hot (19)

Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
Making MySQL Administration a Breeze - A look into a MySQL DBA's toolchest
Making MySQL Administration a Breeze - A look into a MySQL DBA's toolchest Making MySQL Administration a Breeze - A look into a MySQL DBA's toolchest
Making MySQL Administration a Breeze - A look into a MySQL DBA's toolchest
 
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...
 
PowerDNS with MySQL
PowerDNS with MySQLPowerDNS with MySQL
PowerDNS with MySQL
 
Geographically Distributed Multi-Master MySQL Clusters
Geographically Distributed Multi-Master MySQL ClustersGeographically Distributed Multi-Master MySQL Clusters
Geographically Distributed Multi-Master MySQL Clusters
 
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
 
Galera webinar migration to galera cluster from my sql async replication
Galera webinar migration to galera cluster from my sql async replicationGalera webinar migration to galera cluster from my sql async replication
Galera webinar migration to galera cluster from my sql async replication
 
How We Made Scylla Maintenance Easier, Safer and Faster
How We Made Scylla Maintenance Easier, Safer and FasterHow We Made Scylla Maintenance Easier, Safer and Faster
How We Made Scylla Maintenance Easier, Safer and Faster
 
MySqL Failover by Weatherly Cloud Computing USA
MySqL Failover by Weatherly Cloud Computing USAMySqL Failover by Weatherly Cloud Computing USA
MySqL Failover by Weatherly Cloud Computing USA
 
Logical Replication in PostgreSQL - FLOSSUK 2016
Logical Replication in PostgreSQL - FLOSSUK 2016Logical Replication in PostgreSQL - FLOSSUK 2016
Logical Replication in PostgreSQL - FLOSSUK 2016
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
Webinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterWebinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera Cluster
 
Running Galera Cluster on Microsoft Azure
Running Galera Cluster on Microsoft AzureRunning Galera Cluster on Microsoft Azure
Running Galera Cluster on Microsoft Azure
 
Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data stores
 
Apache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpApache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling Up
 
Failover or not to failover
Failover or not to failoverFailover or not to failover
Failover or not to failover
 
Galera Cluster DDL and Schema Upgrades 220217
Galera Cluster DDL and Schema Upgrades 220217Galera Cluster DDL and Schema Upgrades 220217
Galera Cluster DDL and Schema Upgrades 220217
 
Pulsarctl & Pulsar Manager
Pulsarctl & Pulsar ManagerPulsarctl & Pulsar Manager
Pulsarctl & Pulsar Manager
 

Similar to Alibaba patches in MariaDB

Double Sync Replication
Double Sync ReplicationDouble Sync Replication
Double Sync ReplicationLixun Peng
 
Download presentation
Download presentationDownload presentation
Download presentationRachit Gaur
 
Download presentation531
Download presentation531Download presentation531
Download presentation531Indra Pratap
 
MySQL Replication Basics
MySQL Replication BasicsMySQL Replication Basics
MySQL Replication BasicsAbdul Manaf
 
Galera explained 3
Galera explained 3Galera explained 3
Galera explained 3Marco Tusa
 
Mysql replication @ gnugroup
Mysql replication @ gnugroupMysql replication @ gnugroup
Mysql replication @ gnugroupJayant Chutke
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBMariaDB plc
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsConcentric Sky
 
MariaDB High Availability Webinar
MariaDB High Availability WebinarMariaDB High Availability Webinar
MariaDB High Availability WebinarMariaDB plc
 
Download presentation
Download presentationDownload presentation
Download presentationwebhostingguy
 
Scylla Summit 2018: Meshify - A Case Study, or Petshop Seamonsters
Scylla Summit 2018: Meshify - A Case Study, or Petshop SeamonstersScylla Summit 2018: Meshify - A Case Study, or Petshop Seamonsters
Scylla Summit 2018: Meshify - A Case Study, or Petshop SeamonstersScyllaDB
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...confluent
 
SQL Server Alwayson for SharePoint HA/DR Step by Step Guide
SQL Server Alwayson for SharePoint HA/DR Step by Step GuideSQL Server Alwayson for SharePoint HA/DR Step by Step Guide
SQL Server Alwayson for SharePoint HA/DR Step by Step GuideLars Platzdasch
 
On The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterOn The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterSrihari Sriraman
 
End-to-end Troubleshooting Checklist for Microsoft SQL Server
End-to-end Troubleshooting Checklist for Microsoft SQL ServerEnd-to-end Troubleshooting Checklist for Microsoft SQL Server
End-to-end Troubleshooting Checklist for Microsoft SQL ServerKevin Kline
 
02 2017 emea_roadshow_milan_ha
02 2017 emea_roadshow_milan_ha02 2017 emea_roadshow_milan_ha
02 2017 emea_roadshow_milan_hamlraviol
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High AvailabilityMariaDB plc
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopAyon Sinha
 
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...Continuent
 

Similar to Alibaba patches in MariaDB (20)

Double Sync Replication
Double Sync ReplicationDouble Sync Replication
Double Sync Replication
 
Download presentation
Download presentationDownload presentation
Download presentation
 
Download presentation531
Download presentation531Download presentation531
Download presentation531
 
MySQL Replication Basics
MySQL Replication BasicsMySQL Replication Basics
MySQL Replication Basics
 
Galera explained 3
Galera explained 3Galera explained 3
Galera explained 3
 
MySQL Failover - Cubexs Weatherly
MySQL Failover - Cubexs WeatherlyMySQL Failover - Cubexs Weatherly
MySQL Failover - Cubexs Weatherly
 
Mysql replication @ gnugroup
Mysql replication @ gnugroupMysql replication @ gnugroup
Mysql replication @ gnugroup
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDB
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the Seams
 
MariaDB High Availability Webinar
MariaDB High Availability WebinarMariaDB High Availability Webinar
MariaDB High Availability Webinar
 
Download presentation
Download presentationDownload presentation
Download presentation
 
Scylla Summit 2018: Meshify - A Case Study, or Petshop Seamonsters
Scylla Summit 2018: Meshify - A Case Study, or Petshop SeamonstersScylla Summit 2018: Meshify - A Case Study, or Petshop Seamonsters
Scylla Summit 2018: Meshify - A Case Study, or Petshop Seamonsters
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 
SQL Server Alwayson for SharePoint HA/DR Step by Step Guide
SQL Server Alwayson for SharePoint HA/DR Step by Step GuideSQL Server Alwayson for SharePoint HA/DR Step by Step Guide
SQL Server Alwayson for SharePoint HA/DR Step by Step Guide
 
On The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterOn The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL Cluster
 
End-to-end Troubleshooting Checklist for Microsoft SQL Server
End-to-end Troubleshooting Checklist for Microsoft SQL ServerEnd-to-end Troubleshooting Checklist for Microsoft SQL Server
End-to-end Troubleshooting Checklist for Microsoft SQL Server
 
02 2017 emea_roadshow_milan_ha
02 2017 emea_roadshow_milan_ha02 2017 emea_roadshow_milan_ha
02 2017 emea_roadshow_milan_ha
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High Availability
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
 
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
 

More from Lixun Peng

MySQL新技术探索与实践
MySQL新技术探索与实践MySQL新技术探索与实践
MySQL新技术探索与实践Lixun Peng
 
阿里云RDS for MySQL的若干优化
阿里云RDS for MySQL的若干优化阿里云RDS for MySQL的若干优化
阿里云RDS for MySQL的若干优化Lixun Peng
 
DoubleBinlog方案
DoubleBinlog方案DoubleBinlog方案
DoubleBinlog方案Lixun Peng
 
MySQL优化、新特性和新架构 彭立勋
MySQL优化、新特性和新架构 彭立勋MySQL优化、新特性和新架构 彭立勋
MySQL优化、新特性和新架构 彭立勋Lixun Peng
 
对MySQL应用的一些总结
对MySQL应用的一些总结对MySQL应用的一些总结
对MySQL应用的一些总结Lixun Peng
 
对MySQL的一些改进想法和实现
对MySQL的一些改进想法和实现对MySQL的一些改进想法和实现
对MySQL的一些改进想法和实现Lixun Peng
 
MySQL多机房容灾设计(with Multi-Master)
MySQL多机房容灾设计(with Multi-Master)MySQL多机房容灾设计(with Multi-Master)
MySQL多机房容灾设计(with Multi-Master)Lixun Peng
 
Performance of fractal tree databases
Performance of fractal tree databasesPerformance of fractal tree databases
Performance of fractal tree databasesLixun Peng
 
MySQL新技术探索与实践
MySQL新技术探索与实践MySQL新技术探索与实践
MySQL新技术探索与实践Lixun Peng
 
MySQL源码分析.03.InnoDB 物理文件格式与数据恢复
MySQL源码分析.03.InnoDB 物理文件格式与数据恢复MySQL源码分析.03.InnoDB 物理文件格式与数据恢复
MySQL源码分析.03.InnoDB 物理文件格式与数据恢复Lixun Peng
 
MySQL源码分析.02.Handler API
MySQL源码分析.02.Handler APIMySQL源码分析.02.Handler API
MySQL源码分析.02.Handler APILixun Peng
 
MySQL源码分析.01.代码结构与基本流程
MySQL源码分析.01.代码结构与基本流程MySQL源码分析.01.代码结构与基本流程
MySQL源码分析.01.代码结构与基本流程Lixun Peng
 
内部MySQL培训.3.基本原理
内部MySQL培训.3.基本原理内部MySQL培训.3.基本原理
内部MySQL培训.3.基本原理Lixun Peng
 
内部MySQL培训.2.高级应用
内部MySQL培训.2.高级应用内部MySQL培训.2.高级应用
内部MySQL培训.2.高级应用Lixun Peng
 
内部MySQL培训.1.基础技能
内部MySQL培训.1.基础技能内部MySQL培训.1.基础技能
内部MySQL培训.1.基础技能Lixun Peng
 
对简易几何机械化证明的进一步研究
对简易几何机械化证明的进一步研究对简易几何机械化证明的进一步研究
对简易几何机械化证明的进一步研究Lixun Peng
 
A binary graphics recognition algorithm based on fitting function
A binary graphics recognition algorithm based on fitting functionA binary graphics recognition algorithm based on fitting function
A binary graphics recognition algorithm based on fitting functionLixun Peng
 
一种基于拟合函数的图形识别算法
一种基于拟合函数的图形识别算法一种基于拟合函数的图形识别算法
一种基于拟合函数的图形识别算法Lixun Peng
 
中文分词算法设计
中文分词算法设计中文分词算法设计
中文分词算法设计Lixun Peng
 
Database.Cache&Buffer&Lock
Database.Cache&Buffer&LockDatabase.Cache&Buffer&Lock
Database.Cache&Buffer&LockLixun Peng
 

More from Lixun Peng (20)

MySQL新技术探索与实践
MySQL新技术探索与实践MySQL新技术探索与实践
MySQL新技术探索与实践
 
阿里云RDS for MySQL的若干优化
阿里云RDS for MySQL的若干优化阿里云RDS for MySQL的若干优化
阿里云RDS for MySQL的若干优化
 
DoubleBinlog方案
DoubleBinlog方案DoubleBinlog方案
DoubleBinlog方案
 
MySQL优化、新特性和新架构 彭立勋
MySQL优化、新特性和新架构 彭立勋MySQL优化、新特性和新架构 彭立勋
MySQL优化、新特性和新架构 彭立勋
 
对MySQL应用的一些总结
对MySQL应用的一些总结对MySQL应用的一些总结
对MySQL应用的一些总结
 
对MySQL的一些改进想法和实现
对MySQL的一些改进想法和实现对MySQL的一些改进想法和实现
对MySQL的一些改进想法和实现
 
MySQL多机房容灾设计(with Multi-Master)
MySQL多机房容灾设计(with Multi-Master)MySQL多机房容灾设计(with Multi-Master)
MySQL多机房容灾设计(with Multi-Master)
 
Performance of fractal tree databases
Performance of fractal tree databasesPerformance of fractal tree databases
Performance of fractal tree databases
 
MySQL新技术探索与实践
MySQL新技术探索与实践MySQL新技术探索与实践
MySQL新技术探索与实践
 
MySQL源码分析.03.InnoDB 物理文件格式与数据恢复
MySQL源码分析.03.InnoDB 物理文件格式与数据恢复MySQL源码分析.03.InnoDB 物理文件格式与数据恢复
MySQL源码分析.03.InnoDB 物理文件格式与数据恢复
 
MySQL源码分析.02.Handler API
MySQL源码分析.02.Handler APIMySQL源码分析.02.Handler API
MySQL源码分析.02.Handler API
 
MySQL源码分析.01.代码结构与基本流程
MySQL源码分析.01.代码结构与基本流程MySQL源码分析.01.代码结构与基本流程
MySQL源码分析.01.代码结构与基本流程
 
内部MySQL培训.3.基本原理
内部MySQL培训.3.基本原理内部MySQL培训.3.基本原理
内部MySQL培训.3.基本原理
 
内部MySQL培训.2.高级应用
内部MySQL培训.2.高级应用内部MySQL培训.2.高级应用
内部MySQL培训.2.高级应用
 
内部MySQL培训.1.基础技能
内部MySQL培训.1.基础技能内部MySQL培训.1.基础技能
内部MySQL培训.1.基础技能
 
对简易几何机械化证明的进一步研究
对简易几何机械化证明的进一步研究对简易几何机械化证明的进一步研究
对简易几何机械化证明的进一步研究
 
A binary graphics recognition algorithm based on fitting function
A binary graphics recognition algorithm based on fitting functionA binary graphics recognition algorithm based on fitting function
A binary graphics recognition algorithm based on fitting function
 
一种基于拟合函数的图形识别算法
一种基于拟合函数的图形识别算法一种基于拟合函数的图形识别算法
一种基于拟合函数的图形识别算法
 
中文分词算法设计
中文分词算法设计中文分词算法设计
中文分词算法设计
 
Database.Cache&Buffer&Lock
Database.Cache&Buffer&LockDatabase.Cache&Buffer&Lock
Database.Cache&Buffer&Lock
 

Alibaba patches in MariaDB

  • 1. Alibaba Patches in MariaDB Lixun Peng
  • 2. Topic • Time Machine / Flashback (Developing) • Double Sync Replication (Will Contribute) • Multi-Source Replication • Thread Memory Monitor
  • 3. What’s a Time Machine • Rolling back instances/databases/tables to a snapshot • Implement on Server-Level to support all engines. • By full image format binary logs • Currently, it’s a feature of mysqlbinlog tool (with--flashback option)
  • 4. Why Time Machine • Everyone may make mistakes, including a DBA. • After users mis-operating their data, of course, we can recovery it from the last full backup set and binary logs. • But if users’ database is too huge, it will cost so much time! And usually, mis-operation just modify a few data, but we need to recovery whole database.
  • 5. How Time Machine Works • As we know, if binlog_format is ROW (binlog-row- image=FULL in 5.6 and later), all columns’ values are store in the row event, so we can get the data before mis- operation. • Just do following things: • Change Event Type, INSERT->DELETE, DELETE->INSERT • For Update_Event, swapping the SET part and WHERE part • Applying those events from the last one to the first one which mis-operation happened. • All the data will be recovered by inverse operations of mis- oprerations.
  • 6. Done List • Full DML support • Review table support • Because users may want to check which part of data is flashbacked. • GTID support (MariaDB) • We add GTID event support for MariaDB 10.1 • MySQL 5.6 GTID events support is still working
  • 7. ToDo List • Adding DDL supports • For ADD INDEX/COLUMN, or CREATE TABLE query, just drop the index, column, table when running Flashback. • For DROP INDEX/COLUMN, or DROP TABLE query, copy or rename the old table to a reserved database. When Flashback is running, I can drop the new table, and rename the saved old one to the original database. • For TRUNCATE table, I just rename the old table to a reserved database and create a new empty table. • Adding a script for time machine.
  • 9. Double Sync Replication ——Enhancing data security guarantee Lixun Peng @ Alibaba Cloud Compute
  • 10. Problem of Async Replication • Master don’t need to wait the ACK from Slave. • Slave doesn’t know if it dumped the latest binary logs from Master. • When crashed, slave can check if itself is the same with Master or not by its own. • So,The main problem is that Slave doesn’t know the status of Master.
  • 12. Problem of SemiSync • Master needs to wait ACK from Slave. • Slave will downgrade to Async when timeout happen. • If the timeout is too small, timeout will happen frequently. • If the timeout is too big, Master will often be blocked. • After network is recovered, Slave should dump the binary logs generated during timeout. During the time, Slave is still Async. • When a Master is crashed, Slave doesn’t know if the master is Async or SemiSync. • So, Slave still doesn’t know if it’s the same with Master or not when Master crashed. • So,SemiSync doesn’t solve the main problem of Async Repplication.
  • 14. Backgroup & Target • Backgroup • SA guarantee the server availability: 99.999% • NA guarantee the network availability: 99.999% • So, we can assume when the Master is crashed, network will not timeout at that time point. • Target • Slave can know its status by itself. (the same with Master or not) • If the data isn’t the same with Master, notice the app&dev to fix the data, and show the range of lost data. • Key Point: To avoid Slave's status being unknown!
  • 15. Solve the weak point of SemiSync • Once SemiSync is timeout, even network is recovered, Slave still need to dump the binary logs generated during timeout, under Async. • If SemiSync is timeout, we give up the binary logs during timeout, Master just send the latest position & logs. What will happen? • When the network is down, the Slave will always know the latest position on Master. • So, Slave can know if its data is the same with Master or not. • But, if Slave just dump the latest data, how to get the data during the time when network is down? • Async replication can dump the continuous binaray logs • So we can use Async replication to do the full log apply.
  • 16. Combine the Async and SemiSync • Async Replication(Async_Channel) • Dumping continuous binary logs to guarantee that the Slave’s logs are continuous. • Applying for logs after received immediately. • SemiSync Replication(Sync_Channel) • Dumping the latest binary logs to guarantee that the Slave knows the latest position of Master. • Will not apply logs after received, just save the logs & position and outdated logs will be purged automatically. • Analyzing consistency • Comparing the received logs positions with these two channels.
  • 17. Combine the Async and SemiSync
  • 18. How to create two channels(1) • Multi-Source replication can create N channels in one Slave. • Problem:When Master received two dump requests from the same Server-ID servers, it will disconnect the previous one. • Solve:We set Sync Channel as a special Server-ID (0xFFFFFF).
  • 19. How to create two channels (2) • Problem:There are a SemiSync and a non-SemiSync Channel in one Slave, but the SemiSync settings are global. • Solve:We moved SemiSyncSlave class to Master_info.
  • 20. Analyzing consistency • Using the GTID • Using the Log_file_name and Log_file_pos • How to judge, check the following pictures 
  • 22. CASE 1: Needn’t Fix • GTIDs between Sync and Async Channel are the same.
  • 23. CASE 2: Can’t Fix • Exist broken gap between Sync and Async Channel.
  • 24. CASE 3: Can Repair • Combine two channel’s logs, it’s continuous.
  • 25. How to Repair • We wait for the Async Channel till it applied for all logs that received. Then start the SQL THREAD of Sync Channel. • GTID will filter the event that applied by Async Channel. • We provide the REPAIR SLAVE command to do these things automaticially.
  • 26. Multi-Source Replication ——N Masters and 1 Slave Lixun Peng @ Alibaba Cloud Compute
  • 27. Why we need multi-source • OLAP • Most of users using MySQL for data sharding. • Multi-Source can help users to combine their data from sharding instances. • If you are using Master-Slave for backup, Multi-Source can help you to backup many instances into one, it’s easy to maintain.
  • 29. What changes in the code • Move Rpl_filter/skip_slave_counters into Master_info. • Every channels will create a new Master_info. • Every replication-related function will use the special Maser_info. • We create a Master_info_index class to maintain all Master_info.
  • 30. The Syntax • CHANGE MASTER ["connection_name"] ... • FLUSH RELAY LOGS ["connection_name"] • MASTER_POS_WAIT(....,["connection_name"]) • RESET SLAVE ["connection_name"] • SHOW RELAYLOG ["connection_name"] EVENTS • SHOW SLAVE ["connection_name"] STATUS • SHOW ALL SLAVES STATUS • START SLAVE ["connection_name"...] • START ALL SLAVES ... • STOP SLAVE ["connection_name"] ... • STOP ALL SLAVES ...
  • 31. The Syntax • set @@default_master_connection=''; • show status like 'Slave_running'; • set @@default_master_connection=‘connection'; • show status like 'Slave_running';
  • 33. Thread Memory Monitor ——Known how MySQL using memory Lixun Peng @ Alibaba Cloud Compute
  • 34. Why we need TMM • MySQL’s memory limitation just work fine on Storage Engine • For example in InnoDB: innodb_buffer_pool_size • In the Server we can limit only some features’ memory, like sort_buffer_size, join_buffer_size. • But for big Query,the most of memory cost is from MEM_ROOT,no option to limit it. • So when mysqld process used too many memory, we don’t know which thread is the reason. • Then we don’t know which thread to kill to release the memory.
  • 35. How to solve it • Add a hack in my_malloc. • Record the malloc size and which thread applied for this memory • Calculate a total memory size of all threads.