The ninja elephant, scaling the analytics database in Transwerwise

The ninja elephant
Scaling the analytics database in Transferwise
Federico Campoli
Transferwise
3rd February 2017
Federico Campoli (Transferwise) The ninja elephant 3rd February 2017 1 / 56

First rule about talks, don’t talk about the speaker
Born in 1972
Passionate about IT since 1982 mostly because of the TRON movie
Joined the Oracle DBA secret society in 2004
In love with PostgreSQL since 2006
Currently runs the Brighton PostgreSQL User group
Works at Transferwise as Data Engineer

Table of contents
1 We have an appointment, and we are late!
2 The eye of the storm
3 MySQL Replica in a nutshell
4 How we did it
5 Maximum eﬀort
6 Lessons learned
7 Wrap up

Table of contents
4 How we did it
5 Maximum eﬀort
6 Lessons learned
7 Wrap up

We have an appointment, and we are late!

The Gordian Knot of analytics db
I started the data engineer job in July 2016
I was involved in a task not customer facing
However the task was very critical to the business

The Gordian Knot of analytics db
I started the data engineer job in July 2016
I was involved in a task not customer facing
However the task was very critical to the business
I had to ﬁx the performance issues on the MySQL analytics database
Which performed bad, despite the considerable resources assigned to the VM

Tactical assessment
The existing database had the following conﬁguration
MySQL 5.6
Innodb buﬀer size 60 GB
70 GB RAM
20 CPU
database size 600 GB
Looker and Tableau for running the analytic queries
The main live database replicated into the analytics database
Several schema from the service database imported on a regular basis
One schema used for obfuscating PII and denormalising the heavy queries

The frog eﬀect
If you drop a frog in a pot of boiling water, it will of course frantically try to
clamber out. But if you place it gently in a pot of tepid water and turn the heat
will be slowly boiled to death.

The frog eﬀect
If you drop a frog in a pot of boiling water, it will of course frantically try to
clamber out. But if you place it gently in a pot of tepid water and turn the heat
will be slowly boiled to death.
The performance issues worsened over a two years span
The obfuscation was made via custom views
The data size on the MySQL master increased over time
Causing the optimiser to switch on materialise when accessing the views
The analytics tools struggled just under normal load
In busy periods the database became almost unusable
Analysts were busy to tune existing queries rather writing new
A new solution was needed

Table of contents
4 How we did it
5 Maximum eﬀort
6 Lessons learned
7 Wrap up

The eye of the storm

One size doesn’t fits all
It was clear that MySQL was no longer a good fit.
However the new solution’s requirements had to meet some specific needs.
Data updated in almost real time from the live database
PII obfuscated for the analysts
PII available in clear for the power users
The system should be able to scale out for several years
Modern SQL for better analytics queries

May the best database win
The analysts team shortlisted few solutions.
Each solution covered partially the requirements.
Google BigQuery
Amazon RedShift
Snowﬂake
PostgreSQL

May the best database win
The analysts team shortlisted few solutions.
Each solution covered partially the requirements.
Google BigQuery
Amazon RedShift
Snowflake
PostgreSQL
Google BigQuery and Amazon RedShift did not suffice the analytics requirements
and were removed from the list.
Both PostgreSQL and Snowflake offered very good performance and modern SQL.
Neither of them offered a replication system from the MySQL system.

Straight into the cloud
Snowflake is a cloud based data warehouse service. It’s based on Amazon S3 and
comes with different sizing.
Their pricing system is very appealing and the preliminary tests shown Snowflake
outperforming PostgreSQL1
.
1PostgreSQL single machine vs cloud based parallel processing

Streaming copy
Using FiveTran, an impressive multi technology data pipeline, the data would ﬂow
in real time from our production server to Snowﬂake.

Streaming copy
Using FiveTran, an impressive multi technology data pipeline, the data would ﬂow
in real time from our production server to Snowﬂake.
Unfortunately there was just one little catch.
There was no support for obfuscation.

Customer comes ﬁrst
In Transferwise we really care about the customer’s data security.
Our policy for the PII data is that any personal information moving outside our
perimeter shall be obfuscated.
In order to be compliant the database accessible by Fivetran would have only
obfuscated data.

Proactive development
The sense of DBA tingled. I foresaw the requirement and in my spare time I built
a proof of concept based on the replica tool pg chameleon.
The tool which using a python library can replicate a MySQL database into
PostgreSQL.
The initial tests on a reduced dataset were successful.
It was simple to add the obfuscation in real time with minimal changes.

And the winner is...
The initial idea was to use PostgreSQL for obfuscate the data used by FiveTran.
However, because the performance on PostgreSQL were quite good, and the
system have good margin for scaling up, the decision was to keep the data
analytics data behind our perimeter.

Table of contents
4 How we did it
5 Maximum eﬀort
6 Lessons learned
7 Wrap up

MySQL Replica in a nutshell

A quick look to the replication system
Let’s have a quick overview on how the MySQL replica works and how the
replicator interacts with it.
The following slides explain how pg chameleon works because the custom
obfuscator tool shares with pg chameleon most concepts concepts and code.

MySQL Replica
The MySQL replica protocol is logical
When MySQL is conﬁgured properly the RDBMS saves the data changed
into binary log ﬁles
The slave connects to the master and gets the replication data
The replication’s data are saved into the slave’s local relay logs
The local relay logs are replayed into the slave

MySQL Replica

A chameleon in the middle
pg chameleon mimics a mysql slave’s behaviour
Connects to the master and reads data changes
It stores the row images into a PostgreSQL table using the jsonb format
A plpgSQL function decodes the rows and replay the changes

PostgreSQL acts as relay log and replication slave
With an extra cool feature.

PostgreSQL acts as relay log and replication slave
With an extra cool feature.
Initialises the PostgreSQL replica schema in just one command

MySQL replica + pg chameleon

Log formats
MySQL supports diﬀerent formats for the binary logs.
The STATEMENT format. It logs the statements which are replayed on the
slave.
It seems the best solution for performance.
However replaying queries with not deterministic elements generate
inconsistent slaves (e.g. insert with uuid).
The ROW format is deterministic. It logs the row image and the DDL queries.
This is the format required for pg chameleon to work.
MIXED takes the best of both worlds. The master logs the statements unless
a not deterministic element is used. In that case it logs the row image.

Table of contents
4 How we did it
5 Maximum eﬀort
6 Lessons learned
7 Wrap up

How we did it

Replica and obfuscation
I built a minimum viable product for pg chameleon.
The project was forked into a transferwise owned repository for the customisation.
It were added the the obfuscation capabilities and other speciﬁc procedures like
the daily data aggregation.

Mighty morphing power elephant
The replica initialisation locks the mysql tables in read only mode.
To avoid the main database to be locked for several hours a secondary MySQL
replica is setup with the local query logging enabled.
The cascading replica also allowed to use the ROW binlog format as the master
uses MIXED for performance reasons.

This is what awesome looks like!
A MySQL master is replicated into a MySQL slave

This is what awesome looks like!
A MySQL master is replicated into a MySQL slave
The slave logs the row changes locally in ROW format
PostgreSQL reads the slave’s replica and obfuscates the data in realtime!

Replica initialisation
The replica initialisation follows the same rules of any mysql replica setup
Flush the tables with read lock
Get the master’s coordinates
Copy the data
Release the locks
The procedure pulls the data out from mysql using the CSV format for a fast load
in PostgreSQL with the COPY command.
This approach requires with a tricky SQL statement.

First generate the select list
SELECT
CASE
WHEN data_type="enum"
THEN
SUBSTRING(COLUMN_TYPE ,5)
END AS enum_list ,
CASE
WHEN
data_type IN (’"""+" ’,’". join(self.hexify)+""" ’)
THEN
concat(’hex(’,column_name ,’)’)
WHEN
data_type IN (’bit ’)
THEN
concat(’cast(‘’,column_name ,’‘ AS unsigned)’)
ELSE
concat(’‘’,column_name ,’‘’)
END
AS column_csv
FROM
information_schema .COLUMNS
WHERE
table_schema =%s
AND table_name =%s
ORDER BY
ordinal_position
;

Then use it into mysql query
csv_data=""
sql_out="SELECT "+columns_csv+" as data FROM "+table_name+";"
self.mysql_con.connect_db_ubf()
try:
self.logger.debug("Executing query for table %s" % (table_name, ))
self.mysql_con.my_cursor_ubf.execute(sql_out)
except:
self.logger.debug("an error occurred when pulling out the data from the table %s - sql executed: %s" % (table_name, sql_out))

Fallback on failure
The CSV data is pulled out in slices in order to avoid memory overload.
The ﬁle is then pushed into PostgreSQL using the COPY command.
However...
COPY is fast but is single transaction
One failure and the entire batch is rolled back
If this happens the procedure loads the same data using the INSERT
statements
Which can be very slow
But at least discards only the problematic rows

obfuscation setup
A simple yaml ﬁle is used to list table, column and obfuscation strategy
u s e r d e t a i l s :
last name :
mode : normal
nonhash start : 0
nonhash length : 0
phone number :
mode : normal
nonhash start : 1
nonhash length : 2
d a t e o f b i r t h :
mode : date

Obfuscation when initialising
The obfuscation process is quite simple and uses the extension pgcrypt for hashing
in sha256.
When the replica is initialised the data is copied into the schema in clear
The table locks are released
The tables with PII are copied and obfuscated in a separate schema
The process builds the indices on the schemas with data in clear and
obfuscated
The tables without PII data are exposed to the normal users using simple
views
All the varchar ﬁelds in the obfuscated schema are converted in text ﬁelds

Obfuscation on the ﬂy
The obfuscation is also applied when the data is replicated.
The approach is very simple.
When a row image is captured the process checks if the table contains PII
data
In that case the process generates a second jsonb element with the PII data
obfuscated

Obfuscation on the ﬂy
{’global_data’:
{
’binlog’: u’mysql-bin.000227’,
’logpos’: 1543,
’action’: ’update’,
’batch_id’: 2L,
’table’: u’user’,
’log_table’: ’t_log_replica_2’,
’schema’: ’sch_clear’
},
’event_data’:
{
u’email’: u’foo@bar.com’
}
}
{’global_data’:
{
’binlog’: u’mysql-bin.000227’,
’logpos’: 1543,
’action’: ’update’,
’batch_id’: 2L,
’table’: u’user’,
’log_table’: ’t_log_replica_2’,
’schema’: ’sch_obf’
},
’event_data’:
{
u’email’: u’2bc5aa7720b6a3462cdf8c1ae25ed8dc45b1d9e1b0cd960aa15ac72acfe20433’
}
}

The DDL. A real pain in the back
The DDL replica is possible with a little trick.
MySQL even in ROW format emits the DDL as statements
A regular expression traps the DDL like CREATE/DROP TABLE or ALTER
TABLE.
The mysql library gets the table’s metadata from the information schema
The metadata is used to build the DDL in the PostgreSQL dialect
This approach may not be elegant but is quite robust.

Table of contents
4 How we did it
5 Maximum eﬀort
6 Lessons learned
7 Wrap up

Maximum eﬀort

Timing
Query MySQL PostgreSQL PostgreSQL cached
Master procedure 20 hours 4 hours N/A
Extracting sharing ibans2
didn’t complete 3 minutes 1 minute
Adyen notiﬁcation3
6 minutes 2 minutes 6 seconds
2small table with complex aggregations
3big table scan with simple ﬁlters

Resource comparison
Resource MySQL PostgreSQL
Storage Size 940 GB 664 GB
Server CPUs 18 8
Server Memory 68 GB 48 GB
Shared Memory 50 GB 5 GB
Max connections 500 100

Advantages using PostgreSQL
Stronger security model
Better resource optimisation (See previous slide)
No invalid views
No performance issues with views
Complex analytics functions
partitioning (thanks pg pathman!)
BRIN indices

Advantages using PostgreSQL
Stronger security model
Better resource optimisation (See previous slide)
No invalid views
No performance issues with views
Complex analytics functions
partitioning (thanks pg pathman!)
BRIN indices
some code was optimised inside, but actually very little - maybe 10-20% was
improved. We’ll do more of that in the future, but not yet. The good thing is that
the performance gains we have can mostly be attributed just to PG vs MySQL. So
there’s a lot of scope to improve further.
Jeﬀ McClelland - Growth Analyst, data guru

Table of contents
4 How we did it
5 Maximum eﬀort
6 Lessons learned
7 Wrap up

Lessons learned

init replica tune
The replica initialisation required several improvements.
The ﬁrst init replica implementation didn’t complete.
The OOM killer killed the process when the memory usage was too high.
In order to speed up the replica, some large tables not required in the
analytics db were excluded from the init replica.
Some tables required a custom slice size because the row length triggered
again the OOM killer.
Estimating the total rows for user’s feedback is faster but the output can be
odd.
Using not buﬀered cursors improves the speed and the memory usage.

init replica tune
The replica initialisation required several improvements.
The first init replica implementation didn’t complete.
The OOM killer killed the process when the memory usage was too high.
In order to speed up the replica, some large tables not required in the
analytics db were excluded from the init replica.
Some tables required a custom slice size because the row length triggered
again the OOM killer.
Estimating the total rows for user’s feedback is faster but the output can be
odd.
Using not buffered cursors improves the speed and the memory usage.
However.... even after fixing the memory issues the initial copy took 6 days.
Tuning the copy speed with the unbuffered cursors and the row number estimates
improved the initial copy speed which now completes in 30 hours.
Including the time required for the index build.

Strictness is an illusion. MySQL doubly so
MySQL’s lack of strictness is not a mystery.
The replica broke down several times because of the funny way the NOT NULL is
managed by MySQL.
To prevent any further replica breakdown the fields with NOT NULL added with
ALTER TABLE, in PostgreSQL are always as NULLable.
MySQL truncates the strings of characters at the varchar size automatically. This
is a problem if the field is obfuscated on PostgreSQL because the hashed string
could not fit into the corresponding varchar field. Therefore all the character
varying on the obfuscated schema are converted to text.

I feel your lack of constraint disturbing
Rubbish data in MySQL can be stored without errors raised by the DBMS.
When this happens the replicator traps the error when the change is replayed on
PostgreSQL and discards the problematic row.
The value is logged on the replica’s log, available for further actions.

Table of contents
4 How we did it
5 Maximum eﬀort
6 Lessons learned
7 Wrap up

Wrap up

Did you say hire?
WE ARE HIRING!
https://transferwise.com/jobs/

That’s all folks!
QUESTIONS?

Contacts and license
Twitter: 4thdoctor scarf
Transferwise: https://transferwise.com/
Blog:http://www.pgdba.co.uk
Meetup: http://www.meetup.com/Brighton-PostgreSQL-Meetup/
This document is distributed under the terms of the Creative Commons

Boring legal stuﬀ
The 4th doctor meme - source memecrunch.com
The eye, phantom playground, light end tunnel - Copyright Federico Campoli
The dolphin picture - Copyright artnoose
It could work. Young Frankenstein - source quickmeme
Deadpool Clap - source memegenerator
Deadpool Maximum Eﬀort - source Deadpool Zoeiro

The ninja elephant
Scaling the analytics database in Transferwise
Federico Campoli
Transferwise
3rd February 2017

The ninja elephant, scaling the analytics database in Transwerwise

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to The ninja elephant, scaling the analytics database in Transwerwise

Similar to The ninja elephant, scaling the analytics database in Transwerwise (20)

Recently uploaded

Recently uploaded (20)

The ninja elephant, scaling the analytics database in Transwerwise