This presentation goes through performance tuning basics in MySQL Cluster.
It also covers the new parameters and status variables of MySQL Cluster 7.2 to determine issues with e.g disk data performance and query (join) performance.
CNIC Information System with Pakdata Cf In Pakistan
Conference slides: MySQL Cluster Performance Tuning
1. Performance Tuning of MySQL Cluster
Santa Clara, April 2012
Johan Andersson
Severalnines AB
johan@severalnines.com
Cell +46 73 073 60 99
Copyright Severalnines 2012
2. 2
Agenda
Scaling and Partitioning
Designing a Scalable System
Insert Performance Tuning
Query Tuning
Random tricks
Disk Data Tuning
Copyright Severalnines 2012
3. 3
Here is ...
Access Layer
App App
Server Server
MYSQL MYSQL
STORAGE LAYER
DATA DATA
NODE NODE
P0 P1
Node group 0
Copyright Severalnines 2012
4. 4
It can scale linearly ...
Access Layer
App App App App App App App App
Server Server Server Server Server Server Server Server
MYSQL MYSQL MYSQL MYSQL MYSQL NDBAPI NDBAPI NDBAPI
STORAGE LAYER STORAGE LAYER STORAGE LAYER
DATA DATA DATA DATA DATA DATA
NODE NODE NODE NODE NODE NODE
P0 P1 P2 P3 P4 P5
Node group 0 Node group 1 Node group 2
Copyright Severalnines 2012
5. if you find the bottlenecks
A lot of CPU is used on the data nodes
Probably a lot of large index scans and full table scans are used.
A lot of CPU is used on the mysql servers
Probably a lot of GROUP BY/DISTINCT or aggregate functions.
Hardly no CPU is used on either mysql or data nodes
Probably low load
Time is spent on network (a lot of “ping pong” to satisfy a request).
System is running slow in general
Disks (io util), queries, swap (should never happen)
Copyright Severalnines 2012
6. and if you know how
Adding mysqlds – trivial – if the mysqld is the bottleneck
BUT! Adding data nodes
More data nodes does not automatically give better performance
●
Latency may increase for a single query
●
Total throughout will be improved
How to get both?
Copyright Severalnines 2012
7. 7
Designing a
Scalable System
Define the most typical Use Cases
List all my friends, session management etc etc.
Optimize everything for the typical use case
Keep it simple
Complex access patterns does not scale
Simple access patterns do ( Primay key and Partitioned Index Scans )
Note! There is no parameter in config.ini that affects performance – only availability.
Everything is about the Schema and the Queries.
Tune the mysql servers (sort buffers etc) as you would for innodb.
Copyright Severalnines 2012
8. 8
Simple Access
PRIMARY KEY lookups are HASH lookup O(1)
INDEX searches a T-tree and takes O(log n) time.
In 7.2 JOINs are ok, but in 7.1 you should try to avoid
them.
Copyright Severalnines 2012
9. 9
Data Access
Access Layer
App App App App App App App App
Server Server Server Server Server Server Server Server
MYSQL MYSQL MYSQL MYSQL MYSQL NDBAPI NDBAPI NDBAPI
STORAGE LAYER
DATA DATA DATA DATA DATA DATA
NODE NODE NODE NODE NODE NODE
P0 P1 P2 P3 P4 P5
Node group 0 Node group 1 Node group 2
Copyright Severalnines 2012
10. 10
Data Access
One Request can hit all Partitions
Sub-optimal and won't scale
Copyright Severalnines 2012
11. 11
Data Access
Access Layer
App App App App App App App App
Server Server Server Server Server Server Server Server
MYSQL MYSQL MYSQL MYSQL MYSQL NDBAPI NDBAPI NDBAPI
STORAGE LAYER
DATA DATA DATA DATA DATA DATA
NODE NODE NODE NODE NODE NODE
P0 P1 P2 P3 P4 P5
Node group 0 Node group 1 Node group 2
Copyright Severalnines 2012
12. 12
Data Access
One Request hits one partition
Scales!
The number of Partitions (data nodes) does not matter!
Partitioning is important!
Copyright Severalnines 2012
13. 13
Partitioning
MySQL Cluster auto-partitions based on the Primary Key
Data is spread randomly
If possible better to Partition on a part of the Primary Key
CREATE TABLE user_friends(
userid,
friendid ,
somedata,
PRIMARY KEY (userid, friendid)) PARTITION BY KEY(userid)
All records with userid=X will be stored in the same partition!
Ultra important for MySQL Cluster 7.2 and Fast JOINs.
Copyright Severalnines 2012
14. 14
Partitioning
EXPLAIN PARTITIONS <query>
Tells you what partitions you touch.
Also verify with:
mysql> show global status like 'ndb_pruned_scan_count’;
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| Ndb_pruned_scan_count | 1 |
+-----------------------+-------+
Increases when Partition Pruning could be used.
Copyright Severalnines 2012
15. 15
Insert Performance
Scaling Inserts
Option 1) Batch INSERTS if you can
● An insert batch of 10 records will perform 10x faster than 10 single
inserts!
● INSERT INTO t1 VALUES (<record1>), (<record2>), …,(<recordN>)
Option 2) Many threads (parallelism)
Or a combo of both
Dumpfiles or LOAD DATA INFILEs
Chunk them up and load in parallel on several mysqlds
Copyright Severalnines 2012
16. 16
Insert Performance
INSERTs in a table with AUTO_INCREMENT
MySQL Server query Data nodes for an auto_increment
– The mysqld can hold a range of autoincs (cache)
– Before an INSERT, and autoinc must be fetched from either Data node
(slow) or on the cache (fast)
ndb_autoincremet_prefetch_sz sets the cache size and it affects insert perf:
– ndb_autoincrement_prefetch_sz=1: 1211.91TPS
– ndb_autoincrement_prefetch_sz=256: 3471.71TPS
– ndb_autoincrement_prefetch_sz=1024: 3659.52TPS
Copyright Severalnines 2012
17. 17
Insert Performance
ndb_batch_size can also be important with LOAD DATA INFILE or
dumps
SET GLOBAL|SESSION NDB_BATCH_SIZE = 16M
You may get LongMessageBuffer Overload
● Increase it in config.ini to 32M or 48M
Also REDO logs may get overloaded if your disks are too slow and/or
the REDO is too small.
Copyright Severalnines 2012
18. 18
Query Performance
Queries needs to be tuned as “usual”:
Slow query / general log
From a monitoring system (like ClusterControl)
+ a methodology
Copyright Severalnines 2012
19. Query Performance
Slow query log
set global slow_query_log=1;
set global long_query_time=0.01;
set global log_queries_not_using_indexes=1;
General log (if you don’t get enough info in the Slow Query Log)
Activate for a very short period of time (30-60seconds) – intrusive
Can fill up disk very fast – make sure you turn it off.
set global general_log=1;
Use Severalnines ClusterControl
Includes a Cluster-wide Query Monitor.
Query frequency, EXPLAINs, lock time etc.
Performance Monitor and Manager.
Copyright Severalnines 2012
20. Data Types
BLOB/TEXT columns are stored in an external hidden table.
First 255B are stored inline in main table
Reading a BLOB/TEXT requires two reads
One for reading the Main table + reading from hidden table
Change to VARBINARY/VARCHAR if:
Your BLOB/TEXTs can fit within an 8052B record
(record size is currently 8052 Bytes)
Reading/writing VARCHAR/VARBINARY is less expensive
Note 1: BLOB/TEXT are also more expensive in Innodb as BLOB/TEXT data is not
inlined with the table. Thus, two disk seeks are needed to read a BLOB.
Note 2: Store images, movies etc outside the database on the filesystem.
Copyright Severalnines 2012
21. Data Types
Example
CREATE TABLE `t1_blob` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`data1` blob,
`data2` blob,
PRIMARY KEY (`id`)
)ENGINE=ndbcluster
Performance (8 threads, one mysqld, two data nodes):
data1 and data2 as BLOBs: 5844TPS
data1 and data2 as VARBINARY: 19206TPS
~3x
Copyright Severalnines 2012
22. Denormalize
Tables sharing the same PRIMARY KEY can be denormalized.
Table T1: <UID, SOME_DATA>
Table T2: <UID, SOME_OTHER_DATA
SELECT * from T1,T2 WHERE T1.UID=T2.UID and T2.UID=1 requires
two roundtrips.
Starting with MySQL Cluster 7.2 only one roundtrip is needed,.
Denormalize
Table T12: <UID,SOME_DATA, SOME_OTHER_DATA>
Improvement: 2X in throughput
Copyright Severalnines 2012
23. Query Tuning < 7.2
Don't trust the OPTIMIZER in MySQL Cluster 7.1 and earlier
Statistics gathering is non-existing
Optimizer thinks there are only 10 rows to examine in each table!
You have to do a lot of
FORCE INDEX / STRAIGH_JOIN to get queries run the way you
want.
Copyright Severalnines 2012
24. Query Tuning < 7.2
Classic example: if you have two similar indexes:
index(a)
index(a,ts)
on the following table
CREATE TABLE `t1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`a` bigint(20) DEFAULT NULL,
`ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `idx_t1_a` (`a`),
KEY `idx_t1_a_ts` (`a`,`ts`)) ENGINE=ndbcluster DEFAULT CHARSET=latin1
Copyright Severalnines 2012
25. Query Tuning < 7.2
mysql> explain select * from t1 where a=2 and ts='2011-10-05 15:32:11';
+----+-------------+-------+------+----------------------+----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------+----------+---------+-------+------+-------------+
| 1 | SIMPLE | t1 | ref | idx_t1_a,idx_t1_a_ts | idx_t1_a | 9 | const | 10 | Using where |
+----+-------------+-------+------+----------------------+----------+---------+-------+------+-------------+
Use FORCE INDEX(..) ...
mysql> explain select * from t1 FORCE INDEX (idx_t1_a_ts) where a=2 and ts='2011-10-05
15:32:11;
+| 1 | SIMPLE | t1 | ref | idx_t1_a_ts | idx_t1_a_ts | 13 | const,const | 10 | Using where |
1 row in set (0.00 sec)
..to ensure the correct index is picked!
The difference can be 1 record read instead of any number of
records!
Copyright Severalnines 2012
26. 26
Query Tuning in 7.2
ANALYZE TABLE
Must be performed periodically to rebuild index stats
EXPLAIN EXTENDED/PARTITIONS
Make sure the explain show “Child of JOIN pushed down”
This means that the Fast JOIN of NDB could be used
SHOW WARNINGS;
● Shows why a Query was not pushed down.
Copyright Severalnines 2012
27. Ndb_cluster_connection_pool
Problem:
A Sendbuffer on the connection between mysqld and the data nodes is protected
by a Mutex.
Connection threads in MySQL must acquire Mutex and the put data in SendBuffer.
Many threads gives more contention on the mutex
Must scale out with many MySQL Servers.
Workaround:
Ndb_cluster_connection_pool (in my.cnf) creates more connections from one
mysqld to the data nodes
Threads load balance on the connections gives less contention on mutex which in
turn gives increased scalabilty
Less MySQL Servers needed to drive load!
www.severalnines.com/cluster-configurator allows you to specify the connection
pool.
>70 % improvement.
Copyright Severalnines 2012
28. Ndb_cluster_connection_pool
Gives atleast 70% better performance and a MySQL Server that
can scale beyond four database connections.
Set Ndb_cluster_connection_pool=2x<CPU cores>
It is a good starting point
One free [mysqld] slot is required in config.ini for each
Ndb_cluster_connection.
4 mysql servers,each with Ndb_cluster_connection_pool=8 requires 32
[mysqld] in config.ini
Copyright Severalnines 2012
29. Disk Data Tuning
Disk Data Tables
Un-indexed columns → tablespace on disk
Indexed columns → DataMemory
DiskPageBufferMemory (DPBM) – LRU page cache
Like innodb_buffer_pool
Should be big as possible
If data not in DPBM DiskPage
Go to TS and fetch (Slow) IndexMemory DataMemory Buffer
Memory
If data is DPBM
REDO LOG
Return page (faster) UNDO LOG
Tablespace
Copyright Severalnines 2012
30. Disk Data Tuning
DiskPageBufferMemory
– Hit ratio (derived from ndbinfo.diskpagebuffer):
– 1000*page_requests_direct_return/
(page_requests_direct_return +
page_requests_wait_io+
page_requests_wait_queue)
– 998 is good (like in innodb).
– DiskPageBufferMemory=2048M is a good start
DiskPage
IndexMemory DataMemory Buffer
Memory
REDO LOG
UNDO LOG
Tablespace
Copyright Severalnines 2012
31. Disk Data Tuning
UNDO LOG
– Always overseen but can be extended overtime
– Set it to 50% of the REDO log size :
● 0.5 x NoOfFragmentLogFiles x FragmentLogFileSize
Undo buffer (specd in CREATE LOGFILE GROUP)
– 32M to 64M (like the RedoBuffer)
– SharedGlobalMemory=512M
DiskPage
IndexMemory DataMemory Buffer
Memory
REDO LOG
UNDO LOG
Tablespace
Copyright Severalnines 2012
32. More on Cluster
Severalnines Forum
– http://support.severalnines.com/forums/20323398-mysql-cluster
Johan Andersson @ blogspot
– http://johanandersson.blogspot.com
Configuration and Deployment
– http://www.severalnines.com/cluster-configurator
– ~20 min to deploy a 4 node cluster (288 seconds is the
World Record)
Self-training
– http://severalnines.com/mysql-cluster-training
Copyright Severalnines 2012