1. HandlerSocket plugin for
MySQL
Jun 29, 2010 DeNA Technology Seminar @ Yoyogi
IT Platform Dept., System Management Division
DeNA Co.,Ltd.
Akira Higuchi <higuchi dot akira at dena dot jp>
2. Who am I?
Akira Higuchi, Ph.D. in science
IT Platform Dept., DeNA Co.,Ltd.
system-wide performance optimization
middleware development
The creator of HandlerSocket plugin
Using GNU/Linux since 1993
Fedora: yum install KoboDeluxe
Debian: apt-get install kobodeluxe
5. What HandlerSocket aims
Executes simple CRUD operations fast
Omit SQL parsing
Combine multiple requests on the server
side
Allows SQL on the same database
Only simple operations can be faster
Seamless migration from SQL queries
6. HandlerSocket plugin
Offers a direct and non-SQL interface to MySQL
storage engines
Own TCP/IP listener
Talks a text protocol
There is a C++ and a Perl client libraries
Only works with Linux
The source code is here:
https://github.com/ahiguti/HandlerSocket-Plugin-for-
MySQL
More infos on the DeNA Tech Blog
http://engineer.dena.jp/ (in Japanese)
8. Other NoSQL interfaces to
MySQL
mycached
http://developer.cybozu.co.jp/kazuho/2009
/08/mycached-memcac.html
Works with any storage engines
Talks the memcached protocol
NDB API
http://dev.mysql.com/doc/ndbapi/en/index
.html
Dedicated for the ndbcluster engine
10. Performance
241009
159407
60191
15771
0 50000 100000 150000 200000 250000 300000
1 column
50 columns
(requests/sec)
handlersocket
libmysql
Handlersocket executes simple read queries 4x
faster than mysqld/libmysql
Very effective when many columns are retrieved
The reason is described later
11. Commands supported by
HandlerSocket (for reading data)
In pseudo-SQL...
SELECT f1, .. , fn FROM db.table
WHERE k1, ... , km = v1, ... , vm
ORDER BY index_i LIMIT offset, limit
(k1, ... , km) are the key fields (or a
prefix) of the index_i
=, >=, >, <=, and < can be used for a
comparator
12. Commands supported by
HandlerSocket (for modifying
data)
UPDATE, DELETE, and INSERT
Does not support transactions
Modifications are recorded to the
binary log in the row-based format
Modifications are durable
13. Command example
create table db1.table1 (k int key, v char(20))
insert into db1.table1 values (234, 'foo'), (678, ‘bar’)
$ telnet localhost 9998
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
P 0 db1 table1 PRIMARY k,v
0 1
0 = 1 234
0 2 234 foo
0 = 1 678
0 2 678 bar
opens the PK
find k = 234
find k = 678
14. Why fast?
No SQL parsing
low CPU usage
Executes multiple requests in bulk
low CPU/Disk usage
Own client/server protocol
small network transmission size
22. oprofile results –
HandlerSocket
samples % symbol name
129038 6.7241 tcp_sendmsg
80080 4.1729 tcp_v4_rcv
69658 3.6298 dev_queue_xmit
66171 3.4481 .text.skb_release_data
63316 3.2994 __qdisc_run
60279 3.1411 tcp_recvmsg
59703 3.1111 ip_output
58462 3.0464 .text.skb_release_head_state
48876 2.5469 tcp_ack
48733 2.5394 __alloc_skb
45660 2.3793 ip_queue_xmit
44671 2.3278 tcp_transmit_skb
CPU consumption in the Linux kernel
23. oprofile results –
HandlerSocket
HandlerSocket
Most CPU time is consumed in the kernel
schedule() is not called frequently
Inside mysqld, innodb eats most CPU
time
26. Threading model
HandlerSocket:
Small number of threads
Many connections per thread
Uses epoll()
Virtually unlimited number of concurrent
connections
Small memory footprint
27. HandlerSocket reader thread
reads requests from many clients
locks the DB, gets a read view
executes many requests
unlocks the DB
returns responses to clients
locks/unlocks
(1/#conns)
times per
request
handlersocket reader thread
28. HandlerSocket writer thread
reads requests from many clients
locks the DB, begins a transaction
executes multiple requests
commits, and unlocks the DB
returns responses to clients
handlersocket writer thread
executes multiple ops
in a single transaction
29. Write throughput
Condition:
Durable write
sync_binlog = 1
innodb_flush_log_at_trx_commit = 1
innodb_support_xa = 1
Write-back cache with BBU, or SSD
Throughput:
MySQL: up to 1000 qps
HandlerSocket: up to 30000 qps
30. How HandlerSocket locks tables
MyISAM:
Shared-exclusive lock
InnoDB:
Reader threads don’t block
Only one writer thread can be executed at the
same time
HandlerSocket requests are deadlock-free
Only simple operations are supported
32. MySQL C/S protocol
write(3, "L0003select column0,column1,column2,column3,column4 from
db_1.table_1 where k=15", 80) = 80
read(3,
"100100560023def4db_17table_17table_17column07column0fr
0<00037520000000060033def4db_17table_17table_17column
17column1fr0<00037520000000060043def4db_17table_17t
able_17column27column2fr0<00037520000000060053def4d
b_17table_17table_17column37column3fr0<0003752000000006
0063def4db_17table_17table_17column47column4fr0<000375
2000000500737600"0n0010001000110012001300145
00t37600"0", 16384) = 327
when the above query is executed...
SELECT column0, column1, column2, column3, column4
FROM db_1.table_1 where k = 15
33. HandlerSocket C/S protocol
write(3, "1t=t1t15n", 9) = 9
read(3, "0t5t0t1t2t3t4n", 8192) = 14
when an equivalent query is executed using
handlersocket...
libmysql handlersocket
request 80 bytes 9 bytes
response 327 bytes 14 bytes
34. MySQL C/S protocol
The strace result shows that MySQL C/S
protocol is verbose
Result-set metadata
http://forge.mysql.com/wiki/MySQL_Internals_ClientServe
r_Protocol#Field_Packet
Result-set metadata become very large if a
result-set has many columns
Neither a HANDLER statement nor a server-
side prepared statement does not help to
avoid this problem
39. HandlerSocket configuration
options
handlersocket_threads = 16
Number of reader threads
Recommended value is the number of logical
CPU
handlersocket_thread_wr = 1
Number of writer threads
Recommended value is ... 1
handlersocket_port = 9998
Listening port for reader requests
handlersocket_port_wr = 9999
Listening port for writer requests
40. Other configuration options
innodb_buffer_pool_size
As large as possible
innodb_log_file_size,
innodb_log_files_in_group
As large as possible
innodb_thread_concurrency = 0
open_files_limit = 65535
Number of file descriptors mysqld can open
HandlerSocket can handle up to 65000
concurrent connections