SlideShare a Scribd company logo
1 of 51
HandlerSocket plugin for
MySQL
Jun 29, 2010 DeNA Technology Seminar @ Yoyogi
IT Platform Dept., System Management Division
DeNA Co.,Ltd.
Akira Higuchi <higuchi dot akira at dena dot jp>
Who am I?
 Akira Higuchi, Ph.D. in science
 IT Platform Dept., DeNA Co.,Ltd.
 system-wide performance optimization
 middleware development
 The creator of HandlerSocket plugin
 Using GNU/Linux since 1993
 Fedora: yum install KoboDeluxe
 Debian: apt-get install kobodeluxe
About HandlerSocket plugin
What is HandlerSocket?
 Non-SQL interface for MySQL
What HandlerSocket aims
 Executes simple CRUD operations fast
 Omit SQL parsing
 Combine multiple requests on the server
side
 Allows SQL on the same database
 Only simple operations can be faster
 Seamless migration from SQL queries
HandlerSocket plugin
 Offers a direct and non-SQL interface to MySQL
storage engines
 Own TCP/IP listener
 Talks a text protocol
 There is a C++ and a Perl client libraries
 Only works with Linux
 The source code is here:
 https://github.com/ahiguti/HandlerSocket-Plugin-for-
MySQL
 More infos on the DeNA Tech Blog
 http://engineer.dena.jp/ (in Japanese)
Construction
Handler Interface
Innodb MyISAM Other storage engines …
SQL Layer Handlersocket Plugin
Listener for libmysql
libmysql libhsclient
Applications
mysqld
client app
Other NoSQL interfaces to
MySQL
 mycached
 http://developer.cybozu.co.jp/kazuho/2009
/08/mycached-memcac.html
 Works with any storage engines
 Talks the memcached protocol
 NDB API
 http://dev.mysql.com/doc/ndbapi/en/index
.html
 Dedicated for the ndbcluster engine
Performance
Performance
241009
159407
60191
15771
0 50000 100000 150000 200000 250000 300000
1 column
50 columns
(requests/sec)
handlersocket
libmysql
 Handlersocket executes simple read queries 4x
faster than mysqld/libmysql
 Very effective when many columns are retrieved
 The reason is described later
Commands supported by
HandlerSocket (for reading data)
 In pseudo-SQL...
SELECT f1, .. , fn FROM db.table
WHERE k1, ... , km = v1, ... , vm
ORDER BY index_i LIMIT offset, limit
 (k1, ... , km) are the key fields (or a
prefix) of the index_i
 =, >=, >, <=, and < can be used for a
comparator
Commands supported by
HandlerSocket (for modifying
data)
 UPDATE, DELETE, and INSERT
 Does not support transactions
 Modifications are recorded to the
binary log in the row-based format
 Modifications are durable
Command example
 create table db1.table1 (k int key, v char(20))
 insert into db1.table1 values (234, 'foo'), (678, ‘bar’)
$ telnet localhost 9998
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
P 0 db1 table1 PRIMARY k,v
0 1
0 = 1 234
0 2 234 foo
0 = 1 678
0 2 678 bar
opens the PK
find k = 234
find k = 678
Why fast?
 No SQL parsing
low CPU usage
 Executes multiple requests in bulk
low CPU/Disk usage
 Own client/server protocol
small network transmission size
Eliminating CPU consumption
oprofile results –
libmysql/mysqld
 Executes “SELECT v from table where k = ?” many
times
samples| %|
------------------
9669940 53.1574 mysqld
4438098 24.3970 vmlinux
1835976 10.0927 libpthread-2.5.so
1680656 9.2389 libc-2.5.so
397970 2.1877 e1000e
89136 0.4900 oprofiled
42881 0.2357 oprofile
oprofile results –
libmysql/mysqld
samples % symbol name
748022 7.7355 MYSQLparse(void*)
219702 2.2720 my_pthread_fastmutex_lock
205606 2.1262 make_join_statistics(JOIN*, TABLE_LIST*,
198234 2.0500 btr_search_guess_on_hash
180731 1.8690 JOIN::optimize()
177120 1.8317 row_search_for_mysql
171185 1.7703 lex_one_token(void*, void*)
162683 1.6824 alloc_root
131823 1.3632 read_view_open_now
122795 1.2699 mysql_select(THD*, Item***, TABLE_LIST*,
100276 1.0370 open_table(THD*, TABLE_LIST*, st_mem_root*,
99575 1.0297 mem_pool_fill_free_list
96434 0.9973 build_template(row_prebuilt_struct*, THD*,
86349 0.8930 get_hash_symbol(char const*, unsigned int,
 CPU usage inside mysqld
oprofile results –
libmysql/mysqld
samples % symbol name
204393 4.6054 schedule
118648 2.6734 tcp_sendmsg
115832 2.6099 tcp_recvmsg
106537 2.4005 tcp_v4_rcv
103915 2.3414 tcp_ack
103534 2.3328 system_call
93864 2.1150 dev_queue_xmit
86831 1.9565 __mod_timer
85891 1.9353 tcp_rcv_established
84083 1.8946 .text.task_rq_lock
 CPU usage inside the Linux kernel
oprofile results –
libmysql/mysqld
 libmysql/mysqld
 Much CPU time spent in mysqld
 Parsing SQL is slow
 schedule() is called frequently
oprofile results –
HandlerSocket
samples| %|
------------------
1919039 51.0453 vmlinux
811998 21.5987 mysqld
421215 11.2041 libpthread-2.5.so
207166 5.5105 e1000e
191566 5.0955 handlersocket.so
188618 5.0171 libc-2.5.so
13622 0.3623 oprofiled
5707 0.1518 oprofile
 CPU usage inside MySQL with HandlerSocket
oprofile results –
HandlerSocket
samples % symbol name
119684 14.7394 btr_search_guess_on_hash
58202 7.1678 row_search_for_mysql
46946 5.7815 mutex_delay
38617 4.7558 my_pthread_fastmutex_lock
37707 4.6437 buf_page_get_known_nowait
36528 4.4985 rec_get_offsets_func
34625 4.2642 build_template(row_prebuilt_struct*, THD*, TABLE*,
20024 2.4660 row_sel_store_mysql_rec
19347 2.3826 btr_cur_search_to_nth_level
16701 2.0568 row_sel_convert_mysql_key_to_innobase
13343 1.6432 cmp_dtuple_rec_with_match
11381 1.4016 ha_innobase::index_read(unsigned char*,
11176 1.3764 dict_index_copy_types
10762 1.3254 mtr_memo_slot_release
10734 1.3219 ha_innobase::init_table_handle_for_HANDLER()
 CPU consumption in mysqld
oprofile results –
HandlerSocket
samples % symbol name
129038 6.7241 tcp_sendmsg
80080 4.1729 tcp_v4_rcv
69658 3.6298 dev_queue_xmit
66171 3.4481 .text.skb_release_data
63316 3.2994 __qdisc_run
60279 3.1411 tcp_recvmsg
59703 3.1111 ip_output
58462 3.0464 .text.skb_release_head_state
48876 2.5469 tcp_ack
48733 2.5394 __alloc_skb
45660 2.3793 ip_queue_xmit
44671 2.3278 tcp_transmit_skb
 CPU consumption in the Linux kernel
oprofile results –
HandlerSocket
 HandlerSocket
 Most CPU time is consumed in the kernel
 schedule() is not called frequently
 Inside mysqld, innodb eats most CPU
time
Executing multiple requests
in bulk
Threading model
mysqld:
 Thread per connection (MySQL 5)
 Thread pooling (MySQL 6?)
Threading model
HandlerSocket:
 Small number of threads
 Many connections per thread
 Uses epoll()
 Virtually unlimited number of concurrent
connections
 Small memory footprint
HandlerSocket reader thread
reads requests from many clients
locks the DB, gets a read view
executes many requests
unlocks the DB
returns responses to clients
locks/unlocks
(1/#conns)
times per
request
handlersocket reader thread
HandlerSocket writer thread
reads requests from many clients
locks the DB, begins a transaction
executes multiple requests
commits, and unlocks the DB
returns responses to clients
handlersocket writer thread
executes multiple ops
in a single transaction
Write throughput
 Condition:
 Durable write
 sync_binlog = 1
 innodb_flush_log_at_trx_commit = 1
 innodb_support_xa = 1
 Write-back cache with BBU, or SSD
 Throughput:
 MySQL: up to 1000 qps
 HandlerSocket: up to 30000 qps
How HandlerSocket locks tables
 MyISAM:
 Shared-exclusive lock
 InnoDB:
 Reader threads don’t block
 Only one writer thread can be executed at the
same time
 HandlerSocket requests are deadlock-free
 Only simple operations are supported
Client/server protocol
MySQL C/S protocol
write(3, "L0003select column0,column1,column2,column3,column4 from
db_1.table_1 where k=15", 80) = 80
read(3,
"100100560023def4db_17table_17table_17column07column0fr
0<00037520000000060033def4db_17table_17table_17column
17column1fr0<00037520000000060043def4db_17table_17t
able_17column27column2fr0<00037520000000060053def4d
b_17table_17table_17column37column3fr0<0003752000000006
0063def4db_17table_17table_17column47column4fr0<000375
2000000500737600"0n0010001000110012001300145
00t37600"0", 16384) = 327
when the above query is executed...
SELECT column0, column1, column2, column3, column4
FROM db_1.table_1 where k = 15
HandlerSocket C/S protocol
write(3, "1t=t1t15n", 9) = 9
read(3, "0t5t0t1t2t3t4n", 8192) = 14
when an equivalent query is executed using
handlersocket...
libmysql handlersocket
request 80 bytes 9 bytes
response 327 bytes 14 bytes
MySQL C/S protocol
 The strace result shows that MySQL C/S
protocol is verbose
 Result-set metadata
http://forge.mysql.com/wiki/MySQL_Internals_ClientServe
r_Protocol#Field_Packet
 Result-set metadata become very large if a
result-set has many columns
 Neither a HANDLER statement nor a server-
side prepared statement does not help to
avoid this problem
Client libraries
libhsclient
 Client library for C++
Net::HandlerSocket
 Client library for Perl
 Invokes libhsclient via XS
my $cli = new Net::HandlerSocket(
{host => ‘localhost’, port => 9999});
$cli->open_index(1, ‘db1’, ‘table1’, ‘PRIMARY’, ‘k,v’);
my $res = $cli->exec_multi([
[ 1, ‘=‘, [ ’33’ ], 1, 0 ],
[ 1, ‘=‘, [ ’44’ ], 1, 0, ‘U’, [ ’44’, ‘hoge’ ] ],
[ 1, ‘>=‘, [ ’55’ ], 10, 20 ],
]);
Configuration hints
HandlerSocket configuration
options
 handlersocket_threads = 16
 Number of reader threads
 Recommended value is the number of logical
CPU
 handlersocket_thread_wr = 1
 Number of writer threads
 Recommended value is ... 1
 handlersocket_port = 9998
 Listening port for reader requests
 handlersocket_port_wr = 9999
 Listening port for writer requests
Other configuration options
 innodb_buffer_pool_size
 As large as possible
 innodb_log_file_size,
innodb_log_files_in_group
 As large as possible
 innodb_thread_concurrency = 0
 open_files_limit = 65535
 Number of file descriptors mysqld can open
 HandlerSocket can handle up to 65000
concurrent connections
Other configuration options
 innodb_adaptive_hash_index = 1
 Adaptive has index is fast, but consume
memory
Options related to durability
 sync_binlog = 1
 innodb_flush_log_at_trx_commit = 1
 innodb_support_xa = 1
Benchmark results
Benchmark
 Server:
 Core2Quad Q6600
 CentOS 5.4
 Single EXPI9301CT(e1000e)
 Single Intel X25-E (write-back cache disabled)
 Schema:
 CREATE TABLE table1 (k varchar(32) KEY, v varchar(32)) engine =
INNODB;
 Read benchmark:
 10000000 records
 SELECT v from table1 where k = ?
 Random access
 Write benchmark:
 10000000 records
 UPDATE table SET v = ? where k = ?
 Random access
 Durable write
 sync_binlog = 1
 innodb_flush_log_at_trx_commit = 1
 innodb_support_xa = 1
Throughput (reads)
0
50000
100000
150000
200000
250000
300000 1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
# of concurrent connections
queries
per
sec
handlersocket read
mysql read
handlersocket write
mysql write
Throughput (writes)
1
10
100
1000
10000
100000
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
# of concurrent connections
queries
per
sec
handlersocket write
mysql write
Maximum response time
0
10
20
30
40
50
60
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
# of concurrent connections
max
response
time(sec)
handlersocket read
mysql read
handlersocket write
mysql write
Average response time
0
1
2
3
4
5
6
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
# of concurrent connections
average
response
time(sec)
handlersocket read
mysql read
handlersocket write
mysql write
Issues and future plans
Issues
 Difficult to build
 Requires the source of mysql
 MySQL binary compatibility?
Future plans
 ‘where’ clause
 Atomic read-modify-write operations
 SQL support?
 More language bindings

More Related Content

What's hot

Как мы сделали PHP 7 в два раза быстрее PHP 5 / Дмитрий Стогов (Zend Technolo...
Как мы сделали PHP 7 в два раза быстрее PHP 5 / Дмитрий Стогов (Zend Technolo...Как мы сделали PHP 7 в два раза быстрее PHP 5 / Дмитрий Стогов (Zend Technolo...
Как мы сделали PHP 7 в два раза быстрее PHP 5 / Дмитрий Стогов (Zend Technolo...Ontico
 
New features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionNew features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionSveta Smirnova
 
MySQL Tokudb engine benchmark
MySQL Tokudb engine benchmarkMySQL Tokudb engine benchmark
MySQL Tokudb engine benchmarkLouis liu
 
Масштабируемая конфигурация Nginx, Игорь Сысоев (Nginx)
Масштабируемая конфигурация Nginx, Игорь Сысоев (Nginx)Масштабируемая конфигурация Nginx, Игорь Сысоев (Nginx)
Масштабируемая конфигурация Nginx, Игорь Сысоев (Nginx)Ontico
 
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...Ontico
 
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...Ontico
 
REDIS intro and how to use redis
REDIS intro and how to use redisREDIS intro and how to use redis
REDIS intro and how to use redisKris Jeong
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awrLouis liu
 
Using Apache Spark and MySQL for Data Analysis
Using Apache Spark and MySQL for Data AnalysisUsing Apache Spark and MySQL for Data Analysis
Using Apache Spark and MySQL for Data AnalysisSveta Smirnova
 
Advanced Replication
Advanced ReplicationAdvanced Replication
Advanced ReplicationMongoDB
 
MySQL replication & cluster
MySQL replication & clusterMySQL replication & cluster
MySQL replication & clusterelliando dias
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationMydbops
 
MySQL async message subscription platform
MySQL async message subscription platformMySQL async message subscription platform
MySQL async message subscription platformLouis liu
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...ScaleGrid.io
 
Nvmfs benchmark
Nvmfs benchmarkNvmfs benchmark
Nvmfs benchmarkLouis liu
 
My sql failover test using orchestrator
My sql failover test  using orchestratorMy sql failover test  using orchestrator
My sql failover test using orchestratorYoungHeon (Roy) Kim
 
Как PostgreSQL работает с диском
Как PostgreSQL работает с дискомКак PostgreSQL работает с диском
Как PostgreSQL работает с дискомPostgreSQL-Consulting
 
XtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsXtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsLaurynas Biveinis
 
Managing MariaDB Server operations with Percona Toolkit
Managing MariaDB Server operations with Percona ToolkitManaging MariaDB Server operations with Percona Toolkit
Managing MariaDB Server operations with Percona ToolkitSveta Smirnova
 

What's hot (20)

Как мы сделали PHP 7 в два раза быстрее PHP 5 / Дмитрий Стогов (Zend Technolo...
Как мы сделали PHP 7 в два раза быстрее PHP 5 / Дмитрий Стогов (Zend Technolo...Как мы сделали PHP 7 в два раза быстрее PHP 5 / Дмитрий Стогов (Zend Technolo...
Как мы сделали PHP 7 в два раза быстрее PHP 5 / Дмитрий Стогов (Zend Technolo...
 
New features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionNew features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in action
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
 
MySQL Tokudb engine benchmark
MySQL Tokudb engine benchmarkMySQL Tokudb engine benchmark
MySQL Tokudb engine benchmark
 
Масштабируемая конфигурация Nginx, Игорь Сысоев (Nginx)
Масштабируемая конфигурация Nginx, Игорь Сысоев (Nginx)Масштабируемая конфигурация Nginx, Игорь Сысоев (Nginx)
Масштабируемая конфигурация Nginx, Игорь Сысоев (Nginx)
 
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...
pg / shardman: шардинг в PostgreSQL на основе postgres / fdw, pg / pathman и ...
 
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
 
REDIS intro and how to use redis
REDIS intro and how to use redisREDIS intro and how to use redis
REDIS intro and how to use redis
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awr
 
Using Apache Spark and MySQL for Data Analysis
Using Apache Spark and MySQL for Data AnalysisUsing Apache Spark and MySQL for Data Analysis
Using Apache Spark and MySQL for Data Analysis
 
Advanced Replication
Advanced ReplicationAdvanced Replication
Advanced Replication
 
MySQL replication & cluster
MySQL replication & clusterMySQL replication & cluster
MySQL replication & cluster
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL Administration
 
MySQL async message subscription platform
MySQL async message subscription platformMySQL async message subscription platform
MySQL async message subscription platform
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
 
Nvmfs benchmark
Nvmfs benchmarkNvmfs benchmark
Nvmfs benchmark
 
My sql failover test using orchestrator
My sql failover test  using orchestratorMy sql failover test  using orchestrator
My sql failover test using orchestrator
 
Как PostgreSQL работает с диском
Как PostgreSQL работает с дискомКак PostgreSQL работает с диском
Как PostgreSQL работает с диском
 
XtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsXtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithms
 
Managing MariaDB Server operations with Percona Toolkit
Managing MariaDB Server operations with Percona ToolkitManaging MariaDB Server operations with Percona Toolkit
Managing MariaDB Server operations with Percona Toolkit
 

Viewers also liked

Handlersocket 20140218
Handlersocket 20140218Handlersocket 20140218
Handlersocket 20140218akirahiguchi
 
カジュアルにMySQL Clusterを使ってみよう@MySQL Cluster Casual Talks 2013.09
カジュアルにMySQL Clusterを使ってみよう@MySQL Cluster Casual Talks 2013.09カジュアルにMySQL Clusterを使ってみよう@MySQL Cluster Casual Talks 2013.09
カジュアルにMySQL Clusterを使ってみよう@MySQL Cluster Casual Talks 2013.09Mikiya Okuno
 
Handlersocket etc. 20110906
Handlersocket etc. 20110906Handlersocket etc. 20110906
Handlersocket etc. 20110906akirahiguchi
 
MariaDB Spider Mroonga 20140218
MariaDB Spider Mroonga 20140218MariaDB Spider Mroonga 20140218
MariaDB Spider Mroonga 20140218Kentoku
 
То, что вы хотели знать о HandlerSocket, но не смогли нагуглить
То, что вы хотели знать о HandlerSocket, но не смогли нагуглитьТо, что вы хотели знать о HandlerSocket, но не смогли нагуглить
То, что вы хотели знать о HandlerSocket, но не смогли нагуглитьphp-user-group-minsk
 
HandlerSocket plugin for MySQL
HandlerSocket plugin for MySQLHandlerSocket plugin for MySQL
HandlerSocket plugin for MySQLakirahiguchi
 
Spiderの最新動向 20130419
Spiderの最新動向 20130419Spiderの最新動向 20130419
Spiderの最新動向 20130419Kentoku
 
日本語:Mongo dbに於けるシャーディングについて
日本語:Mongo dbに於けるシャーディングについて日本語:Mongo dbに於けるシャーディングについて
日本語:Mongo dbに於けるシャーディングについてippei_suzuki
 
In-Database Analyticsの必要性と可能性
In-Database Analyticsの必要性と可能性In-Database Analyticsの必要性と可能性
In-Database Analyticsの必要性と可能性Satoshi Nagayasu
 

Viewers also liked (9)

Handlersocket 20140218
Handlersocket 20140218Handlersocket 20140218
Handlersocket 20140218
 
カジュアルにMySQL Clusterを使ってみよう@MySQL Cluster Casual Talks 2013.09
カジュアルにMySQL Clusterを使ってみよう@MySQL Cluster Casual Talks 2013.09カジュアルにMySQL Clusterを使ってみよう@MySQL Cluster Casual Talks 2013.09
カジュアルにMySQL Clusterを使ってみよう@MySQL Cluster Casual Talks 2013.09
 
Handlersocket etc. 20110906
Handlersocket etc. 20110906Handlersocket etc. 20110906
Handlersocket etc. 20110906
 
MariaDB Spider Mroonga 20140218
MariaDB Spider Mroonga 20140218MariaDB Spider Mroonga 20140218
MariaDB Spider Mroonga 20140218
 
То, что вы хотели знать о HandlerSocket, но не смогли нагуглить
То, что вы хотели знать о HandlerSocket, но не смогли нагуглитьТо, что вы хотели знать о HandlerSocket, но не смогли нагуглить
То, что вы хотели знать о HandlerSocket, но не смогли нагуглить
 
HandlerSocket plugin for MySQL
HandlerSocket plugin for MySQLHandlerSocket plugin for MySQL
HandlerSocket plugin for MySQL
 
Spiderの最新動向 20130419
Spiderの最新動向 20130419Spiderの最新動向 20130419
Spiderの最新動向 20130419
 
日本語:Mongo dbに於けるシャーディングについて
日本語:Mongo dbに於けるシャーディングについて日本語:Mongo dbに於けるシャーディングについて
日本語:Mongo dbに於けるシャーディングについて
 
In-Database Analyticsの必要性と可能性
In-Database Analyticsの必要性と可能性In-Database Analyticsの必要性と可能性
In-Database Analyticsの必要性と可能性
 

Similar to HandlerSocket plugin for MySQL (English)

11 Things About11g
11 Things About11g11 Things About11g
11 Things About11gfcamachob
 
11thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp0111thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp01Karam Abuataya
 
Perl Stored Procedures for MySQL (2009)
Perl Stored Procedures for MySQL (2009)Perl Stored Procedures for MySQL (2009)
Perl Stored Procedures for MySQL (2009)Antony T Curtis
 
MySQL Day Roma - MySQL Shell and Visual Studio Code Extension
MySQL Day Roma - MySQL Shell and Visual Studio Code ExtensionMySQL Day Roma - MySQL Shell and Visual Studio Code Extension
MySQL Day Roma - MySQL Shell and Visual Studio Code ExtensionFrederic Descamps
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Michael Renner
 
12c Database new features
12c Database new features12c Database new features
12c Database new featuresSandeep Redkar
 
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...Marco Tusa
 
DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2Alex Zaballa
 
DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2Alex Zaballa
 
Linuxfest Northwest 2022 - MySQL 8.0 Nre Features
Linuxfest Northwest 2022 - MySQL 8.0 Nre FeaturesLinuxfest Northwest 2022 - MySQL 8.0 Nre Features
Linuxfest Northwest 2022 - MySQL 8.0 Nre FeaturesDave Stokes
 
Applying profilers to my sql (fosdem 2017)
Applying profilers to my sql (fosdem 2017)Applying profilers to my sql (fosdem 2017)
Applying profilers to my sql (fosdem 2017)Valeriy Kravchuk
 
Confoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New FeaturesConfoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New FeaturesDave Stokes
 
External Language Stored Procedures for MySQL
External Language Stored Procedures for MySQLExternal Language Stored Procedures for MySQL
External Language Stored Procedures for MySQLAntony T Curtis
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and ArchitectureSidney Chen
 
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)Dave Stokes
 
Introducing new SQL syntax and improving performance with preparse Query Rewr...
Introducing new SQL syntax and improving performance with preparse Query Rewr...Introducing new SQL syntax and improving performance with preparse Query Rewr...
Introducing new SQL syntax and improving performance with preparse Query Rewr...Sveta Smirnova
 
Performance and how to measure it - ProgSCon London 2016
Performance and how to measure it - ProgSCon London 2016Performance and how to measure it - ProgSCon London 2016
Performance and how to measure it - ProgSCon London 2016Matt Warren
 
Percona Live UK 2014 Part III
Percona Live UK 2014  Part IIIPercona Live UK 2014  Part III
Percona Live UK 2014 Part IIIAlkin Tezuysal
 

Similar to HandlerSocket plugin for MySQL (English) (20)

11 Things About11g
11 Things About11g11 Things About11g
11 Things About11g
 
11thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp0111thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp01
 
Perl Stored Procedures for MySQL (2009)
Perl Stored Procedures for MySQL (2009)Perl Stored Procedures for MySQL (2009)
Perl Stored Procedures for MySQL (2009)
 
MySQL Day Roma - MySQL Shell and Visual Studio Code Extension
MySQL Day Roma - MySQL Shell and Visual Studio Code ExtensionMySQL Day Roma - MySQL Shell and Visual Studio Code Extension
MySQL Day Roma - MySQL Shell and Visual Studio Code Extension
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
12c Database new features
12c Database new features12c Database new features
12c Database new features
 
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
 
DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2
 
DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2
 
Linuxfest Northwest 2022 - MySQL 8.0 Nre Features
Linuxfest Northwest 2022 - MySQL 8.0 Nre FeaturesLinuxfest Northwest 2022 - MySQL 8.0 Nre Features
Linuxfest Northwest 2022 - MySQL 8.0 Nre Features
 
Applying profilers to my sql (fosdem 2017)
Applying profilers to my sql (fosdem 2017)Applying profilers to my sql (fosdem 2017)
Applying profilers to my sql (fosdem 2017)
 
Confoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New FeaturesConfoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New Features
 
External Language Stored Procedures for MySQL
External Language Stored Procedures for MySQLExternal Language Stored Procedures for MySQL
External Language Stored Procedures for MySQL
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and Architecture
 
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBAPHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
 
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
 
Introducing new SQL syntax and improving performance with preparse Query Rewr...
Introducing new SQL syntax and improving performance with preparse Query Rewr...Introducing new SQL syntax and improving performance with preparse Query Rewr...
Introducing new SQL syntax and improving performance with preparse Query Rewr...
 
Performance and how to measure it - ProgSCon London 2016
Performance and how to measure it - ProgSCon London 2016Performance and how to measure it - ProgSCon London 2016
Performance and how to measure it - ProgSCon London 2016
 
Percona Live UK 2014 Part III
Percona Live UK 2014  Part IIIPercona Live UK 2014  Part III
Percona Live UK 2014 Part III
 

HandlerSocket plugin for MySQL (English)

  • 1. HandlerSocket plugin for MySQL Jun 29, 2010 DeNA Technology Seminar @ Yoyogi IT Platform Dept., System Management Division DeNA Co.,Ltd. Akira Higuchi <higuchi dot akira at dena dot jp>
  • 2. Who am I?  Akira Higuchi, Ph.D. in science  IT Platform Dept., DeNA Co.,Ltd.  system-wide performance optimization  middleware development  The creator of HandlerSocket plugin  Using GNU/Linux since 1993  Fedora: yum install KoboDeluxe  Debian: apt-get install kobodeluxe
  • 4. What is HandlerSocket?  Non-SQL interface for MySQL
  • 5. What HandlerSocket aims  Executes simple CRUD operations fast  Omit SQL parsing  Combine multiple requests on the server side  Allows SQL on the same database  Only simple operations can be faster  Seamless migration from SQL queries
  • 6. HandlerSocket plugin  Offers a direct and non-SQL interface to MySQL storage engines  Own TCP/IP listener  Talks a text protocol  There is a C++ and a Perl client libraries  Only works with Linux  The source code is here:  https://github.com/ahiguti/HandlerSocket-Plugin-for- MySQL  More infos on the DeNA Tech Blog  http://engineer.dena.jp/ (in Japanese)
  • 7. Construction Handler Interface Innodb MyISAM Other storage engines … SQL Layer Handlersocket Plugin Listener for libmysql libmysql libhsclient Applications mysqld client app
  • 8. Other NoSQL interfaces to MySQL  mycached  http://developer.cybozu.co.jp/kazuho/2009 /08/mycached-memcac.html  Works with any storage engines  Talks the memcached protocol  NDB API  http://dev.mysql.com/doc/ndbapi/en/index .html  Dedicated for the ndbcluster engine
  • 10. Performance 241009 159407 60191 15771 0 50000 100000 150000 200000 250000 300000 1 column 50 columns (requests/sec) handlersocket libmysql  Handlersocket executes simple read queries 4x faster than mysqld/libmysql  Very effective when many columns are retrieved  The reason is described later
  • 11. Commands supported by HandlerSocket (for reading data)  In pseudo-SQL... SELECT f1, .. , fn FROM db.table WHERE k1, ... , km = v1, ... , vm ORDER BY index_i LIMIT offset, limit  (k1, ... , km) are the key fields (or a prefix) of the index_i  =, >=, >, <=, and < can be used for a comparator
  • 12. Commands supported by HandlerSocket (for modifying data)  UPDATE, DELETE, and INSERT  Does not support transactions  Modifications are recorded to the binary log in the row-based format  Modifications are durable
  • 13. Command example  create table db1.table1 (k int key, v char(20))  insert into db1.table1 values (234, 'foo'), (678, ‘bar’) $ telnet localhost 9998 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. P 0 db1 table1 PRIMARY k,v 0 1 0 = 1 234 0 2 234 foo 0 = 1 678 0 2 678 bar opens the PK find k = 234 find k = 678
  • 14. Why fast?  No SQL parsing low CPU usage  Executes multiple requests in bulk low CPU/Disk usage  Own client/server protocol small network transmission size
  • 16. oprofile results – libmysql/mysqld  Executes “SELECT v from table where k = ?” many times samples| %| ------------------ 9669940 53.1574 mysqld 4438098 24.3970 vmlinux 1835976 10.0927 libpthread-2.5.so 1680656 9.2389 libc-2.5.so 397970 2.1877 e1000e 89136 0.4900 oprofiled 42881 0.2357 oprofile
  • 17. oprofile results – libmysql/mysqld samples % symbol name 748022 7.7355 MYSQLparse(void*) 219702 2.2720 my_pthread_fastmutex_lock 205606 2.1262 make_join_statistics(JOIN*, TABLE_LIST*, 198234 2.0500 btr_search_guess_on_hash 180731 1.8690 JOIN::optimize() 177120 1.8317 row_search_for_mysql 171185 1.7703 lex_one_token(void*, void*) 162683 1.6824 alloc_root 131823 1.3632 read_view_open_now 122795 1.2699 mysql_select(THD*, Item***, TABLE_LIST*, 100276 1.0370 open_table(THD*, TABLE_LIST*, st_mem_root*, 99575 1.0297 mem_pool_fill_free_list 96434 0.9973 build_template(row_prebuilt_struct*, THD*, 86349 0.8930 get_hash_symbol(char const*, unsigned int,  CPU usage inside mysqld
  • 18. oprofile results – libmysql/mysqld samples % symbol name 204393 4.6054 schedule 118648 2.6734 tcp_sendmsg 115832 2.6099 tcp_recvmsg 106537 2.4005 tcp_v4_rcv 103915 2.3414 tcp_ack 103534 2.3328 system_call 93864 2.1150 dev_queue_xmit 86831 1.9565 __mod_timer 85891 1.9353 tcp_rcv_established 84083 1.8946 .text.task_rq_lock  CPU usage inside the Linux kernel
  • 19. oprofile results – libmysql/mysqld  libmysql/mysqld  Much CPU time spent in mysqld  Parsing SQL is slow  schedule() is called frequently
  • 20. oprofile results – HandlerSocket samples| %| ------------------ 1919039 51.0453 vmlinux 811998 21.5987 mysqld 421215 11.2041 libpthread-2.5.so 207166 5.5105 e1000e 191566 5.0955 handlersocket.so 188618 5.0171 libc-2.5.so 13622 0.3623 oprofiled 5707 0.1518 oprofile  CPU usage inside MySQL with HandlerSocket
  • 21. oprofile results – HandlerSocket samples % symbol name 119684 14.7394 btr_search_guess_on_hash 58202 7.1678 row_search_for_mysql 46946 5.7815 mutex_delay 38617 4.7558 my_pthread_fastmutex_lock 37707 4.6437 buf_page_get_known_nowait 36528 4.4985 rec_get_offsets_func 34625 4.2642 build_template(row_prebuilt_struct*, THD*, TABLE*, 20024 2.4660 row_sel_store_mysql_rec 19347 2.3826 btr_cur_search_to_nth_level 16701 2.0568 row_sel_convert_mysql_key_to_innobase 13343 1.6432 cmp_dtuple_rec_with_match 11381 1.4016 ha_innobase::index_read(unsigned char*, 11176 1.3764 dict_index_copy_types 10762 1.3254 mtr_memo_slot_release 10734 1.3219 ha_innobase::init_table_handle_for_HANDLER()  CPU consumption in mysqld
  • 22. oprofile results – HandlerSocket samples % symbol name 129038 6.7241 tcp_sendmsg 80080 4.1729 tcp_v4_rcv 69658 3.6298 dev_queue_xmit 66171 3.4481 .text.skb_release_data 63316 3.2994 __qdisc_run 60279 3.1411 tcp_recvmsg 59703 3.1111 ip_output 58462 3.0464 .text.skb_release_head_state 48876 2.5469 tcp_ack 48733 2.5394 __alloc_skb 45660 2.3793 ip_queue_xmit 44671 2.3278 tcp_transmit_skb  CPU consumption in the Linux kernel
  • 23. oprofile results – HandlerSocket  HandlerSocket  Most CPU time is consumed in the kernel  schedule() is not called frequently  Inside mysqld, innodb eats most CPU time
  • 25. Threading model mysqld:  Thread per connection (MySQL 5)  Thread pooling (MySQL 6?)
  • 26. Threading model HandlerSocket:  Small number of threads  Many connections per thread  Uses epoll()  Virtually unlimited number of concurrent connections  Small memory footprint
  • 27. HandlerSocket reader thread reads requests from many clients locks the DB, gets a read view executes many requests unlocks the DB returns responses to clients locks/unlocks (1/#conns) times per request handlersocket reader thread
  • 28. HandlerSocket writer thread reads requests from many clients locks the DB, begins a transaction executes multiple requests commits, and unlocks the DB returns responses to clients handlersocket writer thread executes multiple ops in a single transaction
  • 29. Write throughput  Condition:  Durable write  sync_binlog = 1  innodb_flush_log_at_trx_commit = 1  innodb_support_xa = 1  Write-back cache with BBU, or SSD  Throughput:  MySQL: up to 1000 qps  HandlerSocket: up to 30000 qps
  • 30. How HandlerSocket locks tables  MyISAM:  Shared-exclusive lock  InnoDB:  Reader threads don’t block  Only one writer thread can be executed at the same time  HandlerSocket requests are deadlock-free  Only simple operations are supported
  • 32. MySQL C/S protocol write(3, "L0003select column0,column1,column2,column3,column4 from db_1.table_1 where k=15", 80) = 80 read(3, "100100560023def4db_17table_17table_17column07column0fr 0<00037520000000060033def4db_17table_17table_17column 17column1fr0<00037520000000060043def4db_17table_17t able_17column27column2fr0<00037520000000060053def4d b_17table_17table_17column37column3fr0<0003752000000006 0063def4db_17table_17table_17column47column4fr0<000375 2000000500737600"0n0010001000110012001300145 00t37600"0", 16384) = 327 when the above query is executed... SELECT column0, column1, column2, column3, column4 FROM db_1.table_1 where k = 15
  • 33. HandlerSocket C/S protocol write(3, "1t=t1t15n", 9) = 9 read(3, "0t5t0t1t2t3t4n", 8192) = 14 when an equivalent query is executed using handlersocket... libmysql handlersocket request 80 bytes 9 bytes response 327 bytes 14 bytes
  • 34. MySQL C/S protocol  The strace result shows that MySQL C/S protocol is verbose  Result-set metadata http://forge.mysql.com/wiki/MySQL_Internals_ClientServe r_Protocol#Field_Packet  Result-set metadata become very large if a result-set has many columns  Neither a HANDLER statement nor a server- side prepared statement does not help to avoid this problem
  • 37. Net::HandlerSocket  Client library for Perl  Invokes libhsclient via XS my $cli = new Net::HandlerSocket( {host => ‘localhost’, port => 9999}); $cli->open_index(1, ‘db1’, ‘table1’, ‘PRIMARY’, ‘k,v’); my $res = $cli->exec_multi([ [ 1, ‘=‘, [ ’33’ ], 1, 0 ], [ 1, ‘=‘, [ ’44’ ], 1, 0, ‘U’, [ ’44’, ‘hoge’ ] ], [ 1, ‘>=‘, [ ’55’ ], 10, 20 ], ]);
  • 39. HandlerSocket configuration options  handlersocket_threads = 16  Number of reader threads  Recommended value is the number of logical CPU  handlersocket_thread_wr = 1  Number of writer threads  Recommended value is ... 1  handlersocket_port = 9998  Listening port for reader requests  handlersocket_port_wr = 9999  Listening port for writer requests
  • 40. Other configuration options  innodb_buffer_pool_size  As large as possible  innodb_log_file_size, innodb_log_files_in_group  As large as possible  innodb_thread_concurrency = 0  open_files_limit = 65535  Number of file descriptors mysqld can open  HandlerSocket can handle up to 65000 concurrent connections
  • 41. Other configuration options  innodb_adaptive_hash_index = 1  Adaptive has index is fast, but consume memory
  • 42. Options related to durability  sync_binlog = 1  innodb_flush_log_at_trx_commit = 1  innodb_support_xa = 1
  • 44. Benchmark  Server:  Core2Quad Q6600  CentOS 5.4  Single EXPI9301CT(e1000e)  Single Intel X25-E (write-back cache disabled)  Schema:  CREATE TABLE table1 (k varchar(32) KEY, v varchar(32)) engine = INNODB;  Read benchmark:  10000000 records  SELECT v from table1 where k = ?  Random access  Write benchmark:  10000000 records  UPDATE table SET v = ? where k = ?  Random access  Durable write  sync_binlog = 1  innodb_flush_log_at_trx_commit = 1  innodb_support_xa = 1
  • 45. Throughput (reads) 0 50000 100000 150000 200000 250000 300000 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 # of concurrent connections queries per sec handlersocket read mysql read handlersocket write mysql write
  • 46. Throughput (writes) 1 10 100 1000 10000 100000 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 # of concurrent connections queries per sec handlersocket write mysql write
  • 47. Maximum response time 0 10 20 30 40 50 60 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 # of concurrent connections max response time(sec) handlersocket read mysql read handlersocket write mysql write
  • 48. Average response time 0 1 2 3 4 5 6 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 # of concurrent connections average response time(sec) handlersocket read mysql read handlersocket write mysql write
  • 50. Issues  Difficult to build  Requires the source of mysql  MySQL binary compatibility?
  • 51. Future plans  ‘where’ clause  Atomic read-modify-write operations  SQL support?  More language bindings