SlideShare a Scribd company logo
1 of 63
Download to read offline
Performance Optimization Techniques of
MessagePack-Ruby
Sadayuki Furuhashi
RubyKaigi 2019 #MyMessagePack
Tweet your msgpack usage with
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
About me
A founder of Treasure Data, Inc.
Located in Silicon Valley, USA.
OSS Hacker. Github: @frsyuki
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
1 1 0 0 0 1 1 0
Basics of MessagePack
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
What’s MessagePack?
{ “compact”: true, “schema”: 0 }
82 A7 compact C3
A6 schema 00
JSON
MessagePack
It’s like JSON, but fast and small.
7-byte string2-element map true
6-byte string 0
27 bytes
18 bytes (34% smaller)
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
What’s MessagePack?
It’s like JSON, but fast and small.
> Self-descriptive, Schema-on-Read semantics
> Everyone knows
> De facto standard data format
> Human-readable
> Self-descriptive, Schema-on-Read semantics
> Everyone who uses JSON can use
> Drop-in improvement of JSON
> Machine-readable
JSON
MessagePack
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
timestamp 32
timestamp 64
timestamp 96
Language Agnostic Type System
MessagePack

Type System
MessagePack

Format
String
Timestamp
Language Types
fixstr
str 8
str 16
str 32
String
Timestamp
(JSON compatible)(Ruby, Swift, Java, Go, …)
Convert
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Supported by Super Skilled Engineers All Over The World
Ruby /msgpack
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack-Ruby Major Committers
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Real World MessagePack
We import over 2,000,000 records/sec and

store 30PB of data in MessagePack format.
15 trillion rows processed every day.
Sada Furuhashi, Arm Treasure Data
Sada Furuhashi, Initial Creator of Fluentd
MessagePack is an essential component of
Fluentd to achieve high performance and
flexibility at the same time.
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Adoption of MessagePack Today
Mobile Apps
Microprocessors
Automotive Telematics
Sensors
Cloud Infrastructure
Middleware
Machine Learning
Games
Analytical Databases
MessgePack
Zero-overhead,
heterogeneous
data exchange
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
1 1 0 0 0 1 1 0
MessagePack-Ruby implementation
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Packer
Object Buffer
[
1,
2,
{“k” => “v”},
nil
]
94 4-element array
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Packer
Object Buffer
[
1,
2,
{“k” => “v”},
nil
]
94 4-element array
01 Integer 1
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Packer
Object Buffer
[
1,
2,
{“k” => “v”},
nil
]
94 4-element array
01 Integer 1
02 Integer 2
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Packer
Object Buffer
[
1,
2,
{“k” => “v”},
nil
]
94 4-element array
01 Integer 1
02 Integer 2
81 1-element map
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Packer
Object Buffer
[
1,
2,
{“k” => “v”},
nil
]
94
01
02
81
A1 1-byte string
4-element array
Integer 1
Integer 2
1-element map
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Packer
Object Buffer
[
1,
2,
{“k” => “v”},
nil
]
94
01
02
81
‘k’ ‘k’
A1 1-byte string
4-element array
Integer 1
Integer 2
1-element map
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Packer
Object Buffer
[
1,
2,
{“k” => “v”},
nil
]
94
01
02
81
‘k’ ‘k’
A1 1-byte string
‘v’ ‘v’
A1 1-byte string
4-element array
Integer 1
Integer 2
1-element map
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Packer
Object Buffer
[
1,
2,
{“k” => “v”},
nil
]
94
01
02
81
‘k’
A1
‘v’
A1
C0 nil
4-element array
Integer 1
Integer 2
1-element map
‘k’
1-byte string
‘v’
1-byte string
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[ □, □, □, □ ]
4-element array
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[ □, □, □, □ ]
1
Integer 1
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[ 1, □, □, □ ]
1
complete object
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[ 1, □, □, □ ]
2
Integer 2
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[ 1, 2, □, □ ]
2
complete object
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[ 1, 2, □, □ ]
{ □ => □ }1-element map
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[ 1, 2, □, □ ]
{ □ => □ }
“k”1-byte string
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[ 1, 2, □, □ ]
{ “k” => □ }
“k”
complete object
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[ 1, 2, □, □ ]
{ “k” => □ }
“v”
1-byte string
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[ 1, 2, □, □ ]
{ “k” => “v” }
“v”
complete object
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[1, 2,{“k”=>“v”}, □ ]
{ “k” => “v” }
complete object
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[1, 2,{“k”=>“v”}, □ ]
nil
nil
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[1, 2,{“k”=>“v”},nil]
nil
complete object
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Unpacker
Buffer Stack
94
01
02
81
‘k’
A1
‘v’
A1
C0
[1, 2,{“k”=>“v”},nil]
Complete object
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
1 1 0 0 0 1 1 0
Optimization
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Buffer
MessagePack::Buffer
next
mapped_string
mem
{
“count” => 1,
“page” => 0,
“body” => [
“LONG TEXT”
]
]
Object
83 A5 ‘c’ ‘o’ ’u’ ’n’
’t' 01 A4 ‘p’ ’a’ ’g’
’e’ A4 ‘b’ ‘o’ ’d’ ’y’
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack::Buffer
MessagePack::Buffer
next
mapped_string
mem
next
mapped_string
mem
{
“count” => 1,
“page” => 0,
“body” => [
“LONG TEXT”
]
]
Object
83 A5 ‘c’ ‘o’ ’u’ ’n’
’t' 01 A4 ‘p’ ’a’ ’g’
’e’ A4 ‘b’ ‘o’ ’d’ ’y’
91 A9 ‘L’ ‘O’ ’N’ ’G’
’ ’ ’T‘ ’E’ ‘X’ ’T’
Add more buffer chunks
instead of realloc()
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Zero-copy write optimization
MessagePack::Buffer
next
mapped_string
mem
next
mapped_string
mem
next
mapped_string
mem
{
“count” => 1,
“page” => 0,
“body” => [
“LONG TEXT”
]
]
Object
83 A5 ‘c’ ‘o’ ’u’ ’n’
’t' 01 A4 ‘p’ ’a’ ’g’
’e’ A4 ‘b’ ‘o’ ’d’ ’y’
91 A9
“LONG TEXT”
rb_str_dup()
Fast copy-on-write
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Zero-copy read optimization
MessagePack::Buffer
next
mapped_string
mem
Object
{
“count” => 1,
“page” => 0,
“body” => [
“LONG TEXT”
]
]
rb_str_substr()
Fast copy-on-write
x83 A5 c o u n t
t x01 xA4 p a g
e xA4 b o d y
x91 A9 L O N G

T E X T ”
“
※ if SHARABLE_SUBSTRING_P() returns true
x83 A5 c o u n t
t x01 xA4 p a g
e xA4 b o d y
x91 A9 L O N G

T E X T ”
Source String
“
rb_str_dup()
Fast copy-on-write
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Reserved memory pool
MessagePack::Buffer
next
mapped_string
mem
Global memory pool
next
mapped_string
mem
4KB 4KB
4KB 4KB
4KB 4KB
4KB 4KB
83 A5 ‘c’ ‘o’ ’u’ ’n’
’t' 01 A4 ‘p’ ’a’ ’g’
’e’ A4 ‘b’ ‘o’ ’d’ ’y’
91 A9
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Reserved memory pool
MessagePack::Buffer
next
mapped_string
mem
Global memory pool
next
mapped_string
mem
4KB 4KB
4KB 4KB
4KB 4KB
4KB 4KB
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
1 1 0 0 0 1 1 0
Benchmark & further optimization
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Benchmark Data Sets
DB
100_000.times.map {|i| r.rand(i+1) }
[{
"id" : "gfmg-6ppu",
"name" : "Mortgage Complaints",
"averageRating" : 0,
"createdAt" : 1433953219,
"moderationStatus" : true,
“numberOfComments" : 0,
"description" : "Each week we …”,
…
Blogs
Integers
[
0,0,0,1,3,0,0,
…,

42991,26906,18655,7015

]
{
“results”: [
{
“attachments”:[],
“body”:

”Dear Friends and Colleagues,nn

I always look forward to the…
https://www.justice.gov/api/v1/blog_entries.json
http://data.consumerfinance.gov/api/views.json
Benchmark code available at
https://gist.github.com/frsyuki/9777c4adba2b5c957695b64f17b64ba1
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack vs JSON
DB
Integers
Blogs
0 600 1200 1800 2400 3000
Serialization time
MessagePack
JSON (Oj)
JSON (JSON)
100%
DB
Integers
Blogs
0 600 1200 1800 2400 3000
Deserialization time
100%
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack without optimization
DB
Integers
Blogs
0 30 60 90 120 150
Serialization time
100%
DB
Integers
Blogs
0 30 60 90 120 150
Deserialization time
100%
Default
No memory pool
No zero-copy read
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Zero-copy read optimization
MessagePack::Buffer
next
mapped_string
mem
Object
{
“count” => 1,
“page” => 0,
“body” => [
“LONG TEXT”
]
]
x83 A5 c o u n t
t x01 xA4 p a g
e xA4 b o d y
x91 L O N G

T E X T ”
“x83 A5 c o u n t
t x01 xA4 p a g
e xA4 b o d y
x91 L O N G

T E X T ”
Source String
“
rb_str_dup()
Fast copy-on-write
rb_str_substr()
Fast copy-on-write
※ if SHARABLE_SUBSTRING_P() returns true
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Copy-on-write substring
SHARABLE_SUBSTRING_P() returns true only when substring shares the last
0 termination with the original string
s = "a" * 1_000_100
10_000.times do
s.slice(100, 1_000_000)
end
#=> 0.002 sec
s = "a" * 1_000_100
10_000.times do
s.slice(100, 1_000_000 - 1)
end
#=> 2.7 sec
Not including the last character
disables Copy-on-Write
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Copy-on-write substring
SHARABLE_SUBSTRING_P() returns true only when substring shares the last
0 termination with the original string OR Ruby is compiled with
SHARABLE_MIDDLE_SUBSTRING flag (not enabled by default)
s = "a" * 1_000_100
10_000.times do
s.slice(100, 1_000_000)
end
#=> 0.002 sec
s = "a" * 1_000_100
10_000.times do
s.slice(100, 1_000_000 - 1)
end
#=> 2.7 sec 0.002 sec
Using ruby binary compiled with
SHARABLE_MIDDLE_SUBSTRING
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Deserialization with copy-on-write substring
DB
Integers
Blogs
0 30 60 90 120 150
Serialization time
100%
DB
Integers
Blogs
0 30 60 90 120 150
Deserialization time
Default
With SHARABLE_MIDDLE_SUBSTRING
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
What is deserialization bottleneck?
0 14,000,000 28,000,000 42,000,000 56,000,000 70,000,000
Deserialization objects/sec
DB
Integers
Blogs
Boolean
(2^62)-1
(2^62)
Immediate
On-Heap
On-Heap
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
What is deserialization bottleneck?
0 14,000,000 28,000,000 42,000,000 56,000,000 70,000,000
Deserialization objects/sec
DB
Integers
Blogs
Boolean
(2^62)-1
(2^62)
Immediate
Immediate
Immediate
On-Heap
=> Object allocation is slow
On-Heap
On-Heap
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Reusing Hash key objects
data: [
{
“id”: “s6ew-h6mp”,
“name”: “Consumer Complaints”,
…
},
{
“id”: “nsyy-je5y”,
“name”: “Beta Consumers”,
…
}
{
“id”: “wkue-ycpk”,
“name”: “Survey”,
…
}
]
Same key repeats.
Keys of Hash are always frozen.
=> We can reuse objects!
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
How to reuse Hash key objects?
Use fstring. Ruby uses it to reuse same objects for
immutable strings (but C API is not available…yet):
p “a”.object_id == “a”.object_id
#=> false
# frozen_string_literal: true
p “a”.object_id == “a”.object_id
#=> true
p (-“a”).object_id == (-“a”).object_id
#=> true
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Reusing Hash key objects using fstring
DB
Integers
Blogs
0 30 60 90 120 150
Serialization time
100%
DB
Integers
Blogs
0 25 50 75 100 125 150
Deserialization time
Default
Hash key fstring using

hacked Ruby binary
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
Link Time Optimization
DB
Integers
Blogs
0 30 60 90 120 150
Serialization time
100%
DB
Integers
Blogs
0 25 50 75 100 125 150
Deserialization time
All optimization
And -flto=thin
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
1 1 0 0 0 1 1 0
Reading other MessagePack implementations
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
msgpack-java: TypeProfile bypassing
interface MessageBuffer {
public void putInt(int index, int value);
}
class MessageBufferBE implements MessageBuffer {
public void putInt(int index, int value)
{
byteBuffer.putInt(index, value);
}
}
This code has significant overhead:
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
msgpack-java: TypeProfile bypassing
interface MessageBuffer {
public void putInt(int index, int value);
}
class MessageBufferBE implements MessageBuffer {
public void putInt(int index, int value)
{
byteBuffer.putInt(index, value);
}
}
This code has significant overhead:
Dynamic method lookup

(slow even after JIT)
Dynamic method lookup
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
msgpack-java: TypeProfile bypassing
interface MessageBuffer {
public void putInt(int index, int value)
{
v = Integer.reverseBytes(v);
unsafe.putInt(base, address + index, v);
}
public static MessageBuffer newInstance() {
// …
}
Much faster code:
No override.
JVM JIT inlines them.
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
msgpack-java: TypeProfile bypassing
interface MessageBuffer {
public void putInt(int index, int value)
{
v = Integer.reverseBytes(v);
unsafe.putInt(base, address + index, v);
}
public static MessageBuffer newInstance() {
// …
}
Much faster code:
JVM intrinsics

(1-to-1 mapping to CPU instruction,

no function call)
JVM intrinsics
Load inherited class lazily.

No override on little-endian machine.
No override.
JVM JIT inlines them.
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack-CSharp: Lookup cache of optimized Ser/De
class User {
public int Age { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
[
31,
“Sadayuki”,
“Furuhashi”
]
Mapping between class and
semi-structured data
Java: Jackson databind, JAXB, …
C#: MessagePack-CSharp, System.Runtime.Serialization, …
Swift: Codable, SwiftMsgPack, …
…
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack-CSharp: Native LZ4 integration
1 1 0 1 0 1 0 1
1 0 1 1 0 0 1 0
1 0 1 0 0 1 1 0
1 1 0 0 1 1 1 0
1 1 0 0 1 1 1 0
1 0 0 0 0 1 1 0
1 1 1 0 0 1 1 0
1 0 1 0 0 1 1 0
0 0 0 0 1 0 1 0
1 0 0 0 0 1 1 0
MessagePack-CSharp: Native LZ4 integration
It's like JSON.

but fast and small.
Sadayuki Furuhashi
#MyMessagePack
Tweet your msgpack usage with

More Related Content

What's hot

Probabilistic Logic Programming with Beta-Distributed Random Variables
Probabilistic Logic Programming with Beta-Distributed Random VariablesProbabilistic Logic Programming with Beta-Distributed Random Variables
Probabilistic Logic Programming with Beta-Distributed Random VariablesFederico Cerutti
 
Jsr107 come, code, cache, compute!
Jsr107 come, code, cache, compute!Jsr107 come, code, cache, compute!
Jsr107 come, code, cache, compute!C2B2 Consulting
 
Image classification with Deeplearning4j
Image classification with Deeplearning4jImage classification with Deeplearning4j
Image classification with Deeplearning4jEric Jain
 
The Ring programming language version 1.6 book - Part 80 of 189
The Ring programming language version 1.6 book - Part 80 of 189The Ring programming language version 1.6 book - Part 80 of 189
The Ring programming language version 1.6 book - Part 80 of 189Mahmoud Samir Fayed
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalabilityWim Godden
 
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013MariaDB Corporation
 
Parallel Computing With Dask - PyDays 2017
Parallel Computing With Dask - PyDays 2017Parallel Computing With Dask - PyDays 2017
Parallel Computing With Dask - PyDays 2017Christian Aichinger
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Microsoft
 

What's hot (11)

Probabilistic Logic Programming with Beta-Distributed Random Variables
Probabilistic Logic Programming with Beta-Distributed Random VariablesProbabilistic Logic Programming with Beta-Distributed Random Variables
Probabilistic Logic Programming with Beta-Distributed Random Variables
 
Jsr107 come, code, cache, compute!
Jsr107 come, code, cache, compute!Jsr107 come, code, cache, compute!
Jsr107 come, code, cache, compute!
 
Image classification with Deeplearning4j
Image classification with Deeplearning4jImage classification with Deeplearning4j
Image classification with Deeplearning4j
 
Spark Streaming and Suicidal Tendencies
Spark Streaming and Suicidal Tendencies Spark Streaming and Suicidal Tendencies
Spark Streaming and Suicidal Tendencies
 
The Ring programming language version 1.6 book - Part 80 of 189
The Ring programming language version 1.6 book - Part 80 of 189The Ring programming language version 1.6 book - Part 80 of 189
The Ring programming language version 1.6 book - Part 80 of 189
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalability
 
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013
CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013
 
Common scenarios in vcl
Common scenarios in vclCommon scenarios in vcl
Common scenarios in vcl
 
Parallel Computing With Dask - PyDays 2017
Parallel Computing With Dask - PyDays 2017Parallel Computing With Dask - PyDays 2017
Parallel Computing With Dask - PyDays 2017
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
 
Logs
LogsLogs
Logs
 

Similar to Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019

Transformers ASR.pdf
Transformers ASR.pdfTransformers ASR.pdf
Transformers ASR.pdfssuser8025b21
 
Reliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy conditionReliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy conditionShahrukh Ali Khan
 
Exact Real Arithmetic for Tcl
Exact Real Arithmetic for TclExact Real Arithmetic for Tcl
Exact Real Arithmetic for Tclke9tv
 
Credit Risk Assessment using Machine Learning Techniques with WEKA
Credit Risk Assessment using Machine Learning Techniques with WEKACredit Risk Assessment using Machine Learning Techniques with WEKA
Credit Risk Assessment using Machine Learning Techniques with WEKAMehnaz Newaz
 
Number System | Types of Number System | Binary Number System | Octal Number ...
Number System | Types of Number System | Binary Number System | Octal Number ...Number System | Types of Number System | Binary Number System | Octal Number ...
Number System | Types of Number System | Binary Number System | Octal Number ...Get & Spread Knowledge
 
Conway's Game of Life with Repa
Conway's Game of Life with RepaConway's Game of Life with Repa
Conway's Game of Life with Repakizzx2
 

Similar to Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019 (6)

Transformers ASR.pdf
Transformers ASR.pdfTransformers ASR.pdf
Transformers ASR.pdf
 
Reliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy conditionReliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy condition
 
Exact Real Arithmetic for Tcl
Exact Real Arithmetic for TclExact Real Arithmetic for Tcl
Exact Real Arithmetic for Tcl
 
Credit Risk Assessment using Machine Learning Techniques with WEKA
Credit Risk Assessment using Machine Learning Techniques with WEKACredit Risk Assessment using Machine Learning Techniques with WEKA
Credit Risk Assessment using Machine Learning Techniques with WEKA
 
Number System | Types of Number System | Binary Number System | Octal Number ...
Number System | Types of Number System | Binary Number System | Octal Number ...Number System | Types of Number System | Binary Number System | Octal Number ...
Number System | Types of Number System | Binary Number System | Octal Number ...
 
Conway's Game of Life with Repa
Conway's Game of Life with RepaConway's Game of Life with Repa
Conway's Game of Life with Repa
 

More from Sadayuki Furuhashi

Automating Workflows for Analytics Pipelines
Automating Workflows for Analytics PipelinesAutomating Workflows for Analytics Pipelines
Automating Workflows for Analytics PipelinesSadayuki Furuhashi
 
Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理Sadayuki Furuhashi
 
Fluentd at Bay Area Kubernetes Meetup
Fluentd at Bay Area Kubernetes MeetupFluentd at Bay Area Kubernetes Meetup
Fluentd at Bay Area Kubernetes MeetupSadayuki Furuhashi
 
DigdagはなぜYAMLなのか?
DigdagはなぜYAMLなのか?DigdagはなぜYAMLなのか?
DigdagはなぜYAMLなのか?Sadayuki Furuhashi
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container EraSadayuki Furuhashi
 
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11Sadayuki Furuhashi
 
Fighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with EmbulkFighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with EmbulkSadayuki Furuhashi
 
Embulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダEmbulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダSadayuki Furuhashi
 
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGemsPlugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGemsSadayuki Furuhashi
 
Embulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loaderEmbulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loaderSadayuki Furuhashi
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Sadayuki Furuhashi
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Sadayuki Furuhashi
 
Fluentd - Set Up Once, Collect More
Fluentd - Set Up Once, Collect MoreFluentd - Set Up Once, Collect More
Fluentd - Set Up Once, Collect MoreSadayuki Furuhashi
 
Prestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoPrestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoSadayuki Furuhashi
 
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasual
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasualWhat's new in v11 - Fluentd Casual Talks #3 #fluentdcasual
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasualSadayuki Furuhashi
 

More from Sadayuki Furuhashi (20)

Scripting Embulk Plugins
Scripting Embulk PluginsScripting Embulk Plugins
Scripting Embulk Plugins
 
Making KVS 10x Scalable
Making KVS 10x ScalableMaking KVS 10x Scalable
Making KVS 10x Scalable
 
Automating Workflows for Analytics Pipelines
Automating Workflows for Analytics PipelinesAutomating Workflows for Analytics Pipelines
Automating Workflows for Analytics Pipelines
 
Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理
 
Fluentd at Bay Area Kubernetes Meetup
Fluentd at Bay Area Kubernetes MeetupFluentd at Bay Area Kubernetes Meetup
Fluentd at Bay Area Kubernetes Meetup
 
DigdagはなぜYAMLなのか?
DigdagはなぜYAMLなのか?DigdagはなぜYAMLなのか?
DigdagはなぜYAMLなのか?
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container Era
 
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
 
Fighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with EmbulkFighting Against Chaotically Separated Values with Embulk
Fighting Against Chaotically Separated Values with Embulk
 
Embulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダEmbulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダ
 
Plugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGemsPlugin-based software design with Ruby and RubyGems
Plugin-based software design with Ruby and RubyGems
 
Embuk internals
Embuk internalsEmbuk internals
Embuk internals
 
Embulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loaderEmbulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loader
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
 
Prestogres internals
Prestogres internalsPrestogres internals
Prestogres internals
 
Presto+MySQLで分散SQL
Presto+MySQLで分散SQLPresto+MySQLで分散SQL
Presto+MySQLで分散SQL
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014
 
Fluentd - Set Up Once, Collect More
Fluentd - Set Up Once, Collect MoreFluentd - Set Up Once, Collect More
Fluentd - Set Up Once, Collect More
 
Prestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoPrestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for Presto
 
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasual
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasualWhat's new in v11 - Fluentd Casual Talks #3 #fluentdcasual
What's new in v11 - Fluentd Casual Talks #3 #fluentdcasual
 

Recently uploaded

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 

Recently uploaded (20)

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 

Performance Optimization Techniques of MessagePack-Ruby - RubyKaigi 2019

  • 1. Performance Optimization Techniques of MessagePack-Ruby Sadayuki Furuhashi RubyKaigi 2019 #MyMessagePack Tweet your msgpack usage with
  • 2. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 About me A founder of Treasure Data, Inc. Located in Silicon Valley, USA. OSS Hacker. Github: @frsyuki
  • 3. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 1 1 0 0 0 1 1 0 Basics of MessagePack
  • 4. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 What’s MessagePack? { “compact”: true, “schema”: 0 } 82 A7 compact C3 A6 schema 00 JSON MessagePack It’s like JSON, but fast and small. 7-byte string2-element map true 6-byte string 0 27 bytes 18 bytes (34% smaller)
  • 5. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 What’s MessagePack? It’s like JSON, but fast and small. > Self-descriptive, Schema-on-Read semantics > Everyone knows > De facto standard data format > Human-readable > Self-descriptive, Schema-on-Read semantics > Everyone who uses JSON can use > Drop-in improvement of JSON > Machine-readable JSON MessagePack
  • 6. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 timestamp 32 timestamp 64 timestamp 96 Language Agnostic Type System MessagePack
 Type System MessagePack
 Format String Timestamp Language Types fixstr str 8 str 16 str 32 String Timestamp (JSON compatible)(Ruby, Swift, Java, Go, …) Convert
  • 7. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Supported by Super Skilled Engineers All Over The World Ruby /msgpack
  • 8. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack-Ruby Major Committers
  • 9. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Real World MessagePack We import over 2,000,000 records/sec and
 store 30PB of data in MessagePack format. 15 trillion rows processed every day. Sada Furuhashi, Arm Treasure Data Sada Furuhashi, Initial Creator of Fluentd MessagePack is an essential component of Fluentd to achieve high performance and flexibility at the same time.
  • 10. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Adoption of MessagePack Today Mobile Apps Microprocessors Automotive Telematics Sensors Cloud Infrastructure Middleware Machine Learning Games Analytical Databases MessgePack Zero-overhead, heterogeneous data exchange
  • 11. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 1 1 0 0 0 1 1 0 MessagePack-Ruby implementation
  • 12. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Packer Object Buffer [ 1, 2, {“k” => “v”}, nil ] 94 4-element array
  • 13. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Packer Object Buffer [ 1, 2, {“k” => “v”}, nil ] 94 4-element array 01 Integer 1
  • 14. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Packer Object Buffer [ 1, 2, {“k” => “v”}, nil ] 94 4-element array 01 Integer 1 02 Integer 2
  • 15. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Packer Object Buffer [ 1, 2, {“k” => “v”}, nil ] 94 4-element array 01 Integer 1 02 Integer 2 81 1-element map
  • 16. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Packer Object Buffer [ 1, 2, {“k” => “v”}, nil ] 94 01 02 81 A1 1-byte string 4-element array Integer 1 Integer 2 1-element map
  • 17. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Packer Object Buffer [ 1, 2, {“k” => “v”}, nil ] 94 01 02 81 ‘k’ ‘k’ A1 1-byte string 4-element array Integer 1 Integer 2 1-element map
  • 18. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Packer Object Buffer [ 1, 2, {“k” => “v”}, nil ] 94 01 02 81 ‘k’ ‘k’ A1 1-byte string ‘v’ ‘v’ A1 1-byte string 4-element array Integer 1 Integer 2 1-element map
  • 19. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Packer Object Buffer [ 1, 2, {“k” => “v”}, nil ] 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 nil 4-element array Integer 1 Integer 2 1-element map ‘k’ 1-byte string ‘v’ 1-byte string
  • 20. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [ □, □, □, □ ] 4-element array
  • 21. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [ □, □, □, □ ] 1 Integer 1
  • 22. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [ 1, □, □, □ ] 1 complete object
  • 23. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [ 1, □, □, □ ] 2 Integer 2
  • 24. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [ 1, 2, □, □ ] 2 complete object
  • 25. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [ 1, 2, □, □ ] { □ => □ }1-element map
  • 26. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [ 1, 2, □, □ ] { □ => □ } “k”1-byte string
  • 27. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [ 1, 2, □, □ ] { “k” => □ } “k” complete object
  • 28. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [ 1, 2, □, □ ] { “k” => □ } “v” 1-byte string
  • 29. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [ 1, 2, □, □ ] { “k” => “v” } “v” complete object
  • 30. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [1, 2,{“k”=>“v”}, □ ] { “k” => “v” } complete object
  • 31. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [1, 2,{“k”=>“v”}, □ ] nil nil
  • 32. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [1, 2,{“k”=>“v”},nil] nil complete object
  • 33. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Unpacker Buffer Stack 94 01 02 81 ‘k’ A1 ‘v’ A1 C0 [1, 2,{“k”=>“v”},nil] Complete object
  • 34. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 1 1 0 0 0 1 1 0 Optimization
  • 35. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Buffer MessagePack::Buffer next mapped_string mem { “count” => 1, “page” => 0, “body” => [ “LONG TEXT” ] ] Object 83 A5 ‘c’ ‘o’ ’u’ ’n’ ’t' 01 A4 ‘p’ ’a’ ’g’ ’e’ A4 ‘b’ ‘o’ ’d’ ’y’
  • 36. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack::Buffer MessagePack::Buffer next mapped_string mem next mapped_string mem { “count” => 1, “page” => 0, “body” => [ “LONG TEXT” ] ] Object 83 A5 ‘c’ ‘o’ ’u’ ’n’ ’t' 01 A4 ‘p’ ’a’ ’g’ ’e’ A4 ‘b’ ‘o’ ’d’ ’y’ 91 A9 ‘L’ ‘O’ ’N’ ’G’ ’ ’ ’T‘ ’E’ ‘X’ ’T’ Add more buffer chunks instead of realloc()
  • 37. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Zero-copy write optimization MessagePack::Buffer next mapped_string mem next mapped_string mem next mapped_string mem { “count” => 1, “page” => 0, “body” => [ “LONG TEXT” ] ] Object 83 A5 ‘c’ ‘o’ ’u’ ’n’ ’t' 01 A4 ‘p’ ’a’ ’g’ ’e’ A4 ‘b’ ‘o’ ’d’ ’y’ 91 A9 “LONG TEXT” rb_str_dup() Fast copy-on-write
  • 38. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Zero-copy read optimization MessagePack::Buffer next mapped_string mem Object { “count” => 1, “page” => 0, “body” => [ “LONG TEXT” ] ] rb_str_substr() Fast copy-on-write x83 A5 c o u n t t x01 xA4 p a g e xA4 b o d y x91 A9 L O N G
 T E X T ” “ ※ if SHARABLE_SUBSTRING_P() returns true x83 A5 c o u n t t x01 xA4 p a g e xA4 b o d y x91 A9 L O N G
 T E X T ” Source String “ rb_str_dup() Fast copy-on-write
  • 39. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Reserved memory pool MessagePack::Buffer next mapped_string mem Global memory pool next mapped_string mem 4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB 83 A5 ‘c’ ‘o’ ’u’ ’n’ ’t' 01 A4 ‘p’ ’a’ ’g’ ’e’ A4 ‘b’ ‘o’ ’d’ ’y’ 91 A9
  • 40. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Reserved memory pool MessagePack::Buffer next mapped_string mem Global memory pool next mapped_string mem 4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB
  • 41. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 1 1 0 0 0 1 1 0 Benchmark & further optimization
  • 42. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Benchmark Data Sets DB 100_000.times.map {|i| r.rand(i+1) } [{ "id" : "gfmg-6ppu", "name" : "Mortgage Complaints", "averageRating" : 0, "createdAt" : 1433953219, "moderationStatus" : true, “numberOfComments" : 0, "description" : "Each week we …”, … Blogs Integers [ 0,0,0,1,3,0,0, …,
 42991,26906,18655,7015
 ] { “results”: [ { “attachments”:[], “body”:
 ”Dear Friends and Colleagues,nn
 I always look forward to the… https://www.justice.gov/api/v1/blog_entries.json http://data.consumerfinance.gov/api/views.json Benchmark code available at https://gist.github.com/frsyuki/9777c4adba2b5c957695b64f17b64ba1
  • 43. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack vs JSON DB Integers Blogs 0 600 1200 1800 2400 3000 Serialization time MessagePack JSON (Oj) JSON (JSON) 100% DB Integers Blogs 0 600 1200 1800 2400 3000 Deserialization time 100%
  • 44. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack without optimization DB Integers Blogs 0 30 60 90 120 150 Serialization time 100% DB Integers Blogs 0 30 60 90 120 150 Deserialization time 100% Default No memory pool No zero-copy read
  • 45. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Zero-copy read optimization MessagePack::Buffer next mapped_string mem Object { “count” => 1, “page” => 0, “body” => [ “LONG TEXT” ] ] x83 A5 c o u n t t x01 xA4 p a g e xA4 b o d y x91 L O N G
 T E X T ” “x83 A5 c o u n t t x01 xA4 p a g e xA4 b o d y x91 L O N G
 T E X T ” Source String “ rb_str_dup() Fast copy-on-write rb_str_substr() Fast copy-on-write ※ if SHARABLE_SUBSTRING_P() returns true
  • 46. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Copy-on-write substring SHARABLE_SUBSTRING_P() returns true only when substring shares the last 0 termination with the original string s = "a" * 1_000_100 10_000.times do s.slice(100, 1_000_000) end #=> 0.002 sec s = "a" * 1_000_100 10_000.times do s.slice(100, 1_000_000 - 1) end #=> 2.7 sec Not including the last character disables Copy-on-Write
  • 47. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Copy-on-write substring SHARABLE_SUBSTRING_P() returns true only when substring shares the last 0 termination with the original string OR Ruby is compiled with SHARABLE_MIDDLE_SUBSTRING flag (not enabled by default) s = "a" * 1_000_100 10_000.times do s.slice(100, 1_000_000) end #=> 0.002 sec s = "a" * 1_000_100 10_000.times do s.slice(100, 1_000_000 - 1) end #=> 2.7 sec 0.002 sec Using ruby binary compiled with SHARABLE_MIDDLE_SUBSTRING
  • 48. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Deserialization with copy-on-write substring DB Integers Blogs 0 30 60 90 120 150 Serialization time 100% DB Integers Blogs 0 30 60 90 120 150 Deserialization time Default With SHARABLE_MIDDLE_SUBSTRING
  • 49. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 What is deserialization bottleneck? 0 14,000,000 28,000,000 42,000,000 56,000,000 70,000,000 Deserialization objects/sec DB Integers Blogs Boolean (2^62)-1 (2^62) Immediate On-Heap On-Heap
  • 50. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 What is deserialization bottleneck? 0 14,000,000 28,000,000 42,000,000 56,000,000 70,000,000 Deserialization objects/sec DB Integers Blogs Boolean (2^62)-1 (2^62) Immediate Immediate Immediate On-Heap => Object allocation is slow On-Heap On-Heap
  • 51. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Reusing Hash key objects data: [ { “id”: “s6ew-h6mp”, “name”: “Consumer Complaints”, … }, { “id”: “nsyy-je5y”, “name”: “Beta Consumers”, … } { “id”: “wkue-ycpk”, “name”: “Survey”, … } ] Same key repeats. Keys of Hash are always frozen. => We can reuse objects!
  • 52. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 How to reuse Hash key objects? Use fstring. Ruby uses it to reuse same objects for immutable strings (but C API is not available…yet): p “a”.object_id == “a”.object_id #=> false # frozen_string_literal: true p “a”.object_id == “a”.object_id #=> true p (-“a”).object_id == (-“a”).object_id #=> true
  • 53. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Reusing Hash key objects using fstring DB Integers Blogs 0 30 60 90 120 150 Serialization time 100% DB Integers Blogs 0 25 50 75 100 125 150 Deserialization time Default Hash key fstring using
 hacked Ruby binary
  • 54. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 Link Time Optimization DB Integers Blogs 0 30 60 90 120 150 Serialization time 100% DB Integers Blogs 0 25 50 75 100 125 150 Deserialization time All optimization And -flto=thin
  • 55. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 1 1 0 0 0 1 1 0 Reading other MessagePack implementations
  • 56. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 msgpack-java: TypeProfile bypassing interface MessageBuffer { public void putInt(int index, int value); } class MessageBufferBE implements MessageBuffer { public void putInt(int index, int value) { byteBuffer.putInt(index, value); } } This code has significant overhead:
  • 57. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 msgpack-java: TypeProfile bypassing interface MessageBuffer { public void putInt(int index, int value); } class MessageBufferBE implements MessageBuffer { public void putInt(int index, int value) { byteBuffer.putInt(index, value); } } This code has significant overhead: Dynamic method lookup
 (slow even after JIT) Dynamic method lookup
  • 58. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 msgpack-java: TypeProfile bypassing interface MessageBuffer { public void putInt(int index, int value) { v = Integer.reverseBytes(v); unsafe.putInt(base, address + index, v); } public static MessageBuffer newInstance() { // … } Much faster code: No override. JVM JIT inlines them.
  • 59. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 msgpack-java: TypeProfile bypassing interface MessageBuffer { public void putInt(int index, int value) { v = Integer.reverseBytes(v); unsafe.putInt(base, address + index, v); } public static MessageBuffer newInstance() { // … } Much faster code: JVM intrinsics
 (1-to-1 mapping to CPU instruction,
 no function call) JVM intrinsics Load inherited class lazily.
 No override on little-endian machine. No override. JVM JIT inlines them.
  • 60. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack-CSharp: Lookup cache of optimized Ser/De class User { public int Age { get; set; } public string FirstName { get; set; } public string LastName { get; set; } } [ 31, “Sadayuki”, “Furuhashi” ] Mapping between class and semi-structured data Java: Jackson databind, JAXB, … C#: MessagePack-CSharp, System.Runtime.Serialization, … Swift: Codable, SwiftMsgPack, … …
  • 61. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack-CSharp: Native LZ4 integration
  • 62. 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 MessagePack-CSharp: Native LZ4 integration
  • 63. It's like JSON.
 but fast and small. Sadayuki Furuhashi #MyMessagePack Tweet your msgpack usage with