Glauber Costa, a Principal Architect at ScyllaDB, discusses techniques for achieving low latency database operations. He identifies three main sources of latency: speed mismatch between disk and CPU, lack of respect for task quotas, and imperfect isolation. Glauber describes how ScyllaDB addresses these issues through techniques like the I/O scheduler, CPU scheduler, task quotas, block detector, and controllers that regulate operations like memtable flushes. The goal is to make high percentile latencies low and bounded by treating them as bugs rather than nice-to-haves. ScyllaDB users can already benefit from these latency improvements in many situations, with more fixes coming in future releases.
Scylla Summit 2017: How We Got to 1 Millisecond Latency in 99% Under Repair, Compaction, and Flushes
1. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Chasing the 99th
How we got to 1ms latency for
repairs, compactions, and flushes
Principal Architect, ScyllaDB
Glauber Costa
2. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Glauber Costa
2
Glauber Costa is a Principal Architect at ScyllaDB.
He shares his time between the engineering
department working on upcoming Scylla features
and helping customers succeed.
Before ScyllaDB, Glauber worked with Virtualization
in the Linux Kernel for 10 years with contributions
ranging from the Xen Hypervisor to all sorts of guest
functionality and containers.
3. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Dear Scylla,
3
What do you call a latency distribution for which the high
percentiles are much higher than the average?
4. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Dear Scylla,
4
What do you call a latency distribution for which the high
percentiles are much higher than the average?
5. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Three main sources of latencies - Act 1
(Speed mismatch)
5
6. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
How fast is my system?
▪ There are two speeds:
o Disk Speed
o CPU/memory speed
▪ What happens when they are not in sync ?
latency mean : 51.9
latency median : 9.8
latency 95th percentile : 125.6
latency 99th percentile : 1184.0
latency 99.9th percentile : 1991.2
6
7. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
How fast is my system?
▪ There are two speeds:
o Disk Speed
o CPU/memory speed
▪ What happens when they are not in sync ?
latency mean : 51.9
latency median : 9.8
latency 95th percentile : 125.6
latency 99th percentile : 1184.0 (x 22)
latency 99.9th percentile : 1991.2 (x 38)
7
8. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The Wall
8
9. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The Wall - Results
9
latency mean : 54.9
latency median : 43.5
latency 95th percentile : 126.9
latency 99th percentile : 253.9
latency 99.9th percentile : 364.6
10. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The Wall - Results
10
latency mean : 54.9
latency median : 43.5
latency 95th percentile : 126.9
latency 99th percentile : 253.9 (x 4.6)
latency 99.9th percentile : 364.6 (x 6.6)
11. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The Wall - where is it relevant?
▪ Disk speed slower than CPU speed
o plain slow disk, large payloads
11
12. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The Wall - where is it relevant?
▪ Disk speed slower than CPU speed
o plain slow disk, large payloads
▪ In the North
12
13. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Three main sources of latencies - Act 2
(Lack of respect for limits)
13
14. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Tasks in Scylla
14
Traditional stack Scylla’s stack
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise is a
pointer to
eventually
computed value
Task is a
pointer to a
lambda function
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread is a
function pointer
Stack is a byte
array from 64k
to megabytes
15. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The task quota
▪ How often do we check the work queues?
▪ Pre-2.0 defaults too high for latency bound systems
▪ Tasks not respecting it will cause spikes.
15
time
16. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The task quota
▪ How often do we check the work queues?
▪ Pre-2.0 defaults too high for latency bound systems
▪ Tasks not respecting it will cause spikes
16
time
task
poller self-centered
millennial teenager
17. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The task quota
▪ How often do we check the work queues?
▪ Pre-2.0 defaults too high for latency bound systems
▪ Tasks not respecting it will cause spikes
17
time
poll more
often
report
18. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The block detector
▪ Warns developers about violations of the task quota
▪ If you see something, say something!
▪ Close to enabling it by default everywhere (low thresholds)
18
19. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Three main sources of latencies - Act 3
(Imperfect Isolation)
19
20. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The I/O Scheduler
20
Query
Commitlog
Compaction
Queue
Queue
Queue
Userspace
I/O
Scheduler
Disk
Max useful disk concurrency
I/O queued in FS/deviceNo queues
21. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The I/O Scheduler
21
▪ Really good for disk-bound workloads
o But fails isolation sometimes: request sizes
o Problem is well understood, real fix a bit harder
• Good results with some manual intervention
▪ Major component of Scylla since early versions
o Central component in The Wall
22. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The CPU Scheduler
22
▪ 2.0 ships with an initial version
o Only isolates compactions and memtable flushes
▪ 2.1 will ship with the full solution
o Will in general isolate better
o And will also isolate repairs
23. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers
23
24. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers
24
25. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers - memtable
25
26. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers - memtable
26
This is the CPU percentage needed (50 %)
27. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers - memtable
27
This is the CPU percentage needed (50 %) To keep the buffers at a stable level
28. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers - memtable
28
This is the CPU percentage needed (50 %) To keep the buffers at a stable level
Throughput barely oscillates
29. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers - memtable
29
This is the CPU percentage needed (50 %) To keep the buffers at a stable level
Throughput barely oscillates
Total system CPU usage barely oscillates
30. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers - memtable
30
without controller
with controller
31. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers - memtable
31
Before:
latency mean : 0.6
latency median : 0.5
latency 95th percentile : 0.8
latency 99th percentile : 3.6
latency 99.9th percentile : 4.5
After:
latency mean : 0.4
latency median : 0.4
latency 95th percentile : 0.6
latency 99th percentile : 0.8
latency 99.9th percentile : 1.9
32. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers - memtable
32
Before:
latency mean : 0.6
latency median : 0.5
latency 95th percentile : 0.8
latency 99th percentile : 3.6 (x 6.0)
latency 99.9th percentile : 4.5 (x 7.5)
After:
latency mean : 0.4
latency median : 0.4
latency 95th percentile : 0.6
latency 99th percentile : 0.8 (x 2.0)
latency 99.9th percentile : 1.9 (x 4.7)
33. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
The controllers - coming soon
33
▪ Cache updates
▪ Compactions
o Both of the above can already have their impact limited statically
but not controlled
▪ Repairs
o Repairs already respect latencies very well, but are not as fast as
they could be. Controllers will help unleash their full potential
34. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Summary and what to expect
34
▪ We believe high percentile latencies should be low and bounded
▪ We don’t view this as a nice to have, but as a bug instead
▪ We have fixed many of those bugs over the past year and are in a
very good position to fix the remaining ones
▪ A Scylla user can already expect to profit from that in the majority
of situations
35. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
THANK YOU
glauber@scylladb.com
@glcst
Please stay in touch
Any questions?