In this talk we report on our experience with Redis-on-Flash (RoF)—a recently introduced product that uses SSDs as a RAM extension to dramatically increase the effective dataset capacity that can be stored on a single server. This talk provides the first in-depth RoF system performance characterization: we consider different use cases (varying both RAM-to-disk access ratio and object size), and compare SATA-based RoF, NVMe-based RoF, and all-RAM Redis deployments. We show that the superior performance of NVMe drives in terms of both latency and peak bandwidth makes them a particularly good fit for RoF use cases. Specifically, we show that backing RoF with NVMe drives can deliver more than 2 million operations per second with sub-millisecond latency on a single server.
DevoxxFR 2024 Reproducible Builds with Apache Maven
Redis on NVMe SSD - Zvika Guz, Samsung
1. Zvika Guz and Vijay Balakrishnan
Memory Solutions Lab, Samsung Semiconductor Inc
Redis on NVMe SSD
2. 2
Redis-on-Flash
Closed-source (RLEC Flesh), 100% compatible with the open-source Redis
Uses Flash as RAM extension to increase effective node capacity
Tiering memory into “fast” and “slow”:
RAM saves keys and hot values
Flash saves cold values
Dynamic configuration of RAM/Flash usage
Uses RockDB as the storage engine to optimize
access to block storage
Multi-threaded and asynchronous Redis
used to access Flash
3. 3
Why Redis-on-Flash?
Optimize price-to-performance for a given workload
DRAM is more performant than flash, but $/GB is higher
Limited DRAM capacity per server
Tiering dramatically reduces $/GB, while preserving good performance ($/ops)
Enables orders-of-magnitude more capacity per server
RoF is particularly suitable for large datasets with skewed access
distribution
4. 4
Workload
Models real-world Redis Labs customers
Benchmark: memtier_benchmark (open source)
GET/SET requests, varying:
1. Object size
2. Write-to-read ratio
3. Redis RAM hit ratio
Performance target:
Maximize operation-per-second on a single server, while maintaining sub-
millisecond latency
Compared 3 system configuration
1. All-RAM: In-memory RLEC
2. Redis-on-NVMe: 4xSamsung PM1725 NVMe SSDs
3. Redis-on-SATA: 16xSamsung 850 Pro SATA SSDs
https://github.com/RedisLabs/memtier_benchmark
5. 5
Consistent sub-millisecond latencies favor NVMe
NVMe SSD are designed for consistent high performance @ ultra-low
latency
Modest incremental cost over SATA, with much better performance
Samsung PM1725 is the fastest NVMe in the market
Redis-on-NVMe
Samsung PM1725 Specification*
Form Factor 2.5”
Host Interface PCIe Gen3 x4
Capacities 800GB, 1.6TB, 3.2TB
Sequential Read 3300 MB/s
Sequential Write 1900 MB/s
Random Read 840KIOPS
Random Write 130KIOPS
Read Latency 95 usec
Write Latency 60 usec
>6X over SATA
>8.5X over SATA
*PM1725 HHHL version (PCIe Gen3 x8) provides ~double the performance and capacity, but we did not use it here
6. 7
System Configuration
Single client, single server
Industry-standard components, all available today
Server Dell PowerEdge R730xd, dual-socket
Processor 2 x Xeon E5-2690 v3 @ 2.6GHz
12 cores, 24 logical processor per CPU
24 cores, 48 logical processor total
Memory 256GB ECC DDR4
Network 10GbE
Storage 4 x Samsung
PM1725 NVMe
16 x Samsung
850PRO SATA SSD
Memtier_benchmark 1.2.6
RLEC version 4.3.0
Operating System Ubuntu 14.04
Linux Kernel 3.19.8
7. 8
Use case #1: Small Objects
100B objects, write-to-read ratio: 1:1
Perf= 750 KOPS
Latency = 0.75 msec
Disk BW=1.7 GB/s
Perf= 1.8 MOPS
Latency=0.9 msec
Disk BW=602 MB/s
50% RAM-to-Flash ratio 85% RAM-to-Flash ratio
100% of requests served with <1msec latency
8. 9
Disk Bandwidth Spike
Spikes in disk bandwidth align with RocksDB compaction phase
Can reach 2-3x the average BW
Drives must be able to sustain these spikes, otherwise tail latency suffers
Object Size=100B, write-to-read ratio=1:1, RAM-to-Flash hit ratio=85%
Disk BW=602 MB/s
9. 10
Use case #2: Large Objects
1KB objects, write-to-read ratio: 1:4
100% of requests served with <1msec latency
Perf= 270 KOPS
Latency = 0.75 msec
Disk BW=4.3 GB/s
Perf= 816 KOPS
Disk BW=3.9 GB/s
50% RAM-to-Flash ratio 85% RAM-to-Flash ratio
latency= 0.78 msec
11. 12
The Problem with SATA
Need 4X the drives to get to ~half the performance of NVMe
Performance is much more noisy:
99 latency percentile > 1msec
Very difficult to get rid of these latency spikes, exists in almost all our SATA runs
Perf= 132 KOPS
Latency = 0.65 msec
Object Size=1000B, write-to-read ratio=1:4, RAM-to-Flash hit ratio =50%
12. 13
DRAM or Flash?
Optimize performance/$ for each use-case
Affected by the dataset size, access pattern, and access locality
Redis in Memory
Redis-on-NVMe
Redis-on-SATA
$/GB DRAM:NVMe:SATA = 15:2.5:1
13. 14
Summary
Redis-on-Flash enables:
Order-of-magnitude more capacity per node
High performance at significant lower cost
Samsung PM1725 NVME:
Enables breakthrough performance @ sub-millisecond latency
Consistent performance reduces tail latency
Industry standard components, available today
Thank You!
zvika.guz@samsung.com