SlideShare a Scribd company logo
1 of 22
Download to read offline
  -‐‑‒
+)*. *+ *1 9@ 7
•
– d
TCP/IP
• *
• mTCP v memcached
– 35%
– v
2
*)4 u
v
• B3 k6
z 2 l
• mTCP +4Intel4DPDK wi
• github mTCP+4DPDK
orz
• Key4Value4Store w k
Linux l
• RADIS →
• d v orz
• Memcached →
• d
3
A G 7 LNPMXXT
4
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80 100
Key size (bytes)
Key size CDF by appearance
USR
APP
ETC
VAR
SYS
0
0.2
0.4
0.6
0.8
1
1 10 100 1000 10000 100000 1e+06
Value size (bytes)
Value Size CDF by appearance
USR
APP
ETC
VAR
SYS
0
0.2
0.4
0.6
0.8
1
1 10 100 1000 10
Value size (bytes)
Value size CDF by total
Figure 2: Key and value size distributions for all traces. The leftmost CDF shows the sizes o
B.4Atikoglu,4et4al.,4“Workload4Analysis4 of4a4LargeUScale4KeyUValue4Store,”4ACM4SIGMETRICS42012.
here. It is important to note, however, that all Memcached
instances in this study ran on identical hardware.
2.3 Tracing Methodology
Our analysis called for complete traces of traffic passing
through Memcached servers for at least a week. This task
is particularly challenging because it requires nonintrusive
instrumentation of high-traffic volume production servers.
Standard packet sniffers such as tcpdump2
have too much
overhead to run under heavy load. We therefore imple-
mented an efficient packet sniffer called mcap. Implemented
as a Linux kernel module, mcap has several advantages over
standard packet sniffers: it accesses packet data in kernel
space directly and avoids additional memory copying; it in-
troduces only 3% performance overhead (as opposed to tcp-
dump’s 30%); and unlike standard sniffers, it handles out-
of-order packets correctly by capturing incoming traffic af-
ter all TCP processing is done. Consequently, mcap has a
complete view of what the Memcached server sees, which
eliminates the need for further processing of out-of-order
packets. On the other hand, its packet parsing is optimized
for Memcached packets, and would require adaptations for
other applications.
The captured traces vary in size from 3T B to 7T B each.
This data is too large to store locally on disk, adding another
challenge: how to offload this much data (at an average rate
of more than 80, 000 samples per second) without interfering
with production traffic. We addressed this challenge by com-
bining local disk buffering and dynamic offload throttling to
take advantage of low-activity periods in the servers.
Finally, another challenge is this: how to effectively pro-
cess these large data sets? We used Apache HIVE3
to ana-
lyze Memcached traces. HIVE is part of the Hadoop frame-
work that translates SQL-like queries into MapReduce jobs.
We also used the Memcached “stats” command, as well as
Facebook’s production logs, to verify that the statistics we
computed, such as hit rates, are consistent with the aggre-
gated operational metrics collected by these tools.
3. WORKLOAD CHARACTERISTICS
This section describes the observed properties of each trace
0
10000
20000
30000
40000
50000
60000
70000
USR APP ETC VAR SYS
Requests(millions)
Pool
DELETE
UPDATE
GET
Figure 1: Distribution of request types per pool,
over exactly 7 days. UPDATE commands aggregate
all non-DELETE writing operations, such as SET,
REPLACE, etc.
operations. DELETE operations occur when a cached
database entry is modified (but not required to be
set again in the cache). SET operations occur when
the Web servers add a value to the cache. The rela-
tively high number of DELETE operations show that
this pool represents database-backed values that are
affected by frequent user modifications.
ETC has similar characteristics to APP, but with an even
higher rate of DELETE requests (of which some may
not be currently cached). ETC is the largest and least
specific of the pools, so its workloads might be the most
representative to emulate. Because it is such a large
and heterogenous workload, we pay special attention
to this workload throughout the paper.
VAR is the only pool sampled that is write-dominated. It
stores short-term values such as browser-window size
rformance metrics over
ekly patterns (Sec. 3.3,
be used to generate more
We found that the salient
r-law distributions, sim-
serving systems (Sec. 5).
d deployment that can
-scale production usage
as follows. We begin by
cached, its deployment
d its workload. Sec. 3
properties of the trace
), while Sec. 4 describes
he server point of view).
model of the most rep-
tion brings the data to-
s, followed by a section
zing cache behavior and
RIPTION
ource software package
s over the network. As
more RAM can be added
added to the network.
mmunicate with clients.
o select a unique server
ge of the total number of
Table 1: Memcached pools sampled (in one cluster).
These pools do not match their UNIX namesakes,
but are used for illustrative purposes here instead
of their internal names.
Pool Size Description
USR few user-account status information
APP dozens object metadata of one application
ETC hundreds nonspecific, general-purpose
VAR dozens server-side browser information
SYS few system data on service location
A new item arriving after the heap is exhausted requires
the eviction of an older item in the appropriate slab. Mem-
cached uses the Least-Recently-Used (LRU) algorithm to
select the items for eviction. To this end, each slab class
has an LRU queue maintaining access history on its items.
Although LRU decrees that any accessed item be moved to
the top of the queue, this version of Memcached coalesces
repeated accesses of the same item within a short period
(one minute by default) and only moves this item to the top
the first time, to reduce overhead.
2.2 Deployment
Facebook relies on Memcached for fast access to frequently-
accessed values. Web servers typically try to read persistent
values from Memcached before trying the slower backend
databases. In many cases, the caches are demand-filled,
meaning that generally, data is added to the cache after
a client has requested it and failed.
Modifications to persistent data in the database often
propagate as deletions (invalidations) to the Memcached
tier. Some cached data, however, is transient and not backed
by persistent storage, requiring no invalidations.
. VPVNLNRP
USR4keys4are416B4or421B
90%4of4VAR4keys4are431B
USR4values4are4only42B
90%4of4values4are4smaller4than4500B
vw
c *)>M g
b*)*) ( 1%/-‐‑‒%*+ 1 t *.C
c EI +> ag
b+ *)2 ( *. *)/ t * NUXNT ( LNTP
*
* ii
5
L14(64KB)
L24(256KB)
6
CPU4Core
L14(64KB)
L24(256KB)
L14(64KB)
L24(256KB)
L14(64KB)
L24(256KB)
CPU4Core
L14(64KB)
L24(256KB)
CPU4Core
L14(64KB)
L24(256KB)
L14(64KB)
L24(256KB)
L14(64KB)
L24(256KB)
CPU4Core
LLC4(12MB)
44cycles
124cycles
444cycles
Memory4(xx4GB)
3004cyclesMatching4Tables
(4>4xx4MB)
Copyright420144NTT4Corporaton
m x86 v n 7
2010$Sep.
Per$Packet)CPU)Cycles)for)10G
8
1,200 600
1,200 1,600
Cycles'
needed
Packet'I/O IPv4'lookup
='1,800'cycles
='2,800
Your
budget
1,400'cycles
10G, min-sized packets, dual quad-core 2.66GHz CPUs
5,4001,200 … ='6,600
Packet'I/O IPv6'lookup
Packet'I/O Encryption'and'hashing
IPv4
IPv6
IPsec
+
+
+
(in x86, cycle numbers are from RouteBricks [Dobrescu09] and ours)
S. Han, et al., “PacketShader: a GPU-accelerated Software Router,”
SIGCOMM 2010.
※
2010$Sep.
PacketShader:)psio I/O)Optimization
9
Packet'I/O
Packet'I/O
Packet'I/O
Packet'I/O
! 1,200'reduced'to'200'cycles'
per'packet
! Main'ideas
! Huge'packet'buffer
! Batch'processing
600
1,600
IPv4'lookup
='1,800'cycles
='2,800
5,400 … ='6,600
IPv6'lookup
Encryption'and'hashing
+
+
+
1,200
1,200
1,200
S. Han, et al., “PacketShader: a GPU-accelerated Software Router,”
SIGCOMM 2010.
2010$Sep.
PacketShader:)GPU)Offloading
10
Packet'I/O
Packet'I/O
Packet'I/O
! GPU'Offloading'for
! MemoryMintensive'or
! ComputeMintensive'
operations
! Main'topic'of'this'talk
600
1,600
IPv4'lookup
5,400 …
IPv6'lookup
Encryption'and'hashing
+
+
+
S. Han, et al., “PacketShader: a GPU-accelerated Software Router,”
SIGCOMM 2010.
Kernel Uses the Most CPU Cycles
4
83% of CPU usage spent
inside kernel!
Performance bottlenecks
1. Shared resources
2. Broken locality
3. Per packet processing
1) Efficient use of CPU cycles
for TCP/IP processing
2.35x more CPU cycles for app
2) 3x ~ 25x better performance
Bottleneck removed
by mTCPKernel
(without TCP/IP)
45%
Packet I/O
4%
TCP/IP
34%
Application
17%
CPU Usage Breakdown of Web Server
Web server (Lighttpd) Serving a 64 byte file
Linux-3.10
11
E.4Jeong,4et4al.,4“mTCP:4A4Highly4Scalable4UserUlevel4TCP4Stack4for4
Multicore4Systems,”4NSDI2014.
12
Inefficiencies in Kernel from Shared FD
1. Shared resources
– Shared listening queue
– Shared file descriptor space
5
Per-core packet queue
Receive-Side Scaling (H/W)
Core 0 Core 1 Core 3Core 2
Listening queue
Lock
File descriptor space
Linear search for finding empty slot
E.4Jeong,4et4al.,4“mTCP:4A4Highly4Scalable4UserUlevel4TCP4Stack4for4
Multicore4Systems,”4NSDI2014.
13
Inefficiencies in Kernel from Broken Locality
2. Broken locality
6
Per-core packet queue
Receive-Side Scaling (H/W)
Core 0 Core 1 Core 3Core 2
Interrupt
handle
accept()
read()
write()
Interrupt handling core != accepting core
E.4Jeong,4et4al.,4“mTCP:4A4Highly4Scalable4UserUlevel4TCP4Stack4for4
Multicore4Systems,”4NSDI2014.
14
Inefficiencies in Kernel from Lack of Support for Batching
3. Per packet, per system call processing
Inefficient per packet processing
Frequent mode switching
Cache pollution
Per packet memory allocation
Inefficient per system call processing
7
accept(), read(), write()
Packet I/O
Kernel TCP
Application thread
BSD socket LInux epoll
Kernel
User
E.4Jeong,4et4al.,4“mTCP:4A4Highly4Scalable4UserUlevel4TCP4Stack4for4
Multicore4Systems,”4NSDI2014.
15
Overview of mTCP Architecture
10
1. Thread model: Pairwise, per-core threading
2. Batching from packet I/O to application
3. mTCP API: Easily portable API (BSD-like)
User-level packet I/O library (PSIO)
mTCP thread 0 mTCP thread 1
Application
Thread 0
Application
Thread 1
mTCP socket mTCP epoll
NIC device driver Kernel-level
1
2
3
User-level
Core 0 Core 1
• [SIGCOMM’10] PacketShader: A GPU-accelerated software router,
http://shader.kaist.edu/packetshader/io_engine/index.html
E.4Jeong,4et4al.,4“mTCP:4A4Highly4Scalable4UserUlevel4TCP4Stack4for4
Multicore4Systems,”4NSDI2014.
Intel4DPDK
VPVNLNRP VH E
•
– k u •l • z
z h c.f.4SeaStar
16
main4
thread
worker4
thread
worker4
thread
worker4
thread
kernel
main4
thread
worker4
thread
worker4
thread
worker4
thread
mTCP
thread
mTCP
thread
mTCP
thread
pipe
accept()
accept()
read()
write()
read()
write()
read()
write()
read()
write()
accept()
read()
write()
accept()
read()
write()
c
b + E MLNT X MLNT
b VH E
c
b US R * -‐‑‒ + % 8 LNRP MP NRVL T
b CPVNLNRP * -‐‑‒ + -‐‑‒ % VN MP NRVL T
17
Hardware
CPU Intel Xeon E5-22430L/2.0GHz
(6 core) x 2 sockets
Memory 48 GB PC3-12800
Ethernet Intel X520-SR1 (10 GbE)
Software
OS Debian GNU/Linux 8.1
kernel Linux 3.16.0-4-amd64
Intel DPDK 2.0.0
mTCP (4603a1a,June 7 2015)
US R
0
20
40
60
80
100
120
140
160
180
0 2 4 6 8 10 12
10004REQUESTS/SECOND
#CORES
Linux SO_REUSEPORT mTCP
higher4is4better
• Apache4benchmark
• 64B4message
• 10004concurrency
• 100K4requests
3.3x
5.5x
18
VPVNLNRP
c VN MP NRVL T FL S MP NRVL T
l
c VH E v d
G<H .  d><H +  
c u d
19
TCP$
w/$1$thread
TCP$
w/$3$threads
mTCP
w/$1$thread
SET 85,404 146,3514(1.71) 115,1664(1.35)
GET 115,079 139,5754(1.21) 116,8384(1.02)
• mcUbenchmark
• 64B4message
• 5004concurrency
• 100K4requests
VH E g
c – v d
u k u
z ls E A u
c 9G qP XUU 8E@ v d
k P NX P US P S
P P l u d
c
c v
c D@ E A w d
c v v d
dH E(@E v z e
E A e
20
•
• X86
w
• vzh
– z
• cpufreqUinfo(1) v v1/2
– cgroups CPU4throttling z
– kXeon4Phil w z
• r FLARE Tilera v
21
o p
Supachai Thongprasit
e
[1]4S.4Thongprasit,4V.4Visoottiviseh,4and4R.4Takano,4“Toward4Fast4and4Scalable4
KeyUValue4Stores4Based4on4User4Space4TCP/IP4Stack,”4AINTEC42015.4
d d
k d l e
u d ve
22

More Related Content

What's hot

クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術Ryousei Takano
 
An introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAn introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAlessio Villardita
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCRyousei Takano
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...Linaro
 
Warehouse scale computer
Warehouse scale computerWarehouse scale computer
Warehouse scale computerHassan A-j
 
Stig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputerStig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputerDanny Abukalam
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale SupercomputerSagar Dolas
 
Vector processor : Notes
Vector processor : NotesVector processor : Notes
Vector processor : NotesSubhajit Sahu
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformGanesan Narayanasamy
 
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical ResearchBruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical ResearchDanny Abukalam
 
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand SolutionsMellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand Solutionsinside-BigData.com
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveJason Shih
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit SupercomputerVigneshwarRamaswamy
 
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOCScale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOCinside-BigData.com
 
Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*Intel® Software
 
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)inside-BigData.com
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitSlides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitCarlo C. del Mundo
 
XNAT Tuning & Monitoring
XNAT Tuning & MonitoringXNAT Tuning & Monitoring
XNAT Tuning & MonitoringJohn Paulett
 

What's hot (20)

クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術
 
An introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAn introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale Computers
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPC
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
 
Warehouse scale computer
Warehouse scale computerWarehouse scale computer
Warehouse scale computer
 
Stig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputerStig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputer
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale Supercomputer
 
Vector processor : Notes
Vector processor : NotesVector processor : Notes
Vector processor : Notes
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platform
 
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical ResearchBruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
 
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand SolutionsMellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit Supercomputer
 
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOCScale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
 
Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*
 
Google warehouse scale computer
Google warehouse scale computerGoogle warehouse scale computer
Google warehouse scale computer
 
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitSlides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
 
XNAT Tuning & Monitoring
XNAT Tuning & MonitoringXNAT Tuning & Monitoring
XNAT Tuning & Monitoring
 

Viewers also liked

xv6から始めるSPIN入門
xv6から始めるSPIN入門xv6から始めるSPIN入門
xv6から始めるSPIN入門Ryousei Takano
 
とある帽子の大蛇料理Ⅱ
とある帽子の大蛇料理Ⅱとある帽子の大蛇料理Ⅱ
とある帽子の大蛇料理ⅡMasami Ichikawa
 
あなたの知らないネットワークプログラミングの世界
あなたの知らないネットワークプログラミングの世界あなたの知らないネットワークプログラミングの世界
あなたの知らないネットワークプログラミングの世界Ryousei Takano
 
100Gbpsソフトウェアルータの実現可能性に関する論文
100Gbpsソフトウェアルータの実現可能性に関する論文100Gbpsソフトウェアルータの実現可能性に関する論文
100Gbpsソフトウェアルータの実現可能性に関する論文y_uuki
 
xv6のコンテキストスイッチを読む
xv6のコンテキストスイッチを読むxv6のコンテキストスイッチを読む
xv6のコンテキストスイッチを読むmfumi
 
デバドラを書いてみよう!
デバドラを書いてみよう!デバドラを書いてみよう!
デバドラを書いてみよう!Masami Ichikawa
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜Ryousei Takano
 
Disruptive IP Networking with Intel DPDK on Linux
Disruptive IP Networking with Intel DPDK on LinuxDisruptive IP Networking with Intel DPDK on Linux
Disruptive IP Networking with Intel DPDK on LinuxNaoto MATSUMOTO
 
x86とコンテキストスイッチ
x86とコンテキストスイッチx86とコンテキストスイッチ
x86とコンテキストスイッチMasami Ichikawa
 
Network processing by pid
Network processing by pidNetwork processing by pid
Network processing by pidNuno Martins
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価Ryousei Takano
 
Xeon dとlagopusと、pktgen dpdk
Xeon dとlagopusと、pktgen dpdkXeon dとlagopusと、pktgen dpdk
Xeon dとlagopusと、pktgen dpdkMasaru Oki
 
Dpdk環境の話
Dpdk環境の話Dpdk環境の話
Dpdk環境の話Masaru Oki
 
Intel 82599 10GbE Controllerで遊ぼう
Intel 82599 10GbE Controllerで遊ぼうIntel 82599 10GbE Controllerで遊ぼう
Intel 82599 10GbE Controllerで遊ぼうTakuya ASADA
 
10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化Takuya ASADA
 

Viewers also liked (20)

xv6から始めるSPIN入門
xv6から始めるSPIN入門xv6から始めるSPIN入門
xv6から始めるSPIN入門
 
MSDOS
MSDOSMSDOS
MSDOS
 
Bish Bash Bosh & Co
Bish Bash Bosh & Co Bish Bash Bosh & Co
Bish Bash Bosh & Co
 
とある帽子の大蛇料理Ⅱ
とある帽子の大蛇料理Ⅱとある帽子の大蛇料理Ⅱ
とある帽子の大蛇料理Ⅱ
 
あなたの知らないネットワークプログラミングの世界
あなたの知らないネットワークプログラミングの世界あなたの知らないネットワークプログラミングの世界
あなたの知らないネットワークプログラミングの世界
 
πολλαπλασιασμοι ενοτητα 11
πολλαπλασιασμοι ενοτητα 11πολλαπλασιασμοι ενοτητα 11
πολλαπλασιασμοι ενοτητα 11
 
100Gbpsソフトウェアルータの実現可能性に関する論文
100Gbpsソフトウェアルータの実現可能性に関する論文100Gbpsソフトウェアルータの実現可能性に関する論文
100Gbpsソフトウェアルータの実現可能性に関する論文
 
xv6のコンテキストスイッチを読む
xv6のコンテキストスイッチを読むxv6のコンテキストスイッチを読む
xv6のコンテキストスイッチを読む
 
デバドラを書いてみよう!
デバドラを書いてみよう!デバドラを書いてみよう!
デバドラを書いてみよう!
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜
 
Disruptive IP Networking with Intel DPDK on Linux
Disruptive IP Networking with Intel DPDK on LinuxDisruptive IP Networking with Intel DPDK on Linux
Disruptive IP Networking with Intel DPDK on Linux
 
x86とコンテキストスイッチ
x86とコンテキストスイッチx86とコンテキストスイッチ
x86とコンテキストスイッチ
 
Network processing by pid
Network processing by pidNetwork processing by pid
Network processing by pid
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価
 
DPDKを拡張してみた話し
DPDKを拡張してみた話しDPDKを拡張してみた話し
DPDKを拡張してみた話し
 
Xeon dとlagopusと、pktgen dpdk
Xeon dとlagopusと、pktgen dpdkXeon dとlagopusと、pktgen dpdk
Xeon dとlagopusと、pktgen dpdk
 
Dpdk環境の話
Dpdk環境の話Dpdk環境の話
Dpdk環境の話
 
Msdos
MsdosMsdos
Msdos
 
Intel 82599 10GbE Controllerで遊ぼう
Intel 82599 10GbE Controllerで遊ぼうIntel 82599 10GbE Controllerで遊ぼう
Intel 82599 10GbE Controllerで遊ぼう
 
10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化
 

Similar to Understanding Memcached Workloads Through Traces

Migrating the elastic stack to the cloud, or application logging @ travix
 Migrating the elastic stack to the cloud, or application logging @ travix Migrating the elastic stack to the cloud, or application logging @ travix
Migrating the elastic stack to the cloud, or application logging @ travixRuslan Lutsenko
 
Big Data, Mob Scale.
Big Data, Mob Scale.Big Data, Mob Scale.
Big Data, Mob Scale.darach
 
Big Events, Mob Scale - Darach Ennis (Push Technology)
Big Events, Mob Scale - Darach Ennis (Push Technology)Big Events, Mob Scale - Darach Ennis (Push Technology)
Big Events, Mob Scale - Darach Ennis (Push Technology)jaxLondonConference
 
Troubleshooting SQL Server
Troubleshooting SQL ServerTroubleshooting SQL Server
Troubleshooting SQL ServerStephen Rose
 
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudRick Bilodeau
 
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco IntercloudStreamsets Inc.
 
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingNear Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingDibyendu Bhattacharya
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Alluxio, Inc.
 
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Cloudera, Inc.
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with storesYoni Farin
 
Exploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed UpExploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed UpIJERD Editor
 
Clug 2011 March web server optimisation
Clug 2011 March  web server optimisationClug 2011 March  web server optimisation
Clug 2011 March web server optimisationgrooverdan
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware ProvisioningMongoDB
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerYongseok Oh
 
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdfMukundThakur22
 
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCoburn Watson
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5Peter Lawrey
 

Similar to Understanding Memcached Workloads Through Traces (20)

Migrating the elastic stack to the cloud, or application logging @ travix
 Migrating the elastic stack to the cloud, or application logging @ travix Migrating the elastic stack to the cloud, or application logging @ travix
Migrating the elastic stack to the cloud, or application logging @ travix
 
Big Data, Mob Scale.
Big Data, Mob Scale.Big Data, Mob Scale.
Big Data, Mob Scale.
 
Big Events, Mob Scale - Darach Ennis (Push Technology)
Big Events, Mob Scale - Darach Ennis (Push Technology)Big Events, Mob Scale - Darach Ennis (Push Technology)
Big Events, Mob Scale - Darach Ennis (Push Technology)
 
11g R2
11g R211g R2
11g R2
 
Troubleshooting SQL Server
Troubleshooting SQL ServerTroubleshooting SQL Server
Troubleshooting SQL Server
 
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
 
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
 
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingNear Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
 
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
 
Exploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed UpExploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed Up
 
Clug 2011 March web server optimisation
Clug 2011 March  web server optimisationClug 2011 March  web server optimisation
Clug 2011 March web server optimisation
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
 
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
 
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performance
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 

More from Ryousei Takano

Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive ComputingRyousei Takano
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIRyousei Takano
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentRyousei Takano
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksRyousei Takano
 
不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何かRyousei Takano
 
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...Ryousei Takano
 
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~Ryousei Takano
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green CloudRyousei Takano
 
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...Ryousei Takano
 
伸縮自在なデータセンターを実現するインタークラウド資源管理システム
伸縮自在なデータセンターを実現するインタークラウド資源管理システム伸縮自在なデータセンターを実現するインタークラウド資源管理システム
伸縮自在なデータセンターを実現するインタークラウド資源管理システムRyousei Takano
 
SoNIC: Precise Realtime Software Access and Control of Wired Networks
SoNIC: Precise Realtime Software Access and Control of Wired NetworksSoNIC: Precise Realtime Software Access and Control of Wired Networks
SoNIC: Precise Realtime Software Access and Control of Wired NetworksRyousei Takano
 
異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構Ryousei Takano
 
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式Ryousei Takano
 
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...Ryousei Takano
 
インタークラウドにおける仮想インフラ構築システム
インタークラウドにおける仮想インフラ構築システムインタークラウドにおける仮想インフラ構築システム
インタークラウドにおける仮想インフラ構築システムRyousei Takano
 
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...Ryousei Takano
 
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御Ryousei Takano
 

More from Ryousei Takano (19)

Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive Computing
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCI
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
 
ABCI Data Center
ABCI Data CenterABCI Data Center
ABCI Data Center
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center Networks
 
不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か
 
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
 
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
 
IEEE/ACM SC2013報告
IEEE/ACM SC2013報告IEEE/ACM SC2013報告
IEEE/ACM SC2013報告
 
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
 
伸縮自在なデータセンターを実現するインタークラウド資源管理システム
伸縮自在なデータセンターを実現するインタークラウド資源管理システム伸縮自在なデータセンターを実現するインタークラウド資源管理システム
伸縮自在なデータセンターを実現するインタークラウド資源管理システム
 
SoNIC: Precise Realtime Software Access and Control of Wired Networks
SoNIC: Precise Realtime Software Access and Control of Wired NetworksSoNIC: Precise Realtime Software Access and Control of Wired Networks
SoNIC: Precise Realtime Software Access and Control of Wired Networks
 
異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構
 
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
 
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
 
インタークラウドにおける仮想インフラ構築システム
インタークラウドにおける仮想インフラ構築システムインタークラウドにおける仮想インフラ構築システム
インタークラウドにおける仮想インフラ構築システム
 
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
 
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
 

Recently uploaded

Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringJuanCarlosMorales19600
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptJasonTagapanGulla
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 

Recently uploaded (20)

Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineering
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.ppt
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 

Understanding Memcached Workloads Through Traces

  • 2. • – d TCP/IP • * • mTCP v memcached – 35% – v 2 *)4 u v
  • 3. • B3 k6 z 2 l • mTCP +4Intel4DPDK wi • github mTCP+4DPDK orz • Key4Value4Store w k Linux l • RADIS → • d v orz • Memcached → • d 3
  • 4. A G 7 LNPMXXT 4 0 0.2 0.4 0.6 0.8 1 0 20 40 60 80 100 Key size (bytes) Key size CDF by appearance USR APP ETC VAR SYS 0 0.2 0.4 0.6 0.8 1 1 10 100 1000 10000 100000 1e+06 Value size (bytes) Value Size CDF by appearance USR APP ETC VAR SYS 0 0.2 0.4 0.6 0.8 1 1 10 100 1000 10 Value size (bytes) Value size CDF by total Figure 2: Key and value size distributions for all traces. The leftmost CDF shows the sizes o B.4Atikoglu,4et4al.,4“Workload4Analysis4 of4a4LargeUScale4KeyUValue4Store,”4ACM4SIGMETRICS42012. here. It is important to note, however, that all Memcached instances in this study ran on identical hardware. 2.3 Tracing Methodology Our analysis called for complete traces of traffic passing through Memcached servers for at least a week. This task is particularly challenging because it requires nonintrusive instrumentation of high-traffic volume production servers. Standard packet sniffers such as tcpdump2 have too much overhead to run under heavy load. We therefore imple- mented an efficient packet sniffer called mcap. Implemented as a Linux kernel module, mcap has several advantages over standard packet sniffers: it accesses packet data in kernel space directly and avoids additional memory copying; it in- troduces only 3% performance overhead (as opposed to tcp- dump’s 30%); and unlike standard sniffers, it handles out- of-order packets correctly by capturing incoming traffic af- ter all TCP processing is done. Consequently, mcap has a complete view of what the Memcached server sees, which eliminates the need for further processing of out-of-order packets. On the other hand, its packet parsing is optimized for Memcached packets, and would require adaptations for other applications. The captured traces vary in size from 3T B to 7T B each. This data is too large to store locally on disk, adding another challenge: how to offload this much data (at an average rate of more than 80, 000 samples per second) without interfering with production traffic. We addressed this challenge by com- bining local disk buffering and dynamic offload throttling to take advantage of low-activity periods in the servers. Finally, another challenge is this: how to effectively pro- cess these large data sets? We used Apache HIVE3 to ana- lyze Memcached traces. HIVE is part of the Hadoop frame- work that translates SQL-like queries into MapReduce jobs. We also used the Memcached “stats” command, as well as Facebook’s production logs, to verify that the statistics we computed, such as hit rates, are consistent with the aggre- gated operational metrics collected by these tools. 3. WORKLOAD CHARACTERISTICS This section describes the observed properties of each trace 0 10000 20000 30000 40000 50000 60000 70000 USR APP ETC VAR SYS Requests(millions) Pool DELETE UPDATE GET Figure 1: Distribution of request types per pool, over exactly 7 days. UPDATE commands aggregate all non-DELETE writing operations, such as SET, REPLACE, etc. operations. DELETE operations occur when a cached database entry is modified (but not required to be set again in the cache). SET operations occur when the Web servers add a value to the cache. The rela- tively high number of DELETE operations show that this pool represents database-backed values that are affected by frequent user modifications. ETC has similar characteristics to APP, but with an even higher rate of DELETE requests (of which some may not be currently cached). ETC is the largest and least specific of the pools, so its workloads might be the most representative to emulate. Because it is such a large and heterogenous workload, we pay special attention to this workload throughout the paper. VAR is the only pool sampled that is write-dominated. It stores short-term values such as browser-window size rformance metrics over ekly patterns (Sec. 3.3, be used to generate more We found that the salient r-law distributions, sim- serving systems (Sec. 5). d deployment that can -scale production usage as follows. We begin by cached, its deployment d its workload. Sec. 3 properties of the trace ), while Sec. 4 describes he server point of view). model of the most rep- tion brings the data to- s, followed by a section zing cache behavior and RIPTION ource software package s over the network. As more RAM can be added added to the network. mmunicate with clients. o select a unique server ge of the total number of Table 1: Memcached pools sampled (in one cluster). These pools do not match their UNIX namesakes, but are used for illustrative purposes here instead of their internal names. Pool Size Description USR few user-account status information APP dozens object metadata of one application ETC hundreds nonspecific, general-purpose VAR dozens server-side browser information SYS few system data on service location A new item arriving after the heap is exhausted requires the eviction of an older item in the appropriate slab. Mem- cached uses the Least-Recently-Used (LRU) algorithm to select the items for eviction. To this end, each slab class has an LRU queue maintaining access history on its items. Although LRU decrees that any accessed item be moved to the top of the queue, this version of Memcached coalesces repeated accesses of the same item within a short period (one minute by default) and only moves this item to the top the first time, to reduce overhead. 2.2 Deployment Facebook relies on Memcached for fast access to frequently- accessed values. Web servers typically try to read persistent values from Memcached before trying the slower backend databases. In many cases, the caches are demand-filled, meaning that generally, data is added to the cache after a client has requested it and failed. Modifications to persistent data in the database often propagate as deletions (invalidations) to the Memcached tier. Some cached data, however, is transient and not backed by persistent storage, requiring no invalidations. . VPVNLNRP USR4keys4are416B4or421B 90%4of4VAR4keys4are431B USR4values4are4only42B 90%4of4values4are4smaller4than4500B
  • 5. vw c *)>M g b*)*) ( 1%/-‐‑‒%*+ 1 t *.C c EI +> ag b+ *)2 ( *. *)/ t * NUXNT ( LNTP * * ii 5
  • 8. 2010$Sep. Per$Packet)CPU)Cycles)for)10G 8 1,200 600 1,200 1,600 Cycles' needed Packet'I/O IPv4'lookup ='1,800'cycles ='2,800 Your budget 1,400'cycles 10G, min-sized packets, dual quad-core 2.66GHz CPUs 5,4001,200 … ='6,600 Packet'I/O IPv6'lookup Packet'I/O Encryption'and'hashing IPv4 IPv6 IPsec + + + (in x86, cycle numbers are from RouteBricks [Dobrescu09] and ours) S. Han, et al., “PacketShader: a GPU-accelerated Software Router,” SIGCOMM 2010. ※
  • 9. 2010$Sep. PacketShader:)psio I/O)Optimization 9 Packet'I/O Packet'I/O Packet'I/O Packet'I/O ! 1,200'reduced'to'200'cycles' per'packet ! Main'ideas ! Huge'packet'buffer ! Batch'processing 600 1,600 IPv4'lookup ='1,800'cycles ='2,800 5,400 … ='6,600 IPv6'lookup Encryption'and'hashing + + + 1,200 1,200 1,200 S. Han, et al., “PacketShader: a GPU-accelerated Software Router,” SIGCOMM 2010.
  • 10. 2010$Sep. PacketShader:)GPU)Offloading 10 Packet'I/O Packet'I/O Packet'I/O ! GPU'Offloading'for ! MemoryMintensive'or ! ComputeMintensive' operations ! Main'topic'of'this'talk 600 1,600 IPv4'lookup 5,400 … IPv6'lookup Encryption'and'hashing + + + S. Han, et al., “PacketShader: a GPU-accelerated Software Router,” SIGCOMM 2010.
  • 11. Kernel Uses the Most CPU Cycles 4 83% of CPU usage spent inside kernel! Performance bottlenecks 1. Shared resources 2. Broken locality 3. Per packet processing 1) Efficient use of CPU cycles for TCP/IP processing 2.35x more CPU cycles for app 2) 3x ~ 25x better performance Bottleneck removed by mTCPKernel (without TCP/IP) 45% Packet I/O 4% TCP/IP 34% Application 17% CPU Usage Breakdown of Web Server Web server (Lighttpd) Serving a 64 byte file Linux-3.10 11 E.4Jeong,4et4al.,4“mTCP:4A4Highly4Scalable4UserUlevel4TCP4Stack4for4 Multicore4Systems,”4NSDI2014.
  • 12. 12 Inefficiencies in Kernel from Shared FD 1. Shared resources – Shared listening queue – Shared file descriptor space 5 Per-core packet queue Receive-Side Scaling (H/W) Core 0 Core 1 Core 3Core 2 Listening queue Lock File descriptor space Linear search for finding empty slot E.4Jeong,4et4al.,4“mTCP:4A4Highly4Scalable4UserUlevel4TCP4Stack4for4 Multicore4Systems,”4NSDI2014.
  • 13. 13 Inefficiencies in Kernel from Broken Locality 2. Broken locality 6 Per-core packet queue Receive-Side Scaling (H/W) Core 0 Core 1 Core 3Core 2 Interrupt handle accept() read() write() Interrupt handling core != accepting core E.4Jeong,4et4al.,4“mTCP:4A4Highly4Scalable4UserUlevel4TCP4Stack4for4 Multicore4Systems,”4NSDI2014.
  • 14. 14 Inefficiencies in Kernel from Lack of Support for Batching 3. Per packet, per system call processing Inefficient per packet processing Frequent mode switching Cache pollution Per packet memory allocation Inefficient per system call processing 7 accept(), read(), write() Packet I/O Kernel TCP Application thread BSD socket LInux epoll Kernel User E.4Jeong,4et4al.,4“mTCP:4A4Highly4Scalable4UserUlevel4TCP4Stack4for4 Multicore4Systems,”4NSDI2014.
  • 15. 15 Overview of mTCP Architecture 10 1. Thread model: Pairwise, per-core threading 2. Batching from packet I/O to application 3. mTCP API: Easily portable API (BSD-like) User-level packet I/O library (PSIO) mTCP thread 0 mTCP thread 1 Application Thread 0 Application Thread 1 mTCP socket mTCP epoll NIC device driver Kernel-level 1 2 3 User-level Core 0 Core 1 • [SIGCOMM’10] PacketShader: A GPU-accelerated software router, http://shader.kaist.edu/packetshader/io_engine/index.html E.4Jeong,4et4al.,4“mTCP:4A4Highly4Scalable4UserUlevel4TCP4Stack4for4 Multicore4Systems,”4NSDI2014. Intel4DPDK
  • 16. VPVNLNRP VH E • – k u •l • z z h c.f.4SeaStar 16 main4 thread worker4 thread worker4 thread worker4 thread kernel main4 thread worker4 thread worker4 thread worker4 thread mTCP thread mTCP thread mTCP thread pipe accept() accept() read() write() read() write() read() write() read() write() accept() read() write() accept() read() write()
  • 17. c b + E MLNT X MLNT b VH E c b US R * -‐‑‒ + % 8 LNRP MP NRVL T b CPVNLNRP * -‐‑‒ + -‐‑‒ % VN MP NRVL T 17 Hardware CPU Intel Xeon E5-22430L/2.0GHz (6 core) x 2 sockets Memory 48 GB PC3-12800 Ethernet Intel X520-SR1 (10 GbE) Software OS Debian GNU/Linux 8.1 kernel Linux 3.16.0-4-amd64 Intel DPDK 2.0.0 mTCP (4603a1a,June 7 2015)
  • 18. US R 0 20 40 60 80 100 120 140 160 180 0 2 4 6 8 10 12 10004REQUESTS/SECOND #CORES Linux SO_REUSEPORT mTCP higher4is4better • Apache4benchmark • 64B4message • 10004concurrency • 100K4requests 3.3x 5.5x 18
  • 19. VPVNLNRP c VN MP NRVL T FL S MP NRVL T l c VH E v d G<H .  d><H +   c u d 19 TCP$ w/$1$thread TCP$ w/$3$threads mTCP w/$1$thread SET 85,404 146,3514(1.71) 115,1664(1.35) GET 115,079 139,5754(1.21) 116,8384(1.02) • mcUbenchmark • 64B4message • 5004concurrency • 100K4requests
  • 20. VH E g c – v d u k u z ls E A u c 9G qP XUU 8E@ v d k P NX P US P S P P l u d c c v c D@ E A w d c v v d dH E(@E v z e E A e 20
  • 21. • • X86 w • vzh – z • cpufreqUinfo(1) v v1/2 – cgroups CPU4throttling z – kXeon4Phil w z • r FLARE Tilera v 21