SlideShare a Scribd company logo
1 of 38
Download to read offline
Network Support
for Resource Disaggregation
in Next-Generation Datacenters	

Sangjin Han Norbert Egi Aurojit Panda
Sylvia Ratnasamy Guangyu Shi Scott Shenker

UC Berkeley Futurewei Technologies ICSI	

1
Future datacenters will look
fundamentally different.	

	


There will be no “servers”.	

	


Like it or not.	

2
Outline:	

	


1.  What will happen?	

2.  Why will it happen?	

3.  How will it happen?	

3
Today: Server-Centric Architecture	


Intra-server
Network	


Datacenter
Network	


4
Tomorrow: Resource-Centric Architecture	


All resources are 
individually addressable	


Datacenter
Network	


5
The Trends: Disaggregation	

1.  HP MoonShot	

–  Shared cooling/casing/power/mgmt for server blades	


6
The Trends: Disaggregation	

1.  HP MoonShot	

2.  AMD SeaMicro	

–  Virtualized I/O	


Datacenter
Network	


7
The Trends: Disaggregation	

1.  HP MoonShot	

2.  AMD SeaMicro	

3.  Intel Rack Scale Architecture	


8
The Trends: Disaggregation	

1. 
2. 
3. 
4. 

HP MoonShot	

AMD SeaMicro	

Intel Rack Scale Architecture	

Open Compute Project	


9
Disaggregated Datacenter	


Resource as a
standalone blade	


Datacenter
network	


10
Why will it happen?	

	


: Extreme resource modularity	


11
Benefits of Resource Modularity	

1.  Easier to build  evolve	

–  Resources have different cycles/trends/constraints.	

• 
• 

Tight integration in a server is a huge pain	

E.g., “Memory capacity per core drops 30% 
for every 2 years” [Lim et al., ISCA ’09]	


–  Disaggregation enables independent evolution	

–  The biggest driving force from vendor’s viewpoint	


12
Benefits of Resource Modularity	

1.  Easier to build  evolve	

2.  Fine-grained resource provisioning	

–  Current practice: replace/buy an entire 
server, rack, or even datacenter.	

–  e.g., “I want just more processors, not servers!”	

• 

Go buy some CPU blades at Best Buy® and plug them in.	


–  e.g., “I want to try the new NVRAM technology!”	

• 

Again, go for it.	


13
Benefits of Resource Modularity	

1.  Easier to build  evolve	

2.  Fine-grained resource provisioning	

3.  Operational efficiency	

–  Datacenter as a single giant computer	

–  Higher utilization with statistical multiplexing	

–  (I will get back to this)	


14
How will it happen?	

	


: Incrementally and radically.	


15
THIS IS NOT A CRAZY IDEA	


Unified
Interconnect	


•  Do we need to change everything? NO.	

–  HW change is minimal.	

–  SW change is minimal, too.	


16
HW Requires Minimal Modification	


Resource as a standalone blade	


•  The internals don’t need to change.	

•  All we need is embedded network controller.	

–  They already have: QPI, HT, PCIe, SATA, …	

–  Can be very cheap	

•  E.g., a whole graphics card w/ 128Gbps for only $50	

17
How About SW?	

•  Existing SW infrastructure heavily relies on 
the concept of “server”	

–  We don’t want to rewrite it from scratch.	

–  How to utilize the “giant computer”?	


VM	
  

Disaggregated	
  VM	
  

No modification for App/OS	

Minor changes in VMM.	

	

Much higher utilization!	

18
Elastic VMs Achieve High Utilization!	

Task	
  1	
  

Task	
  2	
  

Servers	
  

Task	
  3	
  

Task	
  4	
  

19
Elastic VMs Achieve High Utilization!	


40%	
  of	
  resources	
  are	
  wasted	
  

Servers	
  

20
Elastic VMs Achieve High Utilization!	

Disaggregated	
  
datacenter	
  

1.  No “server boundary”	

2.  Statistical multiplexing at a larger scale 	

3.  Higher utilization!	

40%	
  of	
  resources	
  are	
  wasted	
  

vs.	
  

Servers	
  

21
(tasks vary greatly)	


22
THIS IS NOT A CRAZY IDEA, part 2	

•  We don’t need to change everything.	

–  HW change is minimal.	

–  SW change is minimal, too.	


•  A unified network is plausible	

–  The intra-/inter-server networks can be unified.	

–  Bandwidth/latency requirements are within reach.	


23
Two Different(?) Types of Network	


Intra-server
Network	


Inter-server
Network	


24
Intra- vs. Inter-Server Networking	

E.g., PCIe and 10GbE	

Aren’t they two
different things?

Not really.

	


• 
• 
• 
• 
• 
• 

Serial	

Point-to-point	

Full duplex	

Packet-switched	

Variable packet size	

Supports both message and 
read/write semantics	


	


No fundamental™ difference!	

25
Making Memory Traffic Manageable	

Registers	
  
10,000 Gbps	


1 ns	


CPU	
  Cache	
  
500 Gbps	


50 ns	


Memory	
  
1-10 Gbps	


50,000 ns	


SSD	
  /	
  HDD	
  

26
Making Memory Traffic Manageable	

Registers	
  
10,000 Gbps	


1 ns	


CPU	
  Cache	
  
500 Gbps	


50 ns	


Local	
  memory	
  
?? Gbps	


?? ns	


Remote	
  memory	
  
1-10 Gbps	


50,000 ns	


A small amount of 
local memory as a “cache”	


SSD	
  /	
  HDD	
  
27
Desirable Network Speed?	

•  A quick-and-dirty experiment	

Emulated
Local memory	

 “remote” memory	

Server	
  
Core	
   Core	
  
Core	
   Core	
  

Artificial delay for
bandwidth/latency	

28
Desirable Network Speed?	

•  4 CPU cores, 8GB working set size	

•  GraphLab, memcached, Pig	

•  Findings (read the paper)	

1.  10-40 Gbps is enough.	

1.  Feasible even today!	

2.  Average link utilization:  1-5Gbps	


2.  Latency matters.	

29
WANTED: Low Latency	

memcached with varying latency	


 10µs
latency,	

 20%
overhead	


30
Research Questions	


31
Questions	

•  Answered:	

–  How fast should it be? 10-40Gbps, 1-10µs.	


•  Unanswered:	

– 
– 
– 
– 
– 

Scalability?	

Reliable transfer?	

QoS?	

Packet? Circuit?	

…	

32
1. “Right” Scale of Disaggregation	

•  Disaggregation scale: where is the sweet spot?	

TradiIonal	
  
inter-­‐server	
  
network	
  

datacenter-scale
(flat)	


chassis/rack/pod-scale
(two-tier)	

33
2. Realizing Low Latency	

•  It’s time for low latency [Rumble et al., HotOS ’11]	

–  “5-10µs latency is possible in the short term”	

–  “1µs round-trip times cannot be achieved off-processor”	


•  Congestion avoidance/control should be 
“close to the metal”.	

–  A lot of research efforts are ongoing.	

–  Will they be still valid in disaggregated datacenters?	

34
3. Unified Scheduler	

•  We will need a unified scheduler:	

–  Job scheduler + network controller	

VM1	


VM2	


VM1	

VM1	

VM2	


35
3. Unified Scheduler	

•  We will need a unified scheduler:	

–  Job scheduler + network controller	

VM1	


VM1	

VM1	


VM2	


VM2	

36
3. Unified Scheduler	

•  We will need a unified scheduler:	

–  Job scheduler + network controller	

VM1	


VM1	

VM1	


VM2	


VM2	

37
Closing Remarks	

•  Disaggregated datacenter will be “the next big thing”	

–  Already happening. We need to catch up!	


•  We are working on a small-scale prototype.	

–  Disaggregated resource blades on existing SW/HW	

•  40 CPUs	

•  6 remote memory blades	

•  8 GPU/NIC/storage blades	


–  PCIe as the “unified interconnect”	

38

More Related Content

What's hot

Farming hadoop in_the_cloud
Farming hadoop in_the_cloudFarming hadoop in_the_cloud
Farming hadoop in_the_cloudSteve Loughran
 
Software Defined Storage Appliance Power by ARM based Microserver
Software Defined Storage Appliance Power by ARM based MicroserverSoftware Defined Storage Appliance Power by ARM based Microserver
Software Defined Storage Appliance Power by ARM based MicroserverAaron Joue
 
Cy7 introduction
Cy7 introductionCy7 introduction
Cy7 introductionKunhui Wu
 
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUsDavid Klee
 
Energy Saving ARM Server Cluster Born for Distributed Storage & Computing
Energy Saving ARM Server Cluster Born for Distributed Storage & ComputingEnergy Saving ARM Server Cluster Born for Distributed Storage & Computing
Energy Saving ARM Server Cluster Born for Distributed Storage & ComputingAaron Joue
 
GWAVACon 2015: Microsoft MVP - Exchange Architecture & Sizing
GWAVACon 2015: Microsoft MVP - Exchange Architecture & SizingGWAVACon 2015: Microsoft MVP - Exchange Architecture & Sizing
GWAVACon 2015: Microsoft MVP - Exchange Architecture & SizingGWAVA
 
NGENSTOR_ODA_HPDA
NGENSTOR_ODA_HPDANGENSTOR_ODA_HPDA
NGENSTOR_ODA_HPDAUniFabric
 
Why new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterWhy new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterSolarWinds
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Community
 
VMworld 2013: How SRP Delivers More Than Power to Their Customers
VMworld 2013: How SRP Delivers More Than Power to Their Customers VMworld 2013: How SRP Delivers More Than Power to Their Customers
VMworld 2013: How SRP Delivers More Than Power to Their Customers VMworld
 
Answering the Database Scale Out Problem with PCI SSDs
Answering the Database Scale Out Problem with PCI SSDsAnswering the Database Scale Out Problem with PCI SSDs
Answering the Database Scale Out Problem with PCI SSDsanswers
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red_Hat_Storage
 
OpenNebula TechDay Boston 2015 - Future of Information Storage with ISS Super...
OpenNebula TechDay Boston 2015 - Future of Information Storage with ISS Super...OpenNebula TechDay Boston 2015 - Future of Information Storage with ISS Super...
OpenNebula TechDay Boston 2015 - Future of Information Storage with ISS Super...OpenNebula Project
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedInDataWorks Summit
 
SanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and CassandraSanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and CassandraDataStax Academy
 
DockerCon14 Cluster Management and Containerization
DockerCon14 Cluster Management and ContainerizationDockerCon14 Cluster Management and Containerization
DockerCon14 Cluster Management and ContainerizationDocker, Inc.
 
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...DataStax Academy
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraCeph Community
 
Azure en Nutanix: your journey to the hybrid cloud
Azure en Nutanix: your journey to the hybrid cloudAzure en Nutanix: your journey to the hybrid cloud
Azure en Nutanix: your journey to the hybrid cloudICT-Partners
 
MySQL Performance Tuning
MySQL Performance TuningMySQL Performance Tuning
MySQL Performance TuningFromDual GmbH
 

What's hot (20)

Farming hadoop in_the_cloud
Farming hadoop in_the_cloudFarming hadoop in_the_cloud
Farming hadoop in_the_cloud
 
Software Defined Storage Appliance Power by ARM based Microserver
Software Defined Storage Appliance Power by ARM based MicroserverSoftware Defined Storage Appliance Power by ARM based Microserver
Software Defined Storage Appliance Power by ARM based Microserver
 
Cy7 introduction
Cy7 introductionCy7 introduction
Cy7 introduction
 
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs
 
Energy Saving ARM Server Cluster Born for Distributed Storage & Computing
Energy Saving ARM Server Cluster Born for Distributed Storage & ComputingEnergy Saving ARM Server Cluster Born for Distributed Storage & Computing
Energy Saving ARM Server Cluster Born for Distributed Storage & Computing
 
GWAVACon 2015: Microsoft MVP - Exchange Architecture & Sizing
GWAVACon 2015: Microsoft MVP - Exchange Architecture & SizingGWAVACon 2015: Microsoft MVP - Exchange Architecture & Sizing
GWAVACon 2015: Microsoft MVP - Exchange Architecture & Sizing
 
NGENSTOR_ODA_HPDA
NGENSTOR_ODA_HPDANGENSTOR_ODA_HPDA
NGENSTOR_ODA_HPDA
 
Why new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterWhy new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases faster
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
 
VMworld 2013: How SRP Delivers More Than Power to Their Customers
VMworld 2013: How SRP Delivers More Than Power to Their Customers VMworld 2013: How SRP Delivers More Than Power to Their Customers
VMworld 2013: How SRP Delivers More Than Power to Their Customers
 
Answering the Database Scale Out Problem with PCI SSDs
Answering the Database Scale Out Problem with PCI SSDsAnswering the Database Scale Out Problem with PCI SSDs
Answering the Database Scale Out Problem with PCI SSDs
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
 
OpenNebula TechDay Boston 2015 - Future of Information Storage with ISS Super...
OpenNebula TechDay Boston 2015 - Future of Information Storage with ISS Super...OpenNebula TechDay Boston 2015 - Future of Information Storage with ISS Super...
OpenNebula TechDay Boston 2015 - Future of Information Storage with ISS Super...
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
 
SanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and CassandraSanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and Cassandra
 
DockerCon14 Cluster Management and Containerization
DockerCon14 Cluster Management and ContainerizationDockerCon14 Cluster Management and Containerization
DockerCon14 Cluster Management and Containerization
 
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
 
Azure en Nutanix: your journey to the hybrid cloud
Azure en Nutanix: your journey to the hybrid cloudAzure en Nutanix: your journey to the hybrid cloud
Azure en Nutanix: your journey to the hybrid cloud
 
MySQL Performance Tuning
MySQL Performance TuningMySQL Performance Tuning
MySQL Performance Tuning
 

Similar to Network support for resource disaggregation in next-generation datacenters

Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics PlatformSantanu Dey
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...Linaro
 
Data storage solutions for sns game
Data storage solutions for sns gameData storage solutions for sns game
Data storage solutions for sns gameSon Aris
 
Data Storage Solutions for SNS game
Data Storage Solutions for SNS gameData Storage Solutions for SNS game
Data Storage Solutions for SNS gamewe20
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudRose Toomey
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudDatabricks
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructurexKinAnx
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructuresolarisyourep
 
KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.Kyong-Ha Lee
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
Webinar: StorPool and WHIR - better storage, better business
Webinar: StorPool and WHIR - better storage, better businessWebinar: StorPool and WHIR - better storage, better business
Webinar: StorPool and WHIR - better storage, better businessStorPool Storage
 
Challenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsChallenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsYasin Memari
 
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMsGlobal Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMsMarco Obinu
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisMike Pittaro
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis PyData
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computingNiranjana Ambadi
 
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big DataABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big DataHitoshi Sato
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisMike Pittaro
 

Similar to Network support for resource disaggregation in next-generation datacenters (20)

Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
 
Data storage solutions for sns game
Data storage solutions for sns gameData storage solutions for sns game
Data storage solutions for sns game
 
Data Storage Solutions for SNS game
Data Storage Solutions for SNS gameData Storage Solutions for SNS game
Data Storage Solutions for SNS game
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
 
KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Webinar: StorPool and WHIR - better storage, better business
Webinar: StorPool and WHIR - better storage, better businessWebinar: StorPool and WHIR - better storage, better business
Webinar: StorPool and WHIR - better storage, better business
 
Challenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsChallenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data Genomics
 
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMsGlobal Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computing
 
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big DataABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

Network support for resource disaggregation in next-generation datacenters

  • 1. Network Support for Resource Disaggregation in Next-Generation Datacenters Sangjin Han Norbert Egi Aurojit Panda Sylvia Ratnasamy Guangyu Shi Scott Shenker UC Berkeley Futurewei Technologies ICSI 1
  • 2. Future datacenters will look fundamentally different. There will be no “servers”. Like it or not. 2
  • 3. Outline: 1.  What will happen? 2.  Why will it happen? 3.  How will it happen? 3
  • 5. Tomorrow: Resource-Centric Architecture All resources are individually addressable Datacenter Network 5
  • 6. The Trends: Disaggregation 1.  HP MoonShot –  Shared cooling/casing/power/mgmt for server blades 6
  • 7. The Trends: Disaggregation 1.  HP MoonShot 2.  AMD SeaMicro –  Virtualized I/O Datacenter Network 7
  • 8. The Trends: Disaggregation 1.  HP MoonShot 2.  AMD SeaMicro 3.  Intel Rack Scale Architecture 8
  • 9. The Trends: Disaggregation 1.  2.  3.  4.  HP MoonShot AMD SeaMicro Intel Rack Scale Architecture Open Compute Project 9
  • 10. Disaggregated Datacenter Resource as a standalone blade Datacenter network 10
  • 11. Why will it happen? : Extreme resource modularity 11
  • 12. Benefits of Resource Modularity 1.  Easier to build evolve –  Resources have different cycles/trends/constraints. •  •  Tight integration in a server is a huge pain E.g., “Memory capacity per core drops 30% for every 2 years” [Lim et al., ISCA ’09] –  Disaggregation enables independent evolution –  The biggest driving force from vendor’s viewpoint 12
  • 13. Benefits of Resource Modularity 1.  Easier to build evolve 2.  Fine-grained resource provisioning –  Current practice: replace/buy an entire server, rack, or even datacenter. –  e.g., “I want just more processors, not servers!” •  Go buy some CPU blades at Best Buy® and plug them in. –  e.g., “I want to try the new NVRAM technology!” •  Again, go for it. 13
  • 14. Benefits of Resource Modularity 1.  Easier to build evolve 2.  Fine-grained resource provisioning 3.  Operational efficiency –  Datacenter as a single giant computer –  Higher utilization with statistical multiplexing –  (I will get back to this) 14
  • 15. How will it happen? : Incrementally and radically. 15
  • 16. THIS IS NOT A CRAZY IDEA Unified Interconnect •  Do we need to change everything? NO. –  HW change is minimal. –  SW change is minimal, too. 16
  • 17. HW Requires Minimal Modification Resource as a standalone blade •  The internals don’t need to change. •  All we need is embedded network controller. –  They already have: QPI, HT, PCIe, SATA, … –  Can be very cheap •  E.g., a whole graphics card w/ 128Gbps for only $50 17
  • 18. How About SW? •  Existing SW infrastructure heavily relies on the concept of “server” –  We don’t want to rewrite it from scratch. –  How to utilize the “giant computer”? VM   Disaggregated  VM   No modification for App/OS Minor changes in VMM. Much higher utilization! 18
  • 19. Elastic VMs Achieve High Utilization! Task  1   Task  2   Servers   Task  3   Task  4   19
  • 20. Elastic VMs Achieve High Utilization! 40%  of  resources  are  wasted   Servers   20
  • 21. Elastic VMs Achieve High Utilization! Disaggregated   datacenter   1.  No “server boundary” 2.  Statistical multiplexing at a larger scale 3.  Higher utilization! 40%  of  resources  are  wasted   vs.   Servers   21
  • 23. THIS IS NOT A CRAZY IDEA, part 2 •  We don’t need to change everything. –  HW change is minimal. –  SW change is minimal, too. •  A unified network is plausible –  The intra-/inter-server networks can be unified. –  Bandwidth/latency requirements are within reach. 23
  • 24. Two Different(?) Types of Network Intra-server Network Inter-server Network 24
  • 25. Intra- vs. Inter-Server Networking E.g., PCIe and 10GbE Aren’t they two different things? Not really. •  •  •  •  •  •  Serial Point-to-point Full duplex Packet-switched Variable packet size Supports both message and read/write semantics No fundamental™ difference! 25
  • 26. Making Memory Traffic Manageable Registers   10,000 Gbps 1 ns CPU  Cache   500 Gbps 50 ns Memory   1-10 Gbps 50,000 ns SSD  /  HDD   26
  • 27. Making Memory Traffic Manageable Registers   10,000 Gbps 1 ns CPU  Cache   500 Gbps 50 ns Local  memory   ?? Gbps ?? ns Remote  memory   1-10 Gbps 50,000 ns A small amount of local memory as a “cache” SSD  /  HDD   27
  • 28. Desirable Network Speed? •  A quick-and-dirty experiment Emulated Local memory “remote” memory Server   Core   Core   Core   Core   Artificial delay for bandwidth/latency 28
  • 29. Desirable Network Speed? •  4 CPU cores, 8GB working set size •  GraphLab, memcached, Pig •  Findings (read the paper) 1.  10-40 Gbps is enough. 1.  Feasible even today! 2.  Average link utilization: 1-5Gbps 2.  Latency matters. 29
  • 30. WANTED: Low Latency memcached with varying latency 10µs latency, 20% overhead 30
  • 32. Questions •  Answered: –  How fast should it be? 10-40Gbps, 1-10µs. •  Unanswered: –  –  –  –  –  Scalability? Reliable transfer? QoS? Packet? Circuit? … 32
  • 33. 1. “Right” Scale of Disaggregation •  Disaggregation scale: where is the sweet spot? TradiIonal   inter-­‐server   network   datacenter-scale (flat) chassis/rack/pod-scale (two-tier) 33
  • 34. 2. Realizing Low Latency •  It’s time for low latency [Rumble et al., HotOS ’11] –  “5-10µs latency is possible in the short term” –  “1µs round-trip times cannot be achieved off-processor” •  Congestion avoidance/control should be “close to the metal”. –  A lot of research efforts are ongoing. –  Will they be still valid in disaggregated datacenters? 34
  • 35. 3. Unified Scheduler •  We will need a unified scheduler: –  Job scheduler + network controller VM1 VM2 VM1 VM1 VM2 35
  • 36. 3. Unified Scheduler •  We will need a unified scheduler: –  Job scheduler + network controller VM1 VM1 VM1 VM2 VM2 36
  • 37. 3. Unified Scheduler •  We will need a unified scheduler: –  Job scheduler + network controller VM1 VM1 VM1 VM2 VM2 37
  • 38. Closing Remarks •  Disaggregated datacenter will be “the next big thing” –  Already happening. We need to catch up! •  We are working on a small-scale prototype. –  Disaggregated resource blades on existing SW/HW •  40 CPUs •  6 remote memory blades •  8 GPU/NIC/storage blades –  PCIe as the “unified interconnect” 38