SlideShare a Scribd company logo
1 of 32
Download to read offline
CloudStack Scalability
Testing, Development, Results, and Futures
Apache CloudStack: a project in incubation

                       • Secure, multi-tenant cloud
                         orchestration platform
                         – Turnkey platform for delivering IaaS clouds
                         – Hypervisor agnostic
                         – Highly scalable, secure and open
                         – Complete Self-service portal
                         – Open source, open standards
                         – Deploys on premise
Manage hosts, create VMs, virtual disks, virtual
            Admin          networks, meter usage, ….
                                                    Internet
                    Management Server
                    Cluster

                               Primary
                                                          Router
                               MySQL
                                   Backup                 Load Balancer
                                   MySQL
                                                           L3 Core Switch
Top of Rack
     Switch



                                                                               Object Storage
  Servers
               …              …                …       …              …
                                                                            Availability Zone 1
            Pod 1          Pod 2            Pod 3                  Pod N
Thinking about cloud orchestration at scale
 • Host management
 • Capacity management
 • What host to use to deploy a
   new VM
 • Failure handling
 • Security group propagation
 • Set a goal
We can’t afford this as our QA lab
Simulator enables scale testing


                       Mgmt.
                       Server       Zone
 User API                                     MySQL
                                  Simulator
              Load     Mgmt.
            Balancer   Server
Admin API

                       Mgmt.    MySQL
                       Server

                       Mgmt.
                       Server
Environment
                         2 cores, 4 with Hyper
                       Threading. 2.2 GHz Xeon.
                           Mgmt.
                       16 GB RAM. 12 GB JVM
                           Server
                                Heap.
                                                  Zone
                                          Single spinning disk, later
                                                                  MySQL
 User API
                                          singleSimulator GB RAM.
                                                 SSD. 32
              Load         Mgmt.                 MySQL 5.5.
            Balancer       Server
Admin API

                           Mgmt.              MySQL
                           Server

                           Mgmt.
                           Server
Allocator performance is awful with 1000 hosts
 • Two minutes to decide which host to use for a new VM!
    • Computing capacity for every pod repeatedly
 • Fixed that, but still 12 seconds to decide
 • Use host tags, down to 2 seconds
 • Major changes required to improve further
    • In 2.2.0, store capacity info in DB, skip pod altogether
 • Harness the power of SQL select and all is well
Polling doesn’t scale

 TRUE?              FALSE?
 Sometimes, it is good enough
Host management
• Check host state via TCP connection
• Check every minute
   • 30,000 checks per minute, 500 per second
   • But they take 10 seconds, so 5000 in parallel
   • Not using async I/O so 5000 threads required…
   • Single JVM can support 2000+ threads so this is
     concerning but may not be the limiting factor
Host management
• What is the maximum feasible JVM heap size?
   • Some people use heaps with hundreds of GB
   • Commercial tools can help, but cost
   • We decided to stay below 20 GB (GC concerns)
• How much CPU is required for background processing?
CPU utilization while deploying 30,000 VMs on 30,000 hosts
    CPU Utilization. 400% is maximum
                                       20,000
                                                  5000   5000



                                                                Idle




                                                Time
Deploy time from 25,000 to 30,000 VMs
  Seconds to deploy




                      VM number: 25,000 plus X
Problem: agent load balancing
  Mgmt        Mgmt      • Management servers
 Server 1    Server 2     start/stop/fail/crash
                        • How do newly started
                          Management Servers get
                          agents / work?
                        • When a Management Server
                          exits, how do others pick up its
                          load?
                        • When new hosts are added
                          how is the load distributed?
Common use case timings at scale
• 30,000 hosts and 4 Management Servers
• 4 Management Servers running, 1 fails: 10 minutes to
  redistribute 7500 agents
• 3 Management Servers running, add a fourth: 40 minutes to
  redistribute load evenly  IMPORTANT
• 0 Management Servers running, start all 4 simultaneously: 16
  minutes to connect to all 30,000 hosts
Understanding security groups



               Web                                 DB                                Web
               VM                                  VM                                VM
                          Web                                 DB
                        Security                            Security
              Web        Group                    Web        Group                    DB
              VM                                  VM                                  VM

          …                                   …                                  …
              Web                                 Web
              VM                                  VM


    Ingress Rule: Allow VMs in Web Security Group access to VMs in DB Security Group on Port 3306
L3 isolation with distributed firewalls
Public     Public IP address                                     Tenant   10.1.0.2
Internet   65.37.141.11                                          1 VM 1
           65.37.141.24                          10.1.0.1
                                      Pod 1 L2                   Tenant   10.1.0.3
           65.37.141.36
                                       Switch                    2 VM 1
           65.37.141.80
                                                                 Tenant   10.1.0.4
                                                                 1 VM 2
                           L3 Core
                                      Pod 2 L2
                                       Switch
                                                 10.1.8.1
                                                             …
                             Load     Pod 3 L2   10.1.16.1
                           Balancer    Switch



                                       …
L3 isolation with distributed firewalls
Public     Public IP address                                     Tenant   10.1.0.2
Internet   65.37.141.11                                          1 VM 1
           65.37.141.24                          10.1.0.1
                                      Pod 1 L2                   Tenant   10.1.0.3
           65.37.141.36
                                       Switch                    2 VM 1
           65.37.141.80
                                                                 Tenant   10.1.0.4
                                                                 1 VM 2
                           L3 Core
                                      Pod 2 L2
                                       Switch
                                                 10.1.8.1
                                                             …
                             Load     Pod 3 L2   10.1.16.1
                           Balancer    Switch



                                       …                         Tenant
                                                                 1 VM 3
                                                                          10.1.16.47

                                                                 Tenant
                                                                          10.1.16.85
                                                                 1 VM 4
L3 isolation with distributed firewalls
Public     Public IP address                                     Tenant   10.1.0.2
Internet   65.37.141.11                                          1 VM 1
           65.37.141.24                          10.1.0.1
                                      Pod 1 L2                   Tenant   10.1.0.3
           65.37.141.36
                                       Switch                    2 VM 1
           65.37.141.80
                                                                 Tenant   10.1.0.4
                                                                 1 VM 2
                           L3 Core
                                      Pod 2 L2
                                       Switch
                                                 10.1.8.1
                                                             …
                                                                 Tenant   10.1.16.12
                             Load     Pod 3 L2   10.1.16.1       2 VM 2
                           Balancer    Switch
                                                                 Tenant
                                                                          10.1.16.21
                                                                 2 VM 3


                                       …                         Tenant
                                                                 1 VM 3
                                                                          10.1.16.47

                                                                 Tenant
                                                                          10.1.16.85
                                                                 1 VM 4
One firewall per
Virtual Machine
One million firewalls?
       VM     VM         VM   VM
       …      …          …             VM
       VM     VM              …        …
                         VM   VM       VM
       VM     VM         VM   VM       VM
       VM     VM         VM   VM
       …      …          …             VM
       VM     VM              …        …
                         VM   VM       VM
       VM     VM         VM   VM       VM
       VM     VM         VM   VM
       …      …          …             VM
       VM     VM              …        …
                         VM   VM       VM
       VM     VM         VM   VM       VM
       VM     VM         VM   VM
       …      …          …             VM
       VM     VM              …        …
                         VM   VM       VM
       VM     VM         VM   VM       VM
       VM     VM         VM   VM
       …      …          …             VM
       VM     VM              …        …
                         VM   VM       VM
       VM     VM         VM   VM       VM
       VM
       …
       VM
       VM
              VM
              …
              VM
              VM
                         VM
                         …
                         VM
                         VM
                              VM
                              …
                              VM
                                   …   VM
                                       …
                                       VM
                              VM       VM
       VM     VM         VM   VM
       …      …          …             VM
       VM     VM              …        …
                         VM   VM       VM
       VM     VM         VM   VM       VM
       VM     VM         VM   VM
       …      …          …             VM
       VM     VM              …        …
                         VM   VM       VM
       VM     VM         VM   VM       VM
Orchestrating hundreds of thousands of firewalls

Well-known software scaling techniques
• Message queues
• Consistency tradeoffs
• Idempotent configuration & retries

CloudStack uses
• Special purpose queues
• Optimized for large security groups
• Eventual consistency for rule updates
Problem: firewall rules explosion in dom0

     Allow Security Group {Web} on TCP port 3060


     -A FORWARD -m tcp –p tcp –dport 3060 –src 10.1.16.31 – j ACCEPT
     -A FORWARD -m tcp –p tcp –dport 3060 –src 10.1.45.112 – j ACCEPT
     -A FORWARD -m tcp –p tcp –dport 3060 –src 10.1.189.5 – j ACCEPT
                               …
     -A FORWARD -m tcp –p tcp –dport 3060 –src 10.21.9.77 – j ACCEPT


      Performance suffers for large security groups
Problem: firewall rules explosion in dom0
Fix with ipsets:
   ipset –N web_sg iptreemap
   ipset –A web_sg 10.1.16.31
   ipset –A web_sg 10.1.16.112
   ipset –A web_sg 10.1.189.5

   ipset –A web_sg 10.21.9.77
              …
   -A FORWARD –p tcp –m tcp –dport 3060 –m set –match-set web_sg src -j ACCEPT



See also http://daemonkeeper.net/781/mass-blocking-ip-addresses-with-ipset/
Security group propagation time
   Seconds to fully synced




                             Number of VMs in security group
Problem: database connection management
• Scale testing resulted in several “too many open
  connections” errors from MySQL
• Common problem: holding open connections while
  doing long-running operations
• Took some code clean up and refactoring
• No longer an issue
   • MySQL supports 10,000 connections
   • CloudStack is far below that
DB connections per MS while deploying 30,000 VMs
                                     5,000
                                             5,000
 Number of DB connections



                            20,000




                                                 Time
Other considerations (beyond control plane)
• Network design and devices
• Object store scalability
• Per-host and cluster scalability
• Storage
• Understand your workload
Future work
• Improve simulator accuracy
• Publish results of advanced network (VLAN) testing
• Verify assumption of VM density not impacting scale
More information and joining the project

Project web site:
http://incubator.apache.org/projects/cloudstack.html

Mailing lists:
cloudstack-dev-subscribe@incubator.apache.org
cloudstack-users-subscribe@incubator.apache.org

Scalability study:
http://wiki.cloudstack.org/pages/viewpage.action?pageId=14320020
Q&A

More Related Content

What's hot

21.10.09 Microsoft Event, Microsoft Presentation
21.10.09 Microsoft Event, Microsoft Presentation21.10.09 Microsoft Event, Microsoft Presentation
21.10.09 Microsoft Event, Microsoft Presentationdataplex systems limited
 
Oracle VM – the coolest virtualizator you’ve ever had
Oracle VM – the coolest virtualizator you’ve ever had Oracle VM – the coolest virtualizator you’ve ever had
Oracle VM – the coolest virtualizator you’ve ever had ORACLE USER GROUP ESTONIA
 
Scalable networking in Apache CloudStack
Scalable networking in Apache CloudStackScalable networking in Apache CloudStack
Scalable networking in Apache CloudStackChiradeep Vittal
 
Scvmm 2012 (maarten wijsman)
Scvmm 2012 (maarten wijsman)Scvmm 2012 (maarten wijsman)
Scvmm 2012 (maarten wijsman)hypervnu
 
Avnet & Rorke Data - Open Compute Summit '13
Avnet & Rorke Data - Open Compute Summit '13Avnet & Rorke Data - Open Compute Summit '13
Avnet & Rorke Data - Open Compute Summit '13DaWane Wanek
 
LinuxCon NA 2012: Virtualization in the cloud featuring xen
LinuxCon NA 2012: Virtualization in the cloud featuring xenLinuxCon NA 2012: Virtualization in the cloud featuring xen
LinuxCon NA 2012: Virtualization in the cloud featuring xenThe Linux Foundation
 
16 August 2012 - SWUG - Hyper-V in Windows 2012
16 August 2012 - SWUG - Hyper-V in Windows 201216 August 2012 - SWUG - Hyper-V in Windows 2012
16 August 2012 - SWUG - Hyper-V in Windows 2012Daniel Mar
 
Scale11x : Virtualization with Xen and XCP
Scale11x : Virtualization with Xen and XCPScale11x : Virtualization with Xen and XCP
Scale11x : Virtualization with Xen and XCPLars Kurth
 
Decisions behind hypervisor selection in CloudStack 4.3
Decisions behind hypervisor selection in CloudStack 4.3Decisions behind hypervisor selection in CloudStack 4.3
Decisions behind hypervisor selection in CloudStack 4.3Tim Mackey
 
Prairie DevCon-What's New in Hyper-V in Windows Server "8" Beta - Part 2
Prairie DevCon-What's New in Hyper-V in Windows Server "8" Beta - Part 2Prairie DevCon-What's New in Hyper-V in Windows Server "8" Beta - Part 2
Prairie DevCon-What's New in Hyper-V in Windows Server "8" Beta - Part 2Damir Bersinic
 
Xen server 6.1 customer presentation
Xen server 6.1 customer presentationXen server 6.1 customer presentation
Xen server 6.1 customer presentationNuno Alves
 
What’s New in vCloud Director 5.1?
What’s New in vCloud Director 5.1?What’s New in vCloud Director 5.1?
What’s New in vCloud Director 5.1?Eric Sloof
 
Tudor Damian - Hyper-V 3.0 overview
Tudor Damian - Hyper-V 3.0 overviewTudor Damian - Hyper-V 3.0 overview
Tudor Damian - Hyper-V 3.0 overviewITCamp
 
Mythbusting goes virtual What's new in vSphere 5.1
Mythbusting goes virtual   What's new in vSphere 5.1Mythbusting goes virtual   What's new in vSphere 5.1
Mythbusting goes virtual What's new in vSphere 5.1Eric Sloof
 
Windows Server 2012 RC Hyper V
Windows Server 2012 RC Hyper VWindows Server 2012 RC Hyper V
Windows Server 2012 RC Hyper VLai Yoong Seng
 

What's hot (20)

21.10.09 Microsoft Event, Microsoft Presentation
21.10.09 Microsoft Event, Microsoft Presentation21.10.09 Microsoft Event, Microsoft Presentation
21.10.09 Microsoft Event, Microsoft Presentation
 
Oracle VM – the coolest virtualizator you’ve ever had
Oracle VM – the coolest virtualizator you’ve ever had Oracle VM – the coolest virtualizator you’ve ever had
Oracle VM – the coolest virtualizator you’ve ever had
 
Scalable networking in Apache CloudStack
Scalable networking in Apache CloudStackScalable networking in Apache CloudStack
Scalable networking in Apache CloudStack
 
Scvmm 2012 (maarten wijsman)
Scvmm 2012 (maarten wijsman)Scvmm 2012 (maarten wijsman)
Scvmm 2012 (maarten wijsman)
 
Avnet & Rorke Data - Open Compute Summit '13
Avnet & Rorke Data - Open Compute Summit '13Avnet & Rorke Data - Open Compute Summit '13
Avnet & Rorke Data - Open Compute Summit '13
 
LinuxCon NA 2012: Virtualization in the cloud featuring xen
LinuxCon NA 2012: Virtualization in the cloud featuring xenLinuxCon NA 2012: Virtualization in the cloud featuring xen
LinuxCon NA 2012: Virtualization in the cloud featuring xen
 
16 August 2012 - SWUG - Hyper-V in Windows 2012
16 August 2012 - SWUG - Hyper-V in Windows 201216 August 2012 - SWUG - Hyper-V in Windows 2012
16 August 2012 - SWUG - Hyper-V in Windows 2012
 
Vmware
VmwareVmware
Vmware
 
Scale11x : Virtualization with Xen and XCP
Scale11x : Virtualization with Xen and XCPScale11x : Virtualization with Xen and XCP
Scale11x : Virtualization with Xen and XCP
 
Decisions behind hypervisor selection in CloudStack 4.3
Decisions behind hypervisor selection in CloudStack 4.3Decisions behind hypervisor selection in CloudStack 4.3
Decisions behind hypervisor selection in CloudStack 4.3
 
CloudStack Hyderabad Meetup: Using CloudStack to build IaaS clouds
CloudStack Hyderabad Meetup: Using CloudStack to build IaaS cloudsCloudStack Hyderabad Meetup: Using CloudStack to build IaaS clouds
CloudStack Hyderabad Meetup: Using CloudStack to build IaaS clouds
 
Prairie DevCon-What's New in Hyper-V in Windows Server "8" Beta - Part 2
Prairie DevCon-What's New in Hyper-V in Windows Server "8" Beta - Part 2Prairie DevCon-What's New in Hyper-V in Windows Server "8" Beta - Part 2
Prairie DevCon-What's New in Hyper-V in Windows Server "8" Beta - Part 2
 
DevCloud and CloudMonkey
DevCloud and CloudMonkeyDevCloud and CloudMonkey
DevCloud and CloudMonkey
 
Xen server 6.1 customer presentation
Xen server 6.1 customer presentationXen server 6.1 customer presentation
Xen server 6.1 customer presentation
 
What’s New in vCloud Director 5.1?
What’s New in vCloud Director 5.1?What’s New in vCloud Director 5.1?
What’s New in vCloud Director 5.1?
 
Tudor Damian - Hyper-V 3.0 overview
Tudor Damian - Hyper-V 3.0 overviewTudor Damian - Hyper-V 3.0 overview
Tudor Damian - Hyper-V 3.0 overview
 
CloudStack and SDN
CloudStack and SDNCloudStack and SDN
CloudStack and SDN
 
Windows Azure
Windows AzureWindows Azure
Windows Azure
 
Mythbusting goes virtual What's new in vSphere 5.1
Mythbusting goes virtual   What's new in vSphere 5.1Mythbusting goes virtual   What's new in vSphere 5.1
Mythbusting goes virtual What's new in vSphere 5.1
 
Windows Server 2012 RC Hyper V
Windows Server 2012 RC Hyper VWindows Server 2012 RC Hyper V
Windows Server 2012 RC Hyper V
 

Viewers also liked

Cello saas scalability architecture
Cello saas scalability architectureCello saas scalability architecture
Cello saas scalability architectureTechcello
 
Webinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS ApplicationsWebinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS ApplicationsTechcello
 
Webinar Series Part 2 -Recipe for a Successful SaaS Company - Migrating Sing...
Webinar Series Part 2 -Recipe for a Successful SaaS Company -  Migrating Sing...Webinar Series Part 2 -Recipe for a Successful SaaS Company -  Migrating Sing...
Webinar Series Part 2 -Recipe for a Successful SaaS Company - Migrating Sing...Techcello
 
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...Impetus Technologies
 
SaaS Introduction-May2014
SaaS Introduction-May2014SaaS Introduction-May2014
SaaS Introduction-May2014Nguyen Tung
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarImpetus Technologies
 

Viewers also liked (6)

Cello saas scalability architecture
Cello saas scalability architectureCello saas scalability architecture
Cello saas scalability architecture
 
Webinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS ApplicationsWebinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS Applications
 
Webinar Series Part 2 -Recipe for a Successful SaaS Company - Migrating Sing...
Webinar Series Part 2 -Recipe for a Successful SaaS Company -  Migrating Sing...Webinar Series Part 2 -Recipe for a Successful SaaS Company -  Migrating Sing...
Webinar Series Part 2 -Recipe for a Successful SaaS Company - Migrating Sing...
 
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
 
SaaS Introduction-May2014
SaaS Introduction-May2014SaaS Introduction-May2014
SaaS Introduction-May2014
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 

Similar to 5 scalability Cloudstack Developer Day

CloudStack Architecture Future
CloudStack Architecture FutureCloudStack Architecture Future
CloudStack Architecture FutureKimihiko Kitase
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overviewgavin_lee
 
CloudStack Intro NYC
CloudStack Intro NYCCloudStack Intro NYC
CloudStack Intro NYCke4qqq
 
10 Minute Overview of Apache CloudStack
10 Minute Overview of Apache CloudStack10 Minute Overview of Apache CloudStack
10 Minute Overview of Apache CloudStackke4qqq
 
1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day 1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day Kimihiko Kitase
 
What is cloud computing
What is cloud computingWhat is cloud computing
What is cloud computingBrian Bullard
 
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-12012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1tcloudcomputing-tw
 
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief ComparisonCloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparisonbizalgo
 
System Center Virtual Machine Manager 2008 R2
System Center Virtual Machine Manager 2008 R2System Center Virtual Machine Manager 2008 R2
System Center Virtual Machine Manager 2008 R2aralves
 
Virtualization: Hyper-V, VMM, App-V and MED-V.
Virtualization: Hyper-V, VMM, App-V and MED-V.Virtualization: Hyper-V, VMM, App-V and MED-V.
Virtualization: Hyper-V, VMM, App-V and MED-V.Microsoft Iceland
 
OpenStack Boston User Group, OpenStack overview
OpenStack Boston User Group, OpenStack overviewOpenStack Boston User Group, OpenStack overview
OpenStack Boston User Group, OpenStack overviewOpen Stack
 
EMEA OpenStack Day Intro, July 13th 2011 in London
EMEA OpenStack Day Intro, July 13th 2011 in LondonEMEA OpenStack Day Intro, July 13th 2011 in London
EMEA OpenStack Day Intro, July 13th 2011 in LondonMark Collier
 
Integrate 3rd party security solution into CloudStack
Integrate 3rd party security solution into CloudStackIntegrate 3rd party security solution into CloudStack
Integrate 3rd party security solution into CloudStackmice_xia
 

Similar to 5 scalability Cloudstack Developer Day (20)

CloudStack Architecture Future
CloudStack Architecture FutureCloudStack Architecture Future
CloudStack Architecture Future
 
CloudStack Architecture
CloudStack ArchitectureCloudStack Architecture
CloudStack Architecture
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overview
 
Xen and Apache cloudstack
Xen and Apache cloudstack  Xen and Apache cloudstack
Xen and Apache cloudstack
 
CloudStack Intro NYC
CloudStack Intro NYCCloudStack Intro NYC
CloudStack Intro NYC
 
10 Minute Overview of Apache CloudStack
10 Minute Overview of Apache CloudStack10 Minute Overview of Apache CloudStack
10 Minute Overview of Apache CloudStack
 
1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day 1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day
 
CloudStack technical overview
CloudStack technical overviewCloudStack technical overview
CloudStack technical overview
 
Eucalyptus 3 Product Overview
Eucalyptus 3 Product OverviewEucalyptus 3 Product Overview
Eucalyptus 3 Product Overview
 
What is cloud computing
What is cloud computingWhat is cloud computing
What is cloud computing
 
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-12012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
 
Eucalyptus 3 Product Overview
Eucalyptus 3 Product OverviewEucalyptus 3 Product Overview
Eucalyptus 3 Product Overview
 
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief ComparisonCloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
 
System Center Virtual Machine Manager 2008 R2
System Center Virtual Machine Manager 2008 R2System Center Virtual Machine Manager 2008 R2
System Center Virtual Machine Manager 2008 R2
 
Improvements in Failover Clustering in Windows Server 2012
Improvements in Failover Clustering in Windows Server 2012Improvements in Failover Clustering in Windows Server 2012
Improvements in Failover Clustering in Windows Server 2012
 
Virtualization: Hyper-V, VMM, App-V and MED-V.
Virtualization: Hyper-V, VMM, App-V and MED-V.Virtualization: Hyper-V, VMM, App-V and MED-V.
Virtualization: Hyper-V, VMM, App-V and MED-V.
 
Network Management in System Center 2012 SP1 - VMM
Network Management in System Center 2012  SP1 - VMM Network Management in System Center 2012  SP1 - VMM
Network Management in System Center 2012 SP1 - VMM
 
OpenStack Boston User Group, OpenStack overview
OpenStack Boston User Group, OpenStack overviewOpenStack Boston User Group, OpenStack overview
OpenStack Boston User Group, OpenStack overview
 
EMEA OpenStack Day Intro, July 13th 2011 in London
EMEA OpenStack Day Intro, July 13th 2011 in LondonEMEA OpenStack Day Intro, July 13th 2011 in London
EMEA OpenStack Day Intro, July 13th 2011 in London
 
Integrate 3rd party security solution into CloudStack
Integrate 3rd party security solution into CloudStackIntegrate 3rd party security solution into CloudStack
Integrate 3rd party security solution into CloudStack
 

More from Kimihiko Kitase

ライトプランで利用可能な分析基盤「IBM Analytics Engine (IAE)」とは
ライトプランで利用可能な分析基盤「IBM Analytics Engine (IAE)」とはライトプランで利用可能な分析基盤「IBM Analytics Engine (IAE)」とは
ライトプランで利用可能な分析基盤「IBM Analytics Engine (IAE)」とはKimihiko Kitase
 
クラウドにおけるビッグデータ分析環境
クラウドにおけるビッグデータ分析環境クラウドにおけるビッグデータ分析環境
クラウドにおけるビッグデータ分析環境Kimihiko Kitase
 
最新事例から学ぶビッグデータの活用法 #ocif16 #hortonworks
最新事例から学ぶビッグデータの活用法 #ocif16 #hortonworks最新事例から学ぶビッグデータの活用法 #ocif16 #hortonworks
最新事例から学ぶビッグデータの活用法 #ocif16 #hortonworksKimihiko Kitase
 
Hortonworksが提供する データ活用方法の紹介
Hortonworksが提供する データ活用方法の紹介Hortonworksが提供する データ活用方法の紹介
Hortonworksが提供する データ活用方法の紹介Kimihiko Kitase
 
Hadoop Summit 2016 San Jose レポート
Hadoop Summit 2016  San Jose レポートHadoop Summit 2016  San Jose レポート
Hadoop Summit 2016 San Jose レポートKimihiko Kitase
 
SoftLayer Bluemix Community Festa 2016 Program Guide
SoftLayer Bluemix Community Festa 2016 Program GuideSoftLayer Bluemix Community Festa 2016 Program Guide
SoftLayer Bluemix Community Festa 2016 Program GuideKimihiko Kitase
 
2016年冬 IBMクラウド最新動向と概要
2016年冬 IBMクラウド最新動向と概要2016年冬 IBMクラウド最新動向と概要
2016年冬 IBMクラウド最新動向と概要Kimihiko Kitase
 
2016年冬 IBMクラウド最新動向
2016年冬 IBMクラウド最新動向2016年冬 IBMクラウド最新動向
2016年冬 IBMクラウド最新動向Kimihiko Kitase
 
クラウドを活用した システム開発は適材適所
クラウドを活用したシステム開発は適材適所クラウドを活用したシステム開発は適材適所
クラウドを活用した システム開発は適材適所Kimihiko Kitase
 
ホスティッドプライベートクラウド勉強会 ~Azure Pack on SoftLayer ~
ホスティッドプライベートクラウド勉強会 ~Azure Pack on SoftLayer ~ホスティッドプライベートクラウド勉強会 ~Azure Pack on SoftLayer ~
ホスティッドプライベートクラウド勉強会 ~Azure Pack on SoftLayer ~Kimihiko Kitase
 
話題のNode-REDでIoTアプリを作ってみよう
話題のNode-REDでIoTアプリを作ってみよう話題のNode-REDでIoTアプリを作ってみよう
話題のNode-REDでIoTアプリを作ってみようKimihiko Kitase
 
SoftLayer最新動向と賢い利用方法
SoftLayer最新動向と賢い利用方法 SoftLayer最新動向と賢い利用方法
SoftLayer最新動向と賢い利用方法 Kimihiko Kitase
 
SoftLayer Bluemix Summit 2015 Flyer
SoftLayer Bluemix Summit 2015 FlyerSoftLayer Bluemix Summit 2015 Flyer
SoftLayer Bluemix Summit 2015 FlyerKimihiko Kitase
 
OSC15 Okinawa Intro SoftLayer and Bluemix
OSC15 Okinawa Intro SoftLayer and BluemixOSC15 Okinawa Intro SoftLayer and Bluemix
OSC15 Okinawa Intro SoftLayer and BluemixKimihiko Kitase
 
Introduction of public cloud softlayer and bluemix
Introduction of public cloud softlayer and bluemixIntroduction of public cloud softlayer and bluemix
Introduction of public cloud softlayer and bluemixKimihiko Kitase
 
SoftLayer Bluemix Summit 2015
SoftLayer Bluemix Summit 2015SoftLayer Bluemix Summit 2015
SoftLayer Bluemix Summit 2015Kimihiko Kitase
 
クラウドに構築したWebサイトのセキュリティ対策やグローバル展開について
クラウドに構築したWebサイトのセキュリティ対策やグローバル展開についてクラウドに構築したWebサイトのセキュリティ対策やグローバル展開について
クラウドに構築したWebサイトのセキュリティ対策やグローバル展開についてKimihiko Kitase
 
Introduction softlayer and bluemix
Introduction softlayer and bluemixIntroduction softlayer and bluemix
Introduction softlayer and bluemixKimihiko Kitase
 

More from Kimihiko Kitase (20)

ライトプランで利用可能な分析基盤「IBM Analytics Engine (IAE)」とは
ライトプランで利用可能な分析基盤「IBM Analytics Engine (IAE)」とはライトプランで利用可能な分析基盤「IBM Analytics Engine (IAE)」とは
ライトプランで利用可能な分析基盤「IBM Analytics Engine (IAE)」とは
 
クラウドにおけるビッグデータ分析環境
クラウドにおけるビッグデータ分析環境クラウドにおけるビッグデータ分析環境
クラウドにおけるビッグデータ分析環境
 
最新事例から学ぶビッグデータの活用法 #ocif16 #hortonworks
最新事例から学ぶビッグデータの活用法 #ocif16 #hortonworks最新事例から学ぶビッグデータの活用法 #ocif16 #hortonworks
最新事例から学ぶビッグデータの活用法 #ocif16 #hortonworks
 
Hortonworksが提供する データ活用方法の紹介
Hortonworksが提供する データ活用方法の紹介Hortonworksが提供する データ活用方法の紹介
Hortonworksが提供する データ活用方法の紹介
 
Hadoop Summit 2016 San Jose レポート
Hadoop Summit 2016  San Jose レポートHadoop Summit 2016  San Jose レポート
Hadoop Summit 2016 San Jose レポート
 
SoftLayer Bluemix Community Festa 2016 Program Guide
SoftLayer Bluemix Community Festa 2016 Program GuideSoftLayer Bluemix Community Festa 2016 Program Guide
SoftLayer Bluemix Community Festa 2016 Program Guide
 
2016年冬 IBMクラウド最新動向と概要
2016年冬 IBMクラウド最新動向と概要2016年冬 IBMクラウド最新動向と概要
2016年冬 IBMクラウド最新動向と概要
 
2016年冬 IBMクラウド最新動向
2016年冬 IBMクラウド最新動向2016年冬 IBMクラウド最新動向
2016年冬 IBMクラウド最新動向
 
クラウドを活用した システム開発は適材適所
クラウドを活用したシステム開発は適材適所クラウドを活用したシステム開発は適材適所
クラウドを活用した システム開発は適材適所
 
Try IoT with Node-RED
Try IoT with Node-REDTry IoT with Node-RED
Try IoT with Node-RED
 
ホスティッドプライベートクラウド勉強会 ~Azure Pack on SoftLayer ~
ホスティッドプライベートクラウド勉強会 ~Azure Pack on SoftLayer ~ホスティッドプライベートクラウド勉強会 ~Azure Pack on SoftLayer ~
ホスティッドプライベートクラウド勉強会 ~Azure Pack on SoftLayer ~
 
話題のNode-REDでIoTアプリを作ってみよう
話題のNode-REDでIoTアプリを作ってみよう話題のNode-REDでIoTアプリを作ってみよう
話題のNode-REDでIoTアプリを作ってみよう
 
SoftLayer最新動向と賢い利用方法
SoftLayer最新動向と賢い利用方法 SoftLayer最新動向と賢い利用方法
SoftLayer最新動向と賢い利用方法
 
SoftLayer Bluemix Intro
SoftLayer Bluemix IntroSoftLayer Bluemix Intro
SoftLayer Bluemix Intro
 
SoftLayer Bluemix Summit 2015 Flyer
SoftLayer Bluemix Summit 2015 FlyerSoftLayer Bluemix Summit 2015 Flyer
SoftLayer Bluemix Summit 2015 Flyer
 
OSC15 Okinawa Intro SoftLayer and Bluemix
OSC15 Okinawa Intro SoftLayer and BluemixOSC15 Okinawa Intro SoftLayer and Bluemix
OSC15 Okinawa Intro SoftLayer and Bluemix
 
Introduction of public cloud softlayer and bluemix
Introduction of public cloud softlayer and bluemixIntroduction of public cloud softlayer and bluemix
Introduction of public cloud softlayer and bluemix
 
SoftLayer Bluemix Summit 2015
SoftLayer Bluemix Summit 2015SoftLayer Bluemix Summit 2015
SoftLayer Bluemix Summit 2015
 
クラウドに構築したWebサイトのセキュリティ対策やグローバル展開について
クラウドに構築したWebサイトのセキュリティ対策やグローバル展開についてクラウドに構築したWebサイトのセキュリティ対策やグローバル展開について
クラウドに構築したWebサイトのセキュリティ対策やグローバル展開について
 
Introduction softlayer and bluemix
Introduction softlayer and bluemixIntroduction softlayer and bluemix
Introduction softlayer and bluemix
 

Recently uploaded

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Recently uploaded (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

5 scalability Cloudstack Developer Day

  • 2. Apache CloudStack: a project in incubation • Secure, multi-tenant cloud orchestration platform – Turnkey platform for delivering IaaS clouds – Hypervisor agnostic – Highly scalable, secure and open – Complete Self-service portal – Open source, open standards – Deploys on premise
  • 3. Manage hosts, create VMs, virtual disks, virtual Admin networks, meter usage, …. Internet Management Server Cluster Primary Router MySQL Backup Load Balancer MySQL L3 Core Switch Top of Rack Switch Object Storage Servers … … … … … Availability Zone 1 Pod 1 Pod 2 Pod 3 Pod N
  • 4. Thinking about cloud orchestration at scale • Host management • Capacity management • What host to use to deploy a new VM • Failure handling • Security group propagation • Set a goal
  • 5. We can’t afford this as our QA lab
  • 6. Simulator enables scale testing Mgmt. Server Zone User API MySQL Simulator Load Mgmt. Balancer Server Admin API Mgmt. MySQL Server Mgmt. Server
  • 7. Environment 2 cores, 4 with Hyper Threading. 2.2 GHz Xeon. Mgmt. 16 GB RAM. 12 GB JVM Server Heap. Zone Single spinning disk, later MySQL User API singleSimulator GB RAM. SSD. 32 Load Mgmt. MySQL 5.5. Balancer Server Admin API Mgmt. MySQL Server Mgmt. Server
  • 8.
  • 9. Allocator performance is awful with 1000 hosts • Two minutes to decide which host to use for a new VM! • Computing capacity for every pod repeatedly • Fixed that, but still 12 seconds to decide • Use host tags, down to 2 seconds • Major changes required to improve further • In 2.2.0, store capacity info in DB, skip pod altogether • Harness the power of SQL select and all is well
  • 10. Polling doesn’t scale TRUE? FALSE? Sometimes, it is good enough
  • 11. Host management • Check host state via TCP connection • Check every minute • 30,000 checks per minute, 500 per second • But they take 10 seconds, so 5000 in parallel • Not using async I/O so 5000 threads required… • Single JVM can support 2000+ threads so this is concerning but may not be the limiting factor
  • 12. Host management • What is the maximum feasible JVM heap size? • Some people use heaps with hundreds of GB • Commercial tools can help, but cost • We decided to stay below 20 GB (GC concerns) • How much CPU is required for background processing?
  • 13. CPU utilization while deploying 30,000 VMs on 30,000 hosts CPU Utilization. 400% is maximum 20,000 5000 5000 Idle Time
  • 14. Deploy time from 25,000 to 30,000 VMs Seconds to deploy VM number: 25,000 plus X
  • 15. Problem: agent load balancing Mgmt Mgmt • Management servers Server 1 Server 2 start/stop/fail/crash • How do newly started Management Servers get agents / work? • When a Management Server exits, how do others pick up its load? • When new hosts are added how is the load distributed?
  • 16. Common use case timings at scale • 30,000 hosts and 4 Management Servers • 4 Management Servers running, 1 fails: 10 minutes to redistribute 7500 agents • 3 Management Servers running, add a fourth: 40 minutes to redistribute load evenly IMPORTANT • 0 Management Servers running, start all 4 simultaneously: 16 minutes to connect to all 30,000 hosts
  • 17. Understanding security groups Web DB Web VM VM VM Web DB Security Security Web Group Web Group DB VM VM VM … … … Web Web VM VM Ingress Rule: Allow VMs in Web Security Group access to VMs in DB Security Group on Port 3306
  • 18. L3 isolation with distributed firewalls Public Public IP address Tenant 10.1.0.2 Internet 65.37.141.11 1 VM 1 65.37.141.24 10.1.0.1 Pod 1 L2 Tenant 10.1.0.3 65.37.141.36 Switch 2 VM 1 65.37.141.80 Tenant 10.1.0.4 1 VM 2 L3 Core Pod 2 L2 Switch 10.1.8.1 … Load Pod 3 L2 10.1.16.1 Balancer Switch …
  • 19. L3 isolation with distributed firewalls Public Public IP address Tenant 10.1.0.2 Internet 65.37.141.11 1 VM 1 65.37.141.24 10.1.0.1 Pod 1 L2 Tenant 10.1.0.3 65.37.141.36 Switch 2 VM 1 65.37.141.80 Tenant 10.1.0.4 1 VM 2 L3 Core Pod 2 L2 Switch 10.1.8.1 … Load Pod 3 L2 10.1.16.1 Balancer Switch … Tenant 1 VM 3 10.1.16.47 Tenant 10.1.16.85 1 VM 4
  • 20. L3 isolation with distributed firewalls Public Public IP address Tenant 10.1.0.2 Internet 65.37.141.11 1 VM 1 65.37.141.24 10.1.0.1 Pod 1 L2 Tenant 10.1.0.3 65.37.141.36 Switch 2 VM 1 65.37.141.80 Tenant 10.1.0.4 1 VM 2 L3 Core Pod 2 L2 Switch 10.1.8.1 … Tenant 10.1.16.12 Load Pod 3 L2 10.1.16.1 2 VM 2 Balancer Switch Tenant 10.1.16.21 2 VM 3 … Tenant 1 VM 3 10.1.16.47 Tenant 10.1.16.85 1 VM 4
  • 22. One million firewalls? VM VM VM VM … … … VM VM VM … … VM VM VM VM VM VM VM VM VM VM VM VM … … … VM VM VM … … VM VM VM VM VM VM VM VM VM VM VM VM … … … VM VM VM … … VM VM VM VM VM VM VM VM VM VM VM VM … … … VM VM VM … … VM VM VM VM VM VM VM VM VM VM VM VM … … … VM VM VM … … VM VM VM VM VM VM VM VM VM … VM VM VM … VM VM VM … VM VM VM … VM … VM … VM VM VM VM VM VM VM … … … VM VM VM … … VM VM VM VM VM VM VM VM VM VM VM VM … … … VM VM VM … … VM VM VM VM VM VM VM VM
  • 23. Orchestrating hundreds of thousands of firewalls Well-known software scaling techniques • Message queues • Consistency tradeoffs • Idempotent configuration & retries CloudStack uses • Special purpose queues • Optimized for large security groups • Eventual consistency for rule updates
  • 24. Problem: firewall rules explosion in dom0 Allow Security Group {Web} on TCP port 3060 -A FORWARD -m tcp –p tcp –dport 3060 –src 10.1.16.31 – j ACCEPT -A FORWARD -m tcp –p tcp –dport 3060 –src 10.1.45.112 – j ACCEPT -A FORWARD -m tcp –p tcp –dport 3060 –src 10.1.189.5 – j ACCEPT … -A FORWARD -m tcp –p tcp –dport 3060 –src 10.21.9.77 – j ACCEPT Performance suffers for large security groups
  • 25. Problem: firewall rules explosion in dom0 Fix with ipsets: ipset –N web_sg iptreemap ipset –A web_sg 10.1.16.31 ipset –A web_sg 10.1.16.112 ipset –A web_sg 10.1.189.5 ipset –A web_sg 10.21.9.77 … -A FORWARD –p tcp –m tcp –dport 3060 –m set –match-set web_sg src -j ACCEPT See also http://daemonkeeper.net/781/mass-blocking-ip-addresses-with-ipset/
  • 26. Security group propagation time Seconds to fully synced Number of VMs in security group
  • 27. Problem: database connection management • Scale testing resulted in several “too many open connections” errors from MySQL • Common problem: holding open connections while doing long-running operations • Took some code clean up and refactoring • No longer an issue • MySQL supports 10,000 connections • CloudStack is far below that
  • 28. DB connections per MS while deploying 30,000 VMs 5,000 5,000 Number of DB connections 20,000 Time
  • 29. Other considerations (beyond control plane) • Network design and devices • Object store scalability • Per-host and cluster scalability • Storage • Understand your workload
  • 30. Future work • Improve simulator accuracy • Publish results of advanced network (VLAN) testing • Verify assumption of VM density not impacting scale
  • 31. More information and joining the project Project web site: http://incubator.apache.org/projects/cloudstack.html Mailing lists: cloudstack-dev-subscribe@incubator.apache.org cloudstack-users-subscribe@incubator.apache.org Scalability study: http://wiki.cloudstack.org/pages/viewpage.action?pageId=14320020
  • 32. Q&A