SlideShare a Scribd company logo
1 of 15
Contrail at AllegroGroup
Plan / Prepare / Production (3P)
History
➢ What we had:
○ Two separate cloud environments (Essex And Havana)
○ Floating IP in Essex and Flat Network with VLANS in
Havana
○ Network complicity in Havana
○ Network performance problems in Havana
Goals (Plan)
➢ Ease to maintain and growth
➢ Network simplicity
➢ Network isolation for tenants
➢ Floating IP and flat network
➢ New region in new DC
Fabric (Prepare - 2 weeks)
➢ Easy and fast deployment (couple of corrections in fabric
scripts), we used 1.20 version at that time
➢ Environment ready for test (adding new HV from “any”
location of server room)
➢ Basic performance tests, LBaaS
Where to find quick implementation tools:
http://www.opencontrail.org/opencontrail-quick-start-guide/
Our way
➢ Own puppet manifests based on available ones
➢ Reasons:
○ Existing infrastructure
○ Customized deployment
○ More work at the beginning, less problems later
○ Easy procedure to add hosts (compute nodes,
controllers)
○ Building new region in near future
Implementation
➢ We had everything prepared for version 1.20, and then we
get 2.01 production version ( what to do ?! )
➢ Environment deploying (OpenStack with Contrail,
One Region- 2 CC; 3 CoC; 50 HV), during DC migration
amount of computes increasing - target 250
➢ Move tenants/users/quota from old environment to new
○ we used keystone server builded from scratch and did
upgrade then pump data to Icehouse/Contrail (issue
missing users), qouta were migrated as SQL tables,
exporter.py (script to marge users/tenants/qoutas)
between regions (target for future - one keystone)
Implementation
➢ DNS - we are using Designate with two handlers (one for
Floating IPs, second one for Fixed routable IPs)
➢ Required Image modifications (target ansible automation
build)
➢ Two days before production we did update to latest
available packages from release 2.01
➢ Breaking environment
➢ Clients at new environment
Results
➢ 500 VMs spawned simultaneous
➢ Network performance
Problems
➢ cassandra (we increased number of nodes 3 => 5),
configuration tuning (TTLs in contrail-collector.conf),
compaction throughput, migration of cassandra data
to raid0 SSD disks
➢ OpenFiles issue (user, supervisor, init)
➢ Collector was flood by data from computes
iptables -A OUTPUT -p tcp --dport 8086 -m string --algo bm --
string "flowuuid" -j DROP
Problems
➢ When 500K flow is not enough (vr_flow_entries, vr_oflow_entries)
➢ Flow on Hold issue
➢ Vrouter CPU consumption to high compare to VM
(TBB_THREAD_COUNT /etc/contrail/supervisord_vrouter.conf)
Problems
➢ Rebuild instance - interface was deleted after VM was
respawn
➢ Lack of support for ironic - we will build region for ironic
➢ Disabled Tenant - Not able to login to Contrail UI (keystone
2.0)
➢ Tuning configuration files required
➢ Metadata packages not sent in one session
➢ RBAC for contrail UI
Environment expansion and further plans
➢ 1350 VMs on 150 HV in one DC at this moment
➢ Second region on it’s way
➢ 250-300 HV per region
➢ Migration from Essex and Havana
➢ OpenStack and Contrail upgrades
Q/A?
Thank you!
Check us: allegrotech.io
Join us: kariera.allegro.pl
Twitter: allegrotechblog
e-commerce full of technology

More Related Content

What's hot

What's hot (20)

Tungsten University: Introduction to Continuent Tungsten 2.0
Tungsten University: Introduction to Continuent Tungsten 2.0Tungsten University: Introduction to Continuent Tungsten 2.0
Tungsten University: Introduction to Continuent Tungsten 2.0
 
OVN DBs HA with scale test
OVN DBs HA with scale testOVN DBs HA with scale test
OVN DBs HA with scale test
 
Monitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebulaMonitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebula
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
 
OVN - Basics and deep dive
OVN - Basics and deep diveOVN - Basics and deep dive
OVN - Basics and deep dive
 
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan HoracekOpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
 
See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...
 
Accelerated dataplanes integration and deployment
Accelerated dataplanes integration and deploymentAccelerated dataplanes integration and deployment
Accelerated dataplanes integration and deployment
 
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installations
 
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems IncXPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
 
Quickly Debug VM Failures in OpenStack
Quickly Debug VM Failures in OpenStackQuickly Debug VM Failures in OpenStack
Quickly Debug VM Failures in OpenStack
 
Automating linux network performance testing
Automating linux network performance testingAutomating linux network performance testing
Automating linux network performance testing
 
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
 
Geneve
GeneveGeneve
Geneve
 
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon GarciaOpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
 
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
 
Meetup 23 - 02 - OVN - The future of networking in OpenStack
Meetup 23 - 02 - OVN - The future of networking in OpenStackMeetup 23 - 02 - OVN - The future of networking in OpenStack
Meetup 23 - 02 - OVN - The future of networking in OpenStack
 
WebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck ThreadsWebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck Threads
 
Kubernetes Intro
Kubernetes IntroKubernetes Intro
Kubernetes Intro
 
Ovn vancouver
Ovn vancouverOvn vancouver
Ovn vancouver
 

Similar to Contrail at AllegroGroup

OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan KoomanOpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
OpenNebula Project
 
Using OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting EnvironmentUsing OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting Environment
OpenStack Foundation
 
Vizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users ConferenceVizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users Conference
Isaac Christoffersen
 

Similar to Contrail at AllegroGroup (20)

OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan KoomanOpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
 
OpenNebula Conf 2014 | ONE BIT to rule them all - Stefan Kooman
OpenNebula Conf 2014 | ONE BIT to rule them all - Stefan KoomanOpenNebula Conf 2014 | ONE BIT to rule them all - Stefan Kooman
OpenNebula Conf 2014 | ONE BIT to rule them all - Stefan Kooman
 
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
 
How to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
How to Migrate 100 Clusters from On-Prem to Google Cloud Without DowntimeHow to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
How to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
 
Introduction of private cloud in LINE - OpenStack最新情報セミナー(2019年2月)
Introduction of private cloud in LINE - OpenStack最新情報セミナー(2019年2月)Introduction of private cloud in LINE - OpenStack最新情報セミナー(2019年2月)
Introduction of private cloud in LINE - OpenStack最新情報セミナー(2019年2月)
 
GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...
GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...
GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...
 
Open stack networking_101_part-1
Open stack networking_101_part-1Open stack networking_101_part-1
Open stack networking_101_part-1
 
Boyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceBoyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experience
 
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
 
VMworld 2013: VMware Mirage Storage and Network Deduplication, DEMYSTIFIED
VMworld 2013: VMware Mirage Storage and Network Deduplication, DEMYSTIFIED VMworld 2013: VMware Mirage Storage and Network Deduplication, DEMYSTIFIED
VMworld 2013: VMware Mirage Storage and Network Deduplication, DEMYSTIFIED
 
Stacks and Layers: Integrating P4, C, OVS and OpenStack
Stacks and Layers: Integrating P4, C, OVS and OpenStackStacks and Layers: Integrating P4, C, OVS and OpenStack
Stacks and Layers: Integrating P4, C, OVS and OpenStack
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4
 
London Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNLondon Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERN
 
OpenContrail deployment experience
OpenContrail deployment experienceOpenContrail deployment experience
OpenContrail deployment experience
 
Using OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting EnvironmentUsing OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting Environment
 
Vizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users ConferenceVizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users Conference
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
 
Openstack summit 2015
Openstack summit 2015Openstack summit 2015
Openstack summit 2015
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Contrail at AllegroGroup

  • 1. Contrail at AllegroGroup Plan / Prepare / Production (3P)
  • 2. History ➢ What we had: ○ Two separate cloud environments (Essex And Havana) ○ Floating IP in Essex and Flat Network with VLANS in Havana ○ Network complicity in Havana ○ Network performance problems in Havana
  • 3. Goals (Plan) ➢ Ease to maintain and growth ➢ Network simplicity ➢ Network isolation for tenants ➢ Floating IP and flat network ➢ New region in new DC
  • 4. Fabric (Prepare - 2 weeks) ➢ Easy and fast deployment (couple of corrections in fabric scripts), we used 1.20 version at that time ➢ Environment ready for test (adding new HV from “any” location of server room) ➢ Basic performance tests, LBaaS Where to find quick implementation tools: http://www.opencontrail.org/opencontrail-quick-start-guide/
  • 5. Our way ➢ Own puppet manifests based on available ones ➢ Reasons: ○ Existing infrastructure ○ Customized deployment ○ More work at the beginning, less problems later ○ Easy procedure to add hosts (compute nodes, controllers) ○ Building new region in near future
  • 6. Implementation ➢ We had everything prepared for version 1.20, and then we get 2.01 production version ( what to do ?! ) ➢ Environment deploying (OpenStack with Contrail, One Region- 2 CC; 3 CoC; 50 HV), during DC migration amount of computes increasing - target 250 ➢ Move tenants/users/quota from old environment to new ○ we used keystone server builded from scratch and did upgrade then pump data to Icehouse/Contrail (issue missing users), qouta were migrated as SQL tables, exporter.py (script to marge users/tenants/qoutas) between regions (target for future - one keystone)
  • 7. Implementation ➢ DNS - we are using Designate with two handlers (one for Floating IPs, second one for Fixed routable IPs) ➢ Required Image modifications (target ansible automation build) ➢ Two days before production we did update to latest available packages from release 2.01 ➢ Breaking environment ➢ Clients at new environment
  • 8. Results ➢ 500 VMs spawned simultaneous ➢ Network performance
  • 9. Problems ➢ cassandra (we increased number of nodes 3 => 5), configuration tuning (TTLs in contrail-collector.conf), compaction throughput, migration of cassandra data to raid0 SSD disks ➢ OpenFiles issue (user, supervisor, init) ➢ Collector was flood by data from computes iptables -A OUTPUT -p tcp --dport 8086 -m string --algo bm -- string "flowuuid" -j DROP
  • 10. Problems ➢ When 500K flow is not enough (vr_flow_entries, vr_oflow_entries) ➢ Flow on Hold issue ➢ Vrouter CPU consumption to high compare to VM (TBB_THREAD_COUNT /etc/contrail/supervisord_vrouter.conf)
  • 11. Problems ➢ Rebuild instance - interface was deleted after VM was respawn ➢ Lack of support for ironic - we will build region for ironic ➢ Disabled Tenant - Not able to login to Contrail UI (keystone 2.0) ➢ Tuning configuration files required ➢ Metadata packages not sent in one session ➢ RBAC for contrail UI
  • 12. Environment expansion and further plans ➢ 1350 VMs on 150 HV in one DC at this moment ➢ Second region on it’s way ➢ 250-300 HV per region ➢ Migration from Essex and Havana ➢ OpenStack and Contrail upgrades
  • 13. Q/A?
  • 15. Check us: allegrotech.io Join us: kariera.allegro.pl Twitter: allegrotechblog e-commerce full of technology