SlideShare a Scribd company logo
1 of 20
Download to read offline
Half a year with
Contrail at Production
krzysztof.kowalik@allegrogroup.com
michal.dopierala@allegrogroup.com
➢ 57% of Polish e-commerce
➢ 14.000.000 users
Who we are?
➢ Provide CLOUD/IAAS services to internal customers since 2011
➢ We support dev/test/prod environments for 50 developers teams in private
and public cloud
➢ As a part of other TEAM provide deployment platform (Mesos/Marathon) for
Allegro platform
TEAM - scope
What have we done since last OCUG?
➢ New region PL-KRA with ~100 computes
➢ Test environment with MX10 as gateway
➢ Migration of client VMs from Essex to Icehouse with Contrail
➢ ~100 new computes in PL-POZ
➢ Problem solving
PL-POZ PL-KRA TEST
Numbers in Contrail
Issues
Fixed at build 44
Issues - memory leak
Fixed at build 44
SQL:
nova.instance_faults (message):
internal error: process exited while connecting to monitor: Cannot set up guest memory 'pc.ram': Cannot allocate memory
/etc/qemu-ifdown: could not launch network script
At Hypervisor
Jun 14 15:42:31 host kernel: [8026284.365930] lowmem_reserve[]: 0 0 0 0
Jun 14 15:42:31 host kernel: [8026284.365988] 0 pages HighMem/MovableOnly
Jun 14 15:42:31 host kernel: [8026284.366178] Out of memory: Kill process 140217 (qemu-system-x86) score 16 or sacrifice child
Issues - memory leak
Debug: Exec[munin-node-configure](provider=posix): Executing check 'munin-node-configure --shell | grep ln'
Debug: Executing 'munin-node-configure --shell | grep ln'
Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # The following plugins caused errors:
Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # ip_:
Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # Nothing printed to stdout
Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # No valid suggestions
Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # mysql_:
Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # Non-zero exit during autoconf (2)
Issues - munin can generate latency
Issues - flows re-evaluation
Problem
➢ When a flow is created, contrail keeps track of routes that can potentially
modify the flow action. Contrail-vrouter-agent re-evaluates flows that are
not affected by the new route during VM spawning
Symptom
➢ large latency to all VMs within affected subnet
Solution
➢ custom build (70 of R2.02 or in commit
6992575a03a08f703edb8f0a7622a457dbdbdeee ) where agent re-evaluates
a flow only if flow is affected by the new route
Issues - flows re-evaluation
➢ Duplicated IP generating issue with multi NH updates at many computes -
latency
Script which helps us
#!/usr/bin/python
import requests
import sys
cc_ip = sys.argv[1]
r = requests.get('http://%s:9081/analytics/virtual-machines' % cc_ip)
for href_vm in r.json():
r_vm = requests.get(href_vm['href'])
vm = r_vm.json()['UveVirtualMachineAgent']
try:
for ifaces in vm['interface_list']:
print vm['uuid'], ifaces['active'], ifaces['ip_address'], ifaces['mac_address'], ifaces['virtual_network'], ifaces['uuid']
except Exception as e:
print e, vm['uuid'], vm['interface_list']
Issues - duplicated IP
Examples:
➢ http://127.0.0.1:8085/Snh_SandeshTraceRequest?x=Flow
➢ http://127.0.0.1:8085/Snh_SandeshTraceRequest?x=KSync
➢ http://127.0.0.1:8085/Snh_SandeshTraceRequest?x=Oper%20DB
➢ http://127.0.0.1:8085/Snh_SandeshTraceRequest?x=XmppMessageTrace
➢ http://127.0.0.1:8085/Snh_ItfReq?
➢ http://127.0.0.1:8085/Snh_VnListReq?
Issues - getting introspect data can produce latency
Experimenting with flow_cache_timeout:
➢ mesh environment with different client needs
➢ experimental hypervisors
➢ parameters set to 30
Since commit “507fda3d5deb22c6549d4fd253624bea44534b73” ;-) no need to
take care about flow_cache_timeout
Issues - flow_cache_timeout
#!/bin/bash
/usr/sbin/service contrail-vrouter-agent stop
/sbin/modprobe -r -f vrouter
/sbin/insmod /lib/modules/`uname -r`/updates/dkms/vrouter.ko
/sbin/ifup vhost0
/usr/sbin/service contrail-vrouter-agent start
Issues - defunct
Since Kernel 3.13.0-63
Modified script: /opt/contrail/bin/if-vhost0
#!/bin/bash
source /opt/contrail/bin/vrouter-functions.sh
if [ ! -L /sys/class/net/vhost0 ]; then
insert_vrouter &>> $LOG
fi
/sbin/ip l s pkt1 up
/sbin/ip l s pkt2 up
Issues - interfaces pkt1 and pkt2 down after reboot
➢ network performance issues - hping (lookout for 1k bug)
hping3 -c 100000 -p 8080 -i u100000 -S -n 169.254.0.17
➢ introspects
➢ tcpdump
How to debug
Q/A?
Thank you!
Find us:
Blog: allegro.tech
Twitter: @allegrotechblog

More Related Content

What's hot

OVN DBs HA with scale test
OVN DBs HA with scale testOVN DBs HA with scale test
OVN DBs HA with scale testAliasgar Ginwala
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceScyllaDB
 
Tungsten University: Introduction to Continuent Tungsten 2.0
Tungsten University: Introduction to Continuent Tungsten 2.0Tungsten University: Introduction to Continuent Tungsten 2.0
Tungsten University: Introduction to Continuent Tungsten 2.0Continuent
 
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan HoracekOpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan HoracekNETWAYS
 
Monitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebulaMonitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebulaNETWAYS
 
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems IncXPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems IncThe Linux Foundation
 
Seamless migration from nova network to neutron in e bay production
Seamless migration from nova network to neutron in e bay productionSeamless migration from nova network to neutron in e bay production
Seamless migration from nova network to neutron in e bay productionChengyuan Li
 
OVN - Basics and deep dive
OVN - Basics and deep diveOVN - Basics and deep dive
OVN - Basics and deep diveTrinath Somanchi
 
See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...LinuxCon ContainerCon CloudOpen China
 
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...OpenStack Korea Community
 
Automating linux network performance testing
Automating linux network performance testingAutomating linux network performance testing
Automating linux network performance testingAntonio Ojea Garcia
 
Accelerated dataplanes integration and deployment
Accelerated dataplanes integration and deploymentAccelerated dataplanes integration and deployment
Accelerated dataplanes integration and deploymentOPNFV
 
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installationsNETWAYS
 
SaltConf14 - Ryan Lane, Wikimedia - Immediate consistency with Trebuchet Depl...
SaltConf14 - Ryan Lane, Wikimedia - Immediate consistency with Trebuchet Depl...SaltConf14 - Ryan Lane, Wikimedia - Immediate consistency with Trebuchet Depl...
SaltConf14 - Ryan Lane, Wikimedia - Immediate consistency with Trebuchet Depl...SaltStack
 
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Cloud Native Day Tel Aviv
 
OSMC 2021 | Icinga-Installer – the easy way to your Icinga
OSMC 2021 | Icinga-Installer – the easy way to your IcingaOSMC 2021 | Icinga-Installer – the easy way to your Icinga
OSMC 2021 | Icinga-Installer – the easy way to your IcingaNETWAYS
 
VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead VMworld
 
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon GarciaOpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon GarciaOpenNebula Project
 

What's hot (20)

OVN DBs HA with scale test
OVN DBs HA with scale testOVN DBs HA with scale test
OVN DBs HA with scale test
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
 
Tungsten University: Introduction to Continuent Tungsten 2.0
Tungsten University: Introduction to Continuent Tungsten 2.0Tungsten University: Introduction to Continuent Tungsten 2.0
Tungsten University: Introduction to Continuent Tungsten 2.0
 
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan HoracekOpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
 
Quickly Debug VM Failures in OpenStack
Quickly Debug VM Failures in OpenStackQuickly Debug VM Failures in OpenStack
Quickly Debug VM Failures in OpenStack
 
Monitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebulaMonitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebula
 
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems IncXPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
 
Seamless migration from nova network to neutron in e bay production
Seamless migration from nova network to neutron in e bay productionSeamless migration from nova network to neutron in e bay production
Seamless migration from nova network to neutron in e bay production
 
OVN - Basics and deep dive
OVN - Basics and deep diveOVN - Basics and deep dive
OVN - Basics and deep dive
 
See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...
 
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
 
Automating linux network performance testing
Automating linux network performance testingAutomating linux network performance testing
Automating linux network performance testing
 
Accelerated dataplanes integration and deployment
Accelerated dataplanes integration and deploymentAccelerated dataplanes integration and deployment
Accelerated dataplanes integration and deployment
 
Geneve
GeneveGeneve
Geneve
 
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installations
 
SaltConf14 - Ryan Lane, Wikimedia - Immediate consistency with Trebuchet Depl...
SaltConf14 - Ryan Lane, Wikimedia - Immediate consistency with Trebuchet Depl...SaltConf14 - Ryan Lane, Wikimedia - Immediate consistency with Trebuchet Depl...
SaltConf14 - Ryan Lane, Wikimedia - Immediate consistency with Trebuchet Depl...
 
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
 
OSMC 2021 | Icinga-Installer – the easy way to your Icinga
OSMC 2021 | Icinga-Installer – the easy way to your IcingaOSMC 2021 | Icinga-Installer – the easy way to your Icinga
OSMC 2021 | Icinga-Installer – the easy way to your Icinga
 
VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead
 
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon GarciaOpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
 

Viewers also liked

Contrail Deep-dive - Cloud Network Services at Scale
Contrail Deep-dive - Cloud Network Services at ScaleContrail Deep-dive - Cloud Network Services at Scale
Contrail Deep-dive - Cloud Network Services at ScaleMarketingArrowECS_CZ
 
Uuden millenium palkitun aurinkokennon tulevaisuus
Uuden millenium palkitun aurinkokennon tulevaisuusUuden millenium palkitun aurinkokennon tulevaisuus
Uuden millenium palkitun aurinkokennon tulevaisuusSähköklubi
 
2016 interop sdi_showcase_contrail
2016 interop sdi_showcase_contrail2016 interop sdi_showcase_contrail
2016 interop sdi_showcase_contrailDaisuke Nakajima
 
Cloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper ContrailCloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper Contrailbuildacloud
 
Banv meetup-contrail
Banv meetup-contrailBanv meetup-contrail
Banv meetup-contrailnvirters
 
Contrail deploy by Juju/MAAS
Contrail deploy by Juju/MAASContrail deploy by Juju/MAAS
Contrail deploy by Juju/MAASIkuo Kumagai
 
Writing Skills (Written Communication)
Writing Skills (Written Communication)Writing Skills (Written Communication)
Writing Skills (Written Communication)Mudasir Khan
 

Viewers also liked (10)

Contrail Deep-dive - Cloud Network Services at Scale
Contrail Deep-dive - Cloud Network Services at ScaleContrail Deep-dive - Cloud Network Services at Scale
Contrail Deep-dive - Cloud Network Services at Scale
 
Arrow SDN Lab
Arrow SDN LabArrow SDN Lab
Arrow SDN Lab
 
Uuden millenium palkitun aurinkokennon tulevaisuus
Uuden millenium palkitun aurinkokennon tulevaisuusUuden millenium palkitun aurinkokennon tulevaisuus
Uuden millenium palkitun aurinkokennon tulevaisuus
 
2016 interop sdi_showcase_contrail
2016 interop sdi_showcase_contrail2016 interop sdi_showcase_contrail
2016 interop sdi_showcase_contrail
 
Cloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper ContrailCloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper Contrail
 
Contrail Basics
Contrail BasicsContrail Basics
Contrail Basics
 
Banv meetup-contrail
Banv meetup-contrailBanv meetup-contrail
Banv meetup-contrail
 
Contrail deploy by Juju/MAAS
Contrail deploy by Juju/MAASContrail deploy by Juju/MAAS
Contrail deploy by Juju/MAAS
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
Writing Skills (Written Communication)
Writing Skills (Written Communication)Writing Skills (Written Communication)
Writing Skills (Written Communication)
 

Similar to Half a year with contrail at production

Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesAlexander Penev
 
The Real World - Plugging the Enterprise Into It (nodejs)
The Real World - Plugging  the Enterprise Into It (nodejs)The Real World - Plugging  the Enterprise Into It (nodejs)
The Real World - Plugging the Enterprise Into It (nodejs)Aman Kohli
 
Network Automation Tools
Network Automation ToolsNetwork Automation Tools
Network Automation ToolsEdwin Beekman
 
Dark launching with Consul at Hootsuite - Bill Monkman
Dark launching with Consul at Hootsuite - Bill MonkmanDark launching with Consul at Hootsuite - Bill Monkman
Dark launching with Consul at Hootsuite - Bill MonkmanAmbassador Labs
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick StackGianluca Arbezzano
 
Nested CloudStack with VMware
Nested CloudStack with VMwareNested CloudStack with VMware
Nested CloudStack with VMwareShapeBlue
 
When Web Services Go Bad
When Web Services Go BadWhen Web Services Go Bad
When Web Services Go BadSteve Loughran
 
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...Vietnam Open Infrastructure User Group
 
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...Codemotion
 
Monitoring CloudStack and components
Monitoring CloudStack and componentsMonitoring CloudStack and components
Monitoring CloudStack and componentsShapeBlue
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonVMware Tanzu
 
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Tobias Schneck
 
How to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
How to Migrate 100 Clusters from On-Prem to Google Cloud Without DowntimeHow to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
How to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtimeloodse
 
New Tools and Interfaces for Managing IBM MQ
New Tools and Interfaces for Managing IBM MQNew Tools and Interfaces for Managing IBM MQ
New Tools and Interfaces for Managing IBM MQMatt Leming
 
Testing the limits of cloud networks
Testing the limits of cloud networksTesting the limits of cloud networks
Testing the limits of cloud networksPLUMgrid
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...Ryousei Takano
 
Understanding network and service virtualization
Understanding network and service virtualizationUnderstanding network and service virtualization
Understanding network and service virtualizationSDN Hub
 
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong KimCeph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong KimCeph Community
 
Red Hat and kubernetes: awesome stuff coming your way
Red Hat and kubernetes:  awesome stuff coming your wayRed Hat and kubernetes:  awesome stuff coming your way
Red Hat and kubernetes: awesome stuff coming your wayJohannes Brännström
 

Similar to Half a year with contrail at production (20)

Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
 
The Real World - Plugging the Enterprise Into It (nodejs)
The Real World - Plugging  the Enterprise Into It (nodejs)The Real World - Plugging  the Enterprise Into It (nodejs)
The Real World - Plugging the Enterprise Into It (nodejs)
 
Network Automation Tools
Network Automation ToolsNetwork Automation Tools
Network Automation Tools
 
Dark launching with Consul at Hootsuite - Bill Monkman
Dark launching with Consul at Hootsuite - Bill MonkmanDark launching with Consul at Hootsuite - Bill Monkman
Dark launching with Consul at Hootsuite - Bill Monkman
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick Stack
 
Nested CloudStack with VMware
Nested CloudStack with VMwareNested CloudStack with VMware
Nested CloudStack with VMware
 
When Web Services Go Bad
When Web Services Go BadWhen Web Services Go Bad
When Web Services Go Bad
 
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
 
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
 
Monitoring CloudStack and components
Monitoring CloudStack and componentsMonitoring CloudStack and components
Monitoring CloudStack and components
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - Boston
 
IBM Notes in the Cloud
IBM Notes in the CloudIBM Notes in the Cloud
IBM Notes in the Cloud
 
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
 
How to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
How to Migrate 100 Clusters from On-Prem to Google Cloud Without DowntimeHow to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
How to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
 
New Tools and Interfaces for Managing IBM MQ
New Tools and Interfaces for Managing IBM MQNew Tools and Interfaces for Managing IBM MQ
New Tools and Interfaces for Managing IBM MQ
 
Testing the limits of cloud networks
Testing the limits of cloud networksTesting the limits of cloud networks
Testing the limits of cloud networks
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...
 
Understanding network and service virtualization
Understanding network and service virtualizationUnderstanding network and service virtualization
Understanding network and service virtualization
 
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong KimCeph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
 
Red Hat and kubernetes: awesome stuff coming your way
Red Hat and kubernetes:  awesome stuff coming your wayRed Hat and kubernetes:  awesome stuff coming your way
Red Hat and kubernetes: awesome stuff coming your way
 

Recently uploaded

定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxeditsforyah
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
Intellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptxIntellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptxBipin Adhikari
 
Elevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New OrleansElevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New Orleanscorenetworkseo
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 

Recently uploaded (20)

定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptx
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
Intellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptxIntellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptx
 
Elevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New OrleansElevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New Orleans
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 

Half a year with contrail at production

  • 1. Half a year with Contrail at Production krzysztof.kowalik@allegrogroup.com michal.dopierala@allegrogroup.com
  • 2. ➢ 57% of Polish e-commerce ➢ 14.000.000 users Who we are?
  • 3. ➢ Provide CLOUD/IAAS services to internal customers since 2011 ➢ We support dev/test/prod environments for 50 developers teams in private and public cloud ➢ As a part of other TEAM provide deployment platform (Mesos/Marathon) for Allegro platform TEAM - scope
  • 4. What have we done since last OCUG? ➢ New region PL-KRA with ~100 computes ➢ Test environment with MX10 as gateway ➢ Migration of client VMs from Essex to Icehouse with Contrail ➢ ~100 new computes in PL-POZ ➢ Problem solving
  • 7. Fixed at build 44 Issues - memory leak
  • 8. Fixed at build 44 SQL: nova.instance_faults (message): internal error: process exited while connecting to monitor: Cannot set up guest memory 'pc.ram': Cannot allocate memory /etc/qemu-ifdown: could not launch network script At Hypervisor Jun 14 15:42:31 host kernel: [8026284.365930] lowmem_reserve[]: 0 0 0 0 Jun 14 15:42:31 host kernel: [8026284.365988] 0 pages HighMem/MovableOnly Jun 14 15:42:31 host kernel: [8026284.366178] Out of memory: Kill process 140217 (qemu-system-x86) score 16 or sacrifice child Issues - memory leak
  • 9. Debug: Exec[munin-node-configure](provider=posix): Executing check 'munin-node-configure --shell | grep ln' Debug: Executing 'munin-node-configure --shell | grep ln' Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # The following plugins caused errors: Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # ip_: Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # Nothing printed to stdout Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # No valid suggestions Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # mysql_: Debug: /Stage[main]/Munin::Common/Exec[munin-node-configure]/onlyif: # Non-zero exit during autoconf (2) Issues - munin can generate latency
  • 10. Issues - flows re-evaluation
  • 11. Problem ➢ When a flow is created, contrail keeps track of routes that can potentially modify the flow action. Contrail-vrouter-agent re-evaluates flows that are not affected by the new route during VM spawning Symptom ➢ large latency to all VMs within affected subnet Solution ➢ custom build (70 of R2.02 or in commit 6992575a03a08f703edb8f0a7622a457dbdbdeee ) where agent re-evaluates a flow only if flow is affected by the new route Issues - flows re-evaluation
  • 12. ➢ Duplicated IP generating issue with multi NH updates at many computes - latency Script which helps us #!/usr/bin/python import requests import sys cc_ip = sys.argv[1] r = requests.get('http://%s:9081/analytics/virtual-machines' % cc_ip) for href_vm in r.json(): r_vm = requests.get(href_vm['href']) vm = r_vm.json()['UveVirtualMachineAgent'] try: for ifaces in vm['interface_list']: print vm['uuid'], ifaces['active'], ifaces['ip_address'], ifaces['mac_address'], ifaces['virtual_network'], ifaces['uuid'] except Exception as e: print e, vm['uuid'], vm['interface_list'] Issues - duplicated IP
  • 13. Examples: ➢ http://127.0.0.1:8085/Snh_SandeshTraceRequest?x=Flow ➢ http://127.0.0.1:8085/Snh_SandeshTraceRequest?x=KSync ➢ http://127.0.0.1:8085/Snh_SandeshTraceRequest?x=Oper%20DB ➢ http://127.0.0.1:8085/Snh_SandeshTraceRequest?x=XmppMessageTrace ➢ http://127.0.0.1:8085/Snh_ItfReq? ➢ http://127.0.0.1:8085/Snh_VnListReq? Issues - getting introspect data can produce latency
  • 14. Experimenting with flow_cache_timeout: ➢ mesh environment with different client needs ➢ experimental hypervisors ➢ parameters set to 30 Since commit “507fda3d5deb22c6549d4fd253624bea44534b73” ;-) no need to take care about flow_cache_timeout Issues - flow_cache_timeout
  • 15. #!/bin/bash /usr/sbin/service contrail-vrouter-agent stop /sbin/modprobe -r -f vrouter /sbin/insmod /lib/modules/`uname -r`/updates/dkms/vrouter.ko /sbin/ifup vhost0 /usr/sbin/service contrail-vrouter-agent start Issues - defunct
  • 16. Since Kernel 3.13.0-63 Modified script: /opt/contrail/bin/if-vhost0 #!/bin/bash source /opt/contrail/bin/vrouter-functions.sh if [ ! -L /sys/class/net/vhost0 ]; then insert_vrouter &>> $LOG fi /sbin/ip l s pkt1 up /sbin/ip l s pkt2 up Issues - interfaces pkt1 and pkt2 down after reboot
  • 17. ➢ network performance issues - hping (lookout for 1k bug) hping3 -c 100000 -p 8080 -i u100000 -S -n 169.254.0.17 ➢ introspects ➢ tcpdump How to debug
  • 18. Q/A?