2. History
➢ What we had:
○ Two separate cloud environments (Essex And Havana)
○ Floating IP in Essex and Flat Network with VLANS in
Havana
○ Network complicity in Havana
○ Network performance problems in Havana
3. Goals (Plan)
➢ Ease to maintain and growth
➢ Network simplicity
➢ Network isolation for tenants
➢ Floating IP and flat network
➢ New region in new DC
4. Fabric (Prepare - 2 weeks)
➢ Easy and fast deployment (couple of corrections in fabric
scripts), we used 1.20 version at that time
➢ Environment ready for test (adding new HV from “any”
location of server room)
➢ Basic performance tests, LBaaS
Where to find quick implementation tools:
http://www.opencontrail.org/opencontrail-quick-start-guide/
5. Our way
➢ Own puppet manifests based on available ones
➢ Reasons:
○ Existing infrastructure
○ Customized deployment
○ More work at the beginning, less problems later
○ Easy procedure to add hosts (compute nodes,
controllers)
○ Building new region in near future
6. Implementation
➢ We had everything prepared for version 1.20, and then we
get 2.01 production version ( what to do ?! )
➢ Environment deploying (OpenStack with Contrail,
One Region- 2 CC; 3 CoC; 50 HV), during DC migration
amount of computes increasing - target 250
➢ Move tenants/users/quota from old environment to new
○ we used keystone server builded from scratch and did
upgrade then pump data to Icehouse/Contrail (issue
missing users), qouta were migrated as SQL tables,
exporter.py (script to marge users/tenants/qoutas)
between regions (target for future - one keystone)
7. Implementation
➢ DNS - we are using Designate with two handlers (one for
Floating IPs, second one for Fixed routable IPs)
➢ Required Image modifications (target ansible automation
build)
➢ Two days before production we did update to latest
available packages from release 2.01
➢ Breaking environment
➢ Clients at new environment
9. Problems
➢ cassandra (we increased number of nodes 3 => 5),
configuration tuning (TTLs in contrail-collector.conf),
compaction throughput, migration of cassandra data
to raid0 SSD disks
➢ OpenFiles issue (user, supervisor, init)
➢ Collector was flood by data from computes
iptables -A OUTPUT -p tcp --dport 8086 -m string --algo bm --
string "flowuuid" -j DROP
10. Problems
➢ When 500K flow is not enough (vr_flow_entries, vr_oflow_entries)
➢ Flow on Hold issue
➢ Vrouter CPU consumption to high compare to VM
(TBB_THREAD_COUNT /etc/contrail/supervisord_vrouter.conf)
11. Problems
➢ Rebuild instance - interface was deleted after VM was
respawn
➢ Lack of support for ironic - we will build region for ironic
➢ Disabled Tenant - Not able to login to Contrail UI (keystone
2.0)
➢ Tuning configuration files required
➢ Metadata packages not sent in one session
➢ RBAC for contrail UI
12. Environment expansion and further plans
➢ 1350 VMs on 150 HV in one DC at this moment
➢ Second region on it’s way
➢ 250-300 HV per region
➢ Migration from Essex and Havana
➢ OpenStack and Contrail upgrades