Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Shifting the Curve Towards Reliable and Cost Effective Cloud Operations
1. Spinnaker
Bay Area AWS User Group
July 26th, 2016
Shifting the Curve Towards Reliable and Cost Effective Cloud Operations
2. Assumed Knowledge
* Cloud Deployment (doesn't have to be AWS)
* Continuous Delivery and the value of repeatable deployment pipelines
* Immutable Infrastructure
* Red/Black (or Blue/Green) Deployments
22. https://flic.kr/p/56suBd
Tools
* Asgard
* Mimir
* Jenkins
* Spinnaker
Culture
* Freedom and Responsibility
* Context over Control
* Microservices
* Run what you build
* No dedicated DevOps
Contributors
* 8 engineers from Netflix (Delivery Engineering)
* 6 engineers from Google
* 3 engineers from Microsoft
* 1 engineer from Pivotal
+ numerous open-source committers (Target, Veritas, Full
Contact, Stitch Fix etc.)
23. Running @ Netflix
* Layer custom components and configuration over open source JARs (Bintray)
* No forking
* Dedicated cluster for every Spinnaker service
* Dedicated datastore for every Spinnaker service
* Authentication via SAML or x509
https://flic.kr/p/cpijTm
24. Running @ Netflix
* Layer custom components and configuration over open source JARs (Bintray)
* No forking
* Dedicated cluster for every Spinnaker service
* Dedicated datastore for every Spinnaker service
* Authentication via SAML or x509
Supporting Systems
* Atlas
* Eureka (aka Discovery)
* Automated Canary Analysis (aka ACA)
* Chronos (event tracking)
* Lemur (x509 Certificate Manager)
https://flic.kr/p/cpijTm
25. Running @ Netflix
* Layer custom components and configuration over open source JARs (Bintray)
* No forking
* Dedicated cluster for every Spinnaker service
* Dedicated datastore for every Spinnaker service
* Authentication via SAML or x509
Supporting Systems
* Atlas
* Eureka (aka Discovery)
* Automated Canary Analysis (aka ACA)
* Chronos (event tracking)
* Lemur (x509 Certificate Manager)
Extensions
* Additional Cloud Provider (Titus)
* Internal Spot Market (Reservation Reports)
* Canaries
* Fast/Dynamic Properties
* Application-specific IAM roles
* Service Migration (EC2 Classic -> VPC)
https://flic.kr/p/cpijTm
26. What does a Netflix engineer really care about?
39. Lessons Learned
* Adoption does not come for free!
* Spinnaker Office Hours
* Operational Metrics and Dashboards
40. Lessons Learned
* Adoption does not come for free!
* Spinnaker Office Hours
* Operational Metrics and Dashboards
* Ask yourself ... What could Spinnaker have done to prevent this
outage?
41. Lessons Learned
* Adoption does not come for free!
* Spinnaker Office Hours
* Operational Metrics and Dashboards
* Ask yourself ... What could Spinnaker have done to prevent this
outage?
* Deploy Spinnaker with Spinnaker
42. Lessons Learned
* Adoption does not come for free!
* Spinnaker Office Hours
* Operational Metrics and Dashboards
* Ask yourself ... What could Spinnaker have done to prevent this
outage?
* Deploy Spinnaker with Spinnaker
* Teams with embedded QA have much tighter integrations with
Spinnaker