2. Tweet @garethbowles with feedback!
• How would your organization be different if all
of your engineers could build, test and deploy
their own code ...
• ... and were responsible for fixing what they
broke at 3am ?
Monday, August 5, 13
5. Tweet @garethbowles with feedback!
Netflix is the world’s leading Internet television
network with more than 36 million members in 40
countries enjoying more than one billion hours ofTV
shows and movies per month, including original
series.
Source: http://ir.netflix.com
Monday, August 5, 13
6. Tweet @garethbowles with feedback!
The Challenge
• We need to innovate rapidly, driven by:
• Global competition
• New connected devices
• Continuous customer feedback
• And we need fast rollback
Monday, August 5, 13
7. Tweet @garethbowles with feedback!
The Challenge
• We need to scale to cope with:
• Growing customer base
• Peaks in demand:
• Special events: holidays, Oscars
• Daily fluctuations (weekdays vs.
weekends, daytime vs. evening)
Monday, August 5, 13
8. Tweet @garethbowles with feedback!
Things That Help
• We can push out updates whenever we like
• Company culture
Monday, August 5, 13
10. Tweet @garethbowles with feedback!
A Few ShortYears Ago ...
• Monolithic web app
• Single points of failure
• Releases were done by following runbooks
• DC-based infrastructure
• Different teams used different tools
Monday, August 5, 13
12. Tweet @garethbowles with feedback!
http://www.slideshare.net/reed2001/culture-1798664
Monday, August 5, 13
13. Tweet @garethbowles with feedback!
Freedom and Responsibility
• Hire mature people who work well with
others
• Give them the context for company success
• Then get out of their way
• But hold them responsible for results
Monday, August 5, 13
14. Tweet @garethbowles with feedback!
Context, not Control
• Be transparent about what the company needs
to succeed
• Minimize the processes people need to go
through to achieve success
• Value results, not planning and process
Monday, August 5, 13
15. Tweet @garethbowles with feedback!
Highly Aligned, Loosely Coupled
• Clear strategy and goals
• Team interactions focus on strategy, not tactics
• Minimal cross-functional meetings
• Occasional post-mortems to increase
alignment
Monday, August 5, 13
16. Tweet @garethbowles with feedback!
What This Helped Us Achieve
• DVD to Streaming
• DC to cloud
• US-only to 40-plus countries
Monday, August 5, 13
18. Tweet @garethbowles with feedback!
Key Changes
• Service oriented architecture
• Many small teams, each providing their own
interconnected service
• Deploy on Amazon Web Services
• Increased reliance on open source
Monday, August 5, 13
19. Tweet @garethbowles with feedback!
Highly aligned, loosely coupled
• Services are built by different
teams who work together to
figure out what each service
will provide.
• The service owner publishes
an API that anyone can use.
Monday, August 5, 13
20. Tweet @garethbowles with feedback!
What AWS Provides
• Machine Images (AMI)
• Instances (EC2)
• Elastic IPs
• Load Balancers
• Security groups / Autoscaling groups
Monday, August 5, 13
21. Tweet @garethbowles with feedback!
Freedom and Responsibility
• Developers deploy when
they want
• They also manage their own
capacity and autoscaling
• And fix anything that breaks
at 3am!
Monday, August 5, 13
22. Tweet @garethbowles with feedback!
Personaliza-‐
Eon
Engine
User
Info
Movie
Metadata
Movie
RaEngs
Similar
Movies
API
Reviews
A/B
Test
Engine
2B
requests
per
day
into
the
Ne3lix
API
12B
outbound
requests
per
day
to
API
dependencies
Monday, August 5, 13
23. Tweet @garethbowles with feedback!
Personaliza-‐
Eon
Engine
User
Info
Movie
Metadata
Movie
RaEngs
Similar
Movies
API
Reviews
A/B
Test
Engine
2B
requests
per
day
into
the
Ne3lix
API
12B
outbound
requests
per
day
to
API
dependencies
Monday, August 5, 13
25. Tweet @garethbowles with feedback!
The Audience
• ~700 engineers
• Large majority are developers
• Test engineers
• Delivery teams
• Operations & reliability engineering
Monday, August 5, 13
26. Tweet @garethbowles with feedback!
Our Goal
• Lower the barriers to build, test and
deployment until the entire process is
accessible to every developer.
Monday, August 5, 13
27. Tweet @garethbowles with feedback!
The Team
• 11 engineers and 1 director (but we’re hiring !)
• Developers, build / release engineers, DevOps
• Specialize, but understand the full stack
• Service oriented
Monday, August 5, 13
28. Tweet @garethbowles with feedback!
Self-Service Build & Deployment
• Channel best practices
Monday, August 5, 13
29. Tweet @garethbowles with feedback!
Self-Service Build & Deployment
• Channel best practices
• Promote, don’t dictate
Monday, August 5, 13
30. Tweet @garethbowles with feedback!
Self-Service Build & Deployment
• Channel best practices
• Promote, don’t dictate
• Make adoption easy
Monday, August 5, 13
31. Tweet @garethbowles with feedback!
Self-Service Build & Deployment
• Channel best practices
• Promote, don’t dictate
• Make adoption easy
• Make tools flexible
Monday, August 5, 13
32. Tweet @garethbowles with feedback!
Building and Deploying
Perforce / Git
libraries
source
Ant targets
Ivy
Groovy all over
snapshot / release
libraries / apps
Jenkins
sync
resolve
buildcompile report
publishtest
Artifactory yum
Aminator
Asgard
rpms
Monday, August 5, 13
33. Tweet @garethbowles with feedback!
Building and Deploying
Perforce / Git
libraries
source
Ant targets
Ivy
Groovy all over
snapshot / release
libraries / apps
Jenkins
sync
resolve
buildcompile report
publishtest
Artifactory yum
Aminator
Asgard
rpms
Monday, August 5, 13
35. Tweet @garethbowles with feedback!
Common Build Framework
• Define a build with just a few lines of Ant code
• Templates for libraries and webapps
• Override standard targets if you need to
Monday, August 5, 13
36. Tweet @garethbowles with feedback!
Jenkins Job DSL
• Define Jenkins build jobs using a domain
specific language (based on Groovy)
• Loop to create multiple jobs (e.g. for building
different branches)
• Make one change and rerun to update all jobs
• The code is the configuration
• https://wiki.jenkins-ci.org/display/JENKINS/Job
Monday, August 5, 13
37. Tweet @garethbowles with feedback!
Jenkins Dynaslaves
• Create build slaves in AWS
• Dedicated slave pools for teams
• Scale slave pools up and down on demand
• https://github.com/Netflix-Skunkworks/
dynaslave-plugin
Monday, August 5, 13
39. Tweet @garethbowles with feedback!
Aminator
• Create (“bake”) AMIs
• Image contains a service and everything needed
to run it
• Can be automatically triggered as a build step
• https://github.com/Netflix/aminator
Monday, August 5, 13
40. Tweet @garethbowles with feedback!
How Baking is Different
https://github.com/Netflix/aminator
Monday, August 5, 13
41. Tweet @garethbowles with feedback!
How Baking is Different
Traditional:
•launch OS
•install packages
•install app
https://github.com/Netflix/aminator
Monday, August 5, 13
42. Tweet @garethbowles with feedback!
How Baking is Different
Generic AMI
Instance
Traditional:
•launch OS
•install packages
•install app
https://github.com/Netflix/aminator
Monday, August 5, 13
43. Tweet @garethbowles with feedback!
How Baking is Different
Generic AMI
Instance
Traditional:
•launch OS
•install packages
•install app
https://github.com/Netflix/aminator
Monday, August 5, 13
44. Tweet @garethbowles with feedback!
How Baking is Different
Generic AMI
Instance
Traditional:
•launch OS
•install packages
•install app
https://github.com/Netflix/aminator
Monday, August 5, 13
45. Tweet @garethbowles with feedback!
How Baking is Different
Generic AMI
Instance
Traditional:
•launch OS
•install packages
•install app
Netflix:
•launch OS+app
https://github.com/Netflix/aminator
Monday, August 5, 13
46. Tweet @garethbowles with feedback!
How Baking is Different
Generic AMI
Instance
Traditional:
•launch OS
•install packages
•install app
Netflix:
•launch OS+app
App AMI Instance
https://github.com/Netflix/aminator
Monday, August 5, 13
47. Tweet @garethbowles with feedback!
Linux Base AMI (CentOS or Ubuntu)
Java (JDK 6 or 7)
Tomcat
Optional
Apache
Monitoring
Log Rotation
to S3
Appdynamics
Machine
Agent
Appdynamics
App Agent
monitoring
Application war file, base
servlet, platform, interface
jars for dependent
services
GC and
thread
dump
logging
Healthcheck, status
servlets, JMX interface,
Servo autoscale
Monday, August 5, 13
48. Tweet @garethbowles with feedback!
At Netflix, the AMI is the unit of
deployment.
Monday, August 5, 13
49. Tweet @garethbowles with feedback!
Asgard
• Web UI and REST API for service deployment
and management
• Manage ASGs, ELBs, security groups, ...
• Application -> cluster -> ASG
• Rapid deployment and rollback
• Available to all engineers
• https://github.com/Netflix/asgard
Monday, August 5, 13
51. Tweet @garethbowles with feedback!
Netflix has moved the
granularity from the
instance to the cluster.
Monday, August 5, 13
52. Tweet @garethbowles with feedback!
Simple Service Setup Effort
• Write the code (variable :-))
• 15 minutes to write a build file and define
dependencies
• 15 mins to create a Jenkins build, 2 to 10 mins
to run it
• 5 mins to bake an AMI
• 10 mins to deploy in test, another 10 for prod
Monday, August 5, 13
53. Tweet @garethbowles with feedback!
Just a quick reminder...
(Some of) Netflix is open source:
https://github.com/netflix
Monday, August 5, 13
54. Tweet @garethbowles with feedback!
Why We Open Source
• Give back to Apache license OSS community
• Motivate, retain, hire top engineers
• Benefit from a shared ecosystem
• Make Netflix solutions into common standards
Monday, August 5, 13
55. Tweet @garethbowles with feedback!
The Netflix Platform
Discovery (Eureka)
Entrypoints (Edda)
Configuration (Archaius)
Zookeeper (Exhibitor)
logging (Blitz4j & Honu)
NIWS (Ribbon)
Geo
Base
Hystrix
Circuit Breakers (Hystrix)
Cassandra (Priam &
Astyanax & CassJMeter)
Cryptex
AKMS
EvCache
Zuul
i18n
L10n
Open Source
Monday, August 5, 13
56. Tweet @garethbowles with feedback!
https://github.com/Netflix/Cloud-Prize/wiki
Monday, August 5, 13
57. Tweet @garethbowles with feedback!
ThankYou !
Email: gbowles@{gmail,netflix}.com
Twitter: @garethbowles
Linkedin: www.linkedin.com/in/garethbowles
Monday, August 5, 13