SlideShare a Scribd company logo
1 of 87
Daniel Jacobson
• @daniel_jacobson
• http://www.linkedin.com/in/danieljacobson
• http://www.slideshare.net/danieljacobson
Sangeeta Narayanan
• @sangeetan
• http://www.linkedin.com/in/sangeetanarayanan/
I have added more detail in the
notes field for each slide to provide
additional context
Strategy
Lessons
Implementation
Lessons
Know Your Audience
Target Audience Dictates
Everything Else
The target audience should be
the single biggest influence on
your API design
Small Set of Known Developers
SSKDs
Large Set of Unknown Developers
LSUDs
Both
SSKDs and LSUDs
No matter what…
Figure this out first!
Target Audience Influence
• Team Identity
• Staffing Decisions
• System Architecture
• SLAs
• Development Velocity
• Security Needs
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Netflix API : Key Responsibilities
2008
• Broker data between internal services and
public developers
• Grow community of public developers
• Optimize design for reusability
Evangelists
Partner
Engagement
and Support
API
Engineers
Technical
Writer
QA
Specialists
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Private API Public API
< 0.3% of total
API traffic *
* 11 years worth of public API requests = one day of private API requests
Netflix API : Key Responsibilities
Today
• Broker data between services and devices
• System resiliency
• Scaling the system
• High velocity development
• Insights
The consumers of the API are now
Netflix subscribers
We are now responsible for ensuring
subscribers can stream
Application
Engineers
Platform
Engineers
Technical
Writer
Tools and
Automation
Engineers
Team is now 6x its
size from 2010
Separation of Concerns
Primary Responsibilities of APIs
• Data Gathering
– Retrieving the requested data from one or many local
or remote data sources
• Data Formatting
– Preparing a structured payload to the requesting
agent
• Data Delivery
– Delivering the structured payload to the requesting
agent
There are two players in APIs
API Provider API Consumer
API Provider
PROVIDES
API Consumer
CONSUMES
Traditional API Interactions
API Provider
PROVIDES
EVERYTHING
API Consumer
CONSUMES
Everything means, API Provider does:
• Data Gathering
• Data Formatting
• Data Delivery
• (among other things)
Traditional API Interactions
Why do most API providers provide
everything?
• Many APIs have a large set of unknown and
external developers
• Generic API design tends to be easier for
teams closer to the source
• Centralized API functions makes them easier
to support
Why do most API providers provide
everything?
• Many APIs have a large set of unknown and
external developers
• Generic API design tends to be easier for
teams closer to the source
• Centralized API functions makes them easier
to support
Data Gathering Data Formatting Data Delivery
API Consumer
Don’t care how data is
gathered, as long as it
is gathered
Each consumer cares a
lot about the format
for that specific use
Each consumer cares a
lot about how payload
is delivered
API Provider
Care a lot about how
the data is gathered
Only cares about the
format to the extent it
is easy to support
Only cares that the
delivery method is
easy to support
Separation of Concerns
To be a better provider, the API should address the
separation of concerns of the three core functions
One Size Doesn’t Fit All
Embrace the Differences
Data Gathering Data Formatting Data Delivery
API Consumer
Don’t care how data is
gathered, as long as it
is gathered
Each consumer cares a
lot about the format
for that specific use
Each consumer cares a
lot about how payload
is delivered
API Provider
Care a lot about how
the data is gathered
Only cares about the
format to the extent it
is easy to support
Only cares that the
delivery method is
easy to support
Separation of Concerns
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Screen Real Estate
Controllers
Technical Capabilities
Resource-Based API
vs.
Experience-Based API
Resource-Based Requests
• /users/<id>/ratings/title
• /users/<id>/queues
• /users/<id>/queues/instant
• /users/<id>/recommendations
• /catalog/titles/movie
• /catalog/titles/series
• /catalog/people
REST API
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Network Border Network Border
Experience-Based Requests
• /ps3/homescreen
JAVA API
Network Border Network Border
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Client Adapter Code
Be Pragmatic, Not Dogmatic
Common API Debates
• XML / JSON
• REST / SOAP
• OAuth / Other
• Versioning
• Hypermedia
Who Cares!?!?
Just Solve Problems for your
Audience
Embrace Change
Impermanence and Versionless APIs
v1.0
v1.5
v2.0
Versioning for APIs
1.0
1.5
2.0
3.0?
4.0?
5.0?
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
Eliminate Versioning?
1.0
1.5
2.0
New Architecture
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
JAVA API
Network Border Network Border
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Act Fast, React Fast
Favor Velocity Over Completeness
Delivery Using Buckets
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Testing
Production Traffic
Old Code (Baseline) New Code (Canary)
~1% Traffic
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Deployments
Old Code New Code
Production Traffic
Enable Others to
Act Fast, React Fast
JAVA API
Network Border Network Border
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Dynamically deployed
endpoints
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Statically deployed
libraries
Dynamically deployed
endpoints
Dependency Canaries
Personalization
Service
API
• Build
• Test
• Deploy Service
• Release Lib
Pers.
Lib
• Integrate Lib
• Build
• Test
• Deploy Service
UI Script
Iterations in Hours or Days
Access
Data
Personalization
Service
API
• Build
• Test
• Deploy Service
• Release Lib
• Publish to API
Pers.
Lib
UI Script
Iterations in Minutes?
Access
Data
• Integrate Lib
• Build
• Test
• Deploy Service
Internal Developers Need
Engagement Too
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Documentation
Tools
REPL:
Trainings
Failure is Inevitable
~5,000,000,000
Requests per day
~35
Dependencies
~600
Libraries
Things will break!
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Scale at All Costs
-
10
20
30
40
50
60
June, 2010 June, 2011 June, 2012
RequestsinBillions
API Requests Per Month
Incoming Traffic
Predictive Auto Scaling
Predicted vs. Actual RPS
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Reactive + Predictive Autoscaling
No. of instances
1. Know Your Audience
2. Separation of Concerns
3. One Size Doesn’t Fit All
4. Be Pragmatic, Not
Dogmatic
5. Embrace Change
1. Act Fast, React Fast
2. Enable Others to Act
Fast, React Fast
3. Internal Developers
Need Engagement Too
4. Failure is Inevitable
5. Scale at All Costs
http://github.com/Netflix
Daniel Jacobson
• @daniel_jacobson
• http://www.linkedin.com/in/danieljacobson
• http://www.slideshare.net/danieljacobson
Sangeeta Narayanan
• @sangeetan
• http://www.linkedin.com/in/sangeetanarayanan/

More Related Content

What's hot

Building an API Security Strategy
Building an API Security StrategyBuilding an API Security Strategy
Building an API Security StrategySmartBear
 
Tracking and business intelligence
Tracking and business intelligenceTracking and business intelligence
Tracking and business intelligenceSebastian Schleicher
 
CQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architectureCQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architectureThomas Jaskula
 
Connect SAP Business One using Service Layer (HANA)
Connect SAP Business One using Service Layer (HANA)Connect SAP Business One using Service Layer (HANA)
Connect SAP Business One using Service Layer (HANA)APPSeCONNECT
 
MicroService Architecture
MicroService ArchitectureMicroService Architecture
MicroService ArchitectureFred George
 
The Architecture of an API Platform
The Architecture of an API PlatformThe Architecture of an API Platform
The Architecture of an API PlatformJohannes Ridderstedt
 
Amazon Aurora 신규 서비스 알아보기::최유정::AWS Summit Seoul 2018
Amazon Aurora 신규 서비스 알아보기::최유정::AWS Summit Seoul 2018Amazon Aurora 신규 서비스 알아보기::최유정::AWS Summit Seoul 2018
Amazon Aurora 신규 서비스 알아보기::최유정::AWS Summit Seoul 2018Amazon Web Services Korea
 
Designing APIs with OpenAPI Spec
Designing APIs with OpenAPI SpecDesigning APIs with OpenAPI Spec
Designing APIs with OpenAPI SpecAdam Paxton
 
Implement API Gateway using Azure API Management
Implement API Gateway using Azure API ManagementImplement API Gateway using Azure API Management
Implement API Gateway using Azure API ManagementAlexander Laysha
 
Continuous Integration
Continuous IntegrationContinuous Integration
Continuous Integrationdrluckyspin
 
apidays Paris 2022 - Event-Driven API Management – why REST isn't enough, Ben...
apidays Paris 2022 - Event-Driven API Management – why REST isn't enough, Ben...apidays Paris 2022 - Event-Driven API Management – why REST isn't enough, Ben...
apidays Paris 2022 - Event-Driven API Management – why REST isn't enough, Ben...apidays
 
학교에선 알려주지 않는 오픈소스이야기 - 박치완님
학교에선 알려주지 않는 오픈소스이야기 - 박치완님학교에선 알려주지 않는 오픈소스이야기 - 박치완님
학교에선 알려주지 않는 오픈소스이야기 - 박치완님NAVER D2
 
Open API and API Management - Introduction and Comparison of Products: TIBCO ...
Open API and API Management - Introduction and Comparison of Products: TIBCO ...Open API and API Management - Introduction and Comparison of Products: TIBCO ...
Open API and API Management - Introduction and Comparison of Products: TIBCO ...Kai Wähner
 

What's hot (20)

Effective API Design
Effective API DesignEffective API Design
Effective API Design
 
How Secure Are Your APIs?
How Secure Are Your APIs?How Secure Are Your APIs?
How Secure Are Your APIs?
 
Building an API Security Strategy
Building an API Security StrategyBuilding an API Security Strategy
Building an API Security Strategy
 
An Introduction To REST API
An Introduction To REST APIAn Introduction To REST API
An Introduction To REST API
 
Api Gateway
Api GatewayApi Gateway
Api Gateway
 
Tracking and business intelligence
Tracking and business intelligenceTracking and business intelligence
Tracking and business intelligence
 
CQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architectureCQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architecture
 
Connect SAP Business One using Service Layer (HANA)
Connect SAP Business One using Service Layer (HANA)Connect SAP Business One using Service Layer (HANA)
Connect SAP Business One using Service Layer (HANA)
 
MicroService Architecture
MicroService ArchitectureMicroService Architecture
MicroService Architecture
 
Algunos Conceptos Claves de DevOps
Algunos Conceptos Claves de DevOpsAlgunos Conceptos Claves de DevOps
Algunos Conceptos Claves de DevOps
 
The Architecture of an API Platform
The Architecture of an API PlatformThe Architecture of an API Platform
The Architecture of an API Platform
 
Amazon Aurora 신규 서비스 알아보기::최유정::AWS Summit Seoul 2018
Amazon Aurora 신규 서비스 알아보기::최유정::AWS Summit Seoul 2018Amazon Aurora 신규 서비스 알아보기::최유정::AWS Summit Seoul 2018
Amazon Aurora 신규 서비스 알아보기::최유정::AWS Summit Seoul 2018
 
Designing APIs with OpenAPI Spec
Designing APIs with OpenAPI SpecDesigning APIs with OpenAPI Spec
Designing APIs with OpenAPI Spec
 
Implement API Gateway using Azure API Management
Implement API Gateway using Azure API ManagementImplement API Gateway using Azure API Management
Implement API Gateway using Azure API Management
 
Continuous Integration
Continuous IntegrationContinuous Integration
Continuous Integration
 
apidays Paris 2022 - Event-Driven API Management – why REST isn't enough, Ben...
apidays Paris 2022 - Event-Driven API Management – why REST isn't enough, Ben...apidays Paris 2022 - Event-Driven API Management – why REST isn't enough, Ben...
apidays Paris 2022 - Event-Driven API Management – why REST isn't enough, Ben...
 
학교에선 알려주지 않는 오픈소스이야기 - 박치완님
학교에선 알려주지 않는 오픈소스이야기 - 박치완님학교에선 알려주지 않는 오픈소스이야기 - 박치완님
학교에선 알려주지 않는 오픈소스이야기 - 박치완님
 
Open API and API Management - Introduction and Comparison of Products: TIBCO ...
Open API and API Management - Introduction and Comparison of Products: TIBCO ...Open API and API Management - Introduction and Comparison of Products: TIBCO ...
Open API and API Management - Introduction and Comparison of Products: TIBCO ...
 
Scaling the Netflix API
Scaling the Netflix APIScaling the Netflix API
Scaling the Netflix API
 
Developer Experience on AWS
Developer Experience on AWSDeveloper Experience on AWS
Developer Experience on AWS
 

Viewers also liked

Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016Daniel Jacobson
 
Redesigning the Netflix API - OSCON
Redesigning the Netflix API - OSCONRedesigning the Netflix API - OSCON
Redesigning the Netflix API - OSCONDaniel Jacobson
 
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit MeetupMaintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit MeetupDaniel Jacobson
 
Set Your Content Free! : Case Studies from Netflix and NPR
Set Your Content Free! : Case Studies from Netflix and NPRSet Your Content Free! : Case Studies from Netflix and NPR
Set Your Content Free! : Case Studies from Netflix and NPRDaniel Jacobson
 
API Revolutions : Netflix's API Redesign
API Revolutions : Netflix's API RedesignAPI Revolutions : Netflix's API Redesign
API Revolutions : Netflix's API RedesignDaniel Jacobson
 
Culture (Original 2009 version)
Culture (Original 2009 version)Culture (Original 2009 version)
Culture (Original 2009 version)Reed Hastings
 
How Do Developers React to API Deprecation? The Case of a Smalltalk Ecosystem
How Do Developers React to API Deprecation? The Case of a Smalltalk EcosystemHow Do Developers React to API Deprecation? The Case of a Smalltalk Ecosystem
How Do Developers React to API Deprecation? The Case of a Smalltalk Ecosystemmircea.lungu
 
API Business Models
API Business ModelsAPI Business Models
API Business ModelsJohn Musser
 
Scaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev DenScaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev DenDaniel Jacobson
 
Versioning schemes and branching models for Continuous Delivery - Continuous ...
Versioning schemes and branching models for Continuous Delivery - Continuous ...Versioning schemes and branching models for Continuous Delivery - Continuous ...
Versioning schemes and branching models for Continuous Delivery - Continuous ...Pavel Chunyayev
 
History and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of DistributionHistory and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of DistributionDaniel Jacobson
 
Netflix API - Presentation to PayPal
Netflix API - Presentation to PayPalNetflix API - Presentation to PayPal
Netflix API - Presentation to PayPalDaniel Jacobson
 
Automotive Grade APIs – designing for longevity
Automotive Grade APIs – designing for longevityAutomotive Grade APIs – designing for longevity
Automotive Grade APIs – designing for longevityNordic APIs
 
Why should C-Level care about APIs? It's the new economy, stupid.
Why should C-Level care about APIs? It's the new economy, stupid.Why should C-Level care about APIs? It's the new economy, stupid.
Why should C-Level care about APIs? It's the new economy, stupid.Fabernovel
 
NuGet package CI and CD
NuGet package CI and CDNuGet package CI and CD
NuGet package CI and CDYu GUAN
 

Viewers also liked (20)

Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016
 
Culture
CultureCulture
Culture
 
Redesigning the Netflix API - OSCON
Redesigning the Netflix API - OSCONRedesigning the Netflix API - OSCON
Redesigning the Netflix API - OSCON
 
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit MeetupMaintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
 
Netflix API
Netflix APINetflix API
Netflix API
 
Set Your Content Free! : Case Studies from Netflix and NPR
Set Your Content Free! : Case Studies from Netflix and NPRSet Your Content Free! : Case Studies from Netflix and NPR
Set Your Content Free! : Case Studies from Netflix and NPR
 
API Revolutions : Netflix's API Redesign
API Revolutions : Netflix's API RedesignAPI Revolutions : Netflix's API Redesign
API Revolutions : Netflix's API Redesign
 
Culture (Original 2009 version)
Culture (Original 2009 version)Culture (Original 2009 version)
Culture (Original 2009 version)
 
Sulle tracce di Tullo Morgagni
Sulle tracce di Tullo MorgagniSulle tracce di Tullo Morgagni
Sulle tracce di Tullo Morgagni
 
How Do Developers React to API Deprecation? The Case of a Smalltalk Ecosystem
How Do Developers React to API Deprecation? The Case of a Smalltalk EcosystemHow Do Developers React to API Deprecation? The Case of a Smalltalk Ecosystem
How Do Developers React to API Deprecation? The Case of a Smalltalk Ecosystem
 
API Business Models
API Business ModelsAPI Business Models
API Business Models
 
Scaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev DenScaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev Den
 
Versioning schemes and branching models for Continuous Delivery - Continuous ...
Versioning schemes and branching models for Continuous Delivery - Continuous ...Versioning schemes and branching models for Continuous Delivery - Continuous ...
Versioning schemes and branching models for Continuous Delivery - Continuous ...
 
History and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of DistributionHistory and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of Distribution
 
Netflix API - Presentation to PayPal
Netflix API - Presentation to PayPalNetflix API - Presentation to PayPal
Netflix API - Presentation to PayPal
 
Automotive Grade APIs – designing for longevity
Automotive Grade APIs – designing for longevityAutomotive Grade APIs – designing for longevity
Automotive Grade APIs – designing for longevity
 
Why should C-Level care about APIs? It's the new economy, stupid.
Why should C-Level care about APIs? It's the new economy, stupid.Why should C-Level care about APIs? It's the new economy, stupid.
Why should C-Level care about APIs? It's the new economy, stupid.
 
Zuul @ Netflix SpringOne Platform
Zuul @ Netflix SpringOne PlatformZuul @ Netflix SpringOne Platform
Zuul @ Netflix SpringOne Platform
 
NuGet package CI and CD
NuGet package CI and CDNuGet package CI and CD
NuGet package CI and CD
 
Separation of concerns - DPC12
Separation of concerns - DPC12Separation of concerns - DPC12
Separation of concerns - DPC12
 

Similar to Top 10 Lessons Learned from the Netflix API - OSCON 2014

Oscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons LearnedOscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons LearnedSangeeta Narayanan
 
Maintaining the Front Door to Netflix
Maintaining the Front Door to NetflixMaintaining the Front Door to Netflix
Maintaining the Front Door to NetflixBenjamin Schmaus
 
Lessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptxLessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptxapidays
 
Building the Eventbrite API Ecosystem
Building the Eventbrite API EcosystemBuilding the Eventbrite API Ecosystem
Building the Eventbrite API EcosystemMitch Colleran
 
Building a REST API for Longevity
Building a REST API for LongevityBuilding a REST API for Longevity
Building a REST API for LongevityMuleSoft
 
APIs as a Product Strategy
APIs as a Product StrategyAPIs as a Product Strategy
APIs as a Product StrategyRavi Kumar
 
The Ultimate API Publisher's Guide
The Ultimate API Publisher's GuideThe Ultimate API Publisher's Guide
The Ultimate API Publisher's GuidePronovix
 
Best Practices for Salesforce Data Access
Best Practices for Salesforce Data AccessBest Practices for Salesforce Data Access
Best Practices for Salesforce Data AccessSalesforce Developers
 
Service api design validation & collaboration
Service api design validation & collaborationService api design validation & collaboration
Service api design validation & collaborationUchit Vyas ☁
 
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMGapidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMGapidays
 
Pain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re EverywherePain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re EverywhereNordic APIs
 
Recipes for API Ninjas
Recipes for API NinjasRecipes for API Ninjas
Recipes for API NinjasNordic APIs
 
API Strategy Introduction
API Strategy IntroductionAPI Strategy Introduction
API Strategy IntroductionDoug Gregory
 
DataHero / Eventbrite - API Best Practices
DataHero / Eventbrite - API Best PracticesDataHero / Eventbrite - API Best Practices
DataHero / Eventbrite - API Best PracticesJeff Zabel
 
Practical Application of API-First in microservices development
Practical Application of API-First in microservices developmentPractical Application of API-First in microservices development
Practical Application of API-First in microservices developmentChavdar Baikov
 
Portal and Intranets
Portal and Intranets Portal and Intranets
Portal and Intranets Redar Ismail
 
Presentation to ESPN about the Netflix API
Presentation to ESPN about the Netflix APIPresentation to ESPN about the Netflix API
Presentation to ESPN about the Netflix APIDaniel Jacobson
 
Why API? - Business of APIs Conference
Why API? - Business of APIs ConferenceWhy API? - Business of APIs Conference
Why API? - Business of APIs ConferenceDaniel Jacobson
 

Similar to Top 10 Lessons Learned from the Netflix API - OSCON 2014 (20)

Oscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons LearnedOscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons Learned
 
Maintaining the Front Door to Netflix
Maintaining the Front Door to NetflixMaintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
 
Lessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptxLessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptx
 
Building the Eventbrite API Ecosystem
Building the Eventbrite API EcosystemBuilding the Eventbrite API Ecosystem
Building the Eventbrite API Ecosystem
 
API ARU-ARU
API ARU-ARUAPI ARU-ARU
API ARU-ARU
 
Building a REST API for Longevity
Building a REST API for LongevityBuilding a REST API for Longevity
Building a REST API for Longevity
 
Flavours of APIs
Flavours of APIs Flavours of APIs
Flavours of APIs
 
APIs as a Product Strategy
APIs as a Product StrategyAPIs as a Product Strategy
APIs as a Product Strategy
 
The Ultimate API Publisher's Guide
The Ultimate API Publisher's GuideThe Ultimate API Publisher's Guide
The Ultimate API Publisher's Guide
 
Best Practices for Salesforce Data Access
Best Practices for Salesforce Data AccessBest Practices for Salesforce Data Access
Best Practices for Salesforce Data Access
 
Service api design validation & collaboration
Service api design validation & collaborationService api design validation & collaboration
Service api design validation & collaboration
 
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMGapidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
 
Pain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re EverywherePain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re Everywhere
 
Recipes for API Ninjas
Recipes for API NinjasRecipes for API Ninjas
Recipes for API Ninjas
 
API Strategy Introduction
API Strategy IntroductionAPI Strategy Introduction
API Strategy Introduction
 
DataHero / Eventbrite - API Best Practices
DataHero / Eventbrite - API Best PracticesDataHero / Eventbrite - API Best Practices
DataHero / Eventbrite - API Best Practices
 
Practical Application of API-First in microservices development
Practical Application of API-First in microservices developmentPractical Application of API-First in microservices development
Practical Application of API-First in microservices development
 
Portal and Intranets
Portal and Intranets Portal and Intranets
Portal and Intranets
 
Presentation to ESPN about the Netflix API
Presentation to ESPN about the Netflix APIPresentation to ESPN about the Netflix API
Presentation to ESPN about the Netflix API
 
Why API? - Business of APIs Conference
Why API? - Business of APIs ConferenceWhy API? - Business of APIs Conference
Why API? - Business of APIs Conference
 

More from Daniel Jacobson

Maintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix APIMaintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix APIDaniel Jacobson
 
Scaling the Netflix API - OSCON
Scaling the Netflix API - OSCONScaling the Netflix API - OSCON
Scaling the Netflix API - OSCONDaniel Jacobson
 
Netflix API: Keynote at Disney Tech Conference
Netflix API: Keynote at Disney Tech ConferenceNetflix API: Keynote at Disney Tech Conference
Netflix API: Keynote at Disney Tech ConferenceDaniel Jacobson
 
Techniques for Scaling the Netflix API - QCon SF
Techniques for Scaling the Netflix API - QCon SFTechniques for Scaling the Netflix API - QCon SF
Techniques for Scaling the Netflix API - QCon SFDaniel Jacobson
 
APIs for Internal Audiences - Netflix - App Dev Conference
APIs for Internal Audiences - Netflix - App Dev ConferenceAPIs for Internal Audiences - Netflix - App Dev Conference
APIs for Internal Audiences - Netflix - App Dev ConferenceDaniel Jacobson
 
Netflix API : BAPI 2011 Presentation : SF
Netflix API : BAPI 2011 Presentation : SFNetflix API : BAPI 2011 Presentation : SF
Netflix API : BAPI 2011 Presentation : SFDaniel Jacobson
 
The future-of-netflix-api
The future-of-netflix-apiThe future-of-netflix-api
The future-of-netflix-apiDaniel Jacobson
 
NPR Presentation at Wolfram Data Summit 2010
NPR Presentation at Wolfram Data Summit 2010NPR Presentation at Wolfram Data Summit 2010
NPR Presentation at Wolfram Data Summit 2010Daniel Jacobson
 
NPR: Digital Distribution Strategy: OSCON2010
NPR: Digital Distribution Strategy: OSCON2010NPR: Digital Distribution Strategy: OSCON2010
NPR: Digital Distribution Strategy: OSCON2010Daniel Jacobson
 
NPR's Digital Distribution and Mobile Strategy
NPR's Digital Distribution and Mobile StrategyNPR's Digital Distribution and Mobile Strategy
NPR's Digital Distribution and Mobile StrategyDaniel Jacobson
 
NPR API Usage and Metrics
NPR API Usage and MetricsNPR API Usage and Metrics
NPR API Usage and MetricsDaniel Jacobson
 
OpenID Adoption UX Summit
OpenID Adoption UX SummitOpenID Adoption UX Summit
OpenID Adoption UX SummitDaniel Jacobson
 

More from Daniel Jacobson (13)

Maintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix APIMaintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix API
 
Scaling the Netflix API - OSCON
Scaling the Netflix API - OSCONScaling the Netflix API - OSCON
Scaling the Netflix API - OSCON
 
Netflix API: Keynote at Disney Tech Conference
Netflix API: Keynote at Disney Tech ConferenceNetflix API: Keynote at Disney Tech Conference
Netflix API: Keynote at Disney Tech Conference
 
Techniques for Scaling the Netflix API - QCon SF
Techniques for Scaling the Netflix API - QCon SFTechniques for Scaling the Netflix API - QCon SF
Techniques for Scaling the Netflix API - QCon SF
 
APIs for Internal Audiences - Netflix - App Dev Conference
APIs for Internal Audiences - Netflix - App Dev ConferenceAPIs for Internal Audiences - Netflix - App Dev Conference
APIs for Internal Audiences - Netflix - App Dev Conference
 
Netflix API : BAPI 2011 Presentation : SF
Netflix API : BAPI 2011 Presentation : SFNetflix API : BAPI 2011 Presentation : SF
Netflix API : BAPI 2011 Presentation : SF
 
The future-of-netflix-api
The future-of-netflix-apiThe future-of-netflix-api
The future-of-netflix-api
 
NPR Presentation at Wolfram Data Summit 2010
NPR Presentation at Wolfram Data Summit 2010NPR Presentation at Wolfram Data Summit 2010
NPR Presentation at Wolfram Data Summit 2010
 
NPR: Digital Distribution Strategy: OSCON2010
NPR: Digital Distribution Strategy: OSCON2010NPR: Digital Distribution Strategy: OSCON2010
NPR: Digital Distribution Strategy: OSCON2010
 
NPR's Digital Distribution and Mobile Strategy
NPR's Digital Distribution and Mobile StrategyNPR's Digital Distribution and Mobile Strategy
NPR's Digital Distribution and Mobile Strategy
 
NPR API Usage and Metrics
NPR API Usage and MetricsNPR API Usage and Metrics
NPR API Usage and Metrics
 
OpenID Adoption UX Summit
OpenID Adoption UX SummitOpenID Adoption UX Summit
OpenID Adoption UX Summit
 
NPR : Examples of COPE
NPR : Examples of COPENPR : Examples of COPE
NPR : Examples of COPE
 

Recently uploaded

UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 

Recently uploaded (20)

UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 

Top 10 Lessons Learned from the Netflix API - OSCON 2014

Editor's Notes

  1. The lessons that we discuss in these slides fall into two buckets: strategy and implementation.
  2. In some cases, the audience will be a small set of known developers (SSKDs). These developers are generally engineers within your company or one with whom you are partnering.
  3. In other cases, the audience may be a large set of unknown developers. This audience is typically associated with public APIs.
  4. And in some cases, the API will target both audience types.
  5. This is a short list of the things that the target audience will influence.
  6. For Netflix, we started out with a public API, with the audience being a large set of unknown developers. There were no internal use cases at launch.
  7. Based on the target audience of unknown developers, we staffed accordingly. The team was relatively small, with skills around development, evangelism, partnering, testing and documentation.
  8. As streaming became more critical to the company, we started having devices use the API. Our first mistake was that we were probably too late to pivot our architecture based on our change in target audience. At the time, we had many devices call into our REST API, the same one that we used for the unknown developers.
  9. But eventually, the data demonstrated that the architectural change was needed. This chart shows that the private API completely drarfs the public API in terms of requests. The private API does about five billion requests per day while the public API does between one and two million. This disparity clearly demonstrates the need for us to target the API to the small set of known developers – Netflix’s UI engineers – who build the vast majority of the experiences on Netflix devices.
  10. Given the shift in responsibilities, we positioned the team accordingly, hiring for skills mostly around engineering.
  11. And the team size grew by about 6x in the last few years. If the target audience was still the public API, it is likely that the team size would have grown, but less significantly (perhaps 2x) in that time frame.
  12. API consumers care a lot about data formatting and delivery, but each consumer, in such a diverse ecosystem, cares about them differently. For some devices, they may want an XML payload delivered as a complete document, while others may need JSON, protobuffer or some other format, potentially delivered as streamed bits. Because of these diverse needs, we need to separate out the concerns to better enable the consumers to get what they need.
  13. Most companies focus on a small handful of device implementations, most notably Android and iOS devices.
  14. At Netflix, we have more than 1,000 different device types that we support. Across those devices, there is a high degree of variability. As a result, we have seen inefficiencies and problems emerge across our implementations. Those issues also translate into issues with the API interaction.
  15. For example, screen size could significantly affect what the API should deliver to the UI. TVs with bigger screens that can potentially fit more titles and more metadata per title than a mobile phone. Do we need to send all of the extra bits for fields or items that are not needed, requiring the device itself to drop items on the floor? Or can we optimize the deliver of those bits on a per-device basis? Different devices have different controllers as well. Some, like the iPad, allow for fast swipe interactions so the content needs to be there for the entire row. Other devices, like smart TVs or game some game consoles have LRUD controllers, so it at least gives the opportunity to fetch the data as the row gets navigated. And the technical capabilities of the devices will influence the interactions as well. Some have more computing power or memory which will influence how much data you can process on the device vs. how much needs to be gathered in real-time.
  16. We evolved our discussion towards what ultimately became a discussion between resource-based APIs and experience-based APIs.
  17. The original one-size-fits-all API was very resource oriented with granular requests for specific data, delivering specific documents in specific formats.
  18. The interaction model looked basically like this, with (in this example) the PS3 making many calls across the network to the OSFA API. The API ultimately called back to dependent services to get the corresponding data needed to satisfy the requests.
  19. We have decided to pursue an experience-based approach instead. Rather than making many API requests to assemble the PS3 home screen, the PS3 will potentially make a single request to a custom, optimized endpoint.
  20. In an experience-based interaction, the PS3 can potentially make a single request across the network border to a scripting layer (currently Groovy), in this example to provide the data for the PS3 home screen. The call goes to a very specific, custom endpoint for the PS3 or for a shared UI. The Groovy script then interprets what is needed for the PS3 home screen and triggers a series of calls to the Java API running in the same JVM as the Groovy scripts. The Java API is essentially a series of methods that individually know how to gather the corresponding data from the dependent services. The Java API then returns the data to the Groovy script who then formats and delivers the very specific data back to the PS3.
  21. Our original REST API had granular endpoints and generic interaction models. This leads to different versions when significant changes are made. The REST API had three primary version before our move to the experience-based API.
  22. If we persisted in the REST API, we very likely could have continued to add versions while needing to support the old ones. The need to support prior versions stems from older device implementations that may not be able to updated or retired, thus forcing us to maintain these endpoints for a long time (perhaps as long as 10 years).
  23. Our target with the experience-based API was to build an architecture that allowed us to be versionless. Through SSKDs, separation of concerns, abstraction layers, and interaction optimizations, we are able move to a deprecation model.
  24. The primary goal is to limit versioning in the device-to-server interaction. Ideally, we can deprecate effectively in the server interactions as well, but that is sometimes more difficult. Back to our architecture view, the data can now flow from the services into the Java APIs. We expose granular methods (think data elements rather than resources) to the scripting tier. If a method needs to change, we can add a new method and then work closely with the SSKDs to migrate the calling scripts, enabling us to deprecate the old method. If we are not able to move the scripts, we can insulate the devices from the change either in the Java layer or in the scripting tier.
  25. Several years ago, we were deploying changes roughly every two weeks. We would accumulate changes over that time and then drop them into production all at once. Think of it as gathering water in a bucket.
  26. What we found was that our releases were unpredictable, sometimes resulting in outages, broken functionality, or incomplete work. Accordingly, we decided to slow down, changing our release cycles to three weeks. We figured that would give us more time to test our work. In other words, we got a larger bucket.
  27. Over time, however, we learned that the longer release cycle didn’t improve predictability or quality. Instead, it just slowed us down. In response, we moved aggressively towards continuous delivery. Instead of delivering water in buckets, we had a steady stream of water from a hose. This enabled us to have smaller changes, more isolated and testable, pushed to production instead of having bigger releases with more complexity.
  28. This is how code flows through the system. We have multiple canary releases per day. Internal envs are deployed ~8 times/day in 3 AWS regions. Prod deployments happen 2-3 times/week and can be triggered on demand.
  29. This dashboard lets us track the status of our master branch at any time. Builds that fail at any step in the pipeline are stopped from going further.
  30. A quick word on Testing. We follow the ‘Operate what you Build’ model where developers are responsible for shepherding their changes all the way through to production. We provide them with the tools necessary to help them gain confidence in the quality of their code. One such tool is the automated Canary Analyzer.
  31. Canary Analysis is the process wherein a small percent of traffic is routed to the new code and its performance is compared against the old code based on 1000s of metrics.
  32. A detailed report gives further insight into potential problem areas. In this case, our canary gives a score of 87%, which means it is likely not ready for release.
  33. In tandem with canaries, we use Red/Black deployments as well.
  34. The Red/Black process allows us to run production code in one cluster while we spin up the new code in a second one. As the new code proves itself, we can route all traffic to it and eventually shut down the old. It also allows us to have a fast, automated rollback in the even that the new code is seeing problems.
  35. Our architecture enables us to move faster because of the scripting tier. But this also put us in position to help our consuming teams and dependency teams to move faster as well.
  36. Let’s peek under the hood of the API Server. Client teams deploy endpoints dynamically based on their own schedule. Their cycles are completely asynchronous of server deployments. Newly deployed endpoints are live and ready to take traffic within minutes.
  37. Endpoint Activity Dashboard shows recent deployment activity. Rollbacks can be performed in a matter of minutes as well.
  38. Our dependent services provide to us client libraries that get compiled into our JVM upon deployment. These libraries typically expose static interfaces, which means changes to the interfaces require coding and deployments without our contain. Similar to the dynamic endpoints, we also have opportunity to improve the nimbleness and velocity around these libraries.
  39. One such improvement is dependency canaries, where we are evaluating our new code against the dependencies. This is a dashboard the provides insights into these canaries.
  40. Making the interaction with the consumers of the API dynamic has led to increased agility on the UI side. We are also exploring ways to increase the speed of iteration on the dependencies side. The current interaction model uses static domain models and client libraries to handle the data flow through the API. This results in long iteration cycles for even the simplest of use cases. We are actively pursuing an approach where our dependencies will be able to expose new data by using dynamic pass-through model using a Dictionary of key values.
  41. The idea is that this model will avoid the static update cycle on the API end, thereby resulting in shorter iteration cycles. This will require investment in things like safety checks and discoverability of the API. We are instrumenting the API layer to inspect traffic at runtime and provide insights into API usage.
  42. One of the early mistakes that we made in this new architecture was not treating internal developers like we did public developers. We don’t need the same degree of evangelism, but we do need to maintain strong communications with the client teams while providing robust tools and systems to help them be better developers in our system. An example of us being late to this is represented by our endpoint dashboard. One of our teams went from having about 30 scripts to about 500 in a matter of weeks. Each of these scripts are dynamically compiled into the JVM, occupying permgen space. As the script count shot up, we hit limits in our permgen which resulted in an outage. And an outage in our layer means people cannot stream Netflix. Of course, there is nothing like an outage to kickstart new behaviors. As a result, we immediately set up alerts and then focused more heavily on building tools to support the developers.
  43. Included in that effort is comprehensive documentation.
  44. We built an array of tools as well, including this REPL.
  45. And prepared frequent trainings and videos.
  46. Nobody has a 100% SLA, so things will fail
  47. In fact, a few years ago, we have many failures on a routine basis.
  48. Many of those failures were a result of failures in a dependent service that we did a poor job of protecting against. Because we are the last step before delivering content to the customers, we have a unique opportunity to help protect customers from such failures.
  49. Hystrix allows us to be resilient to failure by implementing the bulk-heading and circuit breaker patterns. Hystrix is open source and available at our github repository.
  50. Failure Simulation and Game Day exercises are a key part of the overall story. The Simian army is a fleet of monkeys who are simulate failures and alert us to non-conformities in an automated manner. Chaos Monkey periodically terminates AWS instances in production to see how the system responds to the instance disappearing. Latency Monkey introduces latencies and errors into a service to see how it responds and lets us assess the customer quality of experience. Conformity monkey alerts us to variations in versions of application across regions. The monkeys are also available in our open source github repository
  51. Because of our pivot to the private API and the explosion of devices consuming it, our traffic grew tremendously in a few years (and continues to grow at very fast rates). Scaling our systems to support this growth is absolutely critical to the success of the company. Techniques, such as throttling are not an option because that only serves to limit the interactions from our streaming subscribers. Instead, we need to be able to handle any load that our devices throw at us. This manifests in many ways, but the following is a detail on one of them – instance scaling.
  52. Let’s go back to the traffic chart. The pattern is predictable with higher peaks on the weekends
  53. To offset these limitations, we created Scryer (not yet open sourced, but in production at Netflix). Scryer evaluates needs based on historical data (week over week, month over month metrics), adjusts instance minimums based on algorithms, and relies on Amazon Auto Scaling for unpredicted events
  54. This graph shows that Scryer’s predictions are in line with actual RPS. In production, Scryer allows us to get instances into production prior to the need (which is different than Amazon’s reactive autoscaling engine which triggers the ramp up based on immediate need, only needing to wait until server start-up is complete). Because the instances are there in advance, Scryer smooths out load averages and response times, which in turn improves the customer experience.
  55. This is an example of what Scryer looks like during an outage. When actual traffic dropped because of an outage, the reactive autoscaling engine would have downsized the farm. In this case, Scryer kept the farm sized correctly so that we were able to deal with the traffic spike after the recovery.
  56. As a side benefit (not the initial intent), Scryer also allows us to be more precise with our instance counts, reducing inefficiencies.