SlideShare a Scribd company logo
1 of 46
The human side of services
John Billings, Joseph Lynch
{billings,jlynch}@yelp.com
Yelp’s Mission:
Connecting people with great
local businesses.
Yelp Stats:
As of Q3 2015
89M 3271%90M
What’s Important?
What’s Important?
What’s Really Important?
What’s Really Important?
Getting Started
2005 2016
yelp-main
0
3,000,000
LoC
2011: A service is born
2011 - 2016: A Cambrian
Explosion of services
2011 - 2016: New practices
Education
Why?
“There are only two hard problems in distributed
systems:
2. Exactly-once delivery
1. Guaranteed order of messages
2. Exactly once delivery”
● Mathias Varraes
How?
How?
How?
Consistency
Tooling
"/pets": {
"get": {
"description": "Returns all pets from the system",
"responses": {
"200": {
"description": "A list of pets.",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/Pet"
}
}
}
}
}
}
Interface design
Organizational
Objectives
Org Objectives
Distributed System Objectives
● Performance
● Security
● Stability
● Cost
These are cross cutting objectives that are crucial in a
SOA.
Org Objectives
No problem …
● Performance is important!
● Security is important!
● Stability is important!
● Cost is important!
… all done right?
Org Objectives
What gets measured gets managed
- Drucker
Good Objective Metric: Performance
Good Objective Metric: Security
Good Objective Metric: Reliability
Good Objective Metric: Cost
Service level objectives
● Contracts are cool
● Performance is cool
● Uptime is cool
● Keeping cost low is cool
… So write these things down in a Contract and page
owners when we violate them?
Operating Services
Ownership
Dealing with Failure
Microservice Pitfalls
(from those that have fallen)
Lots of Flowers
“Let 1000 flowers
bloom. Then rip 999 of them
out by the roots”
- Peter Seibel
Append Only Services
Service Service Service ...
Service Service Service ...
Service Service Service ...
Ditching Libraries
●Libraries can be pretty terrible
– Break tests
– Deploy 20 versions
– Function calls work only in ${language}
– Bugs can take weeks to fix
– Tests can take a long time
Service
Library
Ditching Libraries
Service
Library
That’s not a SOA
… that’s just RPC
Ditching Libraries
●Libraries can be pretty awesome
– Break tests not websites
– Deploy 20 versions
– Function calls are wicked fast
– Have weeks to fix a bug
– Unit tests are fast
Ditching Libraries
Everybody does Ops?
Dev vs Ops = Dev frustration, Ops burnout
Dev + Ops = ?
● Not all Devs can (want to) do Ops
● Not all Ops can (want to) do Dev
● What about?
○ DBAs
○ Security Engineers
○ Designers
Everybody does Ops?
What to Aim For Instead?
1.Encourage cooperation
2.Acknowledge your engineers have varied
skills: {Ops, Dev, Security, Databases, Design,
Frontend, API design, Performance, etc …}
3.Try to build teams that have a wide range of
skills
Any questions?
Image Citations
● Sailing Ship: https://en.wikipedia.org/wiki/Sailing_ship
● Programmer with Laptop:
https://commons.wikimedia.org/wiki/File:Typing_computer_screen_reflection.jpg
● Moore’s Law:
https://en.wikipedia.org/wiki/Moore%27s_law#/media/File:Transistor_Count_and_Moore%2
7s_Law_-_2011.svg
● Cambrian explosion:
https://en.wikipedia.org/wiki/Devonian#/media/File:Fish_evolution.png
● Bus queue: https://en.wikipedia.org/wiki/File:Bus_Queue.jpg
● Deputy badge:
https://commons.wikimedia.org/wiki/File:Badge_of_the_San_Diego_County_Sheriff%27s_De
partment.png
Image Citations
● Deep dive:
https://en.wikipedia.org/wiki/Deep_diving#/media/File:Trevor_Jackson_returns_from_SS_Ky
ogle.jpg
● Map: https://commons.wikimedia.org/wiki/File:Carta_Marina_AB_stitched.jpg
● AWS Total Cost of Onwership: https://aws.amazon.com/blogs/aws/the-new-aws-tco-
calculator/
● Sharing milkshake:
https://commons.wikimedia.org/wiki/File:Children_sharing_a_milkshake.jpg
● Field of flowers: TODO

More Related Content

What's hot

Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Nick Galbreath
 
Faster Secure Software Development with Continuous Deployment - PH Days 2013
Faster Secure Software Development with Continuous Deployment - PH Days 2013Faster Secure Software Development with Continuous Deployment - PH Days 2013
Faster Secure Software Development with Continuous Deployment - PH Days 2013
Nick Galbreath
 

What's hot (20)

DOES SFO 2016 San Francisco - Julia Wester - Predictability: No Magic Required
DOES SFO 2016 San Francisco - Julia Wester - Predictability: No Magic RequiredDOES SFO 2016 San Francisco - Julia Wester - Predictability: No Magic Required
DOES SFO 2016 San Francisco - Julia Wester - Predictability: No Magic Required
 
Mobile Testing at Gilt
Mobile Testing at GiltMobile Testing at Gilt
Mobile Testing at Gilt
 
DevSecCon Singapore 2018 - Pushing left like a boss by Tanya Janca
DevSecCon Singapore 2018 - Pushing left like a boss by Tanya JancaDevSecCon Singapore 2018 - Pushing left like a boss by Tanya Janca
DevSecCon Singapore 2018 - Pushing left like a boss by Tanya Janca
 
Handling Changes to Your Server-Side Data Model
Handling Changes to Your Server-Side Data ModelHandling Changes to Your Server-Side Data Model
Handling Changes to Your Server-Side Data Model
 
Pick Any Three: Good, Fast, or Safe - Devops from Scratch
Pick Any Three: Good, Fast, or Safe - Devops from ScratchPick Any Three: Good, Fast, or Safe - Devops from Scratch
Pick Any Three: Good, Fast, or Safe - Devops from Scratch
 
iOS Testing With Appium at Gilt
iOS Testing With Appium at GiltiOS Testing With Appium at Gilt
iOS Testing With Appium at Gilt
 
Datadog + VictorOps Webinar
Datadog + VictorOps WebinarDatadog + VictorOps Webinar
Datadog + VictorOps Webinar
 
DevSecCon Singapore 2018 - Insecurity in information technology by Tanya Janca
DevSecCon Singapore 2018 - Insecurity in information technology by Tanya JancaDevSecCon Singapore 2018 - Insecurity in information technology by Tanya Janca
DevSecCon Singapore 2018 - Insecurity in information technology by Tanya Janca
 
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
 
DevOps and the Bottom Line
DevOps and the Bottom Line DevOps and the Bottom Line
DevOps and the Bottom Line
 
6 Guidelines for A/B Testing
6 Guidelines for A/B Testing6 Guidelines for A/B Testing
6 Guidelines for A/B Testing
 
No estimates - 10 new principles for testing
No estimates  - 10 new principles for testingNo estimates  - 10 new principles for testing
No estimates - 10 new principles for testing
 
Translating Tester-Speak Into Plain English: Simple Explanations for 8 Testin...
Translating Tester-Speak Into Plain English: Simple Explanations for 8 Testin...Translating Tester-Speak Into Plain English: Simple Explanations for 8 Testin...
Translating Tester-Speak Into Plain English: Simple Explanations for 8 Testin...
 
Shifting left – embedding security into the devops pipeline by Mike d. Kail
Shifting left – embedding security into the devops pipeline by Mike d. KailShifting left – embedding security into the devops pipeline by Mike d. Kail
Shifting left – embedding security into the devops pipeline by Mike d. Kail
 
8 Blind Spots Often Overlooked When Testing on Mobile
8 Blind Spots Often Overlooked When Testing on Mobile8 Blind Spots Often Overlooked When Testing on Mobile
8 Blind Spots Often Overlooked When Testing on Mobile
 
Just4Meeting 2012 - How to protect your web applications
Just4Meeting 2012 -  How to protect your web applicationsJust4Meeting 2012 -  How to protect your web applications
Just4Meeting 2012 - How to protect your web applications
 
Faster Secure Software Development with Continuous Deployment - PH Days 2013
Faster Secure Software Development with Continuous Deployment - PH Days 2013Faster Secure Software Development with Continuous Deployment - PH Days 2013
Faster Secure Software Development with Continuous Deployment - PH Days 2013
 
Підтримка легасі-платформи. Погляд менеджера
Підтримка легасі-платформи. Погляд менеджераПідтримка легасі-платформи. Погляд менеджера
Підтримка легасі-платформи. Погляд менеджера
 
From monitoring to automated testing, Jesse Reynolds, Puppet
From monitoring to automated testing, Jesse Reynolds, PuppetFrom monitoring to automated testing, Jesse Reynolds, Puppet
From monitoring to automated testing, Jesse Reynolds, Puppet
 
The Thinking Tester, Evolved
The Thinking Tester, EvolvedThe Thinking Tester, Evolved
The Thinking Tester, Evolved
 

Similar to Microservices Summit - The Human Side of Services

Dev ops lessons learned - Michael Collins
Dev ops lessons learned  - Michael CollinsDev ops lessons learned  - Michael Collins
Dev ops lessons learned - Michael Collins
Devopsdays
 
Self-Service Operations: Because Ops Still Happens
Self-Service Operations: Because Ops Still HappensSelf-Service Operations: Because Ops Still Happens
Self-Service Operations: Because Ops Still Happens
Rundeck
 

Similar to Microservices Summit - The Human Side of Services (20)

2013 Velocity DevOps Metrics -- It's Not Just For WebOps Any More!
2013 Velocity DevOps Metrics -- It's Not Just For WebOps Any More!2013 Velocity DevOps Metrics -- It's Not Just For WebOps Any More!
2013 Velocity DevOps Metrics -- It's Not Just For WebOps Any More!
 
Dev ops lessons learned - Michael Collins
Dev ops lessons learned  - Michael CollinsDev ops lessons learned  - Michael Collins
Dev ops lessons learned - Michael Collins
 
Self-Service Operations: Because Ops Still Happens
Self-Service Operations: Because Ops Still HappensSelf-Service Operations: Because Ops Still Happens
Self-Service Operations: Because Ops Still Happens
 
DevOps Enterprise Summit 2016
DevOps Enterprise Summit 2016DevOps Enterprise Summit 2016
DevOps Enterprise Summit 2016
 
Turning Human Capital into High Performance Organizational Capital
Turning Human Capital into High Performance Organizational CapitalTurning Human Capital into High Performance Organizational Capital
Turning Human Capital into High Performance Organizational Capital
 
Why other ppl_dont_get_it
Why other ppl_dont_get_itWhy other ppl_dont_get_it
Why other ppl_dont_get_it
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
 
Evolving Architecture and Organization - Lessons from Google and eBay
Evolving Architecture and Organization - Lessons from Google and eBayEvolving Architecture and Organization - Lessons from Google and eBay
Evolving Architecture and Organization - Lessons from Google and eBay
 
SRE Topics with Charity Majors and Liz Fong-Jones of Honeycomb
SRE Topics with Charity Majors and Liz Fong-Jones of HoneycombSRE Topics with Charity Majors and Liz Fong-Jones of Honeycomb
SRE Topics with Charity Majors and Liz Fong-Jones of Honeycomb
 
MVP to MLP - Minimum Lovable Product
MVP to MLP - Minimum Lovable ProductMVP to MLP - Minimum Lovable Product
MVP to MLP - Minimum Lovable Product
 
DevOps by the Numbers - How to Approach the Measurement and Metrics of Your C...
DevOps by the Numbers - How to Approach the Measurement and Metrics of Your C...DevOps by the Numbers - How to Approach the Measurement and Metrics of Your C...
DevOps by the Numbers - How to Approach the Measurement and Metrics of Your C...
 
All daydevops 2016 - Turning Human Capital into High Performance Organizati...
All daydevops   2016 - Turning Human Capital into High Performance Organizati...All daydevops   2016 - Turning Human Capital into High Performance Organizati...
All daydevops 2016 - Turning Human Capital into High Performance Organizati...
 
Summit 2014 Keynote
Summit 2014 KeynoteSummit 2014 Keynote
Summit 2014 Keynote
 
SaltConf14 - Justin Carmony, Deseret Digital Media - Teaching Devs About DevOps
SaltConf14 - Justin Carmony, Deseret Digital Media - Teaching Devs About DevOpsSaltConf14 - Justin Carmony, Deseret Digital Media - Teaching Devs About DevOps
SaltConf14 - Justin Carmony, Deseret Digital Media - Teaching Devs About DevOps
 
Moving Fast At Scale
Moving Fast At ScaleMoving Fast At Scale
Moving Fast At Scale
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
When UX (guy) Meets Operations
When UX (guy) Meets OperationsWhen UX (guy) Meets Operations
When UX (guy) Meets Operations
 
Back to Basics: Reporting 101
Back to Basics: Reporting 101Back to Basics: Reporting 101
Back to Basics: Reporting 101
 
Oracle Discoverer is dead - Where to next for BI?
Oracle Discoverer is dead - Where to next for BI?Oracle Discoverer is dead - Where to next for BI?
Oracle Discoverer is dead - Where to next for BI?
 
Lessons Learned from Large Scale Adoption of DevOps for IBM z Systems Software
Lessons Learned from Large Scale Adoption of DevOps for IBM z Systems SoftwareLessons Learned from Large Scale Adoption of DevOps for IBM z Systems Software
Lessons Learned from Large Scale Adoption of DevOps for IBM z Systems Software
 

More from Yelp Engineering

Optimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOEOptimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOE
Yelp Engineering
 

More from Yelp Engineering (14)

Human Ops
Human OpsHuman Ops
Human Ops
 
Teeing Up Python - Code Golf
Teeing Up Python - Code GolfTeeing Up Python - Code Golf
Teeing Up Python - Code Golf
 
Fluxx Streaming
Fluxx StreamingFluxx Streaming
Fluxx Streaming
 
Giving Design Critique
Giving Design CritiqueGiving Design Critique
Giving Design Critique
 
Building a World Class Security Team
Building a World Class Security TeamBuilding a World Class Security Team
Building a World Class Security Team
 
Humans by the hundred (DevOps Days Ohio)
Humans by the hundred (DevOps Days Ohio)Humans by the hundred (DevOps Days Ohio)
Humans by the hundred (DevOps Days Ohio)
 
Ensuring Consistency in a Replicated World
Ensuring Consistency in a Replicated WorldEnsuring Consistency in a Replicated World
Ensuring Consistency in a Replicated World
 
A Beginners Guide To Launching Yelp In Hong Kong
A Beginners Guide To Launching Yelp In Hong KongA Beginners Guide To Launching Yelp In Hong Kong
A Beginners Guide To Launching Yelp In Hong Kong
 
MySQL At Yelp
MySQL At YelpMySQL At Yelp
MySQL At Yelp
 
Own Your Career
Own Your CareerOwn Your Career
Own Your Career
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique Visitors
 
Optimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOEOptimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOE
 
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen..."Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
 
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E..."Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Microservices Summit - The Human Side of Services

Editor's Notes

  1. Approx. 89 million UMVs via mobile More than 90 million reviews contributed since inception Approx. 71% of all searches on Yelp came from mobile (mobile web & app) Yelp is present across 32 countries
  2. Here’s a cool picture showing exponential increase in complexity over time What does this have to do with services, you might ask?
  3. In the beginning, there were zero lines of code in yelp-main In 2016, there are about three million The problem is, it’s hard to scale up our release process as we keep adding code and developers What is our release process? Once your branch is code reviewed, you submit your branch as a push request Three times a day, a push master grabs around 20 branches and pushes that code out to production So at most around 60 branches get released per day We needed an alternative approach...
  4. This was our first production service It didn’t do very much :) But it was a very useful testing ground for service technologies, as well as deployment, monitoring etc. We generalized it to become v1 of our service template Which then begot PaaSTA, our Platform as a Service
  5. In five years we saw an explosion of over 150 services Maybe we overshot the mark a little? :) Joey is going to talk more about this in a bit
  6. In order to get good at deploying services we’ve had to make lots of changes to the org It used to take several weeks to deploy a service, now it takes an hour or two We spread out operations responsibilities to minimize queuing This is a specific case of a more general one of distributing knowledge
  7. Programming the monolith is hard Programming a service oriented architecture is very hard A few weeks ago we had an issue in our task queues due to a kafka issue This caused massive duplication of some tasks e.g. 50x for some These duplicate tasks caused duplicate photos to appear in timelines :( Great example of why knowing about idempotency is important
  8. Service principles document Outlines what we think are the important things wrt design and operations Technology agnostic Service tutorial We use a cool program called dexy to script incremental service creation and display the output “Here the diff, here’s the output of the service when you apply the diff”
  9. Deputy programs There are some processes where you can cause a lot of damage if you do them wrong e.g. Making puppet changes, setting up new services So we really don’t want to hand the keys to new developers Solution: take one or two more senior engineers from each team and train them to do these things Every week we hold office hours Anyone from across the org can drop in and ask questions about services
  10. Deep dives Every Monday we have an engineering all-hands meeting As part of this, we have a deep dive where an engineer discusses something they’ve been working on Periodically use this to talk about some aspect of services Service Creation Form (SCF) documents the basics of your service Reviewed by a small group of more experienced engineers It’s a balancing act wrt process (goldilocks) In general, we’ve tried to disperse knowledge across the organisation instead Examples of areas covered by SCF: Load balancing, failure modes, caching Review process?
  11. In the monolith, you usually have just one language, one ORM, one database technology, one caching technology When we first went to services, everybody did their own thing: clojure, redis, thrift, couchdb Person-SPOF This is today’s map of the world Yours will probably look different Common set of ‘safe’, well-supported technologies You don’t *have* to use these, but if you don’t then you’re on your own...
  12. One thing that we have standardized on is HTTP/JSON Interface definition! Many (not all) services are using Swagger to define their interfaces Here’s an example of a partial swagger definition Especially successful for our internalapi service Previously: anything goes anywhere Now: Swagger spec for every new endpoint, all spec changes go out to reviewboard group
  13. Every single service has a per service endpoint Not just the website
  14. This is a service’s uptime + reliability, not the website
  15. Each team owns their own services Why? It’s a lot easier to assign responsibility if ownership is clear e.g. upgrade this library Ideally >= 2 people know about a service on a team Some services do effectively become unowned
  16. We use a JIRA project to track ongoing incidents Once resolved, enters into the postmortem status All postmortems go to all developers I like postmortems, but they do take quite a lot of work. Luckily Yelp is very supportive of these efforts Initially some of this was a bit of a struggle for teams not used to operations So we had to spread some of the operations best practices across the org
  17. Oncall, not everyone wants to be oncall Teams need Ops
  18. No dedicated DevOps teams, rather empower your existing developers to become a DevOps or a SecOps or a DevSec, but don’t expect them to be everything