SlideShare a Scribd company logo
1 of 55
Decomposing Twitter
QConNY 2013
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/twitter-soa
Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Tech Lead of Tweet Service Team
Backstory
⇢Exponential growth in traffic
⇢Cost of failure has gone up
⇢Success rate constantly improving
⇢Company has grown 10x over the last 3yrs
Tweets Per Day
20132006 2010
400,000,000
200,000,000
Site Success Rate
20132008 2010
100%
99._%
World Cup
“off the monorail”
not a lot of traffic
Routing Presentation Logic Storage
Monorail MySQL
Challenges
⇢storage I/O bottlenecks
⇢poor concurrency, runtime performance
⇢brittle
⇢too many cooks in the same kitchen
⇢lack of clear ownership
⇢leaky abstractions / tight-coupling
Goals of SOA
⇢not just a web-stack
⇢isolate responsibilities and concerns
⇢site speed
⇢reliability, isolate failures
⇢developer productivity
API Boundaries
Routing Presentation Logic Storage
Monorail
MySQL
Tweet
Flock
Redis
Memcache
Cache
Routing Presentation Logic Storage
MySQL
Tweet Store
Flock
Redis
Memcached
Cache
TFE
Monorail
Tweet
Service
User Service
Timeline
Service
SocialGraph
Service
DirectMessa
ge Service
User Store
API
Web
Search
Feature-X
Feature-Y
HTTP THRIFT THRIFT*
Finagle
⇢service discovery
⇢load balancing
⇢retrying
⇢thread/connection pooling
⇢stats collection
⇢distributed tracing
Future[T]
⇢async, non-blocking IO
⇢fixed number of threads
⇢very high levels of concurrency
Tweet Service
Writes
⇢daily: > 400 MM tweets
⇢avg: > 5000 TPS
⇢daily peak: 9000 TPS
⇢max peak: 33,388 TPS
Tweet Service
Reads
⇢daily: 350 Billion tweets
⇢avg: 4 MM TPS
⇢median latency: 7 ms
⇢p99 latency: 120 ms (bulk requests)
⇢server count: low hundreds
⇢cache: ~ 15 TB
user metadata
mentioned-user metadata
url metadata
media metadata
card metadata
geo metadata
counts
mentions urls
pctd
rt/rp/fv
perspectivals
client-appcards
conversati
spam
language
geo
media
reply/
Storage
Memcached
CacheTweet
Service
Upstream Services
Url Service
Media
Cards
Geo
Spam
User Service
Language
Service
Tweets TFlock MySQL
Snowflake FirehoseSearch
Write-Only
Mail
Timeline
Service
Tweet
Reads/
Writes
Tweet
Service
Url Service
Media
Geo
Spam
User Service
Language
MySQL
HTTP API
Snowflake
Tweet Writes Compose
TFlock
Timeline
Tweet Store
Memcached
Hosebird
Search
Mail
Replication
Store/Deliver
Tweet
Service
TFlock
Timeline
Tweet Store
Memcached
Store
Hosebird
Search
Mail
Replication
Deliver
DeferredR
PC
Timeline
Fanout
Tweet
Service
Fanout
Redis
Redis
Redis
Social
Graph
Service
Timeline Cache
Hosebird
Tweet
Service
Streaming
Social
Graph
Hosebird
Firehose
Firehose
Firehose
Hosebird
User Stream
User Stream
User Stream
Hosebird
Track/Follow
Track/Follow
Track/Follow
Top 3 Lessons Learned
Lessons Learned #1
Incremental change eventually wins
Step 1: Make the smallest possible change

Step 2: Verify/tweak

Step 3: GOTO Step 1
Deploy Often and Incrementally
⇢Keep the changes in each deploy small
⇢Validate on a canary server
Enabling a New Feature
⇢Dynamic configuration
⇢Deploy with feature turned off
⇢Turn on to 1%, verify, increment
Bringing up a New Service
⇢Find something small to break off
⇢Validate correctness and capacity
⇢Slowly turn up
⇢Repeat
Tweet Service Read-Path
⇢Reads from secondary services
⇢Reads from new API services
⇢Reads from Monorail
Tweet Service Write-Path
⇢Moved store/deliver effects one-by-one
⇢Moved compose phase
⇢Switched from Monorail to new API service
Moving an Effect
⇢Decide in calling service
⇢Pass flag with RPC call
⇢Slowly turn up
If Reliability Matters...
Make big changes in small steps
Top 3 Lessons Learned
⇢1) Incremental change eventually wins
⇢
⇢
Lessons Learned #2
Integration testing is hard
Scenario #1
Testing a change within a single service
Difficulty: Easy to Medium
Tap Compare
TFE
Producti Dark
Log
Capture and Replay
Client
Producti
Staging
Scribe Iago Log
Forked Traffic
Client
Producti StagingForward
Verifier
Dev
Scenario #2
Testing a change that spans multiple services
Difficulty: Medium to Hard
Challenges
⇢Need to launch multiple services with change
⇢Need to point those services at each other
⇢A single service may have many dependencies (memcache,
kestrel, MySQL)
⇢Can’t run everything on one box
Testing Changes
API
Logic
Storage
Logic
Storage Storage
Easy
Hard
Hardest
Unsolved Problem
Manual and tedious process
Moral of the story
Plan and build your integration testing framework
early
Top 3 Lessons Learned
⇢1) Incremental change eventually wins
⇢2) Plan integration testing strategy early
⇢
Lessons Learned #3
Failure is always an option
More Hops = More Failures
Retries can be tricky
⇢Dogpile!
⇢Use a library like Finagle if possible
⇢Route around problems
⇢Idempotency is critical
Plan for entire rack failure
⇢Over provision
⇢Mix services within rack
⇢Test this scenario in production
Degrade gracefully
⇢Avoid SPOFs
⇢Overprovision, add redundancy
⇢Have a fallback strategy
⇢Dark-mode a feature instead of total failure
Partial Failures
⇢More common
⇢Harder to handle
⇢Harder to imagine
If Reliability Matters...
Spend most of your time preparing for failures
Top 3 Lessons Learned
⇢1) Incremental change eventually wins
⇢2) Integration testing is hard
⇢3) Failure is always an option
@JoinTheFlock

More Related Content

More from C4Media

More from C4Media (20)

Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Decomposing Twitter: Adventures in Service-Oriented Architecture