This document discusses key metrics and challenges around Twitter's massive scale. It notes that Twitter now sees 70 million tweets per day (800 tweets per second), with 8 terabytes of data generated daily. It faces challenges in supporting real-time delivery at this scale across half the world's population, while also enabling complex queries, relevance systems, and analytics on its massive distributed graph database. The document invites the reader to follow the author on Twitter to learn more about scaling challenges.
2. witter by the
ing a t alk entitled “T
Giv
@twit terU a t Cal.
Number s” with
Lots o f ##s! ne
Tw itter for iPho
1m inute ago via
on
Retweet ed by 1 pers
11. MySQL
Can’t generate IDs fast enough
Centralized and a single point of failure
snowflake
Highly available and uncoordinated (10kqps)
Compatible with the ecosystem
http://github.com/twitter/snowflake
12. ampura
Commons from ch
der Creative
Photo used un
1 TB generated 8 TB generated
per day per day
13. 8 TB
per day
in total
≈ 100 MB
per sec
Photo used u
nder Creative C
ommons from
Mac Users G
uide
= 80 MB
per sec
14. Where do they go?
Followed by
Following
Asymmetric Digraph
18. Photo used under Creative Commons from jurvetson
Distributed graph database
flockdb High rate of CRUD operations
Complex set arithmetic queries
http://github.com/twitter/flockdb
19. @ladygaga
mother mons†er
6.1 million followers
@BarakObama
44th President of the United States
5.3 million followers
@justinbieber
Justin Bieber
5.1 million followers
@raffi
me!
4.1 thousand followers
20. How do they get out?
6B API calls
per day ≈ 70,000 calls
per second
21. REST API
XML/JSON API over HTTP
Poll-based system / pseudo real-time
hosebird
Streaming API
Long poll HTTP
Near real-time delivery of Tweets
24. Where do we want to be?
Today - 150M people generate ~1000 TPS
Tomorrow - we want to support half the world and all its devices
(5B phones and 6B people)
25. Real challenges in front of us
Real time
Indexing, search, and analytics
Relevance systems
Graph databases
Storage
Scalability and efficiency