27. “Finding more of anything never stays
a problem long on the web...
What is hard is finding less, but
ordered by quality, relevance or
urgency.”
- Clay Shirky
32. some api notes
•have to be fast: few hundred milliseconds
•returning full-text feeds for offline caching
•but feeds can be 300K+ (not great for 3/4G networks)
•gzip! (400K down to 80K)
•balancing speed with utility (again, for offline caching)
33. news.me backend (ec2 + s3)
content from:
kinda big content data store™
aggregator-1
mongo
aggregator-2 mongo
redis
aggregator-n redis
34. some aggregation notes
•aggregating ~30 million tweets and fb shares (with links) per day
•the streams move pretty fast. had some growing pains with mongo
•kinda big content data store™ = article content and metadata stored in
memcache + multiple ec2 instances + s3
•kinda big content data store™ = over 1 TB of content data
•auto scale the aggregators when we need them - helps keep the cost
down :-)
35. why use aws?
easy to scale
focus on product
pay for what you need
community (all of Betaworks companies are on
AWS)
37. a few tips...
decouple everything (makes dealing with failure
and scaling easier)
measure everything!
keep it lean - one person
don’t reinvent the wheel