Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS Start-Up Tour 2009 / ShareThis


Published on

ShareThis, AWS Start-Up Tour 2009, Sunnyvale

Published in: Technology
  • I have done a couple of papers through ⇒⇒⇒ ⇐⇐⇐ they have always been great! They are always in touch with you to let you know the status of paper and always meet the deadline!
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

AWS Start-Up Tour 2009 / ShareThis

  1. 1. ShareThis on AWS Paco Nathan, Data Insights AWS Start-Up Tour 2009-06-16
  2. 2. What Does ShareThis Do? • “Make it simple to share any online content” • Social content sharing platform • ESPN, FOX, CS Monitor, HuffPost, CBS Marketwatch, Wired, TechCrunch, ThinkGeek, etc. • When a news story goes viral on a major publisher, our sharing services must scale-out to keep pace AWS Start-Up Tour 2009-06-16
  3. 3. AWS Start-Up Tour 2009-06-16
  4. 4. Why Our Company Uses AWS • >10^6 publishers, >10^9 users, >10^10 urls • Early stage start-up, < 25 people, “wearing lots of hats”, ultra fast-paced R&D • Spikes in popular stories impose demands throughout the architecture: API services, loggers, DW, BI, etc. • How can this level of service be built 100% in the cloud? AWS Start-Up Tour 2009-06-16
  5. 5. AWS Start-Up Tour 2009-06-16
  6. 6. System Architecture • Each service designed for cost-effective, horizontal scale-out • API served by cluster of LAMP stack + cluster of NginX • AsterData: nCluster infrastructure “hub-and-spoke” pattern • Cascading: abstraction layer for tying together components • Batch jobs on Elastic MapReduce, AsterData SQL/MR • SQS, EBS, SimpleDB, MTurk, plus other AWS services AWS Start-Up Tour 2009-06-16
  7. 7. AWS Start-Up Tour 2009-06-16
  8. 8. Key Learnings • Capability to scale-out horizontally without having to recode, rebuild, etc. — add new EC2 nodes to clusters • Authoritative data + backups in S3, great approach for DR • Wide range of use cases implemented: widget API, log clean-up, vertical search, business intelligence, etc. • Developers launch their own sandbox instances — makes dev/test/debug cycles more efficient • Staff enabled to “wear even more hats” with less risk AWS Start-Up Tour 2009-06-16
  9. 9. Cascading + Elastic MapReduce AWS Start-Up Tour 2009-06-16
  10. 10. Cascading + Elastic MapReduce • “Syntax is for humans, APIs are for software” • Defines apps as set operations applied to data flows • Engineers & data scientists don’t think in terms of MapReduce primitives, key/value pairs, etc. • Integrates Hadoop API + other APIs (S3, SQS, JDBC) • Expresses end-points as Java design patterns, compiled code — not just a scramble of scripts AWS Start-Up Tour 2009-06-16
  11. 11. Cascading + Elastic MapReduce • Highly scalable, fault-tolerate framework for batch jobs • Dramatically reduced need for Ops overhead • Excellent command line tools make the dev/test/debug cycle very efficient with “Big Data” • Highly expert staff, very responsive and helpful in forums • Cascading example code in developer resources: “LogAnalyzer for CloudFront” and “Multitool” AWS Start-Up Tour 2009-06-16
  12. 12. Hadoop Book / Case Study ShareThis case study, "Cascading" by Chris K Wensel, in… AWS Start-Up Tour 2009-06-16
  13. 13. Contacts @pacoid on Twitter AWS Start-Up Tour 2009-06-16