Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

16 months @ SoundCloud

Little talk about SoundCloud I gave at my former university.

16 months @ SoundCloud

  1. 16 months @ SoundCloud Tobias Schmidt @dagrobie github.com/grobie Hasso-Plattner Insititute, Potsdam - 13 June 2012
  2. What is SoundCloud?
  3. What is SoundCloud?• social sound platform• record, upload, share, listen to all kinds of sound• Web, iPhone, Android, Mac• 15 million users• platform with 16k apps, over 250 in own app store
  4. Organization• 6 locations (4 offices, main office in Berlin)• 133 people (26 nationalities)• 21 teams (builders, operators, pushers)• 12 task forces, 5 work groups
  5. Organization Builders Operators Pushers Design Finance PlatformEngineering HR Community Product Legal Content Marketing Office
  6. Engineering client HTML 5 Android iOS apps App API Partner Discovery Payment tools AudioData Activities Bazooka Delivery Analysis Systems engineering
  7. soundcloud.com
  8. HTML 5
  9. iOS / Android
  10. Technology• git / github (133 private, 60 public repos)• ruby, shell, JavaScript, Scala, CoffeeScript , C++, Go, Java, Python, Objective-C, C#, C• Rails, Chef, Node.js, jQuery, ...• MySQL, Cassandra, Hadoop, RabbitMQ, memcached, Solr, elasticsearch• S3, EC2, Akamai, Edgecast• nginx, debian
  11. Request flow CDN Router Load balancer Cache HDFSmemcached App / API MySQL Cassandra Worker RabbitMQ Activities Transcoder
  12. MySQL• main storage for all user data• mothership: 1 Master, 9 slaves, 900GB data• backup slave lags intentionally 1 hour behind• automated server selection based on slave consistency
  13. Cassandra 1 activities• index for stream events• friend A liked track B, C uploaded track D• write optimized (several thousands of writes per second)• filled by activities, read from mothership• 2 clusters, with a data size of 1.1TB
  14. Cassandra II stitch• time series of events in different resolutions (hour, day, week, etc.)• for example: plays, likes, profile views, etc.• cluster of 16 nodes, 1TB replicated data• data managed by own tool• at least 50k reads/second
  15. Hadoop• storage of all raw data, logs, events, etc.• 20 nodes, data size of 137TB• map/reduce jobs for analytics• initial data source for stitch
  16. Workflow
  17. My roles at SoundCloud• app engineer (feb 2011 - sep 2011)• service architect (oct 2011)• fire fighter (nov 2011 - dez 2011)• site reliability engineer (jan 2012 - now)
  18. app engineer (feb 2011 - sep 2011)• Connect with Facebook, find FB friends• extended social network sharing• spam protection• lot’s of bug fixes• performance optimizations
  19. service architect (oct 2011) API requests App / API to enrich publishs events messages (new Comment, Follower, etc.)Notifications RabbitMQ delivers push notificationsAPNS Android
  20. fire fighter (nov 2011 - dez 2011)• MySQL database optimizations• Cassandra cluster split• User suspension• lot’s of performance optimizations
  21. site reliability engineer (jan 2012 - now)• rewrite of online schema change tool• new CI test environment• tools to easily spawn new EC2 instances• rewrite of master slave adapter• ruby upgrade
  22. Thank youhttp://soundcloud.com/jobs

×