Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Serverless data processing built for internet SCALE

Ilai Malka (Big Data Developer) & Opher Dubrovsky (Big Data Team lead) @ Nielsen:
You too can build a serverless data pipeline processing 250 billion events/day. In this talk you’ll hear details from a real-life ad delivery system we’ve built running on AWS Lambda serverless infrastructure.
You’ll hear about:
- System design & pitfalls to avoid
- Fault tolerance, self-healing and recoverability
- CI/CD process & avoiding development velocity slowdown

  • Be the first to comment

Serverless data processing built for internet SCALE

  1. 1. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 1 Serverless Data Processing… Built for Internet S C A L E
  2. 2. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 2 About the speakers Our Post about Serverless: https://medium.com/nmc-techblog/going-serverless-c334ae242ca6 NMC Tech Blog: https://medium.com/nmc-techblog Opher Dubrovsky Big Data Team-Lead Ilai Malka Big Data Developer
  3. 3. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 3 What You’ll Hear About • Our data pipeline • Why Serverless ? • Problems we had to solve • Solution Architecture • Results
  4. 4. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 4 Segmentation Upload To Networks Run Campaigns About Nielsen Marketing Cloud (NMC)
  5. 5. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 5 Campaign for new Women Nike shoes For women who live in TLV that like to run
  6. 6. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 6 ID Segment Tags aa123 Tel-Aviv, female, sports, development c5551 Tel-Aviv, male, cooking_enthusiast bb34a Tel-Aviv, female Segment IDs (user list) Tel-Aviv aa123, c5551, bb34a female aa123, bb34a male c5551 sports aa123 development aa123 cooking_ enthusiast c5551 Users Segments
  7. 7. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 7 ID Segment Tags aa123 Tel-Aviv, female, sports, development c5551 Tel-Aviv, male, cooking_enthusiast bb34a Tel-Aviv, female Segment IDs (user list) Tel-Aviv aa123, c5551, bb34a female aa123, bb34a male c5551 sports aa123 development aa123 cooking_ enthusiast c5551 Users Segments Campaign questions Result users in Tel-Aviv that are female aa123, bb34a
  8. 8. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 8 The Old System Java code Java code Java code Java code Java code …. Platforms file …. Servers file file file file file file Files to Process
  9. 9. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 9 The Old System Java code Java code Java code Java code Java code …. Platforms file …. Servers file file file file file file Files to Process Hard to: • Scale • Monitor • Manage • Add platforms
  10. 10. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 10 250 Billion Tags/Day Scale up / down Fault tolerance Cost Effective The Challenges
  11. 11. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 11 Solutions Explored EC2 Machines Spark / EMR Serverless (Lambda)
  12. 12. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 12 Serveless (Lambda) • Costs can escalate • Environment limitations • Scale is built-in • Focus on your business goals
  13. 13. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 13 The Architecture
  14. 14. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 14 The Architecture
  15. 15. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 15 The Data Path
  16. 16. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 16 The Management Path
  17. 17. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 17 250 Billion Tags/Day 0.3 - 1 TB/hour Quickly Scales Up and Down Top Day Ever 750 Billion Tags/Day
  18. 18. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 18 ~ $1000 per Day ~ $4 Per Billion Tags Total cost ($): 25,487 per month Lambda 74% S3 13% DB 11% Other 2%
  19. 19. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 19 What we got for FREE Scale up / down - no brainer Shorter time to market Cost feedback loop Canary deployment
  20. 20. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 20 What we build on our own Specific ad networks requirements Rate limiter / Back pressure Task management + prioritization Cost control
  21. 21. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 21 Should You Consider This ? YES !!!!!
  22. 22. Copyright©2017TheNielsenCompany(US),LLC.Confidentialandproprietary.Donotdistribute. 22 Questions?
  23. 23. This artwork was created using Nielsen data. Copyright © 2017 The Nielsen Company (US), LLC. Confidential and proprietary. Do not distribute. Come and join us! https://www.comeet.co/jobs/nielsen/33.000

×