Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Leveraging an in-house modeling framework for fun and profit

147 views

Published on

Talk given by Mike Skarlinski and Brian Graham from WW (new Weight Watchers) data science team in 5th NYC RecSys meetup, June 20, 2019, hosted at WW HQ

Published in: Data & Analytics
  • Login to see the comments

  • Be the first to like this

Leveraging an in-house modeling framework for fun and profit

  1. 1. Leveraging an in-house modeling framework for fun and profit Mike Skarlinski & Brian Graham {michael.skarlinski, brian.graham}@weightwatchers.com June 2019
  2. 2. Outline • Introduction: data science at WW – the new Weight Watchers • Problem: scalable, simple modeling and recommendation systems with a small team • Solution: design and benefits of building a framework • Implementation: Examples of deployed recommenders
  3. 3. WW is a data driven application to help members on their wellness journeys Member Social Network Activity & Food tracking Weight progress & goals Recipe & food database
  4. 4. As a new team, we are tasked with building a foundation of data products Social Network: Connect Growth WW Program Infra- structure Churn model Return model LTV models Single Member View Recipe recommender Similar recipes Composite foods ontology Personalized feed Groups search Who to follow APIs Primrose
  5. 5. Data science team’s success hinges on effectively sharing work and knowledge openopen Brian Graham Reka Daniel-Weiner Yameng (Eliza) Zhang Kevin Zecchini Carl Anderson Michael (Mike) Skarlinski open Dec. 2019 May 2018 Jan. 2019 Mar. 2019 Feb. 2019 ... (Hint hint) How can we build software that helps us grow and develop as a team?
  6. 6. WW recommender and modeling challenges
  7. 7. Taking stock of our own challenges at WW What would make a good recommender system at WW? Slow serialization but our medium data can be kept in RAM... No live features but we know Docker, k8s... Easy onboarding mono repo with config as code...
  8. 8. We built a framework to solve our challenges and enforce our design decisions (Open source coming soon!!!!!)
  9. 9. Primrose: a framework for simple, quick modeling deployments
  10. 10. Primrose has features to address each design consideration Python in-memory DAG runner, with no serialization between nodes of the DAG. DAG is defined as configuration-as-code approach -- one container for all models Abstract ML and data manipulation operations, data scientists can easily extend the framework Data science Infrastructure People Primrose: (Production In-Memory Solution) framework for solving WW’s most common use cases, caching batched predictions with machine-learning engineering baked-in.
  11. 11. Primrose jobs are executed as Directed Acyclic Graphs (DAG)s in python Flexibility: any number of operations allowed in a single DAG, across any python library Data and functions are passed between nodes in an object that understands how to extract the correct data for each node
  12. 12. DAGs are composed of implementation agnostic, extensible nodes for data science Data scientists can write any class that matches the abstract interface & incorporate in their DAGs Data scientists can write individual nodes using any Python framework or library they choose
  13. 13. Primrose is run like an ETL pipeline in a single docker container for each configuration
  14. 14. For simpler deployments: Primrose uses a “configuration as code” approach Object configuration and DAG structure are build in a configuration JSON Primrose validates the configuration and instantiates the correct classes at runtime Different outputs and results for each DAG Recipe recommender DAG JSON Churn Model DAG JSON Connect Feed DAG JSON Primrose container Success, fame, money...
  15. 15. The framework has helped our team grow and develop production models Deployed 3 production models and 3 production recommenders Onboarded 6 members in less than a year, everyone is working in the framework! We’re going to open-source Primrose !!! Keep on the lookout or contact us!
  16. 16. WW Recommender Examples
  17. 17. Food is at the core of our product
  18. 18. We know you and meet you where you are. coffee croissant fish tacos apple cobb salad pasta with red sauce ice cream Personalize your experience using your data
  19. 19. Recipe Recommendations Similar Recipes Dinner Recommendations
  20. 20. Similar Recipes Flow US WW Recipes Similar Ingredients Similar Names Filters dietary course cuisine main ingredient document = ingredient list or name string lemmatize, tokenize, TF-IDF Cosine similarity Rank *Only recipes with images*
  21. 21. Business Logic (filters) Productionalize in Primrose DAG Google BigQuery Data lake Reader NLTK + Custom Lemmatization Sklearn TF-IDF + cosine similarity Write to GCS Bucket and Google MemoryStore Success! logging.info(‘Your newbie DS has written production quality code.’)
  22. 22. Business Logic (filters) Productionalize in Primrose DAG Google BigQuery Data lake Reader NLTK + Custom Lemmatization Sklearn TF-IDF + cosine similarity Write to GCS Bucket and Google MemoryStore Success! logging.info(‘Your newbie DS has written production quality code.’)
  23. 23. Business Logic (filters) Productionalize in Primrose DAG Google BigQuery Data lake Reader NLTK + Custom Lemmatization Sklearn TF-IDF + cosine similarity Write to GCS Bucket and Google MemoryStore Success! logging.info(‘Your newbie DS has written production quality code.’)
  24. 24. Dinner Recommendations Flow US WW Recipes Similar Ingredients Similar Names Business Logic Eligible Members 2 weeks of tracking history Tracked >= 1 recipe US members Potential Recs tracked most similar X XX X 2nd most sim. n = 4 recommendations
  25. 25. Productionalizing is easier the second time Same BQ reader class, different SQL input file New postprocess class to sort, filter and interleave potential recommendations Success! logging.warning(‘Data Scientist is developing software engineering skills.’)
  26. 26. Container Dinner Recs Primrose Container Container Recipe Recs Micro-Service Flask API Similar Recipes Primrose Redis Cache MemoryStore Final Deployment Architecture Datalake BigQuery Refresh Daily Refresh Daily Android Endpoint Clients iOS Web
  27. 27. Q & A Open sourcing primrose here soon: https://github.com/ww-tech Tech blog https://medium.com/ww-tech-blog

×