Talk given at HealthRecSys workshop at RecSys, Copenhagen, 2019-09-20
Our purpose at WW (the new Weight Watchers) is to "inspire
healthy habits for real life. For people, families, communities, the
world - for everyone." For 56 years, we’ve been a leader in weight
loss. Now, however, our mission is bigger and broader: drive health and wellness, making healthy habits accessible to all, not just a few.
The question is, how do you deliver on that? Humans are notoriously fickle, stubborn, and irrational. They don’t always do what is in their best interests. Moreover, behavioral change and habit forming is genuinely hard. It is all too easy to skip going to the gym, to resist that extra cookie, or to take time for yourself and reduce stress.
In this session, we’ll discuss what’s involved in behavioral change
and nudges, and how the WW data science team are working on
personalized experiences, various recommenders, and other data
products at scale to aid our members’ success.
2. Outline
● Intro to WW: purpose, program, type and scale and data
● Behavioral Nudges
● WW Data Products
● Primrose: how we develop and deploy ML models
● Q&A
6. HEALTH PARADOX #1
We spend more time and
money than ever before on wellness,
but we’ve never been
more unhealthy.
7. HEALTH PARADOX #2
Despite all the advances
in science and food production,
eating healthy
has never felt more complicated.
8.
9.
10.
11.
12.
13.
14. All calories are NOT created equal.
SmartPoints is about health, not just calories
15. Nutritional science to make healthy eating simpler
SmartPoints nudges you towards a healthy eating
pattern with more fruits, vegetables and lean
protein, and less sugar and saturated fat.
∙ Calories establish the baseline.
∙ Sugar and Saturated Fat increase the
SmartPoints value.
∙ Protein lowers the SmartPoints value.
∙ Foods that form the foundation of a healthy
eating pattern have SmartPoints value of
zero.
16. ZeroPoint™ Foods
ZeroPoint foods form the
foundation of a healthy
eating pattern and have a
low risk of overeating.
They don’t have to be
weighed, measured, or
tracked.
•
•
•
•
•
•
•
•
25. In Q2,
Social network:
● 2 million posts
● 14 million comments
● 70 million likes
1 million members tracked a physical activity
Dynamics and Scale
26.
27. visitor Sign up
Member Journey
On board Track food
Read
article
Do activity
Buy
snacks on
e-comm
Meditate
Attend
meeting
Purchase
snacks
Post on
Connect
Personal
coach
Go on
cruise
Try meal
kit
Call
Customer
service
Redeem
win
Make
recipe
Use Alexa
assistant
Almost none of this is personalized!
(Illustrative)
28. Big data:
● Food
● Activity
● Exercises
● Challenges
● Social network
● Workshops
● Personal Coaches
● CRM
● Fulfillment
● Meal kits
● Supermarket foods
● E-commerce
● Cruises
...for 56 years
29. Scale of Data
● Nackers et al (2013) showed that fast (≥0.68 kg/week) weight loss in the first month predicts
higher weight loss success at 6 months than slow (<0.23 kg/week) or moderate initial weight
loss
● Sample size: 298
● We checked our weigh in data to compare these results to what we observe in our member base
Nackers, Ross & Perry (2013). The Association Between Rate of Initial Weight Loss and Long-Term Success in
Obesity Treatment: Does Slow and Steady Win the Race? Int J Behav Med. 17(3):161-167.
30. Scale of Data
● Members considered:
1) started and ended their membership between Apr 2017 and Apr 2019
2) were members for at least 6 months
3) weighed in in their first week, fourth week and sixth month
4) were obese at the beginning of their membership (BMI > 30)
● For all analyses (mostly) unfiltered self report data was used.
Sample size: 211,000 members!
31. 7kg median weight loss after 6 months
● 211k members
● 54% digital (sampling bias)
● Mostly female
● Median birth year 1971
● Median start BMI: 35
● Median BMI @ 6 months: 32.5
32. Significant effect of initial weight loss rate on
weight loss success
● Weight loss defined as 5% or 10% of initial
start weight lost
● Initial weight loss speed:
● Fast (≥0.68 kg/week) : 116,107
● Moderate (<0.68 & >0.23 kg/week):
61,204
● Slow (<0.23 kg/week): 34,107
*all differences statistically significant
Proportion of members who lost 5% or 10% initial
body weight at 6 mo
36. “Any addition to or modification of the environment
that influenced consumers in a predictable way,
without changing economic incentive”
Altering environment by changing presentation of options — called choice architecture
37. “Any aspect of the choice architecture that alters
people’s behavior in a predictable way (1) without
forbidding any options or (2) significantly changing
their economic incentives. Putting fruit at eye level
counts as a nudge; banning junk food does not.”
38. This is Thaler & Sunstein’s framework. Instead, I will use Blumenthal-Barby & Burroughs
iNcentives
Understand mappings
Defaults
Give feedback
Expect error
Structure complex choices
Nudging & Choice Architecture
39. Nudging & Choice Architecture
Category Explanation Examples
Priming Subconscious; physical, verbal, sensational
Place unhealthy options out of sight or
farther away in cafe
Salience Informational; attention grabbing; emotional
Calorie label; graphic image on cigarette
cartons; recommenders
Default
Pre-set default choice; good option for
do-nothing behavior
Automatic opt-in, have to explicitly change
or opt-out: benefits, organ donation
Incentive / Gamify Reward or punish for behavior; recognition Badge; coupon; $$?
Commitment / Ego
Get someone to make a commitment; leverage
ego, pride
Sign up for 5k; invest (pay for membership);
share with friends
Norms / Messenger
Use other to establish a norm and for
consumer to compare themselves
80% of (other) people are organ donors;
most people wear seatbelts
Blumenthal-Barby & Burroughs. (2012). Seeking better health care outcomes. The ethics of using the nudge. Am. J. Bioethics 12(2):1-10.
40.
41. Everything is on the menu
Members are empowered with lots of choices.
We are not dictating diet, exercise regime etc. Hence, these are nudges
43. Priming
Visibility, accessibility, available
You just earned 5 WellnessWins!
Find out how your healthy habits can
pay off with WW's rewards program.
Wellness Wins
Awareness
Your rewards are waiting...
Way to go, Eric! 🎉 You’ve earned 1,500
Wins. Redeem them now.
Wellness Wins
Redemption
What’s for dinner?🤔
These recipes were picked just for you
and your daily SmartPoints Budget.
Dinner
Recommendation
Weight Tracking
Reminder
Reminder: It’s time to weigh in
Members who track their weight regularly
tend to lose more.
We plan to leverage
notification nudges,
similar to these, to
help keep people on
track
45. ● Points on very large number of foods
● Clear “mappings” make it easier to make good choices:
○ instead of calorie counting, 300 cal (or is it kJ, kcal) → 5 points
Saliency
meaningful, relevant info
60. Incentives / Gamify
WellnessWinsTM
A first-of-its-kind program that rewards members for building healthy habits.
You earn “Wins” for:
● Tracking meals (breakfast, lunch, dinner)
● Tracking activity
● Tracking weight
● Attending workshops
61. Incentives / Gamify
Milestones are rewarded for:
• Weight loss: 5, 10, 25, 50, 75, 100, 125,
150, 175 & 200 lbs
• Goal weight
WellnessWins celebrates outcomes
“I’ve got my keyring with all my little bangles hanging off of it. I love that
thing. It might seem stupid but it was just fun to get those rewards along
the way, a physical manifestation of your success”
69. Motivation
● Member is doing the work.
● Important for them to
remind themselves why
they are doing this
70. Willpower
Duckworth, Milkman, & Laibson. (2018). Beyond willpower: strategies for reducing failures of self-control. Psychological Science in the
Public Interest. 19(3) 102–129
“Weight Watchers, for example, coaches dieters to use an array of self-deployed situational and cognitive
strategies and, in addition, sponsors in-person meetings, communicates social norms, and provides a
phone app to track eating and exercise”
“
71. Cadario & Chandon. (2019). Which healthy nudges work best? A meta-analysis of field experiments. Marketing Science. DOI: 10.1287/mksc.2018.1128
Effect Sizes Meta-analysis of 90 articles + 96 field
experiments (299 effect sizes),
average effect of healthy eating
nudges of Cohen’s d=0.23.
= 124 kcal change in a daily intake
Or -7.2%
8 tablespoons sugar / day
72. Summary
Incentives / Recognition at
multiple temporal scales
Initial First track, first
barcode scan
Daily Blue dot
Weekly Weight check in
Continuous streaks
Monthly review
Event Wins, milestone
Annual Annual
review???
Multiple types of
recognition
Non tangible Badges, kudos
Peers Connect
Goods in
kind
wins
Highly primed experience
easy SmartPoints,
FitPoints
available In your pocket, AMZN,
neighborhood
supportive community
Highly saliency
progress
food, fitness
tips
73. Initial Daily Weekly Monthly Milestone Year
Food
Saliency
MyDay
Blue dot
MyDay
Blue dot
MyDay
Blue dot
Newsletter
Month in review
badges/tips?
Reward &
Recognition
(R&R)
badges
#bluedotchallenge
Badges
Streaks
#bluedotchallenge
Badges
recommenders
#bluedotchallenge
Badges
Month in review
Wellness Wins
badges
Defaults SP budget?
Fitness
Saliency MyDay Myday
Newsletter
Month in review
R & R badges badges badges Month in review
Weight
Saliency Coaches / Meetings Month in review
R & R badges badges
Badges
Coaches / Meetings
Month in review
Wellness Wins
badges
Defaults Check in day
76. Data products at WW
Social
Network:
Connect
Growth Program Infra-
structure
Churn model
Return model
LTV models
Single Member View
Recipe recommender
Similar recipes
Auto-tags
Clustering member foods
Composite foods
Personalized feed
Groups search
Who to follow
APIs
Primrose
81. Recipe Recommendations
Similar Recipes Dinner Recommendations
# note: push tokenization and and handling of ngrams down to tokenize in concrete classes
self.tfidf = TfidfVectorizer( tokenizer =self.tokenize)
self.term_document_matrix = self.tfidf.fit_transform( self.docs)
def cosine_similarity_matrix(self):
return cosine_similarity(self.term_document_matrix)
82. Similar Recipes Flow
US WW Recipes
Similar Ingredients
Similar Names
Filters
dietary
course
cuisine
main ingredient
document = ingredient list or name string
lemmatize, tokenize, TF-IDF
Cosine similarity
Rank
*Only recipes with images*
83. Dinner Recommendations Flow
US WW Recipes
Similar Ingredients
Similar Names Business Logic
Eligible Members
2 weeks of tracking history
Tracked >= 1 recipe
US members
Potential Recs
tracked
most similar
X XX
X
2nd most sim.
n = 4 recommendations
84. Food Embeddings
● Motivation: want to learn a space of foods where
similar foods are located near each other
● Applications
○ Recommend low point substitute foods
○ Input into recipe recommender
○ Classify new foods and users
● How to do this? Word embeddings!
85. Word Embedding Overview
● Dense real-valued vectors representing word
meaning
● Idea: words with similar meanings are grouped
together in the embedding space
● Many forms of meaning are conflated since there
is only one representation per word
86. FastText Behind the Scenes
● Learns embeddings using either Skip-gram or
CBOW algorithms
● But learns representations for sub-word units
rather than entire words
● Representations for whole words are composed
from subword representations
87. Preliminary Attempts
● Attempt 1 (using food log data) [1]:
○ Split food names into tokens
○ Each food name = 1 document
○ Average token embeddings for food name
○ Append calorie-normalized nutritional info
○ Did not work well, but might work with better preprocessing
● Attempt 2 (using recipe data) [2]:
○ Context = recipe ingredients
○ Each recipe = 1 document
○ Did not work well, recipe data too small
[1] Diet2Vec: Multi-scale analysis of massive dietary data; Tansey et. al 2016
[2] Cooking up Food Embeddings; Sauer et. al 2017
88. Final Attempt ● Context = ordered food entries, grouped by user
id and time of day (meal type) over one week
● Preprocessed data same way as in [2]
● Each “word” in a document is a whole food name
● Best result from using subword unit modeling
● Other ideas: filtering for power users
(UUID=1234, breakfast, week1) = [Monday breakfast,
Tuesday breakfast, Wednesday breakfast,...]
= [coffee, toast, jam, apple, coffee, orange_juice, tea, cereal,
2%_milk, banana,...]
Will learn associations among items:
within meal: cereal↔2% milk, cereal↔whole milk
among meals: apple↔banana, coffee↔tea
89. Post-processing for Substitute Extraction
● One of the main goals of the project was to extract
substitute food items
● Food data contains category information
● Simply eliminate results from NN list that are not in
the same category
Query:
Results:
92. Seeking positivity Getting help Sharing goals
Encouraging others Making friends Building a brand
Qualitative research from our experience research / analytics teams
93. I want to feel good and
see inspirational and
useful posts.
1 of 6
Seeking positivity
93
94. I post when I have
questions or need
encouragement.
2 of 6
Getting help
95. I post to show what I did
or am going to do.
3 of 6
Sharing goals
96. I give back the support I
received to those who
need it.
4 of 6
Encouraging others
97. I build and invest in
meaningful
relationships.
5 of 6
Making friends
98. I create content to grow
a large following.
6 of 6
Building a brand
99. “Hidden” agendas may change throughout
the day or over the course of a member’s
journey
99
102. Group Search
Cycle Camping Brides
Example: I love biking. Is there a group for this?
...
...
Solution: we’ve provided top 100
terms + top 100 hashtags per group
Issue: search only uses
title and description, not
content
109. Taking Stock of our own challenges
What would make a good recommender system at WW?
Slow serialization
but our medium data
can be kept in RAM...
No live features
but we know Docker, k8s...
Easy onboarding
mono repo with config as code...
110. Primrose has features to address each design
consideration
Python in-memory DAG runner, with no
serialization between nodes of the DAG.
DAG is defined as configuration-as-code
approach -- one container for all models
Abstract ML and data manipulation operations,
data scientists can easily extend the framework
Data science Infrastructure People
: (Production In-Memory Solution) framework for solving
WW’s most common use cases, caching batched predictions with
machine-learning engineering baked-in.
114. Primrose jobs are executed as Directed Acyclic
Graphs (DAG)s in python
115. DAGs are composed of implementation agnostic,
extensible nodes for data science
Data scientists can write individual nodes using
any Python framework or library they choose
116. Primrose is run like an ETL pipeline in a single
docker container for each configuration
117. Single Primrose image
Read prediction
data: Postgresql
Add model
Read trained
model: GCS
Clean prediction
features
Get model
predictions
Primrose
DataObject
Primrose
DataObject
Add features
Get features
Primrose
DataObject
Add cleaned features
Get model, cleaned features
Primrose
DataObject
Add predictions
Write model
predictions: GCS
Get predictions
118. For simpler deployments: Primrose uses a
“configuration as code” approach
Object configuration and DAG structure
are build in a configuration JSON
Primrose validates the configuration
and instantiates the correct classes at
runtime
Different outputs and results for each
DAG
Recipe recommender DAG
JSON
Churn Model DAG JSON
Connect Feed DAG JSON
Primrose container Success, fame, money...
120. Similar Recipes Flow
US WW Recipes
Similar Ingredients
Similar Names
Filters
dietary
course
cuisine
main ingredient
document = ingredient list or name string
lemmatize, tokenize, TF-IDF
Cosine similarity
Rank
*Only recipes with images*
121. Business Logic (filters)
Productionalize in Primrose DAG
Google BigQuery Data lake Reader
NLTK + Custom Lemmatization
Sklearn TF-IDF + cosine similarity
Write to GCS Bucket and Google MemoryStore
Success!
logging.info(‘Your newbie DS has written production quality code.’)
122. Business Logic (filters)
Productionalize in Primrose DAG
Google BigQuery Data lake Reader
NLTK + Custom Lemmatization
Sklearn TF-IDF + cosine similarity
Write to GCS Bucket and Google MemoryStore
Success!
logging.info(‘Your newbie DS has written production quality code.’)
123. Business Logic (filters)
Productionalize in Primrose DAG
Google BigQuery Data lake Reader
NLTK + Custom Lemmatization
Sklearn TF-IDF + cosine similarity
Write to GCS Bucket and Google MemoryStore
Success!
logging.info(‘Your newbie DS has written production quality code.’)
124. Dinner Recommendations Flow
US WW Recipes
Similar Ingredients
Similar Names Business Logic
Eligible Members
2 weeks of tracking history
Tracked >= 1 recipe
US members
Potential Recs
tracked
most similar
X XX
X
2nd most sim.
n = 4 recommendations
125. Productionalizing is easier the second time
Same BQ reader class,
different SQL input file
New postprocess class to sort, filter and interleave potential
recommendations
Success!
logging.warning(‘Data Scientist is developing software engineering skills.’)
126. Primrose has features to address each design
consideration
Python in-memory DAG runner, with no
serialization between nodes of the DAG.
DAG is defined as configuration-as-code
approach -- one container for all models
Abstract ML and data manipulation operations,
data scientists can easily extend the framework
Data science Infrastructure People
: (Production In-Memory Solution) framework for solving
WW’s most common use cases, caching batched predictions with
machine-learning engineering baked-in.
127. Wrap Up
Nudges:
● Once is not enough: nudge different times, channels, timescales
● Recognition really important: Nudge before, recognize after
● Holistic view: challenges, community, personality
Primrose:
● In-memory, config-as-code, extensible
● Helped our new team be productive and get models into prod
● Available today
128. Questions
● carl.anderson@ww.com
● @leapingllamas
● Food RecSys: https://arxiv.org/abs/1809.02862
● Primrose: https://github.com/ww-tech/primrose
● Tech blog: https://medium.com/ww-tech-blog
Hiring: especially data scientists
in Toronto