Recommendation Architecture: Understanding the Components of a Personalized Recommendation System: When we typically talk about recommendation systems, we focus on specific novel algorithms and formulations for performing collaborative filtering. However, building a system to recommend items to a user in a personalized way often involves many more components than just a collaborative filter; it requires a much broader ecosystem of functionality, tools, and development pipelines. This presentation will discuss an holistic approach to building recommendation systems including 1) how A/B testing works with machine learning to iterate toward better recommendations, 2) how to couple an information-retrieval based search stack with collaborative filtering to capture user intent in a personalized way, and 3) making recommendations more relevant and interpretable.
2. BEFORE
DURING
AFTER
DINERS
RESTAURANTS
Understanding
&
Evolving
A2rac4ng
&
Planning
OpenTable: Deliver great experiences at
every step, based on who you are
Proprietary
2
3. OpenTable in Numbers
• Our network connects diners with more than
32,000 restaurants worldwide.
• Our diners have spent more than $30 billion
at our partner restaurants.
• OpenTable seats more than 16 million diners
each month.
• Every month, OpenTable diners write more
than 450,000 restaurant reviews
3
7. What’s the Goal
Minimizing Engineering Time to Improve The
Metric that Matters
• Make it Easy to Measure
• Make it Easy to Iterate
• Reduce Iteration Cycle Times
7
8. Importance of A/B Testing
• If you don’t measure it,
you can’t improve it
• Metrics Drive Behavior
• Continued Forward
Progress
8
9. Pick Your Business Metric
Revenue, Conversions
• OpenTable
• Amazon
Engagement
• Netflix
• Pandora
• Spotify
9
11. Measuring & The Iteration Loop
Op4mize
Models
A/B
Tes4ng
Days
Weeks
Predict
Measure
11
12. Measuring & The Iteration Loop
Analyze
&
Introspect
Op4mize
Models
A/B
Tes4ng
Hours
Days
Weeks
Insights
Predict
Measure
12
13. Ranking Objectives
Objectives:
• Training Error
- Minimize Loss Function
§ Often Convex
• Generalization Error
- Precision at K
• A/B Metric
- Conversion / Engagement
13
14. Training, Generalization, and Online Error
• Training: Train on your specific dataset
- Dealing with Sparseness
• Test/Generalization: How does it generalize
to unseen data?
- Hyper-Parameter Tuning
• Online: How does it perform in the wild
- Model interaction effects between recommend
items (diversity)
15. Fundamental Differences in Usage
Right now vs. Planning
Cost of Being Wrong
Search vs. Recommendations
15
16. Recommendation Stack
Query
Interpreta4on
Retrieval
Ranking
–
Item
&
Explana4on
Index
Building
Context
for
Query
&
User
Model
Building
Explana4on
Content
Visualiza4on
Collabora4ve
Filters
Item
/
User
Metadata
16
17. Using Context, Frequency & Sentiment
• Context
- Implicit: Location, Time, Mobile/Web
- Explicit: Query
• High End Restaurant for Dinner
- Low Frequency, High Sentiment
• Fast, Mediocre Sushi for Lunch
- High Frequency, Moderate
Sentiment
17
18. How to use this data
• Frequency Data:
- General: Popularity
- Personalized: Implicit CF
• Sentiment Data:
- General: Good Experience
- Personalized: Explicit CF
• Good Recommendation
- Use both to drive your Business Metric
18
19. Ranking
Phase 1: Bootstrap through heuristics
Phase 2: Learn to Rank
• Many models
- E [ Revenue | Query, Position, Item, User ]
- E [ Engagement | Query, Position, Item, User ]
- Regression, RankSVM, LambdaMart…
• Modeling Diversity is Important
19
20. Training Example
• Context Free (Collaborative Filtering)
- Train for Content Based and Collaborative Filtering models.
- Create an Ensemble Model
- Perform Hyper-Parameter Tuning for each model
• With Context (Search)
- Train a model using query (implicit & explicit)
§ Includes Context-Free Model
- Perform Hyper-Parameter Tuning
• Evaluate Model using A/B
- Change models, objective functions, etc.
21. Training DataFlow
Collabora4ve
Filter
Service
(Real4me)
Collabora4ve
Filter
HyperParameter
Tuning
(Batch
with
Spark)
Collabora4ve
Filter
Training
(Batch
with
Spark)
22. Training DataFlow
Collabora4ve
Filter
Service
(Real4me)
Collabora4ve
Filter
HyperParameter
Tuning
(Batch
with
Spark)
Collabora4ve
Filter
Training
(Batch
with
Spark)
Search
Service
(Real4me)
Search
HyperParameter
Tuning
(Batch
with
Spark)
Search
Training
(Batch
with
Spark)
23. Training DataFlow
Collabora4ve
Filter
Service
(Real4me)
Collabora4ve
Filter
HyperParameter
Tuning
(Batch
with
Spark)
Collabora4ve
Filter
Training
(Batch
with
Spark)
Search
Service
(Real4me)
Search
HyperParameter
Tuning
(Batch
with
Spark)
Search
Training
(Batch
with
Spark)
User
Interac4on
Logs
(Ka_a)
A/B
Tes4ng
Dashboards
Other
Services
26. Summarizing Content
• Essential for Mobile
• Balance Utility With Trust?
- Summarize, but surface raw
data
• Example:
- Initially, read every review
- Later, use average star rating
26
34. Topic Modeling Methods
We applied two main topic
modeling methods:
• Latent Dirichlet Allocation
(LDA)
- (Blei et al. 2003)
• Non-negative Matrix
Factorization (NMF)
- (Aurora et al. 2012)
34
35. The food was great! I loved the view of the
sailboats.
Bag of Words Model
food
great
chicken
sailboat
view
service
1
1
0
1
1
0
35
36. Topics with NMF using TF-IDF
Word
1
Word
…
Word
N
Review
1
0.8
0.9
0
Review
…
0.6
0
0.8
Review
N
0.9
0
0.8
Reviews
X
Words
Reviews
X
Topics
Topics
X
Words
36
37. Describing Restaurants as Topics
Each
review
for
a
given
restaurant
has
certain
topic
distribuCon
Combining
them,
we
idenCfy
the
top
topics
for
that
restaurant.
Topic 01! Topic 02! Topic 03! Topic 04! Topic 05!
Topic 01! Topic 02! Topic 03! Topic 04! Topic 05!
Topic 01! Topic 02! Topic 03! Topic 04! Topic 05!
review
1
review
2
review
N
.
.
.
Topic 01! Topic 02! Topic 03! Topic 04! Topic 05!
Restaurant
37