Conventionally when we talk about Recommender Systems, we talk about collaborative filtering. While providing personalized recommendations through collaborative filtering is an essential aspect to providing effective recommendations, it is but a piece of a much broader ecosystem of functionality, tools, and development pipelines. This presentation will discuss an holistic approach to building recommendation systems including 1) iterating towards better recommendations, 2) the data pipelines required, 3) a machine-learned ranking approach based on an Information Retrieval formulation that leverages collaborative filtering, 4) ways to make recommendations more relevant and interpretable.
2. OpenTable:
Becoming an Experience Company
BEFORE DURING AFTER
RESTAURANTS DINERS
Sharing &
Remembering
Understanding
& Evolving
Discovery &
Convenience
Attracting &
Planning
Delightful
Dining
Proprietary 2
3. Deliver great experiences at
every step, based on who you are
BEFORE DURING AFTER
RESTAURANTS DINERS
Understanding
& Evolving
Attracting &
Planning
Proprietary 3
4. OpenTable in Numbers
• Our network connects diners with
approximately 32,000 restaurants
worldwide.
• Our diners have spent more than $25
billion at our partner restaurants.
• OpenTable seats more than 15 million
diners each month.
• Every month, OpenTable diners write more
than 400,000 restaurant reviews
4
7. Building Recommendation Systems
• Importance of A/B
Testing
• Generating
Recommendations
• Recommendation
Explanations
• Recommendation
Infrastructure
7
8. What’s the Goal
Minimizing Engineering Time to Improve The
Metric that Matters
• Make it Easy to Measure
• Make it Easy to Iterate
• Reduce Iteration Cycle Times
8
9. Importance of A/B Testing
• If you don’t measure
it, you can’t improve it
• Metrics Drive Behavior
• Continued Forward
Progress
9
10. Pick Your Business Metric
Revenue, Conversions
or Satisfaction
• OpenTable
• Amazon
Engagement
• Netflix
• Pandora
• Spotify
10
13. Measuring & The Iteration Loop
Days Weeks
Optimize
Models
A/B
Testing
Predict Measure
13
14. Measuring & The Iteration Loop
Hours Days Weeks
Analyze &
Introspect
Optimize
Models
A/B
Testing
Insights Predict Measure
14
15. Fundamental Differences in Usage
Right now vs. Planning
Search vs. Recommendations
Cost of Being Wrong
15
16. Recommendation Stack
Query Interpretation
Retrieval
Collaborative
Filters
Item / User
Metadata
Ranking – Item & Explanation
Index
Building
Context for Query & User
Model
Building
Explanation
Content
Visualization
16
17. Query Interpretation & Retrieval
• Get User Intent
• Two Solutions
- Spelling Correction
- Auto Complete
• One Box, Many Types
- Name
- Cuisine
17
18. Ranking Objectives
Objectives:
• Training Error
- RMSE
• Generalization Error
- Precision at K
• A/B Metric
- Conversion
18
19. Ranking
Phase 1: Bootstrap through heuristics
Phase 2: Learn to Rank
• E [ Revenue | Query, Position, Item, User ]
• E [ Engagement | Query, Position, Item, User ]
• Modeling Diversity is Important
19
23. Modeling Confidence
• Understand Intersection of
- Support of User
- Support of Item
• How does support affect
variability of prediction?
-
23
24. Frequency, Sentiment, and Context
• High End Restaurant for Dinner
- High Sentiment, Low Frequency
• Fast, Mediocre Sushi for Lunch
- High Frequency, Moderate
Sentiment
24
25. How to use this data
• Frequency Data:
- General: Popularity
- Personalized: Implicit CF
• Sentiment Data:
- General: Good Experience
- Personalized: Explicit CF
• Good Recommendation
- Use both to drive your Business Metric
25
26. Collaborative Filtering Architecture
Hyper-Parameter
Tuning
(Many Days)
Predicted Rating
Full Trainer
(Many hours)
Incremental
Trainer
(A few
seconds)
(User, Item)
Model
26
28. Reviews come in all shapes and sizes!
This really is a hidden gem and I'm not sure I want to share but I will. :) The owner, Claude, has been here for 47 years
and is all about quality, taste, and not overcharging for what he loves. My husband and I don't often get into the city at
night, but when we do this is THE place. The Grand Marnier Souffle' is the best I've had in my life - and I have a few
years on the life meter. The custard is not over the top and the texture of the entire dessert is superb. This is the only
family style French restaurant I'm aware of in SF. It also doesn't charge you an arm and a leg for their excellent quality
and that also goes for the wine list. Soup, salad, choice of main (try the lamb shank) and choice of dessert - for around
$42 w/o drinks.
“SUPERB!”
Bay Area Reviews
Post Jan 2013
28
32. Generating Topic Features
• Stop Words & Stemming
• Bag of Words Model
• TF/IDF
• Topic Modeling
• Describe Restaurants as Topics
32
33. Stop Words & Stemming
The food was great! I loved the view of the
sailboats.
33
34. Stop Words & Stemming
The food was great! I loved the view of the
sailboats.
34
35. Bag of Words Model
The food was great! I loved the view of the
sailboats.
food great chicken sailboat view service
1 1 0 1 1 0
35
36. TF-IDF
• Term Frequency - Inverse Document
Frequency
• Final Value = TF(t) IDF(t)
36
37. TF-IDF Example
The food was great! I loved the view of the
sailboats.
food great chicken sailboat view service
.02 0.05 0 0.5 0.25 0
37
38. Topic Modeling Methods
We applied two main topic
modeling methods:
• Latent Dirichlet Allocation
(LDA)
- (Blei et al. 2003)
• Non-negative Matrix
Factorization (NMF)
- (Aurora et al. 2012)
38
39. Topics with NMF using TF-IDF
Word 1 Word … Word N
Review 1 0.8 0.9 0
Review … 0.6 0 0.8
Review N 0.9 0 0.8
Reviews
X
Words
Reviews
X
Topics
Topics
X
Words
39
40. Describing Restaurants as Topics
Each review for a
given restaurant
has certain topic
distribution
Combining them,
we identify the top
topics for that
restaurant.
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05
review 1
review 2
...
review N
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05
Restaurant
40
45. Summarizing Content
• Essential for Mobile
• Balance Utility With Trust?
- Summarize, but surface raw
data
• Example:
- Initially, read every review
- Later, use average star rating
45
47. Active Learning for Summarization
Provide
Labels
Train
Model
Generate
New
Dataset
Evaluate
Accuracy
On Full
Dataset
• Incremental Supervised Learning
• Know Precision & Recall
• Always Forward Progress
• Generate Dataset: False Positive/Negative or Difficult to Discriminate
47
48. Devil is in the Details
Attribute Tag – Dim Lighting
“I love the relaxed feel of this place – dark,
small, and cozy – like a comfortable living
room.”
48
54. Infrastructure
Service Logs
User
Interactions
Queue
Batched
Data For
Analysis
Real-Time
Processing
Analytics
Model
Training
A/B Testing
54
55. Multi-Datacenter Infrastructure
Secondary DataCenter Primary DataCenter
Service
Service
Secondary
Queue
Service Service
Central
Queue
Stream
Processing
Batched
Storage
Analytics
Model
Pipeline
A/B
Testing
Secondary DataCenter
Service
Service
Secondary
Queue
55
56. Building Recommendation Systems
• Importance of A/B
Testing
• Generating
Recommendations
• Recommendation
Explanations
• Recommendation
Infrastructure
56
57. Team Composition
Team
• Data Scientist
- Math & Applied Machine Learning
- Relevancy and Accuracy
• Data Science Engineer
- Software Development
- Infrastructure, Speed and Maintainability
Everyone works on production systems
57