SlideShare a Scribd company logo
1 of 58
Recommendation Architecture 
Jeremy Schiff 
RecSys 2014 
Large Scale Recommender Systems 
10/10/2014
OpenTable: 
Becoming an Experience Company 
BEFORE DURING AFTER 
RESTAURANTS DINERS 
Sharing & 
Remembering 
Understanding 
& Evolving 
Discovery & 
Convenience 
Attracting & 
Planning 
Delightful 
Dining 
Proprietary 2
Deliver great experiences at 
every step, based on who you are 
BEFORE DURING AFTER 
RESTAURANTS DINERS 
Understanding 
& Evolving 
Attracting & 
Planning 
Proprietary 3
OpenTable in Numbers 
• Our network connects diners with 
approximately 32,000 restaurants 
worldwide. 
• Our diners have spent more than $25 
billion at our partner restaurants. 
• OpenTable seats more than 15 million 
diners each month. 
• Every month, OpenTable diners write more 
than 400,000 restaurant reviews 
4
Recommendations 
≠ 
Collaborative Filtering 
5
So what are recommendations? 
6
Building Recommendation Systems 
• Importance of A/B 
Testing 
• Generating 
Recommendations 
• Recommendation 
Explanations 
• Recommendation 
Infrastructure 
7
What’s the Goal 
Minimizing Engineering Time to Improve The 
Metric that Matters 
• Make it Easy to Measure 
• Make it Easy to Iterate 
• Reduce Iteration Cycle Times 
8
Importance of A/B Testing 
• If you don’t measure 
it, you can’t improve it 
• Metrics Drive Behavior 
• Continued Forward 
Progress 
9
Pick Your Business Metric 
Revenue, Conversions 
or Satisfaction 
• OpenTable 
• Amazon 
Engagement 
• Netflix 
• Pandora 
• Spotify 
10
Satisfaction Metric 
Recommend 
Restaurant 
User Query 
User Review 
Verify / 
Refine 
Models 
11
Measuring & The Iteration Loop 
Weeks 
A/B 
Testing 
Measure 
12
Measuring & The Iteration Loop 
Days Weeks 
Optimize 
Models 
A/B 
Testing 
Predict Measure 
13
Measuring & The Iteration Loop 
Hours Days Weeks 
Analyze & 
Introspect 
Optimize 
Models 
A/B 
Testing 
Insights Predict Measure 
14
Fundamental Differences in Usage 
Right now vs. Planning 
Search vs. Recommendations 
Cost of Being Wrong 
15
Recommendation Stack 
Query Interpretation 
Retrieval 
Collaborative 
Filters 
Item / User 
Metadata 
Ranking – Item & Explanation 
Index 
Building 
Context for Query & User 
Model 
Building 
Explanation 
Content 
Visualization 
16
Query Interpretation & Retrieval 
• Get User Intent 
• Two Solutions 
- Spelling Correction 
- Auto Complete 
• One Box, Many Types 
- Name 
- Cuisine 
17
Ranking Objectives 
Objectives: 
• Training Error 
- RMSE 
• Generalization Error 
- Precision at K 
• A/B Metric 
- Conversion 
18
Ranking 
Phase 1: Bootstrap through heuristics 
Phase 2: Learn to Rank 
• E [ Revenue | Query, Position, Item, User ] 
• E [ Engagement | Query, Position, Item, User ] 
• Modeling Diversity is Important 
19
Modeling Support: Example 
20
Modeling Support: Example 
21
Modeling Support: Example 
22
Modeling Confidence 
• Understand Intersection of 
- Support of User 
- Support of Item 
• How does support affect 
variability of prediction? 
- 
23
Frequency, Sentiment, and Context 
• High End Restaurant for Dinner 
- High Sentiment, Low Frequency 
• Fast, Mediocre Sushi for Lunch 
- High Frequency, Moderate 
Sentiment 
24
How to use this data 
• Frequency Data: 
- General: Popularity 
- Personalized: Implicit CF 
• Sentiment Data: 
- General: Good Experience 
- Personalized: Explicit CF 
• Good Recommendation 
- Use both to drive your Business Metric 
25
Collaborative Filtering Architecture 
Hyper-Parameter 
Tuning 
(Many Days) 
Predicted Rating 
Full Trainer 
(Many hours) 
Incremental 
Trainer 
(A few 
seconds) 
(User, Item) 
Model 
26
Analyzing Review Content 
27
Reviews come in all shapes and sizes! 
This really is a hidden gem and I'm not sure I want to share but I will. :) The owner, Claude, has been here for 47 years 
and is all about quality, taste, and not overcharging for what he loves. My husband and I don't often get into the city at 
night, but when we do this is THE place. The Grand Marnier Souffle' is the best I've had in my life - and I have a few 
years on the life meter. The custard is not over the top and the texture of the entire dessert is superb. This is the only 
family style French restaurant I'm aware of in SF. It also doesn't charge you an arm and a leg for their excellent quality 
and that also goes for the wine list. Soup, salad, choice of main (try the lamb shank) and choice of dessert - for around 
$42 w/o drinks. 
“SUPERB!” 
Bay Area Reviews 
Post Jan 2013 
28
The ingredients of a spectacular 
dining experience… 
29
… and a spectacularly bad one 
30
Content Features 
Pandora 
• Music Genome Project 
Natural Language Processing 
• Topics & Tags 
31
Generating Topic Features 
• Stop Words & Stemming 
• Bag of Words Model 
• TF/IDF 
• Topic Modeling 
• Describe Restaurants as Topics 
32
Stop Words & Stemming 
The food was great! I loved the view of the 
sailboats. 
33
Stop Words & Stemming 
The food was great! I loved the view of the 
sailboats. 
34
Bag of Words Model 
The food was great! I loved the view of the 
sailboats. 
food great chicken sailboat view service 
1 1 0 1 1 0 
35
TF-IDF 
• Term Frequency - Inverse Document 
Frequency 
• Final Value = TF(t) IDF(t) 
36
TF-IDF Example 
The food was great! I loved the view of the 
sailboats. 
food great chicken sailboat view service 
.02 0.05 0 0.5 0.25 0 
37
Topic Modeling Methods 
We applied two main topic 
modeling methods: 
• Latent Dirichlet Allocation 
(LDA) 
- (Blei et al. 2003) 
• Non-negative Matrix 
Factorization (NMF) 
- (Aurora et al. 2012) 
38
Topics with NMF using TF-IDF 
Word 1 Word … Word N 
Review 1 0.8 0.9 0 
Review … 0.6 0 0.8 
Review N 0.9 0 0.8 
Reviews 
X 
Words 
Reviews 
X 
Topics 
Topics 
X 
Words 
39
Describing Restaurants as Topics 
Each review for a 
given restaurant 
has certain topic 
distribution 
Combining them, 
we identify the top 
topics for that 
restaurant. 
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 
review 1 
review 2 
... 
review N 
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 
Restaurant 
40
Examples of Topics 
41
Varying Topic By Region 
• San Francisco 
• ` 
• London 
• Chicago 
• New York 
42
Compelling Recommendations 
43
Recommendation Explanations 
• Amazon 
• Ness 
• Netfl 
• ix 
• Ness - Social 
44
Summarizing Content 
• Essential for Mobile 
• Balance Utility With Trust? 
- Summarize, but surface raw 
data 
• Example: 
- Initially, read every review 
- Later, use average star rating 
45
Summarizing Restaurant Attributes 
46
Active Learning for Summarization 
Provide 
Labels 
Train 
Model 
Generate 
New 
Dataset 
Evaluate 
Accuracy 
On Full 
Dataset 
• Incremental Supervised Learning 
• Know Precision & Recall 
• Always Forward Progress 
• Generate Dataset: False Positive/Negative or Difficult to Discriminate 
47
Devil is in the Details 
Attribute Tag – Dim Lighting 
“I love the relaxed feel of this place – dark, 
small, and cozy – like a comfortable living 
room.” 
48
Dish Recommendation 
• What to try once I have arrived? 
49
Edit via the Header & Footer menu in 
PowerPoint 
5500
Infrastructure 
Service Logs 
User 
Interactions 
51
Infrastructure 
Service Logs 
User 
Interactions 
Queue 
52
Infrastructure 
Service Logs 
User 
Interactions 
Batched 
Data For 
Analysis 
Real-Time 
Processing 
Queue 
53
Infrastructure 
Service Logs 
User 
Interactions 
Queue 
Batched 
Data For 
Analysis 
Real-Time 
Processing 
Analytics 
Model 
Training 
A/B Testing 
54
Multi-Datacenter Infrastructure 
Secondary DataCenter Primary DataCenter 
Service 
Service 
Secondary 
Queue 
Service Service 
Central 
Queue 
Stream 
Processing 
Batched 
Storage 
Analytics 
Model 
Pipeline 
A/B 
Testing 
Secondary DataCenter 
Service 
Service 
Secondary 
Queue 
55
Building Recommendation Systems 
• Importance of A/B 
Testing 
• Generating 
Recommendations 
• Recommendation 
Explanations 
• Recommendation 
Infrastructure 
56
Team Composition 
Team 
• Data Scientist 
- Math & Applied Machine Learning 
- Relevancy and Accuracy 
• Data Science Engineer 
- Software Development 
- Infrastructure, Speed and Maintainability 
Everyone works on production systems 
57
Questions? 
Jeremy Schiff 
jschiff@opentable.com

More Related Content

Similar to Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommender Systems

Recommender Systems in a nutshell
Recommender Systems in a nutshellRecommender Systems in a nutshell
Recommender Systems in a nutshellKonstantin Savenkov
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinarsoftware testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI WebinarXBOSoft
 
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI WebinarSoftware Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI WebinarXBOSoft
 
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour PresentationSoftware Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour PresentationXBOSoft
 
Agile2014 Report: As a Speaker and a Reporter of the latest Agile in the world
Agile2014 Report: As a Speaker and a Reporter of the latest Agile in the world  Agile2014 Report: As a Speaker and a Reporter of the latest Agile in the world
Agile2014 Report: As a Speaker and a Reporter of the latest Agile in the world Rakuten Group, Inc.
 
Dealing With The Input Providers
Dealing With The Input ProvidersDealing With The Input Providers
Dealing With The Input Providersdclsocialmedia
 
[WI 2017] Context Suggestion: Empirical Evaluations vs User Studies
[WI 2017] Context Suggestion: Empirical Evaluations vs User Studies[WI 2017] Context Suggestion: Empirical Evaluations vs User Studies
[WI 2017] Context Suggestion: Empirical Evaluations vs User StudiesYONG ZHENG
 
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)Miminten
 
Tools for Change: Introduction to agile
Tools for Change: Introduction to agileTools for Change: Introduction to agile
Tools for Change: Introduction to agileOxford City Council
 
Embrace and Beyond Mobility: Design for the Ideal Dining Experience | 拥抱和超越移...
Embrace and Beyond Mobility:  Design for the Ideal Dining Experience | 拥抱和超越移...Embrace and Beyond Mobility:  Design for the Ideal Dining Experience | 拥抱和超越移...
Embrace and Beyond Mobility: Design for the Ideal Dining Experience | 拥抱和超越移...UX Consulting Pte Ltd
 
New service-development-process-design
New service-development-process-designNew service-development-process-design
New service-development-process-designGopinath Guru
 
User Research When You Can't Reach Your Users 20141016
User Research When You Can't Reach Your Users 20141016User Research When You Can't Reach Your Users 20141016
User Research When You Can't Reach Your Users 20141016Heather Staudt
 
How to Conduct Usability Studies: A Librarian Primer
How to Conduct Usability Studies: A Librarian PrimerHow to Conduct Usability Studies: A Librarian Primer
How to Conduct Usability Studies: A Librarian PrimerTao Zhang
 
Linked Data Best Practices and BibFrame
Linked Data Best Practices and BibFrameLinked Data Best Practices and BibFrame
Linked Data Best Practices and BibFrameRobert Sanderson
 
Planning and running usability tests
Planning and running usability testsPlanning and running usability tests
Planning and running usability testsChris Collingridge
 

Similar to Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommender Systems (20)

Recommender Systems in a nutshell
Recommender Systems in a nutshellRecommender Systems in a nutshell
Recommender Systems in a nutshell
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinarsoftware testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
 
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI WebinarSoftware Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
 
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour PresentationSoftware Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
Software Quality Metrics Do's and Don'ts - QAI-Quest 1 Hour Presentation
 
Agile2014 Report: As a Speaker and a Reporter of the latest Agile in the world
Agile2014 Report: As a Speaker and a Reporter of the latest Agile in the world  Agile2014 Report: As a Speaker and a Reporter of the latest Agile in the world
Agile2014 Report: As a Speaker and a Reporter of the latest Agile in the world
 
Dealing With The Input Providers
Dealing With The Input ProvidersDealing With The Input Providers
Dealing With The Input Providers
 
[WI 2017] Context Suggestion: Empirical Evaluations vs User Studies
[WI 2017] Context Suggestion: Empirical Evaluations vs User Studies[WI 2017] Context Suggestion: Empirical Evaluations vs User Studies
[WI 2017] Context Suggestion: Empirical Evaluations vs User Studies
 
Search quality in practice
Search quality in practiceSearch quality in practice
Search quality in practice
 
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
 
How Google works
How Google worksHow Google works
How Google works
 
Tools for Change: Introduction to agile
Tools for Change: Introduction to agileTools for Change: Introduction to agile
Tools for Change: Introduction to agile
 
Embrace and Beyond Mobility: Design for the Ideal Dining Experience | 拥抱和超越移...
Embrace and Beyond Mobility:  Design for the Ideal Dining Experience | 拥抱和超越移...Embrace and Beyond Mobility:  Design for the Ideal Dining Experience | 拥抱和超越移...
Embrace and Beyond Mobility: Design for the Ideal Dining Experience | 拥抱和超越移...
 
New service-development-process-design
New service-development-process-designNew service-development-process-design
New service-development-process-design
 
User Research When You Can't Reach Your Users 20141016
User Research When You Can't Reach Your Users 20141016User Research When You Can't Reach Your Users 20141016
User Research When You Can't Reach Your Users 20141016
 
How to Conduct Usability Studies: A Librarian Primer
How to Conduct Usability Studies: A Librarian PrimerHow to Conduct Usability Studies: A Librarian Primer
How to Conduct Usability Studies: A Librarian Primer
 
Edge Benchmarks
Edge BenchmarksEdge Benchmarks
Edge Benchmarks
 
Linked Data Best Practices and BibFrame
Linked Data Best Practices and BibFrameLinked Data Best Practices and BibFrame
Linked Data Best Practices and BibFrame
 
Planning and running usability tests
Planning and running usability testsPlanning and running usability tests
Planning and running usability tests
 
Kanban Trojan Horse_(2022).pdf
Kanban Trojan Horse_(2022).pdfKanban Trojan Horse_(2022).pdf
Kanban Trojan Horse_(2022).pdf
 

Recently uploaded

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Recently uploaded (20)

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommender Systems

  • 1. Recommendation Architecture Jeremy Schiff RecSys 2014 Large Scale Recommender Systems 10/10/2014
  • 2. OpenTable: Becoming an Experience Company BEFORE DURING AFTER RESTAURANTS DINERS Sharing & Remembering Understanding & Evolving Discovery & Convenience Attracting & Planning Delightful Dining Proprietary 2
  • 3. Deliver great experiences at every step, based on who you are BEFORE DURING AFTER RESTAURANTS DINERS Understanding & Evolving Attracting & Planning Proprietary 3
  • 4. OpenTable in Numbers • Our network connects diners with approximately 32,000 restaurants worldwide. • Our diners have spent more than $25 billion at our partner restaurants. • OpenTable seats more than 15 million diners each month. • Every month, OpenTable diners write more than 400,000 restaurant reviews 4
  • 6. So what are recommendations? 6
  • 7. Building Recommendation Systems • Importance of A/B Testing • Generating Recommendations • Recommendation Explanations • Recommendation Infrastructure 7
  • 8. What’s the Goal Minimizing Engineering Time to Improve The Metric that Matters • Make it Easy to Measure • Make it Easy to Iterate • Reduce Iteration Cycle Times 8
  • 9. Importance of A/B Testing • If you don’t measure it, you can’t improve it • Metrics Drive Behavior • Continued Forward Progress 9
  • 10. Pick Your Business Metric Revenue, Conversions or Satisfaction • OpenTable • Amazon Engagement • Netflix • Pandora • Spotify 10
  • 11. Satisfaction Metric Recommend Restaurant User Query User Review Verify / Refine Models 11
  • 12. Measuring & The Iteration Loop Weeks A/B Testing Measure 12
  • 13. Measuring & The Iteration Loop Days Weeks Optimize Models A/B Testing Predict Measure 13
  • 14. Measuring & The Iteration Loop Hours Days Weeks Analyze & Introspect Optimize Models A/B Testing Insights Predict Measure 14
  • 15. Fundamental Differences in Usage Right now vs. Planning Search vs. Recommendations Cost of Being Wrong 15
  • 16. Recommendation Stack Query Interpretation Retrieval Collaborative Filters Item / User Metadata Ranking – Item & Explanation Index Building Context for Query & User Model Building Explanation Content Visualization 16
  • 17. Query Interpretation & Retrieval • Get User Intent • Two Solutions - Spelling Correction - Auto Complete • One Box, Many Types - Name - Cuisine 17
  • 18. Ranking Objectives Objectives: • Training Error - RMSE • Generalization Error - Precision at K • A/B Metric - Conversion 18
  • 19. Ranking Phase 1: Bootstrap through heuristics Phase 2: Learn to Rank • E [ Revenue | Query, Position, Item, User ] • E [ Engagement | Query, Position, Item, User ] • Modeling Diversity is Important 19
  • 23. Modeling Confidence • Understand Intersection of - Support of User - Support of Item • How does support affect variability of prediction? - 23
  • 24. Frequency, Sentiment, and Context • High End Restaurant for Dinner - High Sentiment, Low Frequency • Fast, Mediocre Sushi for Lunch - High Frequency, Moderate Sentiment 24
  • 25. How to use this data • Frequency Data: - General: Popularity - Personalized: Implicit CF • Sentiment Data: - General: Good Experience - Personalized: Explicit CF • Good Recommendation - Use both to drive your Business Metric 25
  • 26. Collaborative Filtering Architecture Hyper-Parameter Tuning (Many Days) Predicted Rating Full Trainer (Many hours) Incremental Trainer (A few seconds) (User, Item) Model 26
  • 28. Reviews come in all shapes and sizes! This really is a hidden gem and I'm not sure I want to share but I will. :) The owner, Claude, has been here for 47 years and is all about quality, taste, and not overcharging for what he loves. My husband and I don't often get into the city at night, but when we do this is THE place. The Grand Marnier Souffle' is the best I've had in my life - and I have a few years on the life meter. The custard is not over the top and the texture of the entire dessert is superb. This is the only family style French restaurant I'm aware of in SF. It also doesn't charge you an arm and a leg for their excellent quality and that also goes for the wine list. Soup, salad, choice of main (try the lamb shank) and choice of dessert - for around $42 w/o drinks. “SUPERB!” Bay Area Reviews Post Jan 2013 28
  • 29. The ingredients of a spectacular dining experience… 29
  • 30. … and a spectacularly bad one 30
  • 31. Content Features Pandora • Music Genome Project Natural Language Processing • Topics & Tags 31
  • 32. Generating Topic Features • Stop Words & Stemming • Bag of Words Model • TF/IDF • Topic Modeling • Describe Restaurants as Topics 32
  • 33. Stop Words & Stemming The food was great! I loved the view of the sailboats. 33
  • 34. Stop Words & Stemming The food was great! I loved the view of the sailboats. 34
  • 35. Bag of Words Model The food was great! I loved the view of the sailboats. food great chicken sailboat view service 1 1 0 1 1 0 35
  • 36. TF-IDF • Term Frequency - Inverse Document Frequency • Final Value = TF(t) IDF(t) 36
  • 37. TF-IDF Example The food was great! I loved the view of the sailboats. food great chicken sailboat view service .02 0.05 0 0.5 0.25 0 37
  • 38. Topic Modeling Methods We applied two main topic modeling methods: • Latent Dirichlet Allocation (LDA) - (Blei et al. 2003) • Non-negative Matrix Factorization (NMF) - (Aurora et al. 2012) 38
  • 39. Topics with NMF using TF-IDF Word 1 Word … Word N Review 1 0.8 0.9 0 Review … 0.6 0 0.8 Review N 0.9 0 0.8 Reviews X Words Reviews X Topics Topics X Words 39
  • 40. Describing Restaurants as Topics Each review for a given restaurant has certain topic distribution Combining them, we identify the top topics for that restaurant. Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 review 1 review 2 ... review N Topic 01 Topic 02 Topic 03 Topic 04 Topic 05 Restaurant 40
  • 42. Varying Topic By Region • San Francisco • ` • London • Chicago • New York 42
  • 44. Recommendation Explanations • Amazon • Ness • Netfl • ix • Ness - Social 44
  • 45. Summarizing Content • Essential for Mobile • Balance Utility With Trust? - Summarize, but surface raw data • Example: - Initially, read every review - Later, use average star rating 45
  • 47. Active Learning for Summarization Provide Labels Train Model Generate New Dataset Evaluate Accuracy On Full Dataset • Incremental Supervised Learning • Know Precision & Recall • Always Forward Progress • Generate Dataset: False Positive/Negative or Difficult to Discriminate 47
  • 48. Devil is in the Details Attribute Tag – Dim Lighting “I love the relaxed feel of this place – dark, small, and cozy – like a comfortable living room.” 48
  • 49. Dish Recommendation • What to try once I have arrived? 49
  • 50. Edit via the Header & Footer menu in PowerPoint 5500
  • 51. Infrastructure Service Logs User Interactions 51
  • 52. Infrastructure Service Logs User Interactions Queue 52
  • 53. Infrastructure Service Logs User Interactions Batched Data For Analysis Real-Time Processing Queue 53
  • 54. Infrastructure Service Logs User Interactions Queue Batched Data For Analysis Real-Time Processing Analytics Model Training A/B Testing 54
  • 55. Multi-Datacenter Infrastructure Secondary DataCenter Primary DataCenter Service Service Secondary Queue Service Service Central Queue Stream Processing Batched Storage Analytics Model Pipeline A/B Testing Secondary DataCenter Service Service Secondary Queue 55
  • 56. Building Recommendation Systems • Importance of A/B Testing • Generating Recommendations • Recommendation Explanations • Recommendation Infrastructure 56
  • 57. Team Composition Team • Data Scientist - Math & Applied Machine Learning - Relevancy and Accuracy • Data Science Engineer - Software Development - Infrastructure, Speed and Maintainability Everyone works on production systems 57
  • 58. Questions? Jeremy Schiff jschiff@opentable.com

Editor's Notes

  1. LambdaRank, LambdaMART