At Measured Search, we enable companies to elevate experience of search based applications faster and with more confidence. SearchStax, our platform amplifies open source search engines (Solr and Elasticsearch) to accelerate development, reduce operational overhead and improve performance. Join Measured Search's Eric Melz and Ticketmaster's Praveena Subrahmanyam to learn how Ticketmaster is moving towards a data driven search culture and creating a world class search experience for their live entertainment fans.
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Data-Driven Approach to Search Relevance
1. Data-Driven Approach to Search Relevance
Eric Melz
Measured Search
Praveena Subrahmanyam
Ticketmaster
Los Angeles Search, Data, and Analytics Meetup
June 26, 2017
1
2. 2
About the Speakers
Praveena Subrahmanyam
• Senior Architect and Search Lead at Ticketmaster
• ~ 2 years at Ticketmaster
• Geek, Mom, Travel enthusiast
Eric Melz
• Head of Engineering at Measured Search
• Over 20 years in Tech - LinkedIn, Google, Oracle, etc
• Used to work at TicketMaster
3. 3
About Ticketmaster
The World’s Leading Live Entertainment Company
• A Live Nation Company
• Founded over 40 years ago
• Selling over 400 million tickets each year
• Supporting 240K events, 200K attractions and 100K venues across 80+ countries
• Open API’s
• Follow us @ticketmaster
4. 4
• From the homepage, Search is the Top used feature
• 50-60% of sessions use search
Search at Ticketmaster
5. 5
Challenges
• Relevancy
• Text Relevancy
• Popularity
• Geo
• Personalization
• Fix one thing break another thing!
• Long tail
• Performance
• Index
• Query
• Scale
• Documents
• QPS
• Multilingual Documents
• Storing
• Querying
6. 6
• Exploratory
• Manual Testing
• Reports
• Feedback
• Social Media
• Internal
• Dev Jams
• Data Driven
Approaches
8. SearchStax: Open Source based
Platform-as-a-Service
Accelerate your time to market by flattening
the Solr learning curve and going straight to
development. Focus on your search
application and save months of headaches in
setup, provisioning, production readiness and
administration.
Managed Services and
Support
Our always-ready Solr experts are
only a call or an email away – every
day, all day and night, all year round.
Enjoy peace of mind with fully
managed Solr-as-a-Service.
Highly Skilled and Experienced
Open Source Search Experts
Our engineers have decades of
experience and delivered numerous
engagements in the field of search,
analytics and machine learning. These
same search experts are available on
an ad hoc basis to help ensure your
projects success.
Measured Search
8
Accelerate your timeline Peace of Mind On-Demand Expertise
Measured Search® enables companies to elevate the experience of Search
based applications faster and with more confidence.
10. 10
A / B Testing - Fundamentals
Split User population into Segments
Each Segment sees a different variant
• Control - existing version (“A”)
• Treatment - proposed version (“B”)
Variable - metric we hope improves
in the treatment group
11. 11
A / B Testing - Example
Split Users into Segments
• segmentId = userId mod 2
Each Segment sees a different variant
• Control - existing version (“A”)
• Blue Button
• Treatment - proposed version (“B”)
• Green Button
Variable - metric we hope improves
in the treatment group
• Click rate
12. 12
Search - Fundamentals
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Query (aka Search)
Result Set
Rank (aka Position)
Result Item
13. 13
Search A / B Testing - Variants
Variant parameters: Search Index + Ranking Algorithm
Index A
+
Ranking A
Index B
+
Ranking B
Paul M
?
Control Treatment
14. 14
Search A / B Testing - Variables
Click Through Rate
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Click!
15. 15
Search A / B Testing - Variables
Click Through Rate (CTR)
Clicked ClickedNot Clicked
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Click!
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Click!
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Click!
Clicked
Control
CTR = 3/4
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Not Clicked
Treatment
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Not Clicked
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Not Clicked
CTR = 1/4
Clicked
Click!
Score = # Clicks / # Searches
Higher scores are better
16. 16
Search A / B Testing - Variables
Manual (aka Human) Relevance Ranking
Foreach Query Q
• Foreach Item I
• Manually assign Relevance(Q,I)
Query Item Relevance
Paul M Justin Bieber 5
Paul M Paul Manafort 20
Paul M Paul McCartney 98
Paul Ma Justin Bieber 5
Paul Ma Paul Manafort 90
Paul Ma Paul McCartney 70
17. 17
Search A / B Testing - Variables
Human Ranking - Example
Score = Sum(Relevance / Rank )
Higher scores are better
Rank Item Relevance
Relevance /
Rank
1 Paul McCartney 98 98 / 1
2 Paul Manafort 20 20 / 2
3 Justin Bieber 5 5 / 3
Total 109.7
Control
Rank Item Relevance
Relevance /
Rank
1 Justin Bieber 5 5 / 1
2 Paul Manafort 20 20 / 2
3 Paul McCartney 98 98 / 3
Total 47.7
TreatmentPaul M Paul M
18. 18
Search A / B Testing - Variables
Human Ranking - Issue
Foreach Query Q
• Foreach Item I
• Manually assign Relevance(Q,I)
100K queries x 100K items = 10,000,000,000 ratings!
19. 19
Search A / B Testing - Variables
Average Click Position
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Click!
Clicked 3
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Click!
Clicked 1
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Click!
Clicked 1
Control
Avg Click Pos =
(1 + 2 + 1 + 1) / 4 =
1.25
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Treatment
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Not Clicked
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Clicked 2
Click!
Score = Average(Click Pos)
Lower scores are better
Clicked 2
Click!
Avg Click Pos =
(3 + 2 + 3) / 3 =
2.6
Clicked 3
Click!
Clicked 1
Click!
20. 20
Search A / B Testing - Variables
Mean Reciprocal Rank (MRR)
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Click!
Clicked 3
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Click!
Clicked 1
Paul M
1. Paul McCartney
2. Paul Manafort
3. Justin Bieber
Click!
Clicked 1
Control
MRR =
(1/1 + 1/2 + 1/1 + 1/1) / 4 =
0.88
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Treatment
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Not Clicked
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Paul M
1. Justin Bieber
2. Paul Manafort
3. Paul McCartney
Clicked 2
Click!
Score = Average(1 / Click Pos)
Higher scores are better (will be in range (0,1])
Clicked 2
Click!
MRR =
(1/3 + 1/2 + 1/3) / 3 =
0.38
Clicked 3
Click!
Clicked 1
Click!
21. 21
A / B Testing - Variables - No Results Searches
Score = # No-Result-Searches/ # Searches
Lower scores are better (will be in range [0,1])
Paul M
1. Paul McCartney
2. Paul Manafort
Paul
NO RESULTS!
No Results
1. Justin Bieber
Results
1. Justin Bieber
Results
Control
No Results =
1/4
Paul
NO RESULTS!
Treatment
Paul M
1. Paul McCartney
2. Paul Manafort
Results
Justin Beeb
NO RESULTS!
Justin Bieb
1. Justin Bieber
No Results
Results
No Results =
2/4
Results
No Results
Justin Beeb Justin Bieb
22. 22
A / B Testing - Issues
•Need adequate sample sizes to achieve
statistical significance
•Treatment should…
•Have negligible impact to business
•Revenue
•Goodwill
•Be production ready
•Secure
•Performant
•Acceptable UX
•Compatible with prod tech stack
•Have org approval for prod release
23. 23
Model Simulation - Fundamentals
•Alternative to A/B testing - Simulation
•Don’t direct traffic to different variants
•Single variant - control
•Record requests to control
•Replay recorded requests against treatment (in
dev environment)
•Measure performance of treatment against
control
24. 24
Search Model Simulation - Specifics
• Record (from control)
• Searches (queries)
• Searchclicks (queries + item + item position)
• Replay (to treatment)
• Searches - used to compute
• % of No-Result searches
• Searchclicks - used to compute
•Average Click Position
•MRR
• Report
• Metrics
• Average Click Pos
• MRR
• % of No-Result Searches
•Items clicked on in control, but not found in treatment
25. 25
Model Simulation - Flow
A
Control
Index
B
Treatment
Index
Event
DataSearchStax
Searches
Start Simulation
Fetch Results
Model
Simulator
Fetch
Data
Upload
Results
Track Events
Run Queries
Searcher
Analyst