Learn to Rank search results

Recruiting Solutions
1
Learning-to-Rank Search Results
Ganesh Venkataraman
http://www.linkedin.com/in/npcomplete
@gvenkataraman

Audience Classification
 Search background
 ML background
 Search + ML background
2
0% 10% 20% 30% 40% 50% 60% 70%
Search+ML
ML
Search

Outline
 Search Overview
 Why Learning to Rank (LTR)?
 Biases with collecting training data from click logs
– Sampling bias
– Presentation bias
 Three basic approaches
– Point wise
– Pair wise
– List wise
 Key Takeaways/Summary
3

tl;dr
 Ranking interacts heavily with retrieval and query
understanding
 Ground truth > features > model*
 List wise > pair wise > point wise
4
* Airbnb engineering blog: http://nerds.airbnb.com/architecting-machine-learning-system-risk/

bird’s-eye view of how a search engine works
rank using IR model
system:
user:
Information need query select from results
6

Pre Retrieval/Retrieval/Post Retrieval
Pre retrieval
– Process input query, rewrite, check for spelling etc.
– Hit search (potentially several) nodes with appropriate query
Retrieval
– Given a query, retrieve all documents matching query along with
a score
Post retrieval
– Merge sort results from different search nodes
– Add relevant information to search results used by front end
7

Claim #1 Search is about understanding the query/user intent
9

Understanding intent
10
TITLE CO GEO
TITLE-237
software engineer
software developer
programmer
…
CO-1441
Google Inc.
Industry: Internet
GEO-7583
Country: US
Lat: 42.3482 N
Long: 75.1890 W
(RECOGNIZED TAGS: NAME, TITLE, COMPANY, SCHOOL, GEO, SKILL )

Fixing user errors
11
typos
Help users spell names

Claim #2 Search is about understanding systems
12

The Search Index
 Inverted Index: Mapping from (search) terms to list of
documents (they are present in)
 Forward Index: Mapping from documents to metadata
about them
13

Posting List
14
Term Posting List
DO = “it is what it is”
D1 = “what is it”
D2 = “it is a banana”
DocId
a
banana
is
it
what
2
2
0
0
1
1
2
2
0 Frequency
1
Bold
B
1 1 2 1
1 1 2 1
1 1

Candidate selection for “abraham lincoln”
 Posting lists
– “abraham” => {5, 7, 8, 23, 47, 101}
– “lincoln” => {7, 23, 101, 151}
 Query = “abraham AND lincoln”
– Retrieved set => {7, 23, 101}
 Some systems level issues
– How to represent posting lists efficiently?
– How does one traverse a very long posting list? (for words like
“the”, “an” etc.)?
15

Claim #3 Search is ranking problem
16

What is search ranking?
 Ranking
– Find a ordered list of documents according to relevance between
documents and query.
 Traditional search
– f(query, document) => score
 Social networks context
– f(query, document, user) => score
– Find an ordered list of documents according to relevance
between documents, query and user
17

Why LTR?
 Manual models become hard to tune with very large
number of features and non-convex interactions
 Leverages large volume of click through data in an
automated way
 Unique challenges involved in crowdsourcing
personalized ranking
 Key Issues
– How do we collect training data?
– How do we avoid biases?
– How do we train the model?
18

TRAINING
Documents for
training
Fe
atures
Human
evaluation
La
bels
Machine
learning
model

training options – crowdsourcing judgment
21
Crowd source judgments
• (query, user, document) -> label
• {1, 2, 3, 4, 5}, higher label =>
better
• Issues
• Personalized world
• Difficult to scale

Mining click stream
Approach: Clicked = Relevant, Not-Clicked = Not Relevant
User eye
scan
direction
Unfairly penalized?

Position Bias
 “Accurately interpreting clickthrough data as implicit
feedback” – Joachims et. al, ACM SIGIR, 2005.
– Experiment #1
 Present users with normal Google search results
 55.56% users clicked first result
 5.56% clicked second result
– Experiment #2
 Same result page as first experiment, but 1st and 2nd result were
flipped
 57.14% users clicked first result
 7.14% clicked second result
23

FAIR PAIRS
• Fair Pairs:
• Randomize, Clicked= R,
Skipped= NR
[Radlinski and Joachims,
AAAI’06]

FAIR PAIRS
• Fair Pairs:
Flipped
Skipped= NR
AAAI’06]

FAIR PAIRS
• Fair Pairs:
• Great at dealing with position bias
• Does not invert models
Flipped
Skipped= NR
AAAI’06]

Issue #2 – Sampling Bias
27
 Sample bias
– User clicks or skips only what is shown.
– What about low scoring results from existing model?
– Add low-scoring results as ‘easy negatives’ so model
learns bad results not presented to user.
…
label 0
label 0
label 0
…
label 0
page 1 page 2 page 3 page n

Issue #2 – Sampling Bias
28

Avoiding Sampling Bias – Easy negatives
 Invasive way
– For a small sample of users add bad results in the SERP page to
test that the results were indeed bad
– Not really recommended since it affects UX
 Non-Invasive way
– Assume we have a decent model
– Take tail results and add them to model as an “easy negative”
– Similar approach can be done for “easy positives” depending on
applications
29

How to collect training data?
 Implicit relevance judgments from click logs – including
clicked and unclicked results from SERP (avoids position
bias)
 Add easy negatives (avoids sampling bias)
30

Mining click stream
Approach: Relevance labels
Label = 0 (least relevant)
Label = 5 (Most relevant)
Label = 2

Learning to Rank
 Pointwise: Reduce ranking to binary classification
33
Q1
+
+
+
-
Q2
+
-
-
-
Q3
+
+
-
-

Learning to Rank
34
Q1
+
+
+
-
Q2
+
-
-
-
Q3
+
+
-
-

Learning to Rank
35
Q1
+
+
+
-
Q2
+
-
-
-
Q3
+
+
-
-
Limitations
 Assume relevance is absolute
 Relevant documents associated with different queries are put into the
same class

Learning to Rank
 Pairwise: Reduce ranking to classification of document pairs w.r.t. the
same query
– {(Q1, A>B), (Q2, C>D), (Q3, E>F)}
36

Learning to Rank
 Pairwise: Reduce ranking to classification of document pairs w.r.t the
same query
– {(Q1, A>B), (Q2, C>D), (Q3, E>F)}
37

Learning to Rank
 Pairwise
– No longer assume absolute relevance
– Limitation: Does not differentiate inversions at top vs. bottom positions
38

Listwise approach - DCG
 Objective – Come up with a function to convert entire set
of ranked search results, each with relevance labels into a
score
 Characteristics of such a function
– Higher relevance in ranked set => higher score
– Higher relevance in ranked set on higher positions => higher
score
 p documents in the search results, each document ‘i’ has
a relevance reli.
39
DCGp =
p
å
2reli -1
log(i +1) i=1

DCG
Rank Discounted
40
Gain
1 3
2 4.4
3 0.5
(2relevance -1)/log(1+Rank)
7.9

NDCG based optimization
 NDCG@k = Normalized(DCG@k)
 Ensures value is between 0.0 and 1.0
 Since NDCG directly represents the “value” of particular
ranking given the relevance labels, one can directly
formulate ranking as maximizing NDCG@k (say k = 5)
 Directly pluggable into a variety of algorithms including
coordinate ascent
41

Learning to Rank
42
Point wise
Simple to understand and debug
Straight forward to use
✕Query independent
Pair wise
✕Assumes relevance is absolute
Assumes relevance is relative
Depends on query
✕Loss function agnostic to position
List Wise
Directly operate on ranked lists
Loss function aware of position
✕More complicated, non-convex functions, higher
training time

Search Ranking
43
Click Logs Training Data Model
Offline
Evaluation
Online A/B
test/debug
score = f(query, user, document)

tl;dr revisited
 Ranking interacts heavily with retrieval and query
understanding
– Query understanding affects intent detection, fixing user errors etc.
– Retrieval affects candidate selection, speed etc.
 Ground truth > features > model*
– Truth data is affected by biases
 List wise > pair wise > point wise
– Listwise while more complicated avoids some model level issues in
pairwise and point wise methods
44
* Airbnb engineering blog: http://nerds.airbnb.com/architecting-machine-learning-system-risk/

Useful references
 “From RankNet to LambdaRank to
LambdaMART: An overview” – Christopher
Burges
 “Learning to Rank for Information Retrieval” –
Tie-Yan Liu
 RankLib – has implementations of several LTR
approaches
45

LinkedIn search is powered by …
46
We are hiring !!
careers.linkedin.com

Learn to Rank search results

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Learn to Rank search results

Similar to Learn to Rank search results (20)

Recently uploaded

Recently uploaded (20)

Learn to Rank search results