SlideShare a Scribd company logo
1 of 28
Download to read offline
Образец заголовка
Online Learning to Rank
by Edward W Huang (ewhuang3) and
Yerzhan Suleimenov (suleime1)
Prepared as an assignment for CS410: Text Information Systems in Spring 2016
Образец заголовка
Introduction
Образец заголовкаWhat is learning to rank?
• Many information retrieval problems are ranking problems
• Also known as machine-learned ranking
– Uses machine learning techniques to create ranking models
• Training data: queries and documents matched with relevance
judgements
– Model sorts objects by relevance, preference, or importance
– Finds optimal combination of features
Образец заголовкаApplications of learning to rank
• Ranking problems in information retrieval
– Document Retrieval
– Sentiment analysis
– Product rating
– Anti-spam measures
– Search engines
• Many more applications not just in information retrieval!
– Machine translation
– Computational biology
Образец заголовкаOnline vs. offline learning to rank
• Training set is produced by human assessors
(offline)
– Time consuming and expensive to produce
– Not always in line with actual user preferences
• Data of users interacting with system (online)
– Users leave trace of interaction data: query
reformulations, mouse movements, clicks, etc.
– Clicks especially valuable when interpreted as
preferences
Образец заголовка
Big issue with online learning to
rank
• Exploration-exploitation dilemma
– Have to obtain feedback to improve system, while also utilizing past models to
optimize result quality
– Discuss solutions later
Образец заголовка
Creating Ranking Models
Образец заголовкаRanking model training framework
• Discriminative training attributes
– Input space
– Output space
– Hypothesis space
– Loss function
• Ranking model predicts ground truth label in training set in terms
of loss function
• Test phase: new query arrives, trained ranking model sorts
documents according to relevance to query
Образец заголовка
Algorithms for learning to rank
problems
• Categorized into three groups by their framework (input
representation and loss function)
– Pointwise
– Pairwise
– Listwise
T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval,
3(3): 225–331, 2009.
Образец заголовкаLimitations of pointwise approach
• Does not consider interdependency among documents
• Does not make use of the fact that some documents are
associated with the same query
• Most IR evaluation measures are query-level and position-based
Образец заголовкаPairwise and listwise
• Potential solutions to the previously mentioned exploration-exploitation
dilemma
• Pairwise approach
– Input: pairs of documents with labels identifying which one is
preferred
– Learns classifier to predict these labels
• Listwise approach
– Input: entire document list associated with a certain query
– Directly optimizes evaluation measures (i.e., Normalized Discounted
Cumulative Gain)
Hofmann, Katja, Shimon Whiteson, and Maarten De Rijke. "Balancing Exploration and Exploitation in Listwise
and Pairwise Online Learning to Rank for Information Retrieval." Information Retrieval Inf Retrieval 16.1
(2012): 63-90.
Образец заголовка
Absolute and relative feedback
approaches
• Use feedback to learn personalized rankings
• Absolute feedback: contextual bandits learning
• Relative feedback: gradient methods and inferred preferences
between complete result rankings
• Relative is usually better
– Robust to noisy feedback
– Deals with larger document spaces
Chen, Yiwei, and Katja Hofmann. "Online Learning to Rank: Absolute vs. Relative" Proceedings of the
24th International Conference on World Wide Web - WWW '15 Companion (2015).
Образец заголовка
State of the Art Learning
Образец заголовкаImproving learning performance
• Search engine clicks are useful, but might be biased
– Bias might come from attractive titles, snippets, or captions
• Method to detect and compensate for caption bias
– Enable reweighting of clicks based on likelihood
– Attractive, clicked links are considered less relevant
K. Hofmann, F. Behr, and F. Radlinski. On caption bias in interleaving experiments. In Proc. of CIKM,
2012.
Образец заголовкаHandling caption bias
• Allow weighting of clicks based on likelihood that
each click is caption biased
• Model click probability as function of position,
relevance, and caption bias
– Visual characteristics of individual documents
– Pairwise feature to focus on relationships with
neighboring documents
• Learn model weights from past behavior of users
• Remove caption bias to obtain evaluation that
reflects better relevance
Образец заголовкаImproving learning speed
• Search engine clicks can be interpreted using interleaved comparison
methods (two main methods)
– Reliably infer preferences between pairs of rankers
• Dueling bandit gradient descent learns from these comparisons
– Requires pairwise comparisons involving users between all exploratory
rankers
• Multileave gradient descent learns from comparisons of multiple rankers at
once
– Uses a single user interaction
– Fast
Schuth, Anne, Harrie Oosterhuis, Shimon Whiteson, and Maarten De Rijke. "Multileave Gradient Descent for Fast Online
Learning to Rank." Proceedings of the Ninth ACM International Conference on Web Search and Data Mining - WSDM '16
(2016).
Образец заголовка
Evaluating Rankers
Образец заголовкаHow to evaluate rankers?
• After training a ranker, we need to find out how effective it is
• Offline evaluation methods
– Dependent on explicit expert judgements
– Not feasible in practice
• Online evaluation methods
– Leverage online data that can reflect ranker quality
– Click-based ranker evaluation (discussed next)
• State of the art software: Lerot
– Evaluates different algorithms
– Can simulate user clicking behaviour with user models
Schuth, Anne, Katja Hofmann, Shimon Whiteson, and Maarten De Rijke. "Lerot." Proceedings of the 2013 Workshop on Living Labs for
Information Retrieval Evaluation - LivingLab '13 (2013).
Образец заголовкаClick-based ranker evaluation
• Online evaluation strategy based on clickthrough data
• Independent of expert judgments, unlike conventional evaluation
methods
– Measure reflects interest of an actual user rather than interest of an expert
providing relevance judgement
Образец заголовка
Challenges of using clickthrough
data
• Handling presentation bias
– Design user interface with three features
• Blind test: hidden random variables underlying the hypothesis test
• Click to preference: user’s click should reflect its actual judgment
• Low usability impact: interactive, user-friendly interface
• Identifying the superior of two rankers
– Unified user interface that sends user query to both rankers
– Mix two ranking results (discussed next)
– Show combined ranking to the user and record interesting/relevant clicks
T. Joachims, Evaluating Retrieval Performance Using Clickthrough Data, in J. Franke and G.
Nakhaeizadeh and I. Renz, "Text Mining", Physica/Springer Verlag, pp 79-96, 2003.
Образец заголовкаMixing two ranking results
• Also known as interleaving
• Key is to mix by balancing population from both rankers in top n
listings
• Algorithms vary in mixing strategy
– Balanced Interleaving
– Team-Draft Interleaving
Образец заголовка
Leveraging click responses from
mixed rankings
• Each click response represents user’s preference to ranker that provided the
clicked link
• Thus, proper leverage of clicks is critical
– Also known as test statistics
– Essential to reliable evaluation of rankers
• One basic approach is to assign equal weights to all clicks
– Suboptimal since not all clicks are equally significant
– Caption bias!
• More advanced test statistics, discussed next
Образец заголовкаTest statistics for evaluation
• Learn weights to maximize mean score difference between best and worst
rankers
• Optimize statistical power of z-test by maximizing z-score and p-value
– Removes assumption of equal variance of weights
• Learns to invert Wilcoxon Signed-Rank Test
– Produces scoring function to optimize Wilcoxon test
• Max mean difference performs the worst
• Inverse z-test performs the best
Yisong Yue, Yue Gao, O. Chapelle, Ya Zhang, T. Joachims, Learning more powerful test statistics for click-
based retrieval evaluation, Proceedings of the Conference on Research and Development in Information
Retrieval (SIGIR), 2010.
Образец заголовка
How good are interleaving
methods?
• Interleaving methods are compared against baseline:
conventional evaluation methods based on absolute metrics
• Conventional evaluation methods based on absolute metrics
– Absolute usage statistics are expected to monotonically
change with respect to ranker quality
• Interleaving methods
– More user clicks are expected for better ranker
Образец заголовка
Relative performance of
interleaving methods
• Experiment results on two rankers whose relative qualities are known by
construction
• Conventional evaluation methods based on absolute metrics
– Did not reliably identify high-quality rankers
– Absolute usage statistics did not monotonically change with respect to ranker quality
• Balanced Interleaving and Team-Draft Interleaving algorithms
– Reliably identified high-quality rankers
– Number of preferences for better ranker is significantly larger
F. Radlinski, M. Kurup, T. Joachims, How Does Clickthrough Data Reflect Retrieval Quality?,Proceedings of
the ACM Conference on Information and Knowledge Management (CIKM), 2008.
Образец заголовка
How much reliable and why to
choose interleaved methods?
• Results of interleaving agrees with conventional evaluation
methods
• Achieves statistically reliable preference compared to absolute
metrics
• Economical: statistical evaluation power of 10 interleaved clicks is
approximately equal to 1 manual judged query
• Not sensitive to different click aggregation schemes
• Can complement or even replace standard evaluation methods
based on manual judgments or absolute metrics
O. Chapelle, T. Joachims, F. Radlinski, Yisong Yue, Large-Scale Validation and Analysis of Interleaved Search Evaluation, ACM
Transactions on Information Systems (TOIS), 30(1):6.1-6.41, 2012.
Образец заголовкаFuture directions
• Extend current linear learning approaches with online learning to rank algorithms
that can effectively learn more complex models
• Designing and re-experimenting with more complex models for click behavior to
better understand various click biases.
• Learning distinctive properties, such as click dwell time and use of back button, to
filter out raw clicks.
• Understanding range of domains in which interleaving methods are highly effective.
• Improvement of gradient descent based rankers by covering all search directions to
speed up learning processes.
Образец заголовкаReferences
1. Chen, Yiwei, and Katja Hofmann. "Online Learning to Rank: Absolute vs. Relative" Proceedings of the 24th
International Conference on World Wide Web - WWW '15 Companion (2015).
2. F. Radlinski, M. Kurup, T. Joachims, How Does Clickthrough Data Reflect Retrieval Quality?,Proceedings of
the ACM Conference on Information and Knowledge Management (CIKM), 2008.
3. Hofmann, Katja, Shimon Whiteson, and Maarten De Rijke. "Balancing Exploration and Exploitation in
Listwise and Pairwise Online Learning to Rank for Information Retrieval." Information Retrieval Inf
Retrieval 16.1 (2012): 63-90.
4. T. Joachims, Evaluating Retrieval Performance Using Clickthrough Data, in J. Franke and G. Nakhaeizadeh
and I. Renz, "Text Mining", Physica/Springer Verlag, pp 79-96, 2003.
5. K. Hofmann, F. Behr, and F. Radlinski. On caption bias in interleaving experiments. In Proc. of CIKM, 2012.
6. O. Chapelle, T. Joachims, F. Radlinski, Yisong Yue, Large-Scale Validation and Analysis of Interleaved Search
Evaluation, ACM Transactions on Information Systems (TOIS), 30(1):6.1-6.41, 2012.
7. Schuth, Anne, Harrie Oosterhuis, Shimon Whiteson, and Maarten De Rijke. "Multileave Gradient Descent for
Fast Online Learning to Rank." Proceedings of the Ninth ACM International Conference on Web Search and
Data Mining - WSDM '16 (2016).
8. Schuth, Anne, Katja Hofmann, Shimon Whiteson, and Maarten De Rijke. "Lerot." Proceedings of the 2013
Workshop on Living Labs for Information Retrieval Evaluation - LivingLab '13 (2013).
9. T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):
225–331, 2009.
10. Yisong Yue, Yue Gao, O. Chapelle, Ya Zhang, T. Joachims, Learning more powerful test statistics for click-
based retrieval evaluation, Proceedings of the Conference on Research and Development in Information
Retrieval (SIGIR), 2010.

More Related Content

What's hot

Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filteringD Yogendra Rao
 
Recommender systems
Recommender systemsRecommender systems
Recommender systemsTamer Rezk
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systemsFalitokiniaina Rabearison
 
Distributed Processing of Stream Text Mining
Distributed Processing of Stream Text MiningDistributed Processing of Stream Text Mining
Distributed Processing of Stream Text MiningLi Miao
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systemsyoualab
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesDaniel Valcarce
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
Collaborative Filtering
Collaborative FilteringCollaborative Filtering
Collaborative FilteringTayfun Sen
 
Recommender system a-introduction
Recommender system a-introductionRecommender system a-introduction
Recommender system a-introductionzh3f
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation enginesGeorgian Micsa
 
Summary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paperSummary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paperChangsung Moon
 
Hybrid recommender systems
Hybrid recommender systemsHybrid recommender systems
Hybrid recommender systemsrenataghisloti
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systemsguest77b0cd12
 
Multi Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - OverviewMulti Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - OverviewDavide Giannico
 
genetic algorithm based music recommender system
genetic algorithm based music recommender systemgenetic algorithm based music recommender system
genetic algorithm based music recommender systemneha pevekar
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNNŞeyda Hatipoğlu
 

What's hot (20)

Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems
 
Distributed Processing of Stream Text Mining
Distributed Processing of Stream Text MiningDistributed Processing of Stream Text Mining
Distributed Processing of Stream Text Mining
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systems
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slides
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
Collaborative Filtering
Collaborative FilteringCollaborative Filtering
Collaborative Filtering
 
Recommender system a-introduction
Recommender system a-introductionRecommender system a-introduction
Recommender system a-introduction
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
 
Summary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paperSummary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paper
 
Hybrid recommender systems
Hybrid recommender systemsHybrid recommender systems
Hybrid recommender systems
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systems
 
Multi Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - OverviewMulti Criteria Recommender Systems - Overview
Multi Criteria Recommender Systems - Overview
 
genetic algorithm based music recommender system
genetic algorithm based music recommender systemgenetic algorithm based music recommender system
genetic algorithm based music recommender system
 
Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 

Viewers also liked

Textkernel - Semantic Recruiting Technology
Textkernel - Semantic Recruiting TechnologyTextkernel - Semantic Recruiting Technology
Textkernel - Semantic Recruiting TechnologyTextkernel
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningJoaquin Delgado PhD.
 
Learning to Rank: An Introduction to LambdaMART
Learning to Rank: An Introduction to LambdaMARTLearning to Rank: An Introduction to LambdaMART
Learning to Rank: An Introduction to LambdaMARTJulian Qian
 
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...Amit Sharma
 
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough dataВладимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough dataMail.ru Group
 
Firefox OS in the development process of a HTML5 app with Cordova
Firefox OS in the development process of a HTML5 app with CordovaFirefox OS in the development process of a HTML5 app with Cordova
Firefox OS in the development process of a HTML5 app with CordovaThomas Dori
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Lucidworks
 
El trabajo como derecho humano- Linea del tiempo
El trabajo como derecho humano- Linea del tiempoEl trabajo como derecho humano- Linea del tiempo
El trabajo como derecho humano- Linea del tiempoAdela Perez del Viso
 
Good Night - Buenas Noches
Good Night - Buenas NochesGood Night - Buenas Noches
Good Night - Buenas Nochesgloria
 
Calidad Total
Calidad TotalCalidad Total
Calidad Totalaniyliz
 
Mi Tabla IngléS.
Mi Tabla IngléS.Mi Tabla IngléS.
Mi Tabla IngléS.guestdb147
 
Effects of Simultaneous KP transplantation VS KD single in type 2 diabetics: ...
Effects of Simultaneous KP transplantation VS KD single in type 2 diabetics: ...Effects of Simultaneous KP transplantation VS KD single in type 2 diabetics: ...
Effects of Simultaneous KP transplantation VS KD single in type 2 diabetics: ...Francesco Luparini
 
Idade Média: Gótico
Idade Média: GóticoIdade Média: Gótico
Idade Média: GóticoJoão Lima
 
Modernize Your Oracle Environment with an Agile Data Infrastructure
Modernize Your Oracle Environment with an Agile Data InfrastructureModernize Your Oracle Environment with an Agile Data Infrastructure
Modernize Your Oracle Environment with an Agile Data InfrastructureNetApp
 
La electricidad y edison
La electricidad y edisonLa electricidad y edison
La electricidad y edisongatibloger
 

Viewers also liked (20)

Textkernel - Semantic Recruiting Technology
Textkernel - Semantic Recruiting TechnologyTextkernel - Semantic Recruiting Technology
Textkernel - Semantic Recruiting Technology
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Learn to Rank search results
Learn to Rank search resultsLearn to Rank search results
Learn to Rank search results
 
Learning to Rank: An Introduction to LambdaMART
Learning to Rank: An Introduction to LambdaMARTLearning to Rank: An Introduction to LambdaMART
Learning to Rank: An Introduction to LambdaMART
 
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
 
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough dataВладимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data
Владимир Гулин, Mail.Ru Group, Learning to rank using clickthrough data
 
Community Radio Station Program Preferences in Lakhimpur Khiri
Community Radio Station Program Preferences in Lakhimpur KhiriCommunity Radio Station Program Preferences in Lakhimpur Khiri
Community Radio Station Program Preferences in Lakhimpur Khiri
 
Firefox OS in the development process of a HTML5 app with Cordova
Firefox OS in the development process of a HTML5 app with CordovaFirefox OS in the development process of a HTML5 app with Cordova
Firefox OS in the development process of a HTML5 app with Cordova
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
El trabajo como derecho humano- Linea del tiempo
El trabajo como derecho humano- Linea del tiempoEl trabajo como derecho humano- Linea del tiempo
El trabajo como derecho humano- Linea del tiempo
 
3 rhabdoviridae
3 rhabdoviridae3 rhabdoviridae
3 rhabdoviridae
 
Whole-enterprise architecture
Whole-enterprise architectureWhole-enterprise architecture
Whole-enterprise architecture
 
Good Night - Buenas Noches
Good Night - Buenas NochesGood Night - Buenas Noches
Good Night - Buenas Noches
 
Calidad Total
Calidad TotalCalidad Total
Calidad Total
 
Mi Tabla IngléS.
Mi Tabla IngléS.Mi Tabla IngléS.
Mi Tabla IngléS.
 
Effects of Simultaneous KP transplantation VS KD single in type 2 diabetics: ...
Effects of Simultaneous KP transplantation VS KD single in type 2 diabetics: ...Effects of Simultaneous KP transplantation VS KD single in type 2 diabetics: ...
Effects of Simultaneous KP transplantation VS KD single in type 2 diabetics: ...
 
Idade Média: Gótico
Idade Média: GóticoIdade Média: Gótico
Idade Média: Gótico
 
Modernize Your Oracle Environment with an Agile Data Infrastructure
Modernize Your Oracle Environment with an Agile Data InfrastructureModernize Your Oracle Environment with an Agile Data Infrastructure
Modernize Your Oracle Environment with an Agile Data Infrastructure
 
La electricidad y edison
La electricidad y edisonLa electricidad y edison
La electricidad y edison
 
Los Virus: en la frontera de la Vida
Los Virus: en la frontera de la VidaLos Virus: en la frontera de la Vida
Los Virus: en la frontera de la Vida
 

Similar to Online Learning to Rank

Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Roi Blanco
 
ML.pptvdvdvdvdvdfvdfgvdsdgdsfgdfgdfgdfgdf
ML.pptvdvdvdvdvdfvdfgvdsdgdsfgdfgdfgdfgdfML.pptvdvdvdvdvdfvdfgvdsdgdsfgdfgdfgdfgdf
ML.pptvdvdvdvdvdfvdfgvdsdgdsfgdfgdfgdfgdfAvijitChaudhuri3
 
25.ranking on data manifold with sink points
25.ranking on data manifold with sink points25.ranking on data manifold with sink points
25.ranking on data manifold with sink pointsVenkatesh Neerukonda
 
Invited Lecture on Interactive Information Retrieval
Invited Lecture on Interactive Information RetrievalInvited Lecture on Interactive Information Retrieval
Invited Lecture on Interactive Information RetrievalDavidMaxwell77
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentationnirvdrum
 
Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionPerumalPitchandi
 
Evaluating e reference
Evaluating e referenceEvaluating e reference
Evaluating e referenceElaine Lasda
 
Evaluating Collaborative Filtering Recommender Systems
Evaluating Collaborative Filtering Recommender SystemsEvaluating Collaborative Filtering Recommender Systems
Evaluating Collaborative Filtering Recommender SystemsMegaVjohnson
 
Usability of Online Instruction
Usability of Online InstructionUsability of Online Instruction
Usability of Online InstructionMichael Wilder
 
Lane-SlidesMania.pptx
Lane-SlidesMania.pptxLane-SlidesMania.pptx
Lane-SlidesMania.pptxAngeCustodio
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...
Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...
Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...Olivier Jeunen
 
Determining Relevance Rankings with Search Click Logs
Determining Relevance Rankings with Search Click LogsDetermining Relevance Rankings with Search Click Logs
Determining Relevance Rankings with Search Click LogsInderjeet Singh
 

Similar to Online Learning to Rank (20)

Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement
 
ML.ppt
ML.pptML.ppt
ML.ppt
 
ML.ppt
ML.pptML.ppt
ML.ppt
 
ML.ppt
ML.pptML.ppt
ML.ppt
 
ML.ppt
ML.pptML.ppt
ML.ppt
 
ML.pptvdvdvdvdvdfvdfgvdsdgdsfgdfgdfgdfgdf
ML.pptvdvdvdvdvdfvdfgvdsdgdsfgdfgdfgdfgdfML.pptvdvdvdvdvdfvdfgvdsdgdsfgdfgdfgdfgdf
ML.pptvdvdvdvdvdfvdfgvdsdgdsfgdfgdfgdfgdf
 
ML.ppt
ML.pptML.ppt
ML.ppt
 
25.ranking on data manifold with sink points
25.ranking on data manifold with sink points25.ranking on data manifold with sink points
25.ranking on data manifold with sink points
 
Invited Lecture on Interactive Information Retrieval
Invited Lecture on Interactive Information RetrievalInvited Lecture on Interactive Information Retrieval
Invited Lecture on Interactive Information Retrieval
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentation
 
Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System Introduction
 
Evaluating e reference
Evaluating e referenceEvaluating e reference
Evaluating e reference
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
 
Evaluating Collaborative Filtering Recommender Systems
Evaluating Collaborative Filtering Recommender SystemsEvaluating Collaborative Filtering Recommender Systems
Evaluating Collaborative Filtering Recommender Systems
 
Usability of Online Instruction
Usability of Online InstructionUsability of Online Instruction
Usability of Online Instruction
 
Data Analysis, Intepretation
Data Analysis, IntepretationData Analysis, Intepretation
Data Analysis, Intepretation
 
Lane-SlidesMania.pptx
Lane-SlidesMania.pptxLane-SlidesMania.pptx
Lane-SlidesMania.pptx
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...
Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...
Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...
 
Determining Relevance Rankings with Search Click Logs
Determining Relevance Rankings with Search Click LogsDetermining Relevance Rankings with Search Click Logs
Determining Relevance Rankings with Search Click Logs
 

Recently uploaded

Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxuniversity
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...navyadasi1992
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 

Recently uploaded (20)

Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 

Online Learning to Rank

  • 1. Образец заголовка Online Learning to Rank by Edward W Huang (ewhuang3) and Yerzhan Suleimenov (suleime1) Prepared as an assignment for CS410: Text Information Systems in Spring 2016
  • 3. Образец заголовкаWhat is learning to rank? • Many information retrieval problems are ranking problems • Also known as machine-learned ranking – Uses machine learning techniques to create ranking models • Training data: queries and documents matched with relevance judgements – Model sorts objects by relevance, preference, or importance – Finds optimal combination of features
  • 4. Образец заголовкаApplications of learning to rank • Ranking problems in information retrieval – Document Retrieval – Sentiment analysis – Product rating – Anti-spam measures – Search engines • Many more applications not just in information retrieval! – Machine translation – Computational biology
  • 5. Образец заголовкаOnline vs. offline learning to rank • Training set is produced by human assessors (offline) – Time consuming and expensive to produce – Not always in line with actual user preferences • Data of users interacting with system (online) – Users leave trace of interaction data: query reformulations, mouse movements, clicks, etc. – Clicks especially valuable when interpreted as preferences
  • 6. Образец заголовка Big issue with online learning to rank • Exploration-exploitation dilemma – Have to obtain feedback to improve system, while also utilizing past models to optimize result quality – Discuss solutions later
  • 8. Образец заголовкаRanking model training framework • Discriminative training attributes – Input space – Output space – Hypothesis space – Loss function • Ranking model predicts ground truth label in training set in terms of loss function • Test phase: new query arrives, trained ranking model sorts documents according to relevance to query
  • 9. Образец заголовка Algorithms for learning to rank problems • Categorized into three groups by their framework (input representation and loss function) – Pointwise – Pairwise – Listwise T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3): 225–331, 2009.
  • 10. Образец заголовкаLimitations of pointwise approach • Does not consider interdependency among documents • Does not make use of the fact that some documents are associated with the same query • Most IR evaluation measures are query-level and position-based
  • 11. Образец заголовкаPairwise and listwise • Potential solutions to the previously mentioned exploration-exploitation dilemma • Pairwise approach – Input: pairs of documents with labels identifying which one is preferred – Learns classifier to predict these labels • Listwise approach – Input: entire document list associated with a certain query – Directly optimizes evaluation measures (i.e., Normalized Discounted Cumulative Gain) Hofmann, Katja, Shimon Whiteson, and Maarten De Rijke. "Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval." Information Retrieval Inf Retrieval 16.1 (2012): 63-90.
  • 12. Образец заголовка Absolute and relative feedback approaches • Use feedback to learn personalized rankings • Absolute feedback: contextual bandits learning • Relative feedback: gradient methods and inferred preferences between complete result rankings • Relative is usually better – Robust to noisy feedback – Deals with larger document spaces Chen, Yiwei, and Katja Hofmann. "Online Learning to Rank: Absolute vs. Relative" Proceedings of the 24th International Conference on World Wide Web - WWW '15 Companion (2015).
  • 14. Образец заголовкаImproving learning performance • Search engine clicks are useful, but might be biased – Bias might come from attractive titles, snippets, or captions • Method to detect and compensate for caption bias – Enable reweighting of clicks based on likelihood – Attractive, clicked links are considered less relevant K. Hofmann, F. Behr, and F. Radlinski. On caption bias in interleaving experiments. In Proc. of CIKM, 2012.
  • 15. Образец заголовкаHandling caption bias • Allow weighting of clicks based on likelihood that each click is caption biased • Model click probability as function of position, relevance, and caption bias – Visual characteristics of individual documents – Pairwise feature to focus on relationships with neighboring documents • Learn model weights from past behavior of users • Remove caption bias to obtain evaluation that reflects better relevance
  • 16. Образец заголовкаImproving learning speed • Search engine clicks can be interpreted using interleaved comparison methods (two main methods) – Reliably infer preferences between pairs of rankers • Dueling bandit gradient descent learns from these comparisons – Requires pairwise comparisons involving users between all exploratory rankers • Multileave gradient descent learns from comparisons of multiple rankers at once – Uses a single user interaction – Fast Schuth, Anne, Harrie Oosterhuis, Shimon Whiteson, and Maarten De Rijke. "Multileave Gradient Descent for Fast Online Learning to Rank." Proceedings of the Ninth ACM International Conference on Web Search and Data Mining - WSDM '16 (2016).
  • 18. Образец заголовкаHow to evaluate rankers? • After training a ranker, we need to find out how effective it is • Offline evaluation methods – Dependent on explicit expert judgements – Not feasible in practice • Online evaluation methods – Leverage online data that can reflect ranker quality – Click-based ranker evaluation (discussed next) • State of the art software: Lerot – Evaluates different algorithms – Can simulate user clicking behaviour with user models Schuth, Anne, Katja Hofmann, Shimon Whiteson, and Maarten De Rijke. "Lerot." Proceedings of the 2013 Workshop on Living Labs for Information Retrieval Evaluation - LivingLab '13 (2013).
  • 19. Образец заголовкаClick-based ranker evaluation • Online evaluation strategy based on clickthrough data • Independent of expert judgments, unlike conventional evaluation methods – Measure reflects interest of an actual user rather than interest of an expert providing relevance judgement
  • 20. Образец заголовка Challenges of using clickthrough data • Handling presentation bias – Design user interface with three features • Blind test: hidden random variables underlying the hypothesis test • Click to preference: user’s click should reflect its actual judgment • Low usability impact: interactive, user-friendly interface • Identifying the superior of two rankers – Unified user interface that sends user query to both rankers – Mix two ranking results (discussed next) – Show combined ranking to the user and record interesting/relevant clicks T. Joachims, Evaluating Retrieval Performance Using Clickthrough Data, in J. Franke and G. Nakhaeizadeh and I. Renz, "Text Mining", Physica/Springer Verlag, pp 79-96, 2003.
  • 21. Образец заголовкаMixing two ranking results • Also known as interleaving • Key is to mix by balancing population from both rankers in top n listings • Algorithms vary in mixing strategy – Balanced Interleaving – Team-Draft Interleaving
  • 22. Образец заголовка Leveraging click responses from mixed rankings • Each click response represents user’s preference to ranker that provided the clicked link • Thus, proper leverage of clicks is critical – Also known as test statistics – Essential to reliable evaluation of rankers • One basic approach is to assign equal weights to all clicks – Suboptimal since not all clicks are equally significant – Caption bias! • More advanced test statistics, discussed next
  • 23. Образец заголовкаTest statistics for evaluation • Learn weights to maximize mean score difference between best and worst rankers • Optimize statistical power of z-test by maximizing z-score and p-value – Removes assumption of equal variance of weights • Learns to invert Wilcoxon Signed-Rank Test – Produces scoring function to optimize Wilcoxon test • Max mean difference performs the worst • Inverse z-test performs the best Yisong Yue, Yue Gao, O. Chapelle, Ya Zhang, T. Joachims, Learning more powerful test statistics for click- based retrieval evaluation, Proceedings of the Conference on Research and Development in Information Retrieval (SIGIR), 2010.
  • 24. Образец заголовка How good are interleaving methods? • Interleaving methods are compared against baseline: conventional evaluation methods based on absolute metrics • Conventional evaluation methods based on absolute metrics – Absolute usage statistics are expected to monotonically change with respect to ranker quality • Interleaving methods – More user clicks are expected for better ranker
  • 25. Образец заголовка Relative performance of interleaving methods • Experiment results on two rankers whose relative qualities are known by construction • Conventional evaluation methods based on absolute metrics – Did not reliably identify high-quality rankers – Absolute usage statistics did not monotonically change with respect to ranker quality • Balanced Interleaving and Team-Draft Interleaving algorithms – Reliably identified high-quality rankers – Number of preferences for better ranker is significantly larger F. Radlinski, M. Kurup, T. Joachims, How Does Clickthrough Data Reflect Retrieval Quality?,Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), 2008.
  • 26. Образец заголовка How much reliable and why to choose interleaved methods? • Results of interleaving agrees with conventional evaluation methods • Achieves statistically reliable preference compared to absolute metrics • Economical: statistical evaluation power of 10 interleaved clicks is approximately equal to 1 manual judged query • Not sensitive to different click aggregation schemes • Can complement or even replace standard evaluation methods based on manual judgments or absolute metrics O. Chapelle, T. Joachims, F. Radlinski, Yisong Yue, Large-Scale Validation and Analysis of Interleaved Search Evaluation, ACM Transactions on Information Systems (TOIS), 30(1):6.1-6.41, 2012.
  • 27. Образец заголовкаFuture directions • Extend current linear learning approaches with online learning to rank algorithms that can effectively learn more complex models • Designing and re-experimenting with more complex models for click behavior to better understand various click biases. • Learning distinctive properties, such as click dwell time and use of back button, to filter out raw clicks. • Understanding range of domains in which interleaving methods are highly effective. • Improvement of gradient descent based rankers by covering all search directions to speed up learning processes.
  • 28. Образец заголовкаReferences 1. Chen, Yiwei, and Katja Hofmann. "Online Learning to Rank: Absolute vs. Relative" Proceedings of the 24th International Conference on World Wide Web - WWW '15 Companion (2015). 2. F. Radlinski, M. Kurup, T. Joachims, How Does Clickthrough Data Reflect Retrieval Quality?,Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), 2008. 3. Hofmann, Katja, Shimon Whiteson, and Maarten De Rijke. "Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval." Information Retrieval Inf Retrieval 16.1 (2012): 63-90. 4. T. Joachims, Evaluating Retrieval Performance Using Clickthrough Data, in J. Franke and G. Nakhaeizadeh and I. Renz, "Text Mining", Physica/Springer Verlag, pp 79-96, 2003. 5. K. Hofmann, F. Behr, and F. Radlinski. On caption bias in interleaving experiments. In Proc. of CIKM, 2012. 6. O. Chapelle, T. Joachims, F. Radlinski, Yisong Yue, Large-Scale Validation and Analysis of Interleaved Search Evaluation, ACM Transactions on Information Systems (TOIS), 30(1):6.1-6.41, 2012. 7. Schuth, Anne, Harrie Oosterhuis, Shimon Whiteson, and Maarten De Rijke. "Multileave Gradient Descent for Fast Online Learning to Rank." Proceedings of the Ninth ACM International Conference on Web Search and Data Mining - WSDM '16 (2016). 8. Schuth, Anne, Katja Hofmann, Shimon Whiteson, and Maarten De Rijke. "Lerot." Proceedings of the 2013 Workshop on Living Labs for Information Retrieval Evaluation - LivingLab '13 (2013). 9. T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3): 225–331, 2009. 10. Yisong Yue, Yue Gao, O. Chapelle, Ya Zhang, T. Joachims, Learning more powerful test statistics for click- based retrieval evaluation, Proceedings of the Conference on Research and Development in Information Retrieval (SIGIR), 2010.