SIGIR 2017 talk
Tokyo, Japan
A significant amount of search queries originate from some real
world information need or tasks [13]. In order to improve the
search experience of the end users, it is important to have accurate
representations of tasks. As a result, signicant amount of research
has been devoted to extracting proper representations of tasks in
order to enable search systems to help users complete their tasks, as
well as providing the end user with beer query suggestions [9], for
beer recommendations [41], for satisfaction prediction [36] and for
improved personalization in terms of tasks [24, 38]. Most existing
task extraction methodologies focus on representing tasks as at
structures. However, tasks oen tend to have multiple subtasks
associated with them and a more naturalistic representation of tasks
would be in terms of a hierarchy, where each task can be composed
of multiple (sub)tasks. To this end, we propose an ecient Bayesian
nonparametric model for extracting hierarchies of such tasks &
subtasks. We evaluate our method based on real world query log
data both through quantitative and crowdsourced experiments and highlight the importance of considering task/subtask hierarchies
12. Extracting Search Tasks: Prior Work
Problems:
• Link query to on-going task = long chains
• impure tasks
• Rely on large corpus of pre-tagged queries
• Do not aggregate across users
• Tasks are not necessarily flat-structures
• complex tasks decompose into sub-tasks
19. • Build upon Bayesian Rose Trees
• Each node of the tree corresponds to a task
• Each task represented by a set of queries
• Goal: Find the tree structure that maximizes
• Number of partitions consistent with T can be exponentially large
• Approximate using dynamic programming:
åÎ
=
)()(
))(|())(()|(
TPartT
TQpTpTQp
f
ff
Hierarchical Task Extraction
Likelihood of queries
belong to same task
)|)(()1()()|(
)(
ii
TchT
TTT TTleavespQfTQP
i
ÕÎ
-+= pp
Mixture over
partitions of
data points
25. • Experiment 1: Search task identification
• Experiment 2: Crowd-sourced evaluation of hierarchy
• Experiment 3: Term prediction application
Baselines:
1. Bestlink-SVM
2. QC-WCC/QC-HTC
3. LDA-Hawkes
4. LDA-TW
5. Jones hierarchy
6. BHCD: Bayesian Hierarchical Community Detection
7. Bayesian agglomerative clustering
Experimental Evaluation
Task extraction baselines
Hierarchical model baselines
26. • Pairwise precision/recall:
• LDA-TW performs worst
• Too strong assumptions on queries belonging to
same task
• Gains over QC-HTC/WCC
• Query affinities can better reflect semantic
relationships
Experimental Evaluation – I
[Search Task Identification]
Flattened version of hierarchy is useful too!
28. • Indirect evaluation based on term
prediction
1. Construct hierarchy
2. Map to correct node in the hierarchy
3. Leverage node queries for term prediction
• Assumption: identifying good tasks should
help in predicting future queries
• Intersection of TREC Session track & AOL
log data
Experimental Evaluation – III
[Term Prediction]
Outperforms flat-task extraction techniques as well as hierarchical baselines