SlideShare a Scribd company logo
1 of 13
A Quick Introduction to {!ctf}.
Dr. Gavin Ruddy
gav@pontneo.com
23 Jan 2016
What is {!ctf}?
A relatively simple way to make query returns more efficient, dynamic
and intelligent
Developed to improve results for a medical research database
based in SW UK - Trip (www.tripdatabase.com)
Uses the flow of people through content to help
• organise, extend & filter the items in query returns
• share & develop collective intelligence
• adapt naturally as things change
⇒ Implicit self-organisation without complex data, models or processing
{!ctf} stands for click-through filter
Click data
• reflects intelligent choices
• integrates community activity
• adds useful dimensions to content
For example, counting clicks on items and integrating over time
results in “Most Viewed” sections often seen on websites
These reflect collective behaviour and enable a dynamic connection
between users and content
Here’s one for “Rheumatology” documents on Trip varying in response
to click traffic at the beginning of 2015 …
Click data also identify the paths users take between items (sometimes
referred to as clickstream)
⇒ Lots of user click traffic connecting items
⇒ Forming a dynamic, bottom-up “Knowledge Map”
⇒ Useful source of time dependent intelligence …
{!ctf} extends this idea
25% - connect items in the same return
70% - connect to items not in the same return
15% - user modifies search query
55% - intersecting (i.e. related) searches
5% - from single click sessions
95% - from multiple click sessions (Ñ = 7)
For example, of all clicks in Trip search returns:
Traffic between items can be recorded, counted, integrated over time,
filtered etc.
Applications: recommendations
These connections can usefully extend queries
For example, {!ctf} personal recommendations use
• a list of items a user has visited
• & movement of other users to/from these items
to identify and rank interesting related items the user hasn’t visited
⇒ Responsive list of intelligent recommendations without
complex models, data or processing …
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 0.2 0.4 0.6 0.8 1
cg ({!ctf} specificity param)
MeanAveragePrecision
all traffic
to/from high ratings only
high ratings, same demographic
most popular items
{!ctf} on the famous Movielens 1M dataset produces comparable
precision, recall and ranking to more complex methods (see Lu, L. et
al. (2012). Recommender Systems. Physics Reports, 519(1), pp.1-49)
How good are {!ctf} recommendations?
Mean Average Precision of top 100 recommendations calculated from each user’s 5 highest rated
items, for some different components of traffic and at different values of cg. Cg reduces the ranking of
widely connected, generally popular items, boosting more specifically connected recommendations.
Source of Recommendations
How good are {!ctf} recommendations?
BUT, high precision means boring & pointless recommendations
For movies, the “interesting” recommendations are those connected by enough (but not too much)
click traffic. Hence, lower precision can produce better results. An active feedback loop helps
dynamic, shared interest communities find the right balance themselves.
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
cg ({!ctf} specificity param)
fractionoftop100recommendations
below average rating
stuff user
already knows
obscure
interesting
Because click rates diminish extremely quickly down a list, there are
large improvements by getting the top of a search return “right”
Applications: search
Click data helps to improve search efficiency
The following {!ctf} search
• boosts recently more visited items
• injects items strongly related to returned items by lots of
recent click traffic (i.e. “recommendations”)
• responds to changing (in this case, unfiltered) click traffic
over time
This helps improve both precision and recall, again without complex
models, data or processing …
How good is {!ctf} search?
Overall average on Trip (before feedback & at various {!ctf}
settings) is 1.83 and does not fall below 1.
On Trip data, 1.5 to 4+ times more efficient than the underlying text
match algorithm (simple, unboosted TF-IDF on titles)
See http://www.slideshare.net/pontneo/click-through-
filterprototyperesultsv2 for details
These dynamics are not noise …simply a direct reflection of the active
interests in the relevant part of the information space
0
01/01/2014 11/04/2014 20/07/2014 28/10/2014 05/02/2015
“Ebola” interest on Google Trends
(../News/Health/Infectious Diseases)
A simple illustration: Ebola in 2014
0
01/01/2014 11/04/2014 20/07/2014 28/10/2014 05/02/2015
Clicks on “Ebola” documents in Trip
(a tiny signal in the total Trip traffic)
A simple illustration: Ebola in 2014
0
01/01/2014 11/04/2014 20/07/2014 28/10/2014 05/02/2015
“Ebola” documents in top 10 {!ctf} search results
for “hemorrhagic fever” on Trip (sum relevance)
– over 60% as “recommendations”
A simple illustration: Ebola in 2014
Low level, high frequency collaborative method that “naturally” brings
users and information together at the right level
Modifying query responses using click data =>
• clear efficiency improvements
• intelligent, responsive content delivery
• efficient knowledge sharing
Conclusions
For more detail see “Click-Through Filter” e.g.
http://www.slideshare.net/pontneo/better-search-implementation-of-
click-through-filter-as-a-query-parser-plugin-for-apache-solr-lucene
Forms an implicit self-organising feedback loop that results in continual
evolution of responses & communities without complex data or methods

More Related Content

Similar to Quick introduction to the click-through filter

Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...Platforma Otwartej Nauki
 
The Universal Recommender
The Universal RecommenderThe Universal Recommender
The Universal RecommenderPat Ferrel
 
Optimising Your Content for findability
Optimising Your Content for findabilityOptimising Your Content for findability
Optimising Your Content for findabilityKristian Norling
 
Information Filtration
Information FiltrationInformation Filtration
Information FiltrationAli Jafar
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop RecommendationIRJET Journal
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop RecommendationIRJET Journal
 
Webinar: Search and Recommenders
Webinar: Search and RecommendersWebinar: Search and Recommenders
Webinar: Search and RecommendersLucidworks
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systemsvivatechijri
 
Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service iiKan-Han (John) Lu
 
Data sets are often perceived as only being relevant for researchers.pdf
Data sets are often perceived as only being relevant for researchers.pdfData sets are often perceived as only being relevant for researchers.pdf
Data sets are often perceived as only being relevant for researchers.pdflakshmijewellery
 
Use of Visualisations to Optimise Clinical Trials - Neill Barron
Use of Visualisations to Optimise Clinical Trials - Neill BarronUse of Visualisations to Optimise Clinical Trials - Neill Barron
Use of Visualisations to Optimise Clinical Trials - Neill BarronNeill Barron
 
History and Overview of the Recommender Systems.pdf
History and Overview of the Recommender Systems.pdfHistory and Overview of the Recommender Systems.pdf
History and Overview of the Recommender Systems.pdfssuser5b0f5e
 
A Systematic Literature Survey On Recommendation System
A Systematic Literature Survey On Recommendation SystemA Systematic Literature Survey On Recommendation System
A Systematic Literature Survey On Recommendation SystemGina Rizzo
 
Considering metrics for NHS Library Services
Considering metrics for NHS Library ServicesConsidering metrics for NHS Library Services
Considering metrics for NHS Library ServicesAlan Fricker
 
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity DebateIn Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity DebateNeuroscience Information Framework
 
Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...Mounia Lalmas-Roelleke
 
A Decade of Discovery: What We Know and Where We Will Go
A Decade of Discovery: What We Know and Where We Will GoA Decade of Discovery: What We Know and Where We Will Go
A Decade of Discovery: What We Know and Where We Will GoCharleston Conference
 

Similar to Quick introduction to the click-through filter (20)

Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...
 
The Universal Recommender
The Universal RecommenderThe Universal Recommender
The Universal Recommender
 
Optimising Your Content for findability
Optimising Your Content for findabilityOptimising Your Content for findability
Optimising Your Content for findability
 
Information Filtration
Information FiltrationInformation Filtration
Information Filtration
 
2015 04 pfs_blois_france
2015 04 pfs_blois_france2015 04 pfs_blois_france
2015 04 pfs_blois_france
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop Recommendation
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop Recommendation
 
Webinar: Search and Recommenders
Webinar: Search and RecommendersWebinar: Search and Recommenders
Webinar: Search and Recommenders
 
Qs1 group a
Qs1 group a Qs1 group a
Qs1 group a
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service ii
 
Data sets are often perceived as only being relevant for researchers.pdf
Data sets are often perceived as only being relevant for researchers.pdfData sets are often perceived as only being relevant for researchers.pdf
Data sets are often perceived as only being relevant for researchers.pdf
 
Recommender system
Recommender system Recommender system
Recommender system
 
Use of Visualisations to Optimise Clinical Trials - Neill Barron
Use of Visualisations to Optimise Clinical Trials - Neill BarronUse of Visualisations to Optimise Clinical Trials - Neill Barron
Use of Visualisations to Optimise Clinical Trials - Neill Barron
 
History and Overview of the Recommender Systems.pdf
History and Overview of the Recommender Systems.pdfHistory and Overview of the Recommender Systems.pdf
History and Overview of the Recommender Systems.pdf
 
A Systematic Literature Survey On Recommendation System
A Systematic Literature Survey On Recommendation SystemA Systematic Literature Survey On Recommendation System
A Systematic Literature Survey On Recommendation System
 
Considering metrics for NHS Library Services
Considering metrics for NHS Library ServicesConsidering metrics for NHS Library Services
Considering metrics for NHS Library Services
 
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity DebateIn Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
 
Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...Evaluating the search experience: from Retrieval Effectiveness to User Engage...
Evaluating the search experience: from Retrieval Effectiveness to User Engage...
 
A Decade of Discovery: What We Know and Where We Will Go
A Decade of Discovery: What We Know and Where We Will GoA Decade of Discovery: What We Know and Where We Will Go
A Decade of Discovery: What We Know and Where We Will Go
 

Recently uploaded

Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 

Recently uploaded (20)

Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 

Quick introduction to the click-through filter

  • 1. A Quick Introduction to {!ctf}. Dr. Gavin Ruddy gav@pontneo.com 23 Jan 2016
  • 2. What is {!ctf}? A relatively simple way to make query returns more efficient, dynamic and intelligent Developed to improve results for a medical research database based in SW UK - Trip (www.tripdatabase.com) Uses the flow of people through content to help • organise, extend & filter the items in query returns • share & develop collective intelligence • adapt naturally as things change ⇒ Implicit self-organisation without complex data, models or processing
  • 3. {!ctf} stands for click-through filter Click data • reflects intelligent choices • integrates community activity • adds useful dimensions to content For example, counting clicks on items and integrating over time results in “Most Viewed” sections often seen on websites These reflect collective behaviour and enable a dynamic connection between users and content Here’s one for “Rheumatology” documents on Trip varying in response to click traffic at the beginning of 2015 …
  • 4. Click data also identify the paths users take between items (sometimes referred to as clickstream) ⇒ Lots of user click traffic connecting items ⇒ Forming a dynamic, bottom-up “Knowledge Map” ⇒ Useful source of time dependent intelligence … {!ctf} extends this idea 25% - connect items in the same return 70% - connect to items not in the same return 15% - user modifies search query 55% - intersecting (i.e. related) searches 5% - from single click sessions 95% - from multiple click sessions (Ñ = 7) For example, of all clicks in Trip search returns: Traffic between items can be recorded, counted, integrated over time, filtered etc.
  • 5. Applications: recommendations These connections can usefully extend queries For example, {!ctf} personal recommendations use • a list of items a user has visited • & movement of other users to/from these items to identify and rank interesting related items the user hasn’t visited ⇒ Responsive list of intelligent recommendations without complex models, data or processing …
  • 6. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.2 0.4 0.6 0.8 1 cg ({!ctf} specificity param) MeanAveragePrecision all traffic to/from high ratings only high ratings, same demographic most popular items {!ctf} on the famous Movielens 1M dataset produces comparable precision, recall and ranking to more complex methods (see Lu, L. et al. (2012). Recommender Systems. Physics Reports, 519(1), pp.1-49) How good are {!ctf} recommendations? Mean Average Precision of top 100 recommendations calculated from each user’s 5 highest rated items, for some different components of traffic and at different values of cg. Cg reduces the ranking of widely connected, generally popular items, boosting more specifically connected recommendations. Source of Recommendations
  • 7. How good are {!ctf} recommendations? BUT, high precision means boring & pointless recommendations For movies, the “interesting” recommendations are those connected by enough (but not too much) click traffic. Hence, lower precision can produce better results. An active feedback loop helps dynamic, shared interest communities find the right balance themselves. 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 cg ({!ctf} specificity param) fractionoftop100recommendations below average rating stuff user already knows obscure interesting
  • 8. Because click rates diminish extremely quickly down a list, there are large improvements by getting the top of a search return “right” Applications: search Click data helps to improve search efficiency The following {!ctf} search • boosts recently more visited items • injects items strongly related to returned items by lots of recent click traffic (i.e. “recommendations”) • responds to changing (in this case, unfiltered) click traffic over time This helps improve both precision and recall, again without complex models, data or processing …
  • 9. How good is {!ctf} search? Overall average on Trip (before feedback & at various {!ctf} settings) is 1.83 and does not fall below 1. On Trip data, 1.5 to 4+ times more efficient than the underlying text match algorithm (simple, unboosted TF-IDF on titles) See http://www.slideshare.net/pontneo/click-through- filterprototyperesultsv2 for details These dynamics are not noise …simply a direct reflection of the active interests in the relevant part of the information space
  • 10. 0 01/01/2014 11/04/2014 20/07/2014 28/10/2014 05/02/2015 “Ebola” interest on Google Trends (../News/Health/Infectious Diseases) A simple illustration: Ebola in 2014
  • 11. 0 01/01/2014 11/04/2014 20/07/2014 28/10/2014 05/02/2015 Clicks on “Ebola” documents in Trip (a tiny signal in the total Trip traffic) A simple illustration: Ebola in 2014
  • 12. 0 01/01/2014 11/04/2014 20/07/2014 28/10/2014 05/02/2015 “Ebola” documents in top 10 {!ctf} search results for “hemorrhagic fever” on Trip (sum relevance) – over 60% as “recommendations” A simple illustration: Ebola in 2014
  • 13. Low level, high frequency collaborative method that “naturally” brings users and information together at the right level Modifying query responses using click data => • clear efficiency improvements • intelligent, responsive content delivery • efficient knowledge sharing Conclusions For more detail see “Click-Through Filter” e.g. http://www.slideshare.net/pontneo/better-search-implementation-of- click-through-filter-as-a-query-parser-plugin-for-apache-solr-lucene Forms an implicit self-organising feedback loop that results in continual evolution of responses & communities without complex data or methods