SlideShare a Scribd company logo
1 of 74
Download to read offline
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias in recommendation:
avoid it or embrace it?
Pablo Castells
Universidad Autónoma de Madrid
http://ir.ii.uam.es/castells
Amazon, Barcelona, February 17, 2020
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Outline
1. Bias and fairness
2. Removing the bias
3. Understanding the bias
4. Conclusion
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
1. Bias and fairness
2. Removing the bias
3. Understanding the bias
4. Conclusion
Outline
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias in search
 “Search engine manipulation is a serious threat
to the democratic system of government”
 Google could manipulate 2.6 – 10M votes
Before
Pro-Clinton
Pro-Trump
Date
After
Robert Epstein
R. Epstein, R. E. Robertson. A Method for
Detecting Bias in Search Rankings, with
Evidence of Systematic Bias Related to the
2016 Presidential Election. White paper,
American Institute for Behavioral Research
and Technology, June 2017.
R. Epstein, R. E. Robertson. The search
engine manipulation effect (SEME) and its
possible impact on the outcomes of
elections. PNAS 112(33), August 2015.
Election day
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
COMPAS system: ground truth evaluation proves bias
Black
45%
White
23%
Black
28%
White
48%
False positives False negatives
𝑃 FP Black ≫ 𝑃 FP
𝑃 FN Black ≪ 𝑃 FN
Recidivism prediction
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias = under / over-represented features
Female
53%
Male
47% Female
32%
Male
68%
PhDs (USA) Full professors (USA)
𝑃 Female Professor ≪ 𝑃 Female PhD
𝑃 Female Professor ≪ 𝑃 Female
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias = under / over-represented features
Female
16%
Male
84%
PhDs (Spain) Full professors (Spain)
𝑃 Female Professor ≪ 𝑃 Female PhD
𝑃 Female Professor ≪ 𝑃 Female
Female
51%
Male
49%
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias = under / over-represented features
Other
38%
White male
62%
Other
9%
White male
91%
Fortune 500 employees Fortune 500 CEOs
𝑃 White male CEO ≫ 𝑃 White male Employee
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias in algorithms
Bias in many application domains
 Face recognition / surveillance
 Recruiting
 Loans
 News, social media
 Search
 ···
Typically the bias is in the data, in the history – the algorithm
learns and reproduces / amplifies the human bias
Baeza-Yates, R. Bias on the Web. Communications of the ACM 61(6), May 2018.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias in search engines’ results?
 Promote own services
 Gender and ethnic stereotypes
– In autocomplete
– In spelling correction
 Impact on people’s perceptions (e.g. shift voting)
 Relevance bias
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias is… bad?
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
The popularity bias in information retrieval
 Popularity: “unidimensional” bias
– The overrepresented “feature” is the item itself
– Analysis can be generalized to any feature
 Issues related to user satisfaction
– Does the bias hurt the system effectiveness
– Does the bias distort evaluation
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias in search
 In input
– “Popularity” bias in queries
– “Popularity” bias in click logs
– “Popularity” bias in sales
– “Popularity” bias in Web links
– Position bias in clicks
 In output
– Sellers in search results
– Expose the catalog
 Bias in offline evaluation
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias in recommendation
Items
Users
Popular
items
Rest of items
(long tail)
Items
Nºinteractions
In the (input) data Popular items
Long-tail items
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
0
1000
2000
0 1000 2000
0
400
800
0 1000 2000
0
1000
2000
3000
0 1000 2000
Bias in recommendation
In algorithms (output)
Matrix factorization
Nº positive ratings
Nºtimestop10
800
400
0
0 1000 2000
User-based kNN
Nº positive ratings
2000
1000
0
0 1000 2000
Item-based kNN
Nº positive ratings
3000
1000
0
0 1000 2000
0
2000
4000
0 1000 2000
Oracle optimal !!
Nº positive ratings
4000
2000
0
0 1000 2000
2000
R. Cañamares, P. Castells. A Probabilistic Reformulation of Memory-Based Collaborative Filtering – Implications on Popularity Biases. SIGIR 2017.
D. Jannach et al. What recommenders recommend: an analysis of recommendation biases and possible countermeasures. UMUAI 25(5), Dec. 2015.
MovieLens 1M dataset
 1M ratings, 6K users, 4K items
 Random rating split 80% training / 20% test
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias in recommendation
Netflix dataset
 100M ratings, 0.5M users, 18K items
 Random rating split 80% training / 20% test
Random
Positive rating count
User-based kNN
Matrix factorization0.3
0.2
0.1
0
nDCG@10
Average rating value
In offline evaluation
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
P. Cremonesi, Y. Koren, R. Turrin.. Performance of recommender algorithms on top-n recommendation tasks. RecSys 2010.
D. Jannach et al. What recommenders recommend: an analysis of recommendation biases and possible countermeasures. UMUAI 25(5), Dec. 2015.
0
0.1
0.2
0.3
Random
Avg.rating
Nr.ratings
User-based
Matrixfact.
nDCG@10
Netflix
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias in recommendation
What to do about the bias?
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Outline
1. Bias and fairness
2. Removing the bias
3. Understanding the bias
4. Conclusion
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
1. Bias and fairness
2. Removing the bias
3. Understanding the bias
4. Conclusion
Outline
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
What to do about the bias
Answer 1 – Bias is bad
⇒ Remove the bias in your recommendations
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Avoiding popularity: novelty and diversity
Avoid popularity in the output recommendations
 Novelty: limited value of popular recommendations
Try to move towards the long tail
 Diversity / fairness: avoid filter bubble
and concentration over few items
Give all items some chance to be exposed
→ Reranking, multiarmed bandits, etc.
Items
#interactions
𝑎 𝑏
P. Castells, N. J. Hurley, S. Vargas. Novelty and Diversity in
Recommender Systems. In Recommender Systems Handbook,
2nd edition. Springer, 2015.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Avoiding popularity: novelty and diversity
Context
Recommended item
Target user’s
experience
Everyone else’s
experience
Everyone else’s
recommendations
Other items in the
same recommendation
Unexpectedness
Intra-list
diversity
Long-tail
novelty Sales diversity
Distance or identity
Item novelty model
P. Castells, N. J. Hurley, S. Vargas. Novelty and Diversity in Recommender Systems.
In Recommender Systems Handbook, 2nd edition. Springer, 2015.
Problem solved?
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Avoiding popularity: novelty and diversity
Novelty / diversity
Relevance
Stimulation
P. Castells, N. J. Hurley, S. Vargas. Novelty and Diversity in Recommender Systems. In Recommender Systems Handbook, 2nd edition. Springer, 2015.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Avoiding popularity: novelty and diversity
Items
#interactions
𝑎 𝒃 𝒄
Still
a bias
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
What to do about the bias
Answer 2 – Bias is bad
⇒ Remove the bias in offline evaluation
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Popularity bias in offline evaluation
Popular items
(short head)
Rest of items
(long tail)
Observed user-item interaction
Unobserved preference
Items
Users
Ratings are missing
not at random (MNAR)
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Positive rating count
Popularity bias in offline evaluation
Test data (relevant items)
Training data
Unobserved preference
Items
Users
Popular items
(short head)
Rest of items
(long tail)
avg P@𝑘 ∼
+
𝑘
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
Ratings are missing
not at random (MNAR)
0.3
0.2
0.1
0
nDCG@10
0
0.1
0.2
0.3
Random
Avg.rating
Nr.ratings
User-based
Matrixfact.
nDCG@10
Netflix
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Removing the popularity bias in offline evaluation
A. Handling the (test) data
Items Items
#ratings
Flat test Popularity strata
Time
Temporal split
Test data (relevant items)
Training data
Unobserved preference
A. Bellogín, P. Castells, I. Cantador. Statistical Biases in Information Retrieval Metrics for Recommender Systems. Information Retrieval 20(6), July 2017.
P. Cremonesi, Y. Koren, R. Turrin.. Performance of recommender algorithms on top-n recommendation tasks. RecSys 2010.
H. Steck. Training and Testing of Recommender Systems on Data Missing not at Random. KDD 2010.
H. Steck. Item popularity and recommendation accuracy. RecSys 2011.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Removing the popularity bias in offline evaluation
B. Correcting for bias in the metrics
C. Unbiased learning
D. Unbiased datasets
A. Bellogín, P. Castells, I. Cantador. Statistical Biases in Information Retrieval Metrics for Recommender Systems. Information Retrieval 20(6), July 2017.
P. Cremonesi, Y. Koren, R. Turrin.. Performance of recommender algorithms on top-n recommendation tasks. RecSys 2010.
J. M. Hernández-Lobato, N. Houlsby, Z. Ghahramani. Probabilistic Matrix Factorization with Non-random Missing Data. ICML 2014.
H. Steck. Training and Testing of Recommender Systems on Data Missing not at Random. KDD 2010.
H. Steck. Item popularity and recommendation accuracy. RecSys 2011.
Stratified recall
Off-policy evaluation
Inverse propensity scoring
···
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Debiasing evaluation: Inverse Propensity Scoring
𝑃 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
𝑢
𝑖
𝑃 =
1
𝑅
෍
𝑖∈𝑅
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖
What you
want to measure
𝑃 =
1
𝑅
෍
𝑖∈𝑅
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖 · 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
What you
can measure
Biased estimate
Problems: 1) High variance, and
2) How to estimate propensity
𝑃 =
1
𝑅
෍
𝑖∈𝑅
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖 · 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
𝑃 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
Unbiasedestimate
T. Schnabel, A. Swaminathan, A. Singh, N. Chandak, T. Joachims. Recommendations as Treatments: Debiasing Learning and Evaluation. ICML 2016.
Swaminathan, A., Krishnamurthy, A., Agarwal, A., Dudik, M., Langford, J., Jose, D., Zitouni, I. Off-policy Evaluation for Slate Recommendation. NIPS 2017.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
0.005
0
0.01
0.015
0
0.04
0.080.08
0
0.04
0
0.1
0.20.2
0.1
0
0.005
0
0.01
0.015
0
0.04
0.080.08
0
0.04
0
0.1
0.20.2
0.1
0
0.005
0
0.01
0.015
0
0.04
0.080.08
0
0.04
0
0.1
0.20.2
0.1
0
Debiasing evaluation: experiments
P. Castells, R. Cañamares. Characterization of Fair Experiments for Recommender System Evaluation – A Formal Analysis. REVEAL@RecSys 2018.
Temporal split IPSRandom split
MovieLens 1M
Recall@10
Flat test
Matrix factorization
Random
Average rating value
Positive rating count
User-based kNN
0.05
0
0.1
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Debiasing evaluation: Inverse Propensity Scoring
 Playlist recommendation
 Comparison of 12 recommender systems
 Metric: impression-to-stream
1. Online (multivariate) AB test
2. Offline evaluation with IPS variants
– IPS
– Capped IPS
– Normalized capped IPS
A. Gruson, P. Chandar, C. Charbuillet, J. McInerney, S. Hansen, D. Tardieu, B. Carterette. Offline Evaluation to Make Decisions about Playlist Recommendation
Algorithms. WSDM 2019.
Spotify evaluation experiment
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Debiasing evaluation: Inverse Propensity Scoring
ABtest
IPS Normalized
capped IPS
Capped IPS
A. Gruson, P. Chandar, C. Charbuillet, J. McInerney, S. Hansen, D. Tardieu, B. Carterette. Offline Evaluation to Make Decisions about Playlist Recommendation
Algorithms. WSDM 2019.
Recommender system ranking comparison
0
2
4
6
8
10
12
0 2 4 6 8 10 12
0
2
4
6
8
10
12
0 2 4 6 8 10 12
0
2
4
6
8
10
12
0 2 4 6 8 10 12
Spotify evaluation experiment
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Debiasing evaluation: Inverse Propensity Scoring
CappedIPS
IPS Capped IPSIPS
A. Gruson, P. Chandar, C. Charbuillet, J. McInerney, S. Hansen, D. Tardieu, B. Carterette. Offline Evaluation to Make Decisions about Playlist Recommendation
Algorithms. WSDM 2019.
Normalized
cappedIPS
Normalized
cappedIPS
0
2
4
6
8
10
12
0 2 4 6 8 10 12
0
2
4
6
8
10
12
0 2 4 6 8 10 12
0
2
4
6
8
10
12
0 2 4 6 8 10 12
Spotify evaluation experiment
Recommender system ranking comparison
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Debiasing evaluation: unbiased data
Yahoo! R3
Free user interaction
5,400Yahoo!radiousers
10 random tracks per user
MNAR training data
MAR test data
B. Marlin, R. Zemel. Collaborative prediction and ranking with non-random missing data. RecSys 2009.
130K ratings
1,000 music tracks
randomly sampled
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Unbiased test data: experiments
Yahoo! R3
0
0.01
0.02
0
0.02
0.04
0
0.02
0.04
0.06
0
0.02
0.04
0
0.05
0.1
0.15
0
0.1
0.2
Recall@10
0
0.02
0.01
0
0.02
0.04
0
0.06
0.04
0.02
0
0.02
0.04
0
0.15
0.1
0.05
MNAR random split MAR test
CM100k
MNAR random split MAR test
0.2
0
0.1
MNAR
random split
MAR testIPS IPS MAR test
Yahoo! R3
0
0.01
0.02
0
0.02
0.04
0
0.02
0.04
0.06
0
0.02
0.04
0
0.05
0.1
0.15
0
0.1
0.2
Recall@10
0
0.02
0.01
0
0.02
0.04
0
0.06
0.04
0.02
0
0.02
0.04
0
0.15
0.1
0.05
MNAR random split MAR test
CM100k
MNAR random split MAR test
0.2
0
0.1
Yahoo! R3 CM100K
Recall@10
Recall@10
MNAR
random split
P. Castells, R. Cañamares. Characterization of Fair Experiments for Recommender System Evaluation – A Formal Analysis. REVEAL@RecSys 2018.
Matrix factorization
Random
Average rating value
Positive rating count
User-based kNN
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Bias in recommendation
Is bias bad?
How bad?
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Outline
1. Bias and fairness
2. Removing the bias
3. Understanding the bias
4. Conclusion
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
1. Bias and fairness
2. Removing the bias
3. Understanding the bias
4. Conclusion
Outline
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Can we trust our experiments?
Computed on available
user taste observations
Computed with full
knowledge of user tastes
Observed metric value True metric value
Items
Users
Relevant
Non relevant
Missing ratings
?
≈
Items
Users
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Observed
Understanding the bias
Items
Users
Items
Users
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
Relevant
Non relevant
Observation vs. relevance
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Observed
Understanding the bias
Items
Users
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
Relevant
Non relevant
∧
Observation vs. relevance
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Missing rating
Understanding the bias
Items
Users
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
Relevant
Non relevant
∧
Observation vs. relevance
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Understanding the bias
Items
Users
Items
Users
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
Relevant
Non relevant
Observed
Observation vs. relevance
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Understanding the bias
Items
Users
Items
Users
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
Relevant
Non relevant
Observed
Observation vs. relevance
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Understanding the bias
Items
Users
Items
Users
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
Relevant
Non relevant
Observed
Observation vs. relevance
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Understanding the bias
Items
Users
Items
Users
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
Relevant
Non relevant
Observed
Observation vs. relevance
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Formal analysis
Very simple questions
1. Does popularity help or hurt recommendation effectiveness?
2. Which is better, the majority taste (positive rating count)
or the higher consensus (average rating value)?
3. Do biased metric values agree with true (unbiased) values?
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Research questions
Optimal recommendation
Optimal non-personalized
recommendation
Random recommendation
Highest
consensus
?
?
?
?
Personalized
recommendations
Largest
majority
Bad personalized
recommendations
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Where does popularity come from?
Items
#interactions
𝑎 𝑏
What made 𝑎 be so much
more popular than 𝑏?
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
𝑢
Rating generation
𝐷𝑖𝑠𝑐𝑜𝑣𝑒𝑟𝑢,𝑖 𝐸𝑛𝑔𝑎𝑔𝑒 𝑢,𝑖 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖
𝑖
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖 ∧ 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
Rec
algorithm
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Conditional (in)dependences between variables
𝑢
𝐷𝑖𝑠𝑐𝑜𝑣𝑒𝑟𝑢,𝑖 𝐸𝑛𝑔𝑎𝑔𝑒 𝑢,𝑖 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖
𝑖
Items
#Interactions
Popularity distribution
𝑝 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑖
𝐷𝑖𝑠𝑐𝑜𝑣𝑒𝑟
𝑖
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡
𝑂𝑏𝑠𝑒𝑟𝑣𝑒
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Conditional (in)dependences between variables
𝑝 𝑂 𝑖 = 𝑝 𝑂 𝑅, 𝑖 𝑝 𝑅 𝑖 + 𝑝 𝑂 ¬𝑅, 𝑖 𝑝(¬𝑅|𝑖)
𝑝 𝑂 𝑅, 𝑖 = 𝑝 𝑂 𝐷, 𝑅, 𝑖 𝑝 𝐷 𝑅, 𝑖
𝑢
𝐷 𝑢,𝑖 𝐸 𝑢,𝑖 𝑂 𝑢,𝑖
𝑅 𝑢,𝑖
𝑖
𝐷
𝑖
𝑅
𝑂
Items
#Interactions
Popularity distribution
𝑝 𝑂 𝑖
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Conditional (in)dependences between variables
𝑢
𝐷 𝑢,𝑖 𝐸 𝑢,𝑖 𝑂 𝑢,𝑖
𝑅 𝑢,𝑖
𝑖
𝐷
𝑖
𝑅
𝑂
Items
#Interactions
Popularity distribution
𝑝 𝑂 𝑖
𝐷
𝑖
𝑅
𝑂
𝐷
𝑖
𝑅
𝑂
1. Observation depends
just on relevance
2. Observation independent
from relevance
3. Observation depends
on both items and relevance
𝑝 𝑂 𝑅, 𝑖 = 𝑝 𝑂 𝐷, 𝑅, 𝑖 𝑝 𝐷 𝑅, 𝑖
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Findings
Optimal recommendation
Optimal non-personalized
recommendation
Random recommendation
1. Observation conditionally independent from item
Highest
consensus
Biased and unbiased
precision agree
Largest
majority
Biased ෡𝑷 ∝ Unbiased 𝑷
Even if 𝑷 𝑶𝒃𝒔𝒆𝒓𝒗𝒆𝒅 ¬𝑹𝒆𝒍𝒆𝒗𝒂𝒏𝒕 > 𝑷 𝑶𝒃𝒔𝒆𝒓𝒗𝒆𝒅 𝑹𝒆𝒍𝒆𝒗𝒂𝒏𝒕
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Findings
Optimal recommendation
Random recommendation
2. Observation conditionally independent from relevance
a) Observation correlates with relevance
Highest
consensus
Biased and unbiased
precision agree
Largest
majority
Biased ෡𝑷 ∝ Unbiased 𝑷
Optimal non-personalized
recommendation
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Highest
consensus
Largest
majority
Findings
Optimal recommendation
Random recommendation
Biased ෡𝑷 Unbiased 𝑷
2. Observation conditionally independent from relevance
b) Observation does not correlate with relevance
Biased and unbiased
precision disagree
Highest
consensus
Largest
majority
Optimal non-personalized
recommendation
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Findings
Optimal recommendation
Random recommendation
Unbiased 𝑷
2. Observation conditionally independent from relevance
b) Observation correlates negatively with relevance
Largest
majority
Highest
consensus
Largest
majority
Highest
consensus
Biased ෡𝑷
Biased and unbiased
precision disagree
!!
Optimal non-personalized
recommendation
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Findings
Optimal recommendation
Random recommendation
3. No assumption
𝔼 𝑃@1 𝜃 = න
Ω 𝑛
𝔼 𝑃@1 𝜃, 𝜔 𝑑𝜔
Highest
consensus
Largest
majority
Largest
majority
Highest
consensus
Unbiased 𝑷Biased ෡𝑷
Biased and unbiased
precision disagree
Optimal non-personalized
recommendation
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Dependence between observation and relevance
For example…
 Find items through search engines, good
recommender systems, good friends
 Rational herd behavior
 Rate based on whether you like
rated,rel ¬rated,rel rated,¬rel ¬rated,¬rel
1. Observation conditionally independent from item
Relevant rated
Relevant unrated
Non-relevant rated
Non-relevant unrated
Items
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Dependence between observation and relevance
rated,rel ¬rated,rel rated,¬rel ¬rated,¬rel
1. Observation conditionally independent from item
Relevant rated
Relevant unrated
Non-relevant rated
Non-relevant unrated
Items
Mellow BarcelonaGates Diagonal
I found and chose
this nice hotel
I never saw this one
Relevance possibly explains the resulting
observation I produced in Booking.com
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Dependence between observation and relevance
rated,rel ¬rated,rel rated,¬rel ¬rated,¬rel
1. Observation conditionally independent from item
Relevant rated
Relevant unrated
Non-relevant rated
Non-relevant unrated
Items
Other possible
examples…
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Dependence between observation and relevance
rated,rel ¬rated,rel rated,¬rel ¬rated,¬relrated,rel ¬rated,rel rated,¬rel ¬rated,¬relrated,rel ¬rated,rel rated,¬rel ¬rated,¬rel
a) Positive correlation b) No correlation c) Negative correlation
2. Observation conditionally independent from relevance
For example…
 Heavy (and/or good) advertisement
 Social conformity, fashion
 Reinforcement loops
 Randomness + snowball effects
Items Items Items
Relevant rated
Relevant unrated
Non-relevant rated
Non-relevant unrated
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Dependence between observation and relevance
rated,rel ¬rated,rel rated,¬rel ¬rated,¬relrated,rel ¬rated,rel rated,¬rel ¬rated,¬relrated,rel ¬rated,rel rated,¬rel ¬rated,¬rel
a) Positive correlation b) No correlation c) Negative correlation
2. Observation conditionally independent from relevance
Items Items Items
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Typical case
Optimal non-personalized
recommendation
Random recommendation
Highest
consensus
Largest
majority
Biased ෡𝑷 ∝ Unbiased 𝑷
Empirical results would suggest the typical case is a mix of
1. Relevance dependence
2. Item dependence with a) positive correlation
Observation bias stronger than relevance
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Typical case
rated,rel ¬rated,rel rated,¬rel ¬rated,¬rel
1. Observation conditionally independent from item
rated,rel ¬rated,rel rated,¬rel ¬rated,¬rel
a) Positive correlation
2. Observation conditionally independent from relevance
Relevant rated
Relevant unrated
Non-relevant rated
Non-relevant unrated
Typical case would seem a combination
of these two
Items
Items
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Closing the loop: relevance + novelty
CM100K (ir.ii.uam.es/cm100k)
1,000 music tracks
randomly sampled from deezer.com
User is familiar with
1,000users
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
User is not familiar with
100MARjudgments
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Closing the loop: relevance + novelty
CM100K
Undiscovered
nDCG@10
R. Cañamares, P. Castells. From the PRP to the Low Prior Discovery Recall Principle for Recommender Systems. SIGIR 2018.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Implications on personalized algorithms
0
0.01
0.02
0.03
Obs True
0
0.1
0.2
0.3
Obs
MovieLens 1M CM100K
nDCG@10
0
0.01
0.02
0.03
Obs True
0
0.1
0.2
0.3
Obs
Non-normalized kNN
(biased to popularity)
Normalized kNN
(biased to avg rating)
Biased evaluation
Non-normalized > normalized Non-normalized < normalized
Unbiased evaluation
R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Generalization to other biases
Items
Users
Items
Users
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖
Relevant
Non relevant
Observed
Complex observation biases
R. Cañamares, P. Castells. A Probabilistic Reformulation of Memory-Based Collaborative Filtering – Implications on Popularity Biases. SIGIR 2017.
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Outline
1. Bias and fairness
2. Removing the bias
3. Analysis of popularity in recommendation
4. Conclusion
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
1. Bias and fairness
2. Removing the bias
3. Analysis of popularity in recommendation
4. Conclusion
Outline
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Conclusions
 Popularity can be ok as long as it emerges out of relevance
 MNAR offline evaluation tends to agree with MAR evaluation
– Ratings appear to depend on both relevance and items
– Item dependence may be stronger, but tends to agree with relevance
– User bias to rate relevant or non-relevant should not make a difference
 Consensus seems slightly better behaved than majority
– And much better at novel relevant findings
 No universal solution to deal with bias – understand the bias
– Caution with eventual scenarios with strong item dependence
uncorrelated to or against relevance
 Analysis can be generalized to other biases and features
IRGIRGroup @UAM
Bias in recommendation: avoid it or embrace it?
Amazon, Barcelona, February 17, 2020
Ongoing and future directions
 Inverse propensity scoring
 Unbiased datasets
 Popularity bias in false-positive metrics
 Popularity from social network dynamics
 Multi-armed bandit recommendation algorithms
– Specific algorithms (e.g. bandit kNN)
– Better understanding feedback loop effects
and how to cope with them
R. Cañamares, M. Redondo, P. Castells.. Multi-Armed Recommender System Bandit Ensembles. RecSys 2019.
J. Sanz-Cruzado, E. López, P. Castells.. A Simple Multi-Armed Nearest-Neighbor Bandit for Interactive Recommendation. RecSys 2019.

More Related Content

Similar to Bias in recommendation: avoid it or embrace it?

Lean Analytics: Using Data to Build a Better Business Faster
Lean Analytics: Using Data to Build a Better Business FasterLean Analytics: Using Data to Build a Better Business Faster
Lean Analytics: Using Data to Build a Better Business FasterLean Startup Co.
 
Combining Art & Science in Modern Marketing
Combining Art & Science in Modern MarketingCombining Art & Science in Modern Marketing
Combining Art & Science in Modern MarketingScott Brinker
 
Connected commerce final 2104_bel
Connected commerce final 2104_belConnected commerce final 2104_bel
Connected commerce final 2104_belDigitasLBi Belgium
 
Andrey Tyschenko: Craft of Personalization
Andrey Tyschenko: Craft of PersonalizationAndrey Tyschenko: Craft of Personalization
Andrey Tyschenko: Craft of PersonalizationVladas Sapranavicius
 
'The Era of Ecommerce' Report
'The Era of Ecommerce' Report'The Era of Ecommerce' Report
'The Era of Ecommerce' ReportClark Boyd
 
The Era of Omni-Commerce: New Insights for Dominating the Digital Shelf and B...
The Era of Omni-Commerce: New Insights for Dominating the Digital Shelf and B...The Era of Omni-Commerce: New Insights for Dominating the Digital Shelf and B...
The Era of Omni-Commerce: New Insights for Dominating the Digital Shelf and B...Catalyst
 
Building Influence in 2019
Building Influence in 2019Building Influence in 2019
Building Influence in 2019Rand Fishkin
 
Capitalising on Popular Culture
Capitalising on Popular CultureCapitalising on Popular Culture
Capitalising on Popular Culturetwh
 
The Billions You're Losing to Online Ad Fraud
The Billions You're Losing to Online Ad FraudThe Billions You're Losing to Online Ad Fraud
The Billions You're Losing to Online Ad FraudSamuel Scott
 
Walmart Technology Strategy
Walmart Technology StrategyWalmart Technology Strategy
Walmart Technology StrategyMiles Wood
 
Case Study: Alibaba's Expansion
Case Study: Alibaba's ExpansionCase Study: Alibaba's Expansion
Case Study: Alibaba's ExpansionJean-Baptiste Bard
 
State of Analytics: Retail and Consumer Goods
State of Analytics: Retail and Consumer GoodsState of Analytics: Retail and Consumer Goods
State of Analytics: Retail and Consumer GoodsSPI Conference
 
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad BotsDistil Networks
 

Similar to Bias in recommendation: avoid it or embrace it? (20)

B2B marketers anti ad-fraud playbook
B2B marketers anti ad-fraud playbookB2B marketers anti ad-fraud playbook
B2B marketers anti ad-fraud playbook
 
B2C Marketers Anti Ad-Fraud Playbook
B2C Marketers Anti Ad-Fraud PlaybookB2C Marketers Anti Ad-Fraud Playbook
B2C Marketers Anti Ad-Fraud Playbook
 
Social Media Marketing Master Class - Mathew Slavica, Digital Stand
Social Media Marketing Master Class - Mathew Slavica, Digital StandSocial Media Marketing Master Class - Mathew Slavica, Digital Stand
Social Media Marketing Master Class - Mathew Slavica, Digital Stand
 
Lean Analytics: Using Data to Build a Better Business Faster
Lean Analytics: Using Data to Build a Better Business FasterLean Analytics: Using Data to Build a Better Business Faster
Lean Analytics: Using Data to Build a Better Business Faster
 
Combining Art & Science in Modern Marketing
Combining Art & Science in Modern MarketingCombining Art & Science in Modern Marketing
Combining Art & Science in Modern Marketing
 
Connected commerce final 2104_bel
Connected commerce final 2104_belConnected commerce final 2104_bel
Connected commerce final 2104_bel
 
Andrey Tyschenko: Craft of Personalization
Andrey Tyschenko: Craft of PersonalizationAndrey Tyschenko: Craft of Personalization
Andrey Tyschenko: Craft of Personalization
 
Social Media Marketing Master Class - Mathew Slavica, Digital Stand
Social Media Marketing Master Class - Mathew Slavica, Digital StandSocial Media Marketing Master Class - Mathew Slavica, Digital Stand
Social Media Marketing Master Class - Mathew Slavica, Digital Stand
 
The role of facts in marketing
The role of facts in marketingThe role of facts in marketing
The role of facts in marketing
 
'The Era of Ecommerce' Report
'The Era of Ecommerce' Report'The Era of Ecommerce' Report
'The Era of Ecommerce' Report
 
Honors Thesis
Honors ThesisHonors Thesis
Honors Thesis
 
Amazon Pitch
Amazon PitchAmazon Pitch
Amazon Pitch
 
The Era of Omni-Commerce: New Insights for Dominating the Digital Shelf and B...
The Era of Omni-Commerce: New Insights for Dominating the Digital Shelf and B...The Era of Omni-Commerce: New Insights for Dominating the Digital Shelf and B...
The Era of Omni-Commerce: New Insights for Dominating the Digital Shelf and B...
 
Building Influence in 2019
Building Influence in 2019Building Influence in 2019
Building Influence in 2019
 
Capitalising on Popular Culture
Capitalising on Popular CultureCapitalising on Popular Culture
Capitalising on Popular Culture
 
The Billions You're Losing to Online Ad Fraud
The Billions You're Losing to Online Ad FraudThe Billions You're Losing to Online Ad Fraud
The Billions You're Losing to Online Ad Fraud
 
Walmart Technology Strategy
Walmart Technology StrategyWalmart Technology Strategy
Walmart Technology Strategy
 
Case Study: Alibaba's Expansion
Case Study: Alibaba's ExpansionCase Study: Alibaba's Expansion
Case Study: Alibaba's Expansion
 
State of Analytics: Retail and Consumer Goods
State of Analytics: Retail and Consumer GoodsState of Analytics: Retail and Consumer Goods
State of Analytics: Retail and Consumer Goods
 
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots
 

More from Pablo Castells

REVEAL @ RecSys 2018 - Characterization of Fair Experiments for Recommender S...
REVEAL @ RecSys 2018 - Characterization of Fair Experiments for Recommender S...REVEAL @ RecSys 2018 - Characterization of Fair Experiments for Recommender S...
REVEAL @ RecSys 2018 - Characterization of Fair Experiments for Recommender S...Pablo Castells
 
SIGIR 2017 - A Probabilistic Reformulation of Memory-Based Collaborative Filt...
SIGIR 2017 - A Probabilistic Reformulation of Memory-Based Collaborative Filt...SIGIR 2017 - A Probabilistic Reformulation of Memory-Based Collaborative Filt...
SIGIR 2017 - A Probabilistic Reformulation of Memory-Based Collaborative Filt...Pablo Castells
 
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...Pablo Castells
 
SIGIR 2011 Poster - Intent-Oriented Diversity in Recommender Systems
SIGIR 2011 Poster - Intent-Oriented Diversity in Recommender SystemsSIGIR 2011 Poster - Intent-Oriented Diversity in Recommender Systems
SIGIR 2011 Poster - Intent-Oriented Diversity in Recommender SystemsPablo Castells
 
SIGIR 2012 - Explicit Relevance Models in Intent-Oriented Information Retrie...
SIGIR 2012 - Explicit Relevance Models in Intent-Oriented  Information Retrie...SIGIR 2012 - Explicit Relevance Models in Intent-Oriented  Information Retrie...
SIGIR 2012 - Explicit Relevance Models in Intent-Oriented Information Retrie...Pablo Castells
 
ACM RecSys 2011 - Rank and Relevance in Novelty and Diversity Metrics for Rec...
ACM RecSys 2011 - Rank and Relevance in Novelty and Diversity Metrics for Rec...ACM RecSys 2011 - Rank and Relevance in Novelty and Diversity Metrics for Rec...
ACM RecSys 2011 - Rank and Relevance in Novelty and Diversity Metrics for Rec...Pablo Castells
 

More from Pablo Castells (6)

REVEAL @ RecSys 2018 - Characterization of Fair Experiments for Recommender S...
REVEAL @ RecSys 2018 - Characterization of Fair Experiments for Recommender S...REVEAL @ RecSys 2018 - Characterization of Fair Experiments for Recommender S...
REVEAL @ RecSys 2018 - Characterization of Fair Experiments for Recommender S...
 
SIGIR 2017 - A Probabilistic Reformulation of Memory-Based Collaborative Filt...
SIGIR 2017 - A Probabilistic Reformulation of Memory-Based Collaborative Filt...SIGIR 2017 - A Probabilistic Reformulation of Memory-Based Collaborative Filt...
SIGIR 2017 - A Probabilistic Reformulation of Memory-Based Collaborative Filt...
 
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...
 
SIGIR 2011 Poster - Intent-Oriented Diversity in Recommender Systems
SIGIR 2011 Poster - Intent-Oriented Diversity in Recommender SystemsSIGIR 2011 Poster - Intent-Oriented Diversity in Recommender Systems
SIGIR 2011 Poster - Intent-Oriented Diversity in Recommender Systems
 
SIGIR 2012 - Explicit Relevance Models in Intent-Oriented Information Retrie...
SIGIR 2012 - Explicit Relevance Models in Intent-Oriented  Information Retrie...SIGIR 2012 - Explicit Relevance Models in Intent-Oriented  Information Retrie...
SIGIR 2012 - Explicit Relevance Models in Intent-Oriented Information Retrie...
 
ACM RecSys 2011 - Rank and Relevance in Novelty and Diversity Metrics for Rec...
ACM RecSys 2011 - Rank and Relevance in Novelty and Diversity Metrics for Rec...ACM RecSys 2011 - Rank and Relevance in Novelty and Diversity Metrics for Rec...
ACM RecSys 2011 - Rank and Relevance in Novelty and Diversity Metrics for Rec...
 

Recently uploaded

ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 

Recently uploaded (20)

ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 

Bias in recommendation: avoid it or embrace it?

  • 1. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias in recommendation: avoid it or embrace it? Pablo Castells Universidad Autónoma de Madrid http://ir.ii.uam.es/castells Amazon, Barcelona, February 17, 2020
  • 2. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Outline 1. Bias and fairness 2. Removing the bias 3. Understanding the bias 4. Conclusion
  • 3. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 1. Bias and fairness 2. Removing the bias 3. Understanding the bias 4. Conclusion Outline
  • 4. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias in search  “Search engine manipulation is a serious threat to the democratic system of government”  Google could manipulate 2.6 – 10M votes Before Pro-Clinton Pro-Trump Date After Robert Epstein R. Epstein, R. E. Robertson. A Method for Detecting Bias in Search Rankings, with Evidence of Systematic Bias Related to the 2016 Presidential Election. White paper, American Institute for Behavioral Research and Technology, June 2017. R. Epstein, R. E. Robertson. The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections. PNAS 112(33), August 2015. Election day
  • 5. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 COMPAS system: ground truth evaluation proves bias Black 45% White 23% Black 28% White 48% False positives False negatives 𝑃 FP Black ≫ 𝑃 FP 𝑃 FN Black ≪ 𝑃 FN Recidivism prediction
  • 6. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias = under / over-represented features Female 53% Male 47% Female 32% Male 68% PhDs (USA) Full professors (USA) 𝑃 Female Professor ≪ 𝑃 Female PhD 𝑃 Female Professor ≪ 𝑃 Female
  • 7. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias = under / over-represented features Female 16% Male 84% PhDs (Spain) Full professors (Spain) 𝑃 Female Professor ≪ 𝑃 Female PhD 𝑃 Female Professor ≪ 𝑃 Female Female 51% Male 49%
  • 8. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias = under / over-represented features Other 38% White male 62% Other 9% White male 91% Fortune 500 employees Fortune 500 CEOs 𝑃 White male CEO ≫ 𝑃 White male Employee
  • 9. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias in algorithms Bias in many application domains  Face recognition / surveillance  Recruiting  Loans  News, social media  Search  ··· Typically the bias is in the data, in the history – the algorithm learns and reproduces / amplifies the human bias Baeza-Yates, R. Bias on the Web. Communications of the ACM 61(6), May 2018.
  • 10. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias in search engines’ results?  Promote own services  Gender and ethnic stereotypes – In autocomplete – In spelling correction  Impact on people’s perceptions (e.g. shift voting)  Relevance bias
  • 11. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias is… bad?
  • 12. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 The popularity bias in information retrieval  Popularity: “unidimensional” bias – The overrepresented “feature” is the item itself – Analysis can be generalized to any feature  Issues related to user satisfaction – Does the bias hurt the system effectiveness – Does the bias distort evaluation
  • 13. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias in search  In input – “Popularity” bias in queries – “Popularity” bias in click logs – “Popularity” bias in sales – “Popularity” bias in Web links – Position bias in clicks  In output – Sellers in search results – Expose the catalog  Bias in offline evaluation
  • 14. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias in recommendation Items Users Popular items Rest of items (long tail) Items Nºinteractions In the (input) data Popular items Long-tail items
  • 15. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 0 1000 2000 0 1000 2000 0 400 800 0 1000 2000 0 1000 2000 3000 0 1000 2000 Bias in recommendation In algorithms (output) Matrix factorization Nº positive ratings Nºtimestop10 800 400 0 0 1000 2000 User-based kNN Nº positive ratings 2000 1000 0 0 1000 2000 Item-based kNN Nº positive ratings 3000 1000 0 0 1000 2000 0 2000 4000 0 1000 2000 Oracle optimal !! Nº positive ratings 4000 2000 0 0 1000 2000 2000 R. Cañamares, P. Castells. A Probabilistic Reformulation of Memory-Based Collaborative Filtering – Implications on Popularity Biases. SIGIR 2017. D. Jannach et al. What recommenders recommend: an analysis of recommendation biases and possible countermeasures. UMUAI 25(5), Dec. 2015. MovieLens 1M dataset  1M ratings, 6K users, 4K items  Random rating split 80% training / 20% test
  • 16. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias in recommendation Netflix dataset  100M ratings, 0.5M users, 18K items  Random rating split 80% training / 20% test Random Positive rating count User-based kNN Matrix factorization0.3 0.2 0.1 0 nDCG@10 Average rating value In offline evaluation R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018. P. Cremonesi, Y. Koren, R. Turrin.. Performance of recommender algorithms on top-n recommendation tasks. RecSys 2010. D. Jannach et al. What recommenders recommend: an analysis of recommendation biases and possible countermeasures. UMUAI 25(5), Dec. 2015. 0 0.1 0.2 0.3 Random Avg.rating Nr.ratings User-based Matrixfact. nDCG@10 Netflix
  • 17. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias in recommendation What to do about the bias?
  • 18. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Outline 1. Bias and fairness 2. Removing the bias 3. Understanding the bias 4. Conclusion
  • 19. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 1. Bias and fairness 2. Removing the bias 3. Understanding the bias 4. Conclusion Outline
  • 20. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 What to do about the bias Answer 1 – Bias is bad ⇒ Remove the bias in your recommendations
  • 21. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Avoiding popularity: novelty and diversity Avoid popularity in the output recommendations  Novelty: limited value of popular recommendations Try to move towards the long tail  Diversity / fairness: avoid filter bubble and concentration over few items Give all items some chance to be exposed → Reranking, multiarmed bandits, etc. Items #interactions 𝑎 𝑏 P. Castells, N. J. Hurley, S. Vargas. Novelty and Diversity in Recommender Systems. In Recommender Systems Handbook, 2nd edition. Springer, 2015.
  • 22. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Avoiding popularity: novelty and diversity Context Recommended item Target user’s experience Everyone else’s experience Everyone else’s recommendations Other items in the same recommendation Unexpectedness Intra-list diversity Long-tail novelty Sales diversity Distance or identity Item novelty model P. Castells, N. J. Hurley, S. Vargas. Novelty and Diversity in Recommender Systems. In Recommender Systems Handbook, 2nd edition. Springer, 2015. Problem solved?
  • 23. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Avoiding popularity: novelty and diversity Novelty / diversity Relevance Stimulation P. Castells, N. J. Hurley, S. Vargas. Novelty and Diversity in Recommender Systems. In Recommender Systems Handbook, 2nd edition. Springer, 2015.
  • 24. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Avoiding popularity: novelty and diversity Items #interactions 𝑎 𝒃 𝒄 Still a bias
  • 25. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 What to do about the bias Answer 2 – Bias is bad ⇒ Remove the bias in offline evaluation
  • 26. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Popularity bias in offline evaluation Popular items (short head) Rest of items (long tail) Observed user-item interaction Unobserved preference Items Users Ratings are missing not at random (MNAR) R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
  • 27. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Positive rating count Popularity bias in offline evaluation Test data (relevant items) Training data Unobserved preference Items Users Popular items (short head) Rest of items (long tail) avg P@𝑘 ∼ + 𝑘 R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018. Ratings are missing not at random (MNAR) 0.3 0.2 0.1 0 nDCG@10 0 0.1 0.2 0.3 Random Avg.rating Nr.ratings User-based Matrixfact. nDCG@10 Netflix
  • 28. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Removing the popularity bias in offline evaluation A. Handling the (test) data Items Items #ratings Flat test Popularity strata Time Temporal split Test data (relevant items) Training data Unobserved preference A. Bellogín, P. Castells, I. Cantador. Statistical Biases in Information Retrieval Metrics for Recommender Systems. Information Retrieval 20(6), July 2017. P. Cremonesi, Y. Koren, R. Turrin.. Performance of recommender algorithms on top-n recommendation tasks. RecSys 2010. H. Steck. Training and Testing of Recommender Systems on Data Missing not at Random. KDD 2010. H. Steck. Item popularity and recommendation accuracy. RecSys 2011.
  • 29. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Removing the popularity bias in offline evaluation B. Correcting for bias in the metrics C. Unbiased learning D. Unbiased datasets A. Bellogín, P. Castells, I. Cantador. Statistical Biases in Information Retrieval Metrics for Recommender Systems. Information Retrieval 20(6), July 2017. P. Cremonesi, Y. Koren, R. Turrin.. Performance of recommender algorithms on top-n recommendation tasks. RecSys 2010. J. M. Hernández-Lobato, N. Houlsby, Z. Ghahramani. Probabilistic Matrix Factorization with Non-random Missing Data. ICML 2014. H. Steck. Training and Testing of Recommender Systems on Data Missing not at Random. KDD 2010. H. Steck. Item popularity and recommendation accuracy. RecSys 2011. Stratified recall Off-policy evaluation Inverse propensity scoring ···
  • 30. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Debiasing evaluation: Inverse Propensity Scoring 𝑃 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 𝑢 𝑖 𝑃 = 1 𝑅 ෍ 𝑖∈𝑅 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖 What you want to measure 𝑃 = 1 𝑅 ෍ 𝑖∈𝑅 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖 · 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 What you can measure Biased estimate Problems: 1) High variance, and 2) How to estimate propensity 𝑃 = 1 𝑅 ෍ 𝑖∈𝑅 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖 · 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 𝑃 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 Unbiasedestimate T. Schnabel, A. Swaminathan, A. Singh, N. Chandak, T. Joachims. Recommendations as Treatments: Debiasing Learning and Evaluation. ICML 2016. Swaminathan, A., Krishnamurthy, A., Agarwal, A., Dudik, M., Langford, J., Jose, D., Zitouni, I. Off-policy Evaluation for Slate Recommendation. NIPS 2017.
  • 31. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 0.005 0 0.01 0.015 0 0.04 0.080.08 0 0.04 0 0.1 0.20.2 0.1 0 0.005 0 0.01 0.015 0 0.04 0.080.08 0 0.04 0 0.1 0.20.2 0.1 0 0.005 0 0.01 0.015 0 0.04 0.080.08 0 0.04 0 0.1 0.20.2 0.1 0 Debiasing evaluation: experiments P. Castells, R. Cañamares. Characterization of Fair Experiments for Recommender System Evaluation – A Formal Analysis. REVEAL@RecSys 2018. Temporal split IPSRandom split MovieLens 1M Recall@10 Flat test Matrix factorization Random Average rating value Positive rating count User-based kNN 0.05 0 0.1
  • 32. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Debiasing evaluation: Inverse Propensity Scoring  Playlist recommendation  Comparison of 12 recommender systems  Metric: impression-to-stream 1. Online (multivariate) AB test 2. Offline evaluation with IPS variants – IPS – Capped IPS – Normalized capped IPS A. Gruson, P. Chandar, C. Charbuillet, J. McInerney, S. Hansen, D. Tardieu, B. Carterette. Offline Evaluation to Make Decisions about Playlist Recommendation Algorithms. WSDM 2019. Spotify evaluation experiment
  • 33. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Debiasing evaluation: Inverse Propensity Scoring ABtest IPS Normalized capped IPS Capped IPS A. Gruson, P. Chandar, C. Charbuillet, J. McInerney, S. Hansen, D. Tardieu, B. Carterette. Offline Evaluation to Make Decisions about Playlist Recommendation Algorithms. WSDM 2019. Recommender system ranking comparison 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Spotify evaluation experiment
  • 34. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Debiasing evaluation: Inverse Propensity Scoring CappedIPS IPS Capped IPSIPS A. Gruson, P. Chandar, C. Charbuillet, J. McInerney, S. Hansen, D. Tardieu, B. Carterette. Offline Evaluation to Make Decisions about Playlist Recommendation Algorithms. WSDM 2019. Normalized cappedIPS Normalized cappedIPS 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Spotify evaluation experiment Recommender system ranking comparison
  • 35. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Debiasing evaluation: unbiased data Yahoo! R3 Free user interaction 5,400Yahoo!radiousers 10 random tracks per user MNAR training data MAR test data B. Marlin, R. Zemel. Collaborative prediction and ranking with non-random missing data. RecSys 2009. 130K ratings 1,000 music tracks randomly sampled
  • 36. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Unbiased test data: experiments Yahoo! R3 0 0.01 0.02 0 0.02 0.04 0 0.02 0.04 0.06 0 0.02 0.04 0 0.05 0.1 0.15 0 0.1 0.2 Recall@10 0 0.02 0.01 0 0.02 0.04 0 0.06 0.04 0.02 0 0.02 0.04 0 0.15 0.1 0.05 MNAR random split MAR test CM100k MNAR random split MAR test 0.2 0 0.1 MNAR random split MAR testIPS IPS MAR test Yahoo! R3 0 0.01 0.02 0 0.02 0.04 0 0.02 0.04 0.06 0 0.02 0.04 0 0.05 0.1 0.15 0 0.1 0.2 Recall@10 0 0.02 0.01 0 0.02 0.04 0 0.06 0.04 0.02 0 0.02 0.04 0 0.15 0.1 0.05 MNAR random split MAR test CM100k MNAR random split MAR test 0.2 0 0.1 Yahoo! R3 CM100K Recall@10 Recall@10 MNAR random split P. Castells, R. Cañamares. Characterization of Fair Experiments for Recommender System Evaluation – A Formal Analysis. REVEAL@RecSys 2018. Matrix factorization Random Average rating value Positive rating count User-based kNN
  • 37. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Bias in recommendation Is bias bad? How bad?
  • 38. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Outline 1. Bias and fairness 2. Removing the bias 3. Understanding the bias 4. Conclusion
  • 39. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 1. Bias and fairness 2. Removing the bias 3. Understanding the bias 4. Conclusion Outline
  • 40. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Can we trust our experiments? Computed on available user taste observations Computed with full knowledge of user tastes Observed metric value True metric value Items Users Relevant Non relevant Missing ratings ? ≈ Items Users R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
  • 41. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Observed Understanding the bias Items Users Items Users R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018. 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 Relevant Non relevant Observation vs. relevance
  • 42. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Observed Understanding the bias Items Users R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018. 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 Relevant Non relevant ∧ Observation vs. relevance 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖
  • 43. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Missing rating Understanding the bias Items Users R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018. 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 Relevant Non relevant ∧ Observation vs. relevance
  • 44. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Understanding the bias Items Users Items Users R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018. 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 Relevant Non relevant Observed Observation vs. relevance
  • 45. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Understanding the bias Items Users Items Users R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018. 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 Relevant Non relevant Observed Observation vs. relevance
  • 46. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Understanding the bias Items Users Items Users R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018. 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 Relevant Non relevant Observed Observation vs. relevance
  • 47. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Understanding the bias Items Users Items Users R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018. 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 Relevant Non relevant Observed Observation vs. relevance
  • 48. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Formal analysis Very simple questions 1. Does popularity help or hurt recommendation effectiveness? 2. Which is better, the majority taste (positive rating count) or the higher consensus (average rating value)? 3. Do biased metric values agree with true (unbiased) values? R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
  • 49. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Research questions Optimal recommendation Optimal non-personalized recommendation Random recommendation Highest consensus ? ? ? ? Personalized recommendations Largest majority Bad personalized recommendations R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
  • 50. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Where does popularity come from? Items #interactions 𝑎 𝑏 What made 𝑎 be so much more popular than 𝑏? R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
  • 51. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 𝑢 Rating generation 𝐷𝑖𝑠𝑐𝑜𝑣𝑒𝑟𝑢,𝑖 𝐸𝑛𝑔𝑎𝑔𝑒 𝑢,𝑖 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖 𝑖 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖 ∧ 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 Rec algorithm R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
  • 52. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Conditional (in)dependences between variables 𝑢 𝐷𝑖𝑠𝑐𝑜𝑣𝑒𝑟𝑢,𝑖 𝐸𝑛𝑔𝑎𝑔𝑒 𝑢,𝑖 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖 𝑖 Items #Interactions Popularity distribution 𝑝 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑖 𝐷𝑖𝑠𝑐𝑜𝑣𝑒𝑟 𝑖 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑂𝑏𝑠𝑒𝑟𝑣𝑒 R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
  • 53. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Conditional (in)dependences between variables 𝑝 𝑂 𝑖 = 𝑝 𝑂 𝑅, 𝑖 𝑝 𝑅 𝑖 + 𝑝 𝑂 ¬𝑅, 𝑖 𝑝(¬𝑅|𝑖) 𝑝 𝑂 𝑅, 𝑖 = 𝑝 𝑂 𝐷, 𝑅, 𝑖 𝑝 𝐷 𝑅, 𝑖 𝑢 𝐷 𝑢,𝑖 𝐸 𝑢,𝑖 𝑂 𝑢,𝑖 𝑅 𝑢,𝑖 𝑖 𝐷 𝑖 𝑅 𝑂 Items #Interactions Popularity distribution 𝑝 𝑂 𝑖
  • 54. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Conditional (in)dependences between variables 𝑢 𝐷 𝑢,𝑖 𝐸 𝑢,𝑖 𝑂 𝑢,𝑖 𝑅 𝑢,𝑖 𝑖 𝐷 𝑖 𝑅 𝑂 Items #Interactions Popularity distribution 𝑝 𝑂 𝑖 𝐷 𝑖 𝑅 𝑂 𝐷 𝑖 𝑅 𝑂 1. Observation depends just on relevance 2. Observation independent from relevance 3. Observation depends on both items and relevance 𝑝 𝑂 𝑅, 𝑖 = 𝑝 𝑂 𝐷, 𝑅, 𝑖 𝑝 𝐷 𝑅, 𝑖
  • 55. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Findings Optimal recommendation Optimal non-personalized recommendation Random recommendation 1. Observation conditionally independent from item Highest consensus Biased and unbiased precision agree Largest majority Biased ෡𝑷 ∝ Unbiased 𝑷 Even if 𝑷 𝑶𝒃𝒔𝒆𝒓𝒗𝒆𝒅 ¬𝑹𝒆𝒍𝒆𝒗𝒂𝒏𝒕 > 𝑷 𝑶𝒃𝒔𝒆𝒓𝒗𝒆𝒅 𝑹𝒆𝒍𝒆𝒗𝒂𝒏𝒕
  • 56. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Findings Optimal recommendation Random recommendation 2. Observation conditionally independent from relevance a) Observation correlates with relevance Highest consensus Biased and unbiased precision agree Largest majority Biased ෡𝑷 ∝ Unbiased 𝑷 Optimal non-personalized recommendation
  • 57. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Highest consensus Largest majority Findings Optimal recommendation Random recommendation Biased ෡𝑷 Unbiased 𝑷 2. Observation conditionally independent from relevance b) Observation does not correlate with relevance Biased and unbiased precision disagree Highest consensus Largest majority Optimal non-personalized recommendation
  • 58. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Findings Optimal recommendation Random recommendation Unbiased 𝑷 2. Observation conditionally independent from relevance b) Observation correlates negatively with relevance Largest majority Highest consensus Largest majority Highest consensus Biased ෡𝑷 Biased and unbiased precision disagree !! Optimal non-personalized recommendation
  • 59. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Findings Optimal recommendation Random recommendation 3. No assumption 𝔼 𝑃@1 𝜃 = න Ω 𝑛 𝔼 𝑃@1 𝜃, 𝜔 𝑑𝜔 Highest consensus Largest majority Largest majority Highest consensus Unbiased 𝑷Biased ෡𝑷 Biased and unbiased precision disagree Optimal non-personalized recommendation
  • 60. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Dependence between observation and relevance For example…  Find items through search engines, good recommender systems, good friends  Rational herd behavior  Rate based on whether you like rated,rel ¬rated,rel rated,¬rel ¬rated,¬rel 1. Observation conditionally independent from item Relevant rated Relevant unrated Non-relevant rated Non-relevant unrated Items
  • 61. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Dependence between observation and relevance rated,rel ¬rated,rel rated,¬rel ¬rated,¬rel 1. Observation conditionally independent from item Relevant rated Relevant unrated Non-relevant rated Non-relevant unrated Items Mellow BarcelonaGates Diagonal I found and chose this nice hotel I never saw this one Relevance possibly explains the resulting observation I produced in Booking.com
  • 62. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Dependence between observation and relevance rated,rel ¬rated,rel rated,¬rel ¬rated,¬rel 1. Observation conditionally independent from item Relevant rated Relevant unrated Non-relevant rated Non-relevant unrated Items Other possible examples…
  • 63. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Dependence between observation and relevance rated,rel ¬rated,rel rated,¬rel ¬rated,¬relrated,rel ¬rated,rel rated,¬rel ¬rated,¬relrated,rel ¬rated,rel rated,¬rel ¬rated,¬rel a) Positive correlation b) No correlation c) Negative correlation 2. Observation conditionally independent from relevance For example…  Heavy (and/or good) advertisement  Social conformity, fashion  Reinforcement loops  Randomness + snowball effects Items Items Items Relevant rated Relevant unrated Non-relevant rated Non-relevant unrated
  • 64. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Dependence between observation and relevance rated,rel ¬rated,rel rated,¬rel ¬rated,¬relrated,rel ¬rated,rel rated,¬rel ¬rated,¬relrated,rel ¬rated,rel rated,¬rel ¬rated,¬rel a) Positive correlation b) No correlation c) Negative correlation 2. Observation conditionally independent from relevance Items Items Items
  • 65. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Typical case Optimal non-personalized recommendation Random recommendation Highest consensus Largest majority Biased ෡𝑷 ∝ Unbiased 𝑷 Empirical results would suggest the typical case is a mix of 1. Relevance dependence 2. Item dependence with a) positive correlation Observation bias stronger than relevance
  • 66. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Typical case rated,rel ¬rated,rel rated,¬rel ¬rated,¬rel 1. Observation conditionally independent from item rated,rel ¬rated,rel rated,¬rel ¬rated,¬rel a) Positive correlation 2. Observation conditionally independent from relevance Relevant rated Relevant unrated Non-relevant rated Non-relevant unrated Typical case would seem a combination of these two Items Items
  • 67. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Closing the loop: relevance + novelty CM100K (ir.ii.uam.es/cm100k) 1,000 music tracks randomly sampled from deezer.com User is familiar with 1,000users R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018. User is not familiar with 100MARjudgments
  • 68. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Closing the loop: relevance + novelty CM100K Undiscovered nDCG@10 R. Cañamares, P. Castells. From the PRP to the Low Prior Discovery Recall Principle for Recommender Systems. SIGIR 2018.
  • 69. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Implications on personalized algorithms 0 0.01 0.02 0.03 Obs True 0 0.1 0.2 0.3 Obs MovieLens 1M CM100K nDCG@10 0 0.01 0.02 0.03 Obs True 0 0.1 0.2 0.3 Obs Non-normalized kNN (biased to popularity) Normalized kNN (biased to avg rating) Biased evaluation Non-normalized > normalized Non-normalized < normalized Unbiased evaluation R. Cañamares, P. Castells. Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems. SIGIR 2018.
  • 70. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Generalization to other biases Items Users Items Users 𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑢,𝑖𝑂𝑏𝑠𝑒𝑟𝑣𝑒 𝑢,𝑖 Relevant Non relevant Observed Complex observation biases R. Cañamares, P. Castells. A Probabilistic Reformulation of Memory-Based Collaborative Filtering – Implications on Popularity Biases. SIGIR 2017.
  • 71. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Outline 1. Bias and fairness 2. Removing the bias 3. Analysis of popularity in recommendation 4. Conclusion
  • 72. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 1. Bias and fairness 2. Removing the bias 3. Analysis of popularity in recommendation 4. Conclusion Outline
  • 73. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Conclusions  Popularity can be ok as long as it emerges out of relevance  MNAR offline evaluation tends to agree with MAR evaluation – Ratings appear to depend on both relevance and items – Item dependence may be stronger, but tends to agree with relevance – User bias to rate relevant or non-relevant should not make a difference  Consensus seems slightly better behaved than majority – And much better at novel relevant findings  No universal solution to deal with bias – understand the bias – Caution with eventual scenarios with strong item dependence uncorrelated to or against relevance  Analysis can be generalized to other biases and features
  • 74. IRGIRGroup @UAM Bias in recommendation: avoid it or embrace it? Amazon, Barcelona, February 17, 2020 Ongoing and future directions  Inverse propensity scoring  Unbiased datasets  Popularity bias in false-positive metrics  Popularity from social network dynamics  Multi-armed bandit recommendation algorithms – Specific algorithms (e.g. bandit kNN) – Better understanding feedback loop effects and how to cope with them R. Cañamares, M. Redondo, P. Castells.. Multi-Armed Recommender System Bandit Ensembles. RecSys 2019. J. Sanz-Cruzado, E. López, P. Castells.. A Simple Multi-Armed Nearest-Neighbor Bandit for Interactive Recommendation. RecSys 2019.