The document summarizes a presentation given by Rocío Cañamares and Pablo Castells from the Universidad Autónoma de Madrid. The presentation explored how social network effects can influence popularity biases in recommender systems. It described a stochastic model of social communication and rating behavior to simulate how items become popular. The model was used to run simulation-based experiments that analyzed the relationship between popularity, relevance, and recommendation precision under different scenarios. The experiments aimed to determine when popularity is an effective recommendation strategy compared to random recommendations.
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity biases in recommender systems
1. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
6th ACM RecSys Workshop on Recommender Systems
and the Social Web – RSWeb 2014
Exploring social network effects
on popularity biases
in recommender systems
Rocío Cañamares and Pablo Castells
Universidad Autónoma de Madrid
http://ir.ii.uam.es
Foster City, CA, 6 October 2014
2. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Outline of my talk
Why is popularity effective?
When is popularity effective?
– How does an item become popular?
– A stochastic model of social communication and rating behavior
Simulation-based experiments for “what if” scenarios
Conclusions
3. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
The effectiveness of precision in top-k recommendation
Popularity tests well for top-k precision in offline experiments
(Cremonesi et al RecSys 2010, etc.)
But… does this reflect true precision?
…or might there be an artificial bias that rewards popular items
in the offline experimental procedure?
There is of course the issue of lack of novelty, but we shall focus
here on accuracy
4. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Why is popularity effective?
5. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Why is popularity rank an effective recommendation
Items
Observed user-item interaction
Unobserved preference
Users
The good old rating matrix…
6. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Why is popularity rank an effective recommendation
Popular items
(short head)
Rest of items
(long tail)
Observed user-item interaction
Unobserved preference
Items
Users
Rating matrix in practice
7. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Why is popularity rank an effective recommendation
In a random split, popular items have more test hits than average (more more )
Thus recommending them is effective (at least better than random)
But how about true precision? What’s in the “ ” cells?
Test data (relevant items)
Training data
Unobserved preference
Items
Users
Popular items
(short head)
Rest of items
(long tail)
avg P@푘 ∼
+
푘
8. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Or is it? A toy simplified example
Item
A
Item
B
1 2
3 8
3 4
7 8
Observed
P@1
True
P@1
Popularity recommendation
Random recommendation
Ratings
9. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
When is popularity effective?
10. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
When is popularity effective?
Why do popular items get more ratings?
And how does that relate with item relevance?
(“relevance” meaning target users like the items)
11. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Rating generation
In order for a rating to be produced…
1. Discovery: the user needs to discover the item
– And then find out whether or not she likes it
2. Rating decision: the user needs to tell the system about it
– I.e. rate the item
So the biases in discovery and rating decisions should result in
(may explain?) biases in rating distribution (i.e. popularity)
12. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Discovery sources
How do people find items
We search/browse for them
We randomly run into them
They are advertised to us
They are brought to us by a recommender system
···
We find them through our friends
We define a stochastic model
– Social communication and rating
– User decisions dependent on item relevance
We analyze the effect on popularity precision
– Simulation
13. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
A model of social discovery and rating propagation
Rate
Rate?
RTaetlel??
Rating decision
푝 푟푎푡푒 푠푒푒푛, 푙푖푘푒푑
푝 푟푎푡푒 푠푒푒푛, ¬푙푖푘푒푑
Communication decision
푝 푡푒푙푙 푠푒푒푛, 푙푖푘푒푑
푝 푡푒푙푙 푠푒푒푛, ¬푙푖푘푒푑
• Kown item sampling
• Friend sampling
• Boostrapping discovery
from exogenous source
14. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
From user behavior model to macro social effect
Communication-relevance bias
푝 푡푒푙푙 푠푒푒푛, 푙푖푘푒푑 , 푝 푡푒푙푙 푠푒푒푛, ¬푙푖푘푒푑
Global discovery-relevance bias
푝 푠푒푒푛 푙푖푘푒푑 , 푝 푠푒푒푛 ¬푙푖푘푒푑
Rating-relevance decision bias
푝 푟푎푡푒 푠푒푒푛, 푙푖푘푒푑 ,
푝 푟푎푡푒 푠푒푒푛, ¬푙푖푘푒푑
Global rating-relevance bias
푝 푙푖푘푒푑 푟푎푡푒푑 , 푝 푙푖푘푒푑 ¬푟푎푡푒푑
Expected precision
of popularity-rank recommendation
User behavior
model parameters
15. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Two approaches to analyze the model effects
Theoretical
Simulate and see what happens…
Challenging! Work in progress…
16. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Experiments
17. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Experiments – Simulation setup
Social network: ~4,000 users, ~88,000 arcs
– Facebook network data from Jure Leskovec
– Random graphs: Barabási-Albert, Erdös-Rényi
3,700 items
We simulate a relevance distribution with a long-tail shape,
randomly assigned to user-item pairs
Bootstrapping: exogenous random
discovery every ~1,000 time cycles
Stop simulation when 500,000 ratings
are produced
Roughly
MovieLens 1M
scale
0
0.2
0.4
0.6
0.8
1
0 1000 2000 3000
푖
18. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Experiments – Simulation setup
At any point in the simulation we are able to:
– Split the rating data and run a recommender system (e.g. popularity)
– Measure the precision of the recommendations – observed and true
By running different configurations we can observe the
results in different scenarios
– We test in general one bias at a time: discovery or rating
– We show single shot no average
19. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Research questions for experiments
How does popularity compare with random recomendation
precision depending on the four user behavior parameters?
Does it make a difference to consider all ratings or only positive
ratings in popularity rank?
Does the social network topology and network phenomena
make a difference?
Can observed and true precision disagree?
23. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Social network topology effect
0
0.1
0.2
0.3
Observed True Observed True
Facebook Barabási-Albert
P@10
Relevant popularity
Random recommendation
푝 푡푒푙푙 푠푒푒푛, 푙푖푘푒푑 = 1 푝 푟푎푡푒 푠푒푒푛, 푙푖푘푒푑 = 1
푝 푡푒푙푙 푠푒푒푛,¬푙푖푘푒푑 = 1 푝 푟푎푡푒 푠푒푒푛,¬푙푖푘푒푑 = 0
0
0.1
0.2
0.3
Observed True Observed True
Facebook Barabási-Albert
P@10
Popularity
Random recommendation
24. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Contradicting observed and true precision
0
0.05
0.1
0.15
0.2
0.25
Observed True
P@10
Simple popularity
Positive popularity
Random
recommendation
푝 푡푒푙푙 푠푒푒푛, 푙푖푘푒푑 = 0 푝 푟푎푡푒 푠푒푒푛, 푙푖푘푒푑 = 1
푝 푡푒푙푙 푠푒푒푛,¬푙푖푘푒푑 = 1 푝 푟푎푡푒 푠푒푒푛,¬푙푖푘푒푑 = 1
Random
is here
25. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Conclusions
Observed precision of popularity is always better than random
True precision of popularity is worse than random when:
– Users talk about items they dislike more often than ones they like
– Users rate items they dislike more often than ones they like
Positive popularity is considerably more robust than simple popularity
– Fairly immune to user rating behavior on disliked items
Viral effects in temporal split
– Determined by a) user communication frequency, and b) social network topology
– Early popular items are recommendable to fewer users than in a random split
– Popularity may then become less useful for recommendation
It is not impossible for true and observed precision to be inconsistent
26. IRG IR Group @ UAM
Exploring social network effects on popularity biases in recommender systems
6th ACM RecSys Workshop on Recommender Systems and the Social Web – RSWeb 2014
Foster City, CA, 6 October 2014
Future work
Analytic work (in progress)
Very easy to generalize the model, just to mention a few possibilities…
– Arbitrarily biased exogenous sources, including recommender systems
– Dynamic social network, dynamic item lifecycles
– User behavior dependence on discovery source
– Social influence propagation, dynamic user preferences
So far a first step
– Understanding how social behavior patterns impact true popularity effectiveness
Next questions
– User studies
– Tracking and detecting the collective behavior patterns in real settings
– What to do about it
a) In the evaluation procedure & metrics and/or interpretation of results
b) In the algorithms which may potentially take popularity as a signal