Who will follow whom? Exploiting Semantics for Link Prediction in Attention-Information Networks

Who will follow whom?
Exploiting Semantics for Link Prediction in
Attention-Information Networks
International Semantic Web Conference 2012. Boston, US

Matthew Rowe1 Milan Stankovic2,3 Harith Alani4

1
School of Computing and Communications, Lancaster University, Lancaster, UK
2
Hypios Research, 187 rue du Temple, 75003 Paris, France
3
Universit Paris-Sorbonne, 28 rue Serpente, 75006 Paris
4
Knowledge Media Institute, The Open University, Milton Keynes, UK

@mrowebot | m.rowe@lancaster.ac.uk
http://www.matthew-rowe.com | http://www.lancs.ac.uk/staff/rowem/

Background Problem Formulation Approach Experiments Summary

Attention Information Networks
The intersection of information and social networks
[Yin et al., 2011]:
Users can follow other users: u subscribes to v
u = Follower, v = Followee
User u is paying attention to the content from user v
u v

Users become ’Information Hubs’ [Romero and Kleinberg, 2010]
Tune in to get real time event information
E.g. #Sandy, #Arabspring, #Londonriots
People become social sensors

u v

Attention is paid to the information that users publish
Who will follow whom? Exploiting Semantics for Link Prediction 2 / 22


Attention Economics

Large uptake/adoption of Attention-Information Networks:
31.9% increase in Twitter users in 2011
Attention becomes a limited commodity
“What counts now is what is most scarce now, namely attention.”
[Goldhaber, 1997]
Users must consider who they wish to subscribe to
Whose content do I wish to receive?
Who interests me?
If we can understand who will follow whom & follower decisions:
Predict social capital based on expected network growth;
Facilitate audience building
Of interest to Digital Marketing ﬁrms - i.e. boosting client’s presence



Outline

Problem Formulation
Related Work
Follower-Decision Hypotheses
Formulating the Problem

Approach
Features
Concept Disambiguation with User Contexts

Experiments
Dataset
Experimental Setup
Results: Prediction Accuracy
Results: Follower-Decision Patterns



Related Work

Network-topology approaches [Golder and Yardi, 2010,
Yin et al., 2011, Backstrom and Leskovec, 2011]:
Path structures, common followers and common friends
Local metadata approaches [Schifanella et al., 2010,
Leroy et al., 2010, Brzozowski and Romero, 2011]:
Common tags (Flickr, YouTube), group information (on Flickr)
Local metadata approaches use tags or group memberships, but no
concepts
No examination of the follower-decision behaviour patterns
And no exploration of divergent follower decision behaviour



Follower-Decision Hypotheses

H1. Following a user is performed when there is a topical
affinity between the follower and the followee
[Schifanella et al., 2010] found social and topical homophily to be
correlated on Flickr
H2. Users who do not focus on specific topics do not base
their follower-decisions on topical information but on social
factors
Unfocussed users show divergent decision behaviour

H3. Users who are more socially connected are driven by social
rather than topical factors
High-degree users are driven by social network effects



Formulating the Problem

A directed social network is a graph: G = V , E , where:
V denotes the set of users (nodes), and;
E is the set of edges ( u, v ∈ E ) between nodes.
meaning that u follows v
An egocentric social network (egonet) of u is denoted by Γ(u)
Γ− (u) denotes in the follower network (incoming edges)
Γ+ (u) denotes in the followee network (outgoing edges)
A given user u is provided with a set of recommended users R(u)
R(u) ∩ Γ+ (u) = ∅
Goal: induce a function between users and recommendations:
f : V × R → {0, 1}



Predicting Links in Attention-Information Networks

Given our problem setting we want to:
1. Identify the best performing general model;
2. Explore follower-decision behaviour and how this differs
Problem is a binary classification task: pairwise features between u
and each of his recommendations (v ∈ R(u))
To enable accurate prediction and explore different factors behind
link creation we implement:
Social features: based on the network-structure
Topical features: based on content published by u and v
Visibility features: based on the user noticing a followed
We now explain the various features which are computed between u
and v ∈ R(u)...



Social

Social features account for the topology of the network and the existence
of edges present within the network prior to recommendations
Mutual Followers Count: Measures the overlap of the follower sets
(i.e. the set of users connecting into a given user) between u and v .
Mutual Followees Count Measures the overlap of the followee sets
Mutual Friends Count Measures the overlap of the friends sets
i.e. Friendship is denoted by a bi-directional edge between nodes
Mutual Neighbours Measures the overlap of the ego-centric
networks of u and v whilst ignoring the directions of the links in the
network
[Zhou et al., 2009, Yin et al., 2011, Backstrom and Leskovec, 2011]



Topical (I)

In attention-information networks users pay attention the content of
other users
Topical features use: a) tags, b) concept bags, c) concept graphs
Tag Vectors: Examining tag/keyword overlap between u and v
[Schifanella et al., 2010]
Cosine Similarity: between the tag vectors of u and v
Concept Bags: Examining overlap of concepts
Return concepts from user content, then derive the concept bag
vector
Cosine Similarity: similarity between the concept bag vectors of u
and v
Jensen-Shannon Divergence: probability distribution divergence
between concept bag vectors of u and v
Greater divergence means greater dissimilarity between topics



Topical (II)
C1

C3 C2

u v

Concept Graphs: Semantic relatedness of users using graph-based
metrics
Measure distances between concepts from tags of u and v : d(ci , cj )
Distance measures have two varieties, based on input tags:
1. Tag Intersection: Intersection of the tag sets of u and v
2. All Tags: All tags from the tag sets of u and v
Measured three distances measures for d(ci , cj ) using the above sets:

Shortest Path: least number of steps from ci to cj (Bellman-Ford
algorithm)
Hitting Time: number of steps for a random walker to leave ci and
reach cj [Fouss et al., 2007]
Commute Time: number of steps for a random walker to leave ci
and reach cj , and then return to ci



Visibility

The presence of information published by a prospective followee could
inﬂuence users in their follower-decisions
Retweet Count: total number of times a given user (v ) has been
retweeted by members of the followee network belonging to u
Mention Count: total number of times a given user (v ) has been
mentioned by members of the followee network belonging to u
Comment Count: total number of times a given user (v ) has had
his content commented on by members of the followee network
belonging to u
Weighted Counts: weight each count by reply-frequency with
ego-network member



Features Summary

Type Feature Name Output Domain
Social Mutual Followers Count {0} ∪ + N
Mutual Followees Count {0} ∪ + N
Mutual Friends Count {0} ∪ + N
Mutual Neighbours Count {0} ∪ + N
Topical Tag Vectors - Cosine [0, 1]
Concept Bags - Cosine [0, 1]
Concept Bags - JS-Divergence R +

Concept Graphs - Int - Shortest Path N +

Concept Graphs - All - Shortest Path N +

Concept Graphs - Int - Hitting Time R +

Concept Graphs - All - Hitting Time R +

Concept Graphs - Int - Commute Time R +

Concept Graphs - All - Commute Time R +

Visibility Retweet Count {0} ∪ N +

Mention Count {0} ∪ N +

Comment Count {0} ∪ N +

Weighted Retweet Count {0} ∪ R +

Weighted Mention Count {0} ∪ R +

Weighted Comment Count {0} ∪ R +



Concept Disambiguation with User Contexts

Distances across the concept graph capture semantic relatedness
Distance metrics require a mapping between a tag and a concept...
Polysemy Problem: one tag can be mapped to multiple concepts
[Cantador et al., 2011] propose ‘distributional aggregation’ to
choose the most representative tag for a web resource:
Voting mechanism: Tag usage frequency amongst a collection of
users
Our voting mechanism: concept frequency given the user
For a given tag: count candidate concept frequency in concept bag
CTu , choose the most frequent



Dataset
Knowledge Discovery and Data mining (KDD) Cup 2012 Follower
Prediction task
Chinese microblogging platform Tencent Weibo
Users, recommendations, and outcomes
Follow-graph of users
Set of tags found within each user’s content
Tag-categorisation data and category graph

q q
qq
q
q q
q q
qq
qq
qq

10000
qq
qq
q
q qq
qq
1000

qq
qq
Frequency (c(n))

Frequency (c(n))
qq
qq
qq
qq
qq
q
qq
qq
q qq
q
q qq
q
qq
q
q
qq
q
q qq
q
q
qq
q
qq
q q
q
qq
q
q qq
q
q
qq
q
q
100

qq q
q
qq
qq
q
q q
q
qq
q
qq q
q
qqq
q
q
qq qq
q
qq
q
100
qq
q
qq
q
qq q
qq
q
q
q q q
qq
q
qq
q
qq
q
q
qq
q
qq q qq
qq
qq
q
qq
qqq qq q
qq
q
q
q
q
qq qq
q
qq
q
qq
qq
qq
10

q q
qq q q
qq
qq
q
qq
q
q q qq
q
qq
q
qq
qq
qq
qq qq
qq
qq
qq q q qq
qq
qq
q
q
q qq
q q qqq
qq q
qq
qq
q
qq
qq
q
qq q
qq
q
q
qqq
q qq
qq qqqq
qqq
qq
q
qqq
qq q
qq
qq
qq
q q qq q
qq q
q qqq
qqq
qqq
qq
qq
q
qqqq
qqqq
qqqq
qqq
qqq
qq
1

1

q qqqqq
qqqq
qqq
q q qqq
qqq
qqq
qqq
qq
qq

1 2 5 10 20 50 1 5 50 500
categories (n) recommendations (n)

(a) Categories per Tag (b) Recommend’ per User



Experimental Setup

1. General Follower Prediction: seeking a follower model
Randomly selected 10% of users and built pairwise feature vectors
2. Binned Follower Prediction: seeking behaviour-speciﬁc models
Divided users into 10 bins based on: a) concept-bag entropy, b)
out-degree
Selected all the users from low and high bins, built feature vectors
Divided each dataset into an 80:20% split for training and testing
For each experiment:
1. Model Selection
2. Pattern Analysis
Evaluation Measures:
1. Area Under the receiver operator characteristic Curve (AUC )
2. Matthews correlation coeﬃcient (MCC )



Results: Prediction Accuracy

General Follower Prediction Model
Topical features significantly better
Models significantly outperform the random model
Binned Follower Prediction Models
Concept entropy: low - topical features; high - social features
Degree: low and high - topical features
Visibility features have little effect on predictions (majority are zero)
1.0

0.4
Social Social
Topical Topical
0.8

0.3
Visibility Visibility
All All
0.6

0.2
0.4

0.1
0.2

0.0
−0.1
0.0

Full Entropy − Low Entropy − High Degree − Low Degree − High Full Entropy − Low Entropy − High Degree − Low Degree − High

(c) AUC (d) MCC



Results: Follower-Decision Patterns

Connections are formed...
In the General Follower Prediction Model when:
users share neighbours
users are closer in terms of the subjects they discuss
In the Binned Follower Prediction Model
for low entropy and low degree users when:
same feature pattern as the general model
for high entropy users when:
users have an overlap of subscribers
tags diﬀer, but similar concepts!
for high degree users when:
users listen to the same people
users share a topical aﬃnity with the same pattern as the general
model



Findings

General behaviour pattern: topical homphily
[Schifanella et al., 2010] found socially close users to have high tag
cosine
Our approach detects latent patterns based on concept graphs
On common followers:
[Golder and Yardi, 2010, Brzozowski and Romero, 2011] found
mutual audience to correlate with link creation
We find that: mutual followers should be reduced in the general
model
On common neighbours:
[Leroy et al., 2010] found an increase in mutual neighbours to
correlate with link creation
Similar effect in our findings
Divergent behaviour for high entropy users: suggests a need for
bespoke models



Conclusions

Our approach for link prediction outperforms: a) a random baseline,
b) existing network-structure approaches
General follower-decision model identified topical homophily effects
Accounting for behaviour uncovered different follower-decisions:
Unfocussed users follow users with whom they have conceptual
affinity
Concept-graphs allowed for latent effects to be identified
Applicable over the linked data graph
Can improve recommendations by accounting for behaviour and
building bespoke models:
Growing the platform’s network and increasing social capital
Understand who will follow whom, and audience growth



Future Work
Apply our approach over Twitter and YouTube: are ﬁndings
consistent?
Extract concepts from content, measure distances across the Linked
Data graph
Inclusion of more nuanced user behaviour
Conjecture: performance is conditioned on time-sensitive user
behaviour
User Churn: detecting the complement of link creation
25 days of Twitter logs show this (red):
∆(u) = |Γ− (u)| − |Γ− (u)|
t t (1)
6000
5000
4000
c(∆)

3000
2000
1000
0

−40 −20 0 20 40

∆



Questions

Twitter: @mrowebot
Email: m.rowe@lancaster.ac.uk
WWW: http://www.matthew-rowe.com
WWW: http://www.lancs.ac.uk/staff/rowem/


Who will follow whom? Exploiting Semantics for Link Prediction in Attention-Information Networks

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to Who will follow whom? Exploiting Semantics for Link Prediction in Attention-Information Networks

Similar to Who will follow whom? Exploiting Semantics for Link Prediction in Attention-Information Networks (20)

More from Matthew Rowe

More from Matthew Rowe (19)

Recently uploaded

Recently uploaded (20)

Who will follow whom? Exploiting Semantics for Link Prediction in Attention-Information Networks