Social applications implemented on a peer-to-peer (P2P) architecture mine the social graph of their users for improved performance in search, recommendations, resource
sharing and others. In such applications, the social graph that connects their users is distributed on the peer-to-peer system: the traversal of the social graph translates to a socially-informed routing in the peer-to-peer layer.
In this work we introduce the model of a projection graph that is the result of mapping a social graph onto a peer-to-peer network. We analytically formulate the relation between metrics in the social graph and in the projection graph. We focus on three such graph metrics: degree centrality, node betweenness centrality, and edge betweenness centrality. We evaluate experimentally the feasibility of estimating these metrics in the projection graph from the metrics of the social graph. Our experiments on real networks show that when mapping communities of 50-150 users on a peer, there is an optimal organization of the projection graph with respect to degree and node betweenness centrality. In this range, the association between the properties of the social graph and the projection graph is the highest, and thus the properties of the (dynamic) projection graph can be inferred from
the properties of the (slower changing) social graph. We discuss the applicability of our findings to aspects of peer-to-peer systems such as data dissemination, social search, peer vulnerability, and data placement and caching.
Inferring Peer Centrality in Socially-Informed Peer-to-Peer Systems. Nicolas Kourtellis and Adriana Iamnitchi. In Proceedings of 11th IEEE International Conference on Peer-to-Peer Computing (P2P'11), Kyoto, Japan, Aug 2011
Vision and reflection on Mining Software Repositories research in 2024
Inferring Peer Centrality in Socially-Informed Peer-to-Peer Systems
1. Inferring Peer Centrality
in Socially-Informed P2P Systems
Nicolas Kourtellis, Adriana Iamnitchi
Department of Computer Science & Engineering
University of South Florida
Tampa, USA
11th IEEE International Conference on Peer-to-Peer Computing
Kyoto, Japan, 2011
2. Socially-aware Applications
Applications collect and use social information:
Location, collocation, history of interactions, etc.
Build (implicit/explicit) social network of users
Use: reduce spam, provide recommendations, etc.
Wide range of system architectures
How does the social network of users affect the load
in a P2P architecture?
2
Decentralization of user social data
• MobiClique
• Yarta
• ...
• PeerSoN
• LifeSocial.KOM
• Safebook
• Prometheus
• …
3. Social Graphs & P2P Networks
Users connected with application-specific edges
User-contributed peers form a P2P network
User social graph is partitioned into subgraphs &
stored on peers
Questions:
How do applications traverse a distributed social graph?
What does it mean for the P2P routing? 3
4. Invite user G’s 2-hop hiking contacts to a trip
Social graph traversals => many P2P lookups
Application performance affected by projection
of social graph on peers
Application Example
4
=> 1-hop={B, C, E} 2-hops={A, D, F, I}
5. How do the properties of the projection graph compare with
the properties of the social graph projected?
Projection Graph
5
Projection
Graph (PG)
P2P Overlay
Social
Graph (SG)
6. Projection Graph Model
Uses:
Study properties of peers such as centrality
Study how the social graph topology affects P2P
routing & system performance 6
Social Graph SG = (V,E)
V=set of users, E=set of social edges
Projection Graph PG = (VP
,EP
)
VP
=set of peers, EP
=set of P2P edges
PV
(i) = set of users mapped on peer Pi
, Pi
Î VP
(Pi
,Pj
) Î EP
iff $ a Î PV
(i), $ b Î PV
(j) s.t. (a,b) Î E
w(Pi
,Pj
) = (a,b) Î E |a Î PV
(i), b Î PV
(j){ }
8. A
B
C
D
EF
G
H
IJ
K
L
M
N
O
Number of edges of a node
High degree centrality peers: Network Hubs
Can be targeted to directly influence many other
peers with a message broadcast or distribute a
search query
Degree Centrality
8
9. A
B
C
D
EF
G
H
IJ
K
M
N
O
Node Betweenness Centrality
Measures the extent to which a node lies on the
shortest path between two other nodes
High betweenness centrality peers: Control
communication between distant peers
Can host data caches for reduced latency to locate
data
9
10. A
B
CD
EF
G
H
I
J
K
L
M
N
O
Edge Betweenness Centrality
Measures the extent to which an edge lies on the
shortest path between two nodes
High betweenness centrality edges: Connect
distant parts of P2P network
Can be monitored to block malware traffic
10
11. Calculating Peer Centrality
Challenging because of:
Limited access to user data (e.g., privacy settings)
P2P network scale
Peer churn
Through experimental analysis on the social and
projection graph, we investigate how to
circumvent these limitations
11
12. Experimental Questions
Can we approximate the centrality of peers using
the centrality scores of their users?
How does the number of users storing data per
peer affect the centrality scores of their peers?
Social graph is less dynamic than the P2P network
Calculate infrequently centrality score of users & use it
to estimate their peer’s centrality
Spoiler Alert!
[1, ~150] users/peer: Can estimate degree &
betweenness centrality of peers with good
accuracy
Above 150 users/peer: The projection graph
becomes highly connected => peers do not
differentiate in centrality 12
13. Naturally-formed communities offer incentives for resource
sharing 1 community subgraph mapped per peer
Projection graphs generated from 5 real social graphs
Communities detected via recursive Louvain algorithm*
Varied average community size: 5,10,20,…,1000 users/peer
Calculate correlation of centralities of users and their peers
Compare average centralities of users and their peers
Identify top centrality peers from their users’ scores
Experimental Methodology
13
Social Network Users Edges
gnutella04 10,876 39,994
gnutella31 62,561 147,878
enron 33,696 180,811
epinions 75,877 405,739
slashdot 82,168 504,230
*V. D. Blondel et al, “Fast unfolding of communities in large networks”,
Journal of Statistical Mechanics: Theory and Experiment, vol. 10, 2008.
14. Correlation of Centrality Scores
[1-150] users/peer:
Projection graph resembles
closely social graph
Highest correlation of social &
projection graph metrics
Degree & node betweenness
estimated from local
information (cumulative scores)
14
0
0.2
0.4
0.6
0.8
1
1 10 100 1000
DegreeCentralityCorrelation
Users/Peer (a)
gnutella04
enron
gnutella31
epinions
slashdot
0
0.2
0.4
0.6
0.8
1
1 10 100 1000NodeBetweennessCentralityCorrelation
Users/Peer (b)
gnutella04
enron
gnutella31
epinions
slashdot
0
0.2
0.4
0.6
0.8
1
1 10 100 1000
EdgeBetweennessCentralityCorrelation
Users/Peer (c)
gnutella04
enron
gnutella31
epinions
slashdot
After 150 users/peer:
Projection graph topology
loses social properties
Highly connected network
Peers participate equally
in graph traversal
Users/Peer
vs.
Degree
Users/Peer
vs.
Node Betweenness
Users/Peer
vs.
Edge Betweenness
15. Comparison of Centrality Scores
Increase number of users/peer turning point in
projection graph
More connections with other peers
increase peer degree & betweenness to maximum
More social edges within peers
decrease edge betweenness to minimum
15
1e-05
0.0001
0.001
0.01
0.1
1
1 10 100 1000
DegreeCentrality
Users/Peer (a)
gnutella04_CDCU
gnutella04_DCP
enron_CDCU
enron_DCP
gnutella31_CDCU
gnutella31_DCP
epinions_CDCU
epinions_DCP
slashdot_CDCU
slashdot_DCP
1e-05
0.0001
0.001
0.01
0.1
1
1 10 100 1000
NodeBetweennessCentrality
Users/Peer (b)
gnutella04_CNBCU
gnutella04_NBCP
enron_CNBCU
enron_NBCP
gnutella31_CNBCU
gnutella31_NBCP
epinions_CNBCU
epinions_NBCP
slashdot_CNBCU
slashdot_NBCP
1e-11
1e-10
1e-09
1e-08
1e-07
1e-06
1e-05
0.0001
0.001
0.01
1 10 100 1000
EdgeBetweennessCentrality
Users/Peer (c)
gnutella04_CEBCU
gnutella04_EBCP
enron_CEBCU
enron_EBCP
gnutella31_CEBCU
gnutella31_EBCP
epinions_CEBCU
epinions_EBCP
slashdot_CEBCU
slashdot_EBCP
Users/Peer
vs.
Degree
Users/Peer
vs.
Node Betweenness
Users/Peer
Vs.
Edge Betweenness
16. Finding High Betweenness Peers
Placing data caches on high betweenness peers
can reduce latency to locate data
Can we identify such peers, knowing the top
betweenness users or communities?
Top 5% betweenness centrality users => top betweenness
centrality peers with 80–90% accuracy 16
0
0.2
0.4
0.6
0.8
1
1 10 100 1000
PeerOverlap
Users/Peer (Method 1)
1%
5%
10%
1 10 100 1000
Users/Peer (Method 2)
1%
5%
10%
Users/Peer Users/Peer
With Top-N% users With Top-N% communities
17. Summary of Findings
[1, ~150] users/peer:
Projection graph resembles closely social graph
Highest correlation of social & projection graph metrics
Degree & node betweenness can be estimated from
local information (cumulative scores of users)
Cannot estimate well edge betweenness
Above 150 users/peer:
Projection graph topology loses social properties
A highly connected projection graph
No differentiation in peer centrality
Top betweenness centrality users can pinpoint the top
betweenness centrality peers with good accuracy
Overall: Applications can calculate infrequently
centrality score of users to estimate peer centrality
Social graph changes slowly compared to P2P network 17
18. Impact on Applications & Systems
Target high degree peers to:
Decrease search time
Increase breadth of search and diversity of results
Target high betweenness peers to:
Monitor information flow and collect traces
Place data caches and indexes of data location
Quarantine malware outbursts
Disseminate software patches
Tackle P2P churn
Predict centrality of peers to allocate resources
Reduce overlay overhead
Enhance routing tables with P2P edges for faster &
more secure peer discovery
18
19. 19
Thank you!
This work was supported by NSF Grants:
CNS 0952420 and CNS 0831785
http://www.cse.usf.edu/dsg/
nkourtel@mail.usf.edu