2. Outline
SNA = Complex Network Analysis on Social Networks
Notation & Metrics Degree Distribution
Path Lengths
Transitivity
Models Random Graphs
Small-Worlds
Preferential Attachment
Models Discussion
Conclusion
2
3. Network Directed Network
G = (V, E) E ⊂ V 2
k out
= ∑ A ij k = ∑ A ji
in
{(x, x) x ∈V } ∩ E = ∅
i i
j j
ki = kiin + kiout
Undirected Network
Adjacency Matrix
A symmetric
⎧1 if (i,j) ∈E
A ij = ⎨
⎩0 otherwise ki = ∑ A ji = ∑ A ij
j j
px = # {i ki = x }
1
Degree Distribution
n
Average Degree k =n −1
∑k x
x∈V
3
4. Measure of Transitivity
()
−1
ki
Local Clustering Coefficient Ci = 2 T (i)
T(i): # distinct triangles with i as vertex
1
Clustering Coefficient C = ∑ Ci
n i∈V
C=
( number of closed paths of length 2 ) = ( number of triangles ) × 3
( number of paths of length 2 ) ( number of connected triples )
4
5. Shortest Path Length and Diameter
scalar operations
AB = A + .⋅ B The matrix product depends from
( A,+,⋅) [ AB]ij = ∑ A ik ⋅ Bkj
the operations of the semi-ring
k
Set of Adjacency Matrices
min
Other matrix products make sense: e.g., ( A,+,^ ) or ( A,^,+ )
We consider: (
Sk (M) = M + .^ M k ^ .+ M k )
Shortest path lengths matrix: L = ( Sn … S1 ) ( M )
Diameter: d = max L Average shortest path: = Lij
ij
5
6. Computational Complexity of ASPL:
All pairs shortest path matrix based (parallelizable): ( ) α ≈ 3/ 4
O n 3+α
All pairs shortest path Bellman-Ford: O (n )3
All pairs shortest path Dijkstra w. Fibonacci Heaps: O ( n log n + nm )
2
Computing the CPL
x = M q (S) q#S elements are ≤ than x and (1-q)#S are > than x
x = Lqδ (S) q#S(1-δ) elements are ≤ than x and (1-q)#S(1-δ) are > than x
Huber Algorithm
2 2 (1 − δ )
2
Let R a random sample of S such that #R=s, then
s = 2 ln
q δ 2 Lqδ(S) = Mq(R) with probability p = 1-ε.
6
8. Facebook Hugs Degree Distribution
10000000 Nodes: 1322631 Edges: 1555597
m/n: 1.17 CPL: 11.74
1000000
Clustering Coefficient: 0.0527
Number of Components: 18987
100000
Isles: 0
10000
Largest Component Size: 1169456
1000
For large k we have
100
statistical fluctuations
10
1
1 10 100 1000
For small k power-laws do not hold 8
9. Many networks have
power-law degree distribution. pk ∝ k −γ
γ >1
• Citation networks
k r
=?
• Biological networks
• WWW graph
• Internet graph
• Social Networks
Power-Law: ! gamma=3
1000000
100000
10000
1000
100
10
1
0.1 9
1 10 100 1000
10. Erdös-Rényi Random Graphs
Connectedness
p Threshold log n / n
G(n, p)
p
G(n, m) p
p
p
p
Ensembles of Graphs p p
When describe values of p
properties, we actually the p Pr(Aij = 1) = p
expected value of the property
d := d = ∑ Pr(G)⋅ d(G) ∝
log n
Pr(G) = p m
(1− p)
() n
2 −m
G log k
⎛ n⎞
m =⎜ ⎟ p k = (n − 1)p C = k (n − 1) −1
⎝ 2⎠
⎛ n − 1⎞ k k
k
pk = ⎜ ⎟ p (1− p)
n−1−k
n→∞ pk = e − k
10
⎝k ⎠ k!
11. p
Watts-Strogatz Model
In the modified model, we only add the edges.
ki = κ + si ps = e −κ s (κ p ) s
C=
3(κ − 2)
s! 4(κ − 1) + 8κ p + 4κ p 2
Edges in
the lattice # added
pk = e −κ s (κ p ) k−κ
≈
log(npκ )
shortcuts
( k − κ )! κ p
2
11
12. Strogatz-Watts Model - 10000 nodes k = 4
1
CPL(p)/CPL(0)
C(p)/C(0)
0.8
CPL(p)/CPL(0)
0.6
C(p)/C(0)
0.4
0.2
0
0 0.2 0.4 p 0.6 0.8 1
Short CPL
Large Clustering Coefficient 12
Threshold
Threshold
14. Barabási-Albert Model Connectedness log n
Threshold log log n
BARABASI-ALBERT-MODEL(G,M0,STEPS) Pr(V = x ) = ∑ Pr(E = e) =
FOR K FROM 1 TO STEPS e∈N ( x )
N0 ← NEW-NODE(G) kx 2k x
= =
ADD-NODE(G,N0) m ∑ kx
A ← MAKE-ARRAY() x
FOR N IN NODES(G)
−3
PUSH(A, N) pk ∝ x
FOR J IN DEGREE(N)
log n
PUSH(A, N) ≈
FOR J FROM 1 TO M log log n
N ← RANDOM-CHOICE(A)
−3/4
ADD-LINK (N0, N) C≈n
Scale-free entails
short CPL
Transitivity disappears 14
with network size No analytical proof available
15. OSN Refs. Users Links <k> C CP d γ r
L
Club Nexus Adamic et al 2.5 K 10 K 8.2 0.2 4 13 n.a. n.a.
Cyworld Ahn et al 12 M 191 M 31.6 0.2 3.2 16 -0.1
Cyworld T Ahn et al 92 K 0.7 M 15.3 0.3 7.2 n.a. n.a. 0.4
LiveJournal Mislove et al 5 M 77 M 17 0.3 5.9 20 0.2
Flickr Mislove et al 1.8 M 22 M 12.2 0.3 5.7 27 0.2
Twitter Kwak et al 41 M 1700 M n.a. n.a. 4 4.1 n.a.
Orkut Mislove et al 3 M 223 M 106 0.2 4.3 9 1.5 0.1
Orkut Ahn et al 100 K 1.5 M 30.2 0.3 3.8 n.a. 3.7 0.3
Youtube Mislove et al 1.1 M 5 M 4.29 0.1 5.1 21 -0
Facebook Gjoka et al 1 M n.a. n.a. 0.2 n.a. n.a. 0.23
FB H Nazir et al 51 K 116 K n.a. 0.4 n.a. 29 n.a.
FB GL Nazir et al 277 K 600 K n.a. 0.3 n.a. 45 n.a.
BrightKite Scellato et al 54 K 213 K 7.88 0.2 4.7 n.a. n.a.
FourSquare Scellato et al 58 K 351 K 12 0.3 4.6 n.a. n.a.
LiveJournal Scellato et al 993 K 29.6 M 29.9 0.2 4.9 n.a. n.a.
Twitter Java et al 87 K 829 K 18.9 0.1 n.a. 6 0.59
Twitter Scellato et al 409 K 183 M 447 0.2 2.8 n.a. n.a.
15
16. Static Deg C Rigid
ER Yes Poisson Low -
WS Yes Poisson Ok Yes
BA No PL γ=3 Fixable Yes
• Moreover:
• Mostly no navigability
• Uniformity assumption
• Sometimes too complex for analytic study
• Few features studied
• Power-law?
16
17. Alternative models for degree distributions
Power-laws are difficult to fit.
When they do, there are often better distributions.
Power-law with cutoff almost always fits better than plain power-law.
f (x;γ , β ) = x −γ eβ x
Sometimes the log-normal distribution is more appropriate
1 ⎛ − ( log(x / m))2 ⎞
f (x;σ , m) = exp ⎜ ⎟
xσ (2π )1/2
⎝ 2σ 2
⎠
Most of the times random and preferential attachment processes concur
F(x;r) = 1− (rm)1+r (x + rm)−(1+r )
r→0 r→∞
17
scale-free negative exponential dist.
18. Massachussets 1st run: 64/296 arrived, most
Boston delivered to him by 2 men
Nebraska
2nd run: 24/160 arrived, 2/3
delivered by “Mr. Jacobs”
Omaha
2 ≤ hops ≤ 10; µ=5.x
Wichita 6 Degrees
CPL, hubs, ...
Kansas ... and Kleinberg’s Intuition
Milgram’s Experiment
• Random people from Omaha & Wichita were asked to
send a postcard to a person in Boston:
• Write the name on the postcard
• Forward the message only to people personally known
18
that was more likely to know the target
19. Biased Preferential Attachment
At each step:
A new node is added to the network and is assigned to one of the
sets P, I and L according to a probability distribution h
+
e0 ∈ edges are added to the network
for each edge (u,v) u is chosen with distribution D0 and:
if u ∈ I, v is a new node and is assigned to P;
if u ∈ L, v is chosen according to Dγ.
⎧(β + 1)(ku + 1) u ∈L
β ⎪
D (u) ∝ ⎨ ku + 1 u ∈I
⎪0 u ∈P
⎩
No analytic results available.
19
20. Transitive Linking Model [Davidsen 02]
Transitive Linking
I At each step:
TL: a random node is chosen, and it introduces two other nodes that
are linked to it; if the node does not have 2 edges, it introduces
himself to a random node
RM: with probability p a node is chosen and removed along its edges
and replaced with a node with one random edge
I When p ⇤ 1 the TL dominates the process:
I the degree distribution is a power-law with cutoff
I 1 C = p(⌅k ⇧ 1), i.e., quite large in practice
I For larger values of p the two different process concur to form an
exponential degree distribution
I for p ⇥ 1 the degree distribution is essentially a Poisson
distribution
Instead of p it would make sense to have distinct p and r
Bergenti, Franchi, Poggi (Univ. Parma) Models for Agent-based Simulation of SN SNAMAS ’11 11 / 19
parameters for nodes leaving and entering the network
Few analytic results available.
20
21. [1]
Dorogovtsev, S. N. and Mendes, J. F. F. 2003 Evolution of Networks: From Biological Nets
to the Internet and WWW (Physics). Oxford University Press, USA.
[2]
Watts, D. J. 2003 Small Worlds: The Dynamics of Networks between Order and
Randomness (Princeton Studies in Complexity). Princeton University Press.
[3]
Jackson, M. O. 2010 Social and Economic Networks. Princeton University Press.
[4]
Newman, M. 2010 Networks: An Introduction. Oxford University Press, USA.
[5]
Wasserman, S. and Faust, K. 1994 Social Network Analysis: Methods and Applications
(Structural Analysis in the Social Sciences). Cambridge University Press.
[6]
Scott, J. P. 2000 Social Network Analysis: A Handbook. Sage Publications Ltd.
[7]
Kepner, J. and Gilbert, J. 2011 Graph Algorithms in the Language of Linear Algebra
(Software, Environments, and Tools). Society for Industrial & Applied Mathematics.
[8]
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2009 Introduction to
Algorithms. The MIT Press.
[9]
Skiena, S. S. 2010 The Algorithm Design Manual. Springer.
[10]
Bollobas, B. 1998 Modern Graph Theory. Springer.
[11]
Watts, D. J. and Strogatz, S. H. 1998. Collective dynamics of ‘small-world’networks.
Nature. 393, 6684, 440-442.
[12]
Barabási, A. L. and Albert, R. 1999. Emergence of scaling in random networks. Science.
286, 5439, 509.
[13]
Kleinberg, J. 2000. The small-world phenomenon: an algorithm perspective. Proceedings of
the thirty-second annual ACM symposium on Theory of computing. 163-170.
[14]
Milgram, S. 1967. The small world problem. Psychology today. 2, 1, 60-67.
21
22. Thanks for your kind attention.
Enrico Franchi (efranchi@ce.unipr.it)
AOTLAB, Dipartimento Ingegneria dell’Informazione,
Università di Parma
22