SlideShare a Scribd company logo
1 of 76
Download to read offline
Dynamics in large scale networks
John Clements
Supervised by: Dr. Babak Farzad, Dr. Henryk Fuk±
Brock University
jc09xs@brocku.ca
February 01 2016
John Clements (Brock University) Dynamics in large scale networks February 01 2016 1 / 65
Table of Contents
1 Introduction
Denitions
2 A Brief History of Large network dynamics
Patterns in the removal of nodes from large networks.
Network properties
3 High Clustering
4 Node expiration
Connectivity and node expiration.
Degree
Clustering Coecient
Conclusions
5 The server merger
Graphical evolution
The servers before the merger
The merger.
Degree dierences.
6 Graph motifs
Finding bounds on the 3 node subgraphs.John Clements (Brock University) Dynamics in large scale networks February 01 2016 2 / 65
Table of Contents
1 Introduction
Denitions
2 A Brief History of Large network dynamics
Patterns in the removal of nodes from large networks.
Network properties
3 High Clustering
4 Node expiration
Connectivity and node expiration.
Degree
Clustering Coecient
Conclusions
5 The server merger
Graphical evolution
The servers before the merger
The merger.
Degree dierences.
6 Graph motifs
Finding bounds on the 3 node subgraphs.John Clements (Brock University) Dynamics in large scale networks February 01 2016 3 / 65
Graph Theory Denition
Graph
A Graph G is an ordered pair (V (G),E(G)) consisting of a set V (G) of
vertices and a set E(G) of edges, that form connections between them.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 4 / 65
Network analysis Denitions
Degree
The degree of a vertex v in a graph G, denoted kG (v) is the number of
edges of G incident with v.[?]
Clustering coecient:
The clustering coecient of a node v is:
cv =
2T(v)
kv (kv − 1)
Where T(v) is the number of triangles (i.e. connected neighbors) v is
involved in. The clustering coecient of a degree 0 or 1 node is set as 0.[?]
John Clements (Brock University) Dynamics in large scale networks February 01 2016 5 / 65
Network analysis Denitions
Degree
The degree of a vertex v in a graph G, denoted kG (v) is the number of
edges of G incident with v.[?]
Clustering coecient:
The clustering coecient of a node v is:
cv =
2T(v)
kv (kv − 1)
Where T(v) is the number of triangles (i.e. connected neighbors) v is
involved in. The clustering coecient of a degree 0 or 1 node is set as 0.[?]
John Clements (Brock University) Dynamics in large scale networks February 01 2016 5 / 65
A Brief overview of Large network dynamics
There are a truly enormous number of paper of studies and analysis of real
world large networks including nearly any type of online network.
Alongside these studies are network models
But the removal process of nodes from large networks has rarely been
studied empirically and incorporated in very few dynamic into network
models. Most dynamic models
Many dynamic models have been proposed often these include a edge
removal process models that use a node removal process are much rarer.
Examples: 6 degrees of separation, the actor network, durr
These studies range from single snapshots and painstakingly gathered
survey data to the event based dynamic studies.
Most of these studies do not account for the removal of nodes.
Another area important to us is network modeling, many of the studies of
real world networks propose a model or provide best ts of one.
Models
• Nodal attribute models
• Exponental random graphsJohn Clements (Brock University) Dynamics in large scale networks February 01 2016 6 / 65
The two datasets.
The businesses competing for add space on Google and Bing.
Why did we choose these networks in particular?
We looked at the removal or lapse process for businesses in a network of
businesses competing for AD space on Google and Bing.
The network of friendships among Avatars in the MMOFPS
planetside 2.
We looked for patterns in the removal or expiration of Avatars in the
massively multiplayer online game (MMOG) planetside 2.
Look at the merger of two servers.
Examine the removal or lapse process for avatars, looking for a simple rule.
Why did we collect these?
• We thought we could nd patterns in the removal or lapse of these
nodes.
• Competition example
John Clements (Brock University) Dynamics in large scale networks February 01 2016 7 / 65
Crawler overview.
1 Gather a list of active avatar Ids from the server we want to crawl.
Add them to the queue of Id's to check.
2 Get the friendlists of all avatar Ids in the queue from the API. If
successful remove them from the queue and add it to the list of visited
Ids.
3 Then go through the friend list identify which Ids are valid. Save each
of these valid friend relationships to the edge set.
4 While there are Ids in the queue go to step two.
5 Record the edge list in a sql table.
6 Gather the avatar attributes for each of the Ids found in the crawl and
record them to a sql table.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 8 / 65
Planetside 2 avatar attributes
The planetside 2 servers
Datasets
Server Location 7 days 44 days
US East EW Emerald
US West CW Connery
EU MW Miller
Our data is drawn from three
planetside 2 servers:
• Connery the east coast server.
• Emerald the server created from the merger of Waterson and Mattherson.
• Miller a EU server
The available avatar attributes depends on when it was gathered.
Common to both datasets.
• Id
• Name
John Clements (Brock University) Dynamics in large scale networks February 01 2016 9 / 65
Exclusive
Avatars online in the last 44 days,
stored in Connery, Emerald and Miller.
• Includes the server merger
• Starts on the 23rd of May.
Avatars online in the last 7 days,
referred to by CW,EW and MW.
• Outt Id
• Outt size
• Creation date
• Login count
• Last login date
• Total time played and time played
by month
• Number of kills and deaths by
month.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 10 / 65
Correlation between attributes
Average attribute correlation matrix for CW.
Degree CC Br Kills Deaths K/D Time Outt Size
Degree 1.000
CC -0.005 1.000
Br 0.305 0.059 1.000
Kills 0.210 0.006 0.499 1.000
Deaths 0.206 0.001 0.445 0.794 1.000
K/D 0.103 0.004 0.403 0.324 0.194 1.000
Time 0.246 0.004 0.510 0.792 0.892 0.280 1.000
Outt Size 0.024 -0.033 0.088 0.003 0.065 0.008 0.056 1.000
John Clements (Brock University) Dynamics in large scale networks February 01 2016 11 / 65
Google Ad network visualization
John Clements (Brock University) Dynamics in large scale networks February 01 2016 12 / 65
High Clustering
Clustering coecient in the planetside 2 snapshots.
Advertisement networks.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 13 / 65
High Clustering
Clustering coecient in the planetside 2 snapshots.
Advertisement networks.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 13 / 65
Avatar states
Active vs Inactive
• An avatar with active after the previous snapshot is active.
• An inactive avatar is any avatar who is not active but is seen active
again in the future.
Avatar states
• A new avatar is any avatar created after the previous snapshot.
• A dead or abandoned avatar is any avatar who never returns from
inactivity.
• And the third group of Immediately abandoned (IA) new avatars.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 14 / 65
small world for long diameters
Advertisment network Random graphs.
Bing Google
Diameter 7 8 3 4
APL 2.528 2.752 2.945 (0.00180) 5.108 (0.00696)
Table: The diameter and average path length of the competition network
John Clements (Brock University) Dynamics in large scale networks February 01 2016 15 / 65
The clustering coecient distribution of Google.
Full Without the spike
John Clements (Brock University) Dynamics in large scale networks February 01 2016 16 / 65
The clustering coecient distribution of Bing.
Full Without the spike
John Clements (Brock University) Dynamics in large scale networks February 01 2016 17 / 65
Emerald August 18th a typical distribution of avatar
clustering coecient
Full Without the spike
John Clements (Brock University) Dynamics in large scale networks February 01 2016 18 / 65
Table of Contents
1 Introduction
Denitions
2 A Brief History of Large network dynamics
Patterns in the removal of nodes from large networks.
Network properties
3 High Clustering
4 Node expiration
Connectivity and node expiration.
Degree
Clustering Coecient
Conclusions
5 The server merger
Graphical evolution
The servers before the merger
The merger.
Degree dierences.
6 Graph motifs
Finding bounds on the 3 node subgraphs.John Clements (Brock University) Dynamics in large scale networks February 01 2016 19 / 65
Edges connecting failed companies.
Compared with edges in random subgraphs.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 20 / 65
Edge dynamics
CW EW MW
Edge formation
existing ↔ existing 29.97% 29.23% 25.15%
existing ↔ new 4.74% 5.27% 4.85%
existing ↔ IA 3.31% 3.71% 3.32%
new ↔ new 0.45% 0.53% 0.56%
new ↔ IA 0.30% 0.41% 0.39%
IA ↔ IA 0.23% 0.29% 0.30%
Edge deletion
One removed 53.15% 54.70% 52.01%
Both removed 4.30% 5.02% 4.54%
Broken 4.06% 4.23% 4.69%
Unstable 0.29% 0.30% 0.26%
John Clements (Brock University) Dynamics in large scale networks February 01 2016 21 / 65
Power law Degree Distribution
The impact of the degree
• Power law x−α
• Exponential e−λx
• Power law with exponential cuto x−α
e−λx
John Clements (Brock University) Dynamics in large scale networks February 01 2016 22 / 65
Degree of failed companies
Bing Google
John Clements (Brock University) Dynamics in large scale networks February 01 2016 23 / 65
Degree of Dead Avatars
John Clements (Brock University) Dynamics in large scale networks February 01 2016 24 / 65
Clustering coecient distribution of failed companies
Bing Google
John Clements (Brock University) Dynamics in large scale networks February 01 2016 25 / 65
The normalized battle rank distribution.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 26 / 65
Avatar state by size of outt.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 27 / 65
Creation date.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 28 / 65
Conclusions
• Generally the nodes that were removed from both network were
peripheral in unimportant positions.
• But none of the patterens we did nd were strong indicators in the end.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 29 / 65
Table of Contents
1 Introduction
Denitions
2 A Brief History of Large network dynamics
Patterns in the removal of nodes from large networks.
Network properties
3 High Clustering
4 Node expiration
Connectivity and node expiration.
Degree
Clustering Coecient
Conclusions
5 The server merger
Graphical evolution
The servers before the merger
The merger.
Degree dierences.
6 Graph motifs
Finding bounds on the 3 node subgraphs.John Clements (Brock University) Dynamics in large scale networks February 01 2016 30 / 65
W
e initially started collecting the planetside 2 dataset to capture the merger
of two servers.
• This is the rst time that a server merger has been captured and
studied.
• Provides an easily studied analog to real world merger of populations.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 31 / 65
Reading the graphs
These images were created using Gephi [?] using the force atlas 2 layout.
Node size scales linearly with degree, and colour is assigned by the
following table.
Colour Key
Origin Faction
NC TR VS
Waterson
Mattherson
Neither
John Clements (Brock University) Dynamics in large scale networks February 01 2016 32 / 65
Merger: Waterson June 23
John Clements (Brock University) Dynamics in large scale networks February 01 2016 33 / 65
Merger: Mattherson June 23
John Clements (Brock University) Dynamics in large scale networks February 01 2016 34 / 65
June 30th
John Clements (Brock University) Dynamics in large scale networks February 01 2016 35 / 65
July 14th
John Clements (Brock University) Dynamics in large scale networks February 01 2016 36 / 65
August 4th
John Clements (Brock University) Dynamics in large scale networks February 01 2016 37 / 65
August 18th
John Clements (Brock University) Dynamics in large scale networks February 01 2016 38 / 65
September 15
John Clements (Brock University) Dynamics in large scale networks February 01 2016 39 / 65
November 17th.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 40 / 65
December 17th.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 41 / 65
February 23rd.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 42 / 65
Assortivity:
Assortativity measures the tendency of nodes to be connected to nodes
similar to themselves in some way.
The assortativity coecient is dened as follows:
r =
i ei,i − i a2
i
1 − i a2
i
Where ei,j is the fraction of edges that connecting vertexes of type i to
vertexes of type j. Let ai to be the fraction of edges connecting to a vertex
of type i.
The minimum is:
rmin =
− i a2
i
1 − i a2
i
which occurs when ei,j = 0∀i,j
John Clements (Brock University) Dynamics in large scale networks February 01 2016 43 / 65
The assortivity by origin.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 44 / 65
Degree dierence of cross origin edges.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 45 / 65
Degree dierence of mattherson to waterson edges.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 46 / 65
Table of Contents
1 Introduction
Denitions
2 A Brief History of Large network dynamics
Patterns in the removal of nodes from large networks.
Network properties
3 High Clustering
4 Node expiration
Connectivity and node expiration.
Degree
Clustering Coecient
Conclusions
5 The server merger
Graphical evolution
The servers before the merger
The merger.
Degree dierences.
6 Graph motifs
Finding bounds on the 3 node subgraphs.John Clements (Brock University) Dynamics in large scale networks February 01 2016 47 / 65
Graph motif
Finding the potential motifs in a Barabási-Albert graph.
Denition
The graph motifs of a network are patterns that occur signicantly more
often in it then expected in an ensemble of networks[?].
The signicance of a motif is measured with the simple Z score.
Signicance
Z =
M − ¯Mr
σr
Where M be the number of subgraphs in the network and ¯Mr and σr be
the mean and standard deviation for the number found in the ensemble
John Clements (Brock University) Dynamics in large scale networks February 01 2016 48 / 65
History of network motifs
Introduced in by Shen-Orr et. al.
Most research has focused eciently nding motifs such as:
• FANMOD 2006
• KAVOSH 2009
In 2013 Johan Ugander found the extremal bounds on the potential
subgraphs found in any network by its density.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 49 / 65
History of network motifs
Introduced in by Shen-Orr et. al.
Most research has focused eciently nding motifs such as:
• FANMOD 2006
• KAVOSH 2009
In 2013 Johan Ugander found the extremal bounds on the potential
subgraphs found in any network by its density.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 49 / 65
History of network motifs
Introduced in by Shen-Orr et. al.
Most research has focused eciently nding motifs such as:
• FANMOD 2006
• KAVOSH 2009
In 2013 Johan Ugander found the extremal bounds on the potential
subgraphs found in any network by its density.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 49 / 65
signifcance prole example
John Clements (Brock University) Dynamics in large scale networks February 01 2016 50 / 65
The Barabási  Albert Algorithm
The algorithm takes two parameters N the number of nodes in the nal
graph and m the number of edges each node forms to existing nodes.
• Create graph with m unconnected nodes.
• While there are less then N nodes in the network, add a node with m
edges to existing nodes.
• The probability of choosing a existing node is proportional to its
degree.
[?]
P(v) =
kx
i∈V (G)
ki
(1)
John Clements (Brock University) Dynamics in large scale networks February 01 2016 51 / 65
Random graph ensemble
• Traditionally the ensemble consists of random graphs with the same
degree distribution as the original network.
• However this method results in some correlations that arrise from the
degree distrobution itself
• So we used a ensemble of Gn,p random graphs with the same density
as the original.
Gn,p random graph
There are two parameters n and p, generate a graph with n nodes for every
pair of nodes add an edge with probability p independently. The expected
density of such a graph is equal to the p.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 52 / 65
Random graph ensemble
• Traditionally the ensemble consists of random graphs with the same
degree distribution as the original network.
• However this method results in some correlations that arrise from the
degree distrobution itself
• So we used a ensemble of Gn,p random graphs with the same density
as the original.
Gn,p random graph
There are two parameters n and p, generate a graph with n nodes for every
pair of nodes add an edge with probability p independently. The expected
density of such a graph is equal to the p.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 52 / 65
The 4 undirected triads
Possible triads
Empty
(1 − p)3
One edge
3p(1 − p)2
Open Triad
3p2(1 − p)
Triangle
p3
The density of a completed BA graph is:
p =
2m(N − m)
N(N − 1)
(2)
So we can easily compute the expected number of each triad in a Gn,p
random graph.
N
(1 − p)3 N
3p(1 − p)2 N
3p2
(1 − p)
N
p3
John Clements (Brock University) Dynamics in large scale networks February 01 2016 53 / 65
Empirical tests
The triangles created by low values of m.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 54 / 65
Probabilistic bounds.
A the time t depends entirely
on the parameters N and m.
Since we start from a empty
graph and add m edges with
every node the number of
edges at any given time must
be:
E(Gt) = m(t − m) (3)
As a result many other graph
parameters can be calculated
at any given step such as
density:
D =
2m(t − m)
t(t − 1)
(4)
Examples of edge probabilities and
bounds.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 55 / 65
Additive bounds.
• So when N is greater then 8m2−1+ 16m3−16m2+1
8m−2 the probability of a
edge is greater in the BA graph then in a Gn,p graph.
• If we were to simply count how many of each subgraph are added and
nd bounds for small motifs at least.
• The expected number of any n node subgraphs in our ensemble which
is simply (N
3)P where P is the probability of such a subgraph in a Gn,p
random graph.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 56 / 65
Bounds on triads in the BA graph vs the expected number in the
ensemble.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 57 / 65
Triangles
As we know at each timestep we add m at most (m
2) triangles. And at least
(m
2) open triads are created at each step after the second.
BarabásiAlbert
The upper bound on the number of Triangle subgraphs is:
N
t=m+1
m
2
=
1
2
m(m − 1)(N − m) (5)
Random Graph Ensemble
The expected number of triangle subgraphs in a Gn,p random graph is:
N
3
p3
=
4
3
(N − 2)(N − m)3m3
(N − 1)2N2
(6)
John Clements (Brock University) Dynamics in large scale networks February 01 2016 58 / 65
Triangles
As we know at each timestep we add m at most (m
2) triangles. And at least
(m
2) open triads are created at each step after the second. Trivially:
1
2
m(m − 1)(N − m) 
4
3
(N − 2)(N − m)3m3
(N − 1)2N2
For all 0  m  N.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 58 / 65
Open Triad.
Random Graph Ensemble
The expected number of open triads in a Gn,p random graph is.
N
3
3p2
(p − 1) =
2(N − 2)(N − m)2m2(N2 − 2Nm + 2m2 − N)
(N − 1)2N2
(5)
BarabásiAlbert
The lower bound on the number of open triads in a BA graph.
1
2
m(m + 1)(N − m) ≥
2(N − 2)(N − m)2m2(N2 − 2Nm + 2m2 − N)
(N − 1)2N2
(6)
John Clements (Brock University) Dynamics in large scale networks February 01 2016 59 / 65
Open Triad.
solution
Therefore when m and N are used such that m ≥ N
2 − 1 the open triad will
be a motif of the resulting graph.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 59 / 65
One Edge.
BarabásiAlbert
The minimum number of subgraphs containing a single edge.
N
t=m+1
t − 1
2
−
t − 1 − m
2
=
1
2
m(N − 2)(N − m) (5)
Random Graph Ensemble
While the expected number of subgraphs containing exactly one edge in a
Gn,p random graph is:
N
3
3p(p − 1)2
=
(N − 2)(N − m)m(N2 − 2Nm + 2m2 − N)2
(N − 1)2N2
(6)
John Clements (Brock University) Dynamics in large scale networks February 01 2016 60 / 65
One Edge.
So the maximum number of one edge subgraphs in the BA graph is greater
then the expected number in the Gn,p when:
m 
1
2
N +
1
2
( 2 − 1)N2 + (2 − 2)N (5)
m 
1
2
N −
1
2
( 2 − 1)N2 + (2 − 2)N (6)
John Clements (Brock University) Dynamics in large scale networks February 01 2016 60 / 65
Empty.
BarabásiAlbert
The upper bound on the number of empty nodes in a BA graph the bound
is:
m
3
+
N
t=m+1
t − 1 − m
2
=
1
6
(N − 2)(N2
− 3Nm + 3m2
− N) (7)
Random Graph Ensemble
The expected number of empty graphs in a Gn,p graph is:
N
3
(1 − p)3
=
1
6
(N − 2)(N2 − 2Nm + 2m2 − N)3
(N − 1)2N2
(8)
John Clements (Brock University) Dynamics in large scale networks February 01 2016 61 / 65
Empty.
For all N  5 the upper bound is less then the expected value of empty
subgraphs in the ensemble.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 61 / 65
Final bounds
Probabilistic bounds
When the dierence between m and N is such that
N ≥
8m2 − 16m3 − 16m2 + 1 − 1
8m − 2
holds, then the triangle and empty triads will never be a motif.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 62 / 65
Final bounds
Additive bounds
The open triad will be a motif a BA graph whenever:
m 
N
2
− 1
for any N  2.
The single edge triad can only be a motif when:
m 
1
2
N +
1
2
( 2 − 1)N2 + (2 − 2)N
or
m 
1
2
N −
1
2
( 2 − 1)N2 + (2 − 2)N
for any valid N.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 62 / 65
future work
The formation of a servers social network.
Continuing from the server merger, we have records of the formation of
several new servers that could be investigated. Letting us learn how the
original servers structure came to be.
Identifying the social structure of players
Continuing from the work on the removal of players from the planetside 2
network and from the ad network, it would be very helpful to have better
way of identifying the parent players or companies. Potentially changing
the networks structure greatly
John Clements (Brock University) Dynamics in large scale networks February 01 2016 63 / 65
future work
The formation of a servers social network.
Continuing from the server merger, we have records of the formation of
several new servers that could be investigated. Letting us learn how the
original servers structure came to be.
Identifying the social structure of players
Continuing from the work on the removal of players from the planetside 2
network and from the ad network, it would be very helpful to have better
way of identifying the parent players or companies. Potentially changing
the networks structure greatly
John Clements (Brock University) Dynamics in large scale networks February 01 2016 63 / 65
Future Work: Additional database analysis.
By its very nature large datasets will always have more unanswered
questions. There are a huge number of potential relationships between the
networks, the actors and their removal that we did not have time to test,
for example how many of the removed avatars had a typo in their name. In
this suggests some future work that seems interesting but is either outside
the scope of large network analysis or simply something we did not have
time to do.
John Clements (Brock University) Dynamics in large scale networks February 01 2016 64 / 65
John Clements (Brock University) Dynamics in large scale networks February 01 2016 65 / 65

More Related Content

Viewers also liked

基礎から学ぶ組み込みAndroid
基礎から学ぶ組み込みAndroid基礎から学ぶ組み込みAndroid
基礎から学ぶ組み込みAndroiddemuyan
 
Google Plus SignIn : l'Authentification Google
Google Plus SignIn : l'Authentification GoogleGoogle Plus SignIn : l'Authentification Google
Google Plus SignIn : l'Authentification GoogleMathias Seguy
 
Reklamok Ertelmezese felkeszules a vizsgara
Reklamok Ertelmezese felkeszules a vizsgaraReklamok Ertelmezese felkeszules a vizsgara
Reklamok Ertelmezese felkeszules a vizsgaraZoltan Havasi
 
Animate me, If you don't do it for me do it for Chet :)
Animate me, If you don't do it for me do it for Chet :)Animate me, If you don't do it for me do it for Chet :)
Animate me, If you don't do it for me do it for Chet :)Mathias Seguy
 
Lec.6 strength design method rectangular sections 2
Lec.6   strength design method rectangular sections  2Lec.6   strength design method rectangular sections  2
Lec.6 strength design method rectangular sections 2Muthanna Abbu
 
Vernacular Architecture of Central india
Vernacular Architecture of Central indiaVernacular Architecture of Central india
Vernacular Architecture of Central indiaJulian_Divakar
 
JULIANA_BACCHUS_RESUME 2015
JULIANA_BACCHUS_RESUME 2015JULIANA_BACCHUS_RESUME 2015
JULIANA_BACCHUS_RESUME 2015Juliana Bacchus
 
Why Visual Content is Key for Oil & Gas Communications - Bill Roth [Energy Di...
Why Visual Content is Key for Oil & Gas Communications - Bill Roth [Energy Di...Why Visual Content is Key for Oil & Gas Communications - Bill Roth [Energy Di...
Why Visual Content is Key for Oil & Gas Communications - Bill Roth [Energy Di...Energy Digital Summit
 
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...Energy Digital Summit
 
Personal Branding: Career Management Strategies - Chris Westfall [Energy Digi...
Personal Branding: Career Management Strategies - Chris Westfall [Energy Digi...Personal Branding: Career Management Strategies - Chris Westfall [Energy Digi...
Personal Branding: Career Management Strategies - Chris Westfall [Energy Digi...Energy Digital Summit
 

Viewers also liked (10)

基礎から学ぶ組み込みAndroid
基礎から学ぶ組み込みAndroid基礎から学ぶ組み込みAndroid
基礎から学ぶ組み込みAndroid
 
Google Plus SignIn : l'Authentification Google
Google Plus SignIn : l'Authentification GoogleGoogle Plus SignIn : l'Authentification Google
Google Plus SignIn : l'Authentification Google
 
Reklamok Ertelmezese felkeszules a vizsgara
Reklamok Ertelmezese felkeszules a vizsgaraReklamok Ertelmezese felkeszules a vizsgara
Reklamok Ertelmezese felkeszules a vizsgara
 
Animate me, If you don't do it for me do it for Chet :)
Animate me, If you don't do it for me do it for Chet :)Animate me, If you don't do it for me do it for Chet :)
Animate me, If you don't do it for me do it for Chet :)
 
Lec.6 strength design method rectangular sections 2
Lec.6   strength design method rectangular sections  2Lec.6   strength design method rectangular sections  2
Lec.6 strength design method rectangular sections 2
 
Vernacular Architecture of Central india
Vernacular Architecture of Central indiaVernacular Architecture of Central india
Vernacular Architecture of Central india
 
JULIANA_BACCHUS_RESUME 2015
JULIANA_BACCHUS_RESUME 2015JULIANA_BACCHUS_RESUME 2015
JULIANA_BACCHUS_RESUME 2015
 
Why Visual Content is Key for Oil & Gas Communications - Bill Roth [Energy Di...
Why Visual Content is Key for Oil & Gas Communications - Bill Roth [Energy Di...Why Visual Content is Key for Oil & Gas Communications - Bill Roth [Energy Di...
Why Visual Content is Key for Oil & Gas Communications - Bill Roth [Energy Di...
 
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
 
Personal Branding: Career Management Strategies - Chris Westfall [Energy Digi...
Personal Branding: Career Management Strategies - Chris Westfall [Energy Digi...Personal Branding: Career Management Strategies - Chris Westfall [Energy Digi...
Personal Branding: Career Management Strategies - Chris Westfall [Energy Digi...
 

Similar to Presentation

Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Kyunghoon Kim
 
Measurement and modeling of the web and related data sets
Measurement and modeling of the web and related data setsMeasurement and modeling of the web and related data sets
Measurement and modeling of the web and related data setsMark J. Feldman
 
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS,...
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS,...Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS,...
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS,...Steve Kramer
 
From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"diannepatricia
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsJason Riedy
 
LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks
 LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks
LCF: A Temporal Approach to Link Prediction in Dynamic Social NetworksIJCSIS Research Publications
 
Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting (Vienna)
Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting (Vienna)Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting (Vienna)
Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting (Vienna)Daniel Katz
 
Lect2 up010 (100324)
Lect2 up010 (100324)Lect2 up010 (100324)
Lect2 up010 (100324)aicdesign
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs Jason Riedy
 
SDC: A Distributed Clustering Protocol
SDC: A Distributed Clustering ProtocolSDC: A Distributed Clustering Protocol
SDC: A Distributed Clustering ProtocolCSCJournals
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept Miha Ahronovitz
 
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...Daniel Katz
 
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...Steve Kramer
 
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...Xi Wang
 
SRA Poster_Guzman
SRA Poster_GuzmanSRA Poster_Guzman
SRA Poster_GuzmanAna Guzman
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...Daniel Katz
 

Similar to Presentation (20)

Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용
 
Measurement and modeling of the web and related data sets
Measurement and modeling of the web and related data setsMeasurement and modeling of the web and related data sets
Measurement and modeling of the web and related data sets
 
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS,...
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS,...Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS,...
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS,...
 
From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks
 LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks
LCF: A Temporal Approach to Link Prediction in Dynamic Social Networks
 
Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting (Vienna)
Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting (Vienna)Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting (Vienna)
Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting (Vienna)
 
Lect2 up010 (100324)
Lect2 up010 (100324)Lect2 up010 (100324)
Lect2 up010 (100324)
 
MIMO
MIMOMIMO
MIMO
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
18 Diffusion Models and Peer Influence
18 Diffusion Models and Peer Influence18 Diffusion Models and Peer Influence
18 Diffusion Models and Peer Influence
 
09 Diffusion Models & Peer Influence
09 Diffusion Models & Peer Influence09 Diffusion Models & Peer Influence
09 Diffusion Models & Peer Influence
 
SDC: A Distributed Clustering Protocol
SDC: A Distributed Clustering ProtocolSDC: A Distributed Clustering Protocol
SDC: A Distributed Clustering Protocol
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
 
F14 lec12graphs
F14 lec12graphsF14 lec12graphs
F14 lec12graphs
 
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
 
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...
Finding Key Influencers and Viral Topics in Twitter Networks Related to ISIS ...
 
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
 
SRA Poster_Guzman
SRA Poster_GuzmanSRA Poster_Guzman
SRA Poster_Guzman
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
 

Presentation

  • 1. Dynamics in large scale networks John Clements Supervised by: Dr. Babak Farzad, Dr. Henryk Fuk± Brock University jc09xs@brocku.ca February 01 2016 John Clements (Brock University) Dynamics in large scale networks February 01 2016 1 / 65
  • 2. Table of Contents 1 Introduction Denitions 2 A Brief History of Large network dynamics Patterns in the removal of nodes from large networks. Network properties 3 High Clustering 4 Node expiration Connectivity and node expiration. Degree Clustering Coecient Conclusions 5 The server merger Graphical evolution The servers before the merger The merger. Degree dierences. 6 Graph motifs Finding bounds on the 3 node subgraphs.John Clements (Brock University) Dynamics in large scale networks February 01 2016 2 / 65
  • 3. Table of Contents 1 Introduction Denitions 2 A Brief History of Large network dynamics Patterns in the removal of nodes from large networks. Network properties 3 High Clustering 4 Node expiration Connectivity and node expiration. Degree Clustering Coecient Conclusions 5 The server merger Graphical evolution The servers before the merger The merger. Degree dierences. 6 Graph motifs Finding bounds on the 3 node subgraphs.John Clements (Brock University) Dynamics in large scale networks February 01 2016 3 / 65
  • 4. Graph Theory Denition Graph A Graph G is an ordered pair (V (G),E(G)) consisting of a set V (G) of vertices and a set E(G) of edges, that form connections between them. John Clements (Brock University) Dynamics in large scale networks February 01 2016 4 / 65
  • 5. Network analysis Denitions Degree The degree of a vertex v in a graph G, denoted kG (v) is the number of edges of G incident with v.[?] Clustering coecient: The clustering coecient of a node v is: cv = 2T(v) kv (kv − 1) Where T(v) is the number of triangles (i.e. connected neighbors) v is involved in. The clustering coecient of a degree 0 or 1 node is set as 0.[?] John Clements (Brock University) Dynamics in large scale networks February 01 2016 5 / 65
  • 6. Network analysis Denitions Degree The degree of a vertex v in a graph G, denoted kG (v) is the number of edges of G incident with v.[?] Clustering coecient: The clustering coecient of a node v is: cv = 2T(v) kv (kv − 1) Where T(v) is the number of triangles (i.e. connected neighbors) v is involved in. The clustering coecient of a degree 0 or 1 node is set as 0.[?] John Clements (Brock University) Dynamics in large scale networks February 01 2016 5 / 65
  • 7. A Brief overview of Large network dynamics There are a truly enormous number of paper of studies and analysis of real world large networks including nearly any type of online network. Alongside these studies are network models But the removal process of nodes from large networks has rarely been studied empirically and incorporated in very few dynamic into network models. Most dynamic models Many dynamic models have been proposed often these include a edge removal process models that use a node removal process are much rarer. Examples: 6 degrees of separation, the actor network, durr These studies range from single snapshots and painstakingly gathered survey data to the event based dynamic studies. Most of these studies do not account for the removal of nodes. Another area important to us is network modeling, many of the studies of real world networks propose a model or provide best ts of one. Models • Nodal attribute models • Exponental random graphsJohn Clements (Brock University) Dynamics in large scale networks February 01 2016 6 / 65
  • 8. The two datasets. The businesses competing for add space on Google and Bing. Why did we choose these networks in particular? We looked at the removal or lapse process for businesses in a network of businesses competing for AD space on Google and Bing. The network of friendships among Avatars in the MMOFPS planetside 2. We looked for patterns in the removal or expiration of Avatars in the massively multiplayer online game (MMOG) planetside 2. Look at the merger of two servers. Examine the removal or lapse process for avatars, looking for a simple rule. Why did we collect these? • We thought we could nd patterns in the removal or lapse of these nodes. • Competition example John Clements (Brock University) Dynamics in large scale networks February 01 2016 7 / 65
  • 9. Crawler overview. 1 Gather a list of active avatar Ids from the server we want to crawl. Add them to the queue of Id's to check. 2 Get the friendlists of all avatar Ids in the queue from the API. If successful remove them from the queue and add it to the list of visited Ids. 3 Then go through the friend list identify which Ids are valid. Save each of these valid friend relationships to the edge set. 4 While there are Ids in the queue go to step two. 5 Record the edge list in a sql table. 6 Gather the avatar attributes for each of the Ids found in the crawl and record them to a sql table. John Clements (Brock University) Dynamics in large scale networks February 01 2016 8 / 65
  • 10. Planetside 2 avatar attributes The planetside 2 servers Datasets Server Location 7 days 44 days US East EW Emerald US West CW Connery EU MW Miller Our data is drawn from three planetside 2 servers: • Connery the east coast server. • Emerald the server created from the merger of Waterson and Mattherson. • Miller a EU server The available avatar attributes depends on when it was gathered. Common to both datasets. • Id • Name John Clements (Brock University) Dynamics in large scale networks February 01 2016 9 / 65
  • 11. Exclusive Avatars online in the last 44 days, stored in Connery, Emerald and Miller. • Includes the server merger • Starts on the 23rd of May. Avatars online in the last 7 days, referred to by CW,EW and MW. • Outt Id • Outt size • Creation date • Login count • Last login date • Total time played and time played by month • Number of kills and deaths by month. John Clements (Brock University) Dynamics in large scale networks February 01 2016 10 / 65
  • 12. Correlation between attributes Average attribute correlation matrix for CW. Degree CC Br Kills Deaths K/D Time Outt Size Degree 1.000 CC -0.005 1.000 Br 0.305 0.059 1.000 Kills 0.210 0.006 0.499 1.000 Deaths 0.206 0.001 0.445 0.794 1.000 K/D 0.103 0.004 0.403 0.324 0.194 1.000 Time 0.246 0.004 0.510 0.792 0.892 0.280 1.000 Outt Size 0.024 -0.033 0.088 0.003 0.065 0.008 0.056 1.000 John Clements (Brock University) Dynamics in large scale networks February 01 2016 11 / 65
  • 13. Google Ad network visualization John Clements (Brock University) Dynamics in large scale networks February 01 2016 12 / 65
  • 14. High Clustering Clustering coecient in the planetside 2 snapshots. Advertisement networks. John Clements (Brock University) Dynamics in large scale networks February 01 2016 13 / 65
  • 15. High Clustering Clustering coecient in the planetside 2 snapshots. Advertisement networks. John Clements (Brock University) Dynamics in large scale networks February 01 2016 13 / 65
  • 16. Avatar states Active vs Inactive • An avatar with active after the previous snapshot is active. • An inactive avatar is any avatar who is not active but is seen active again in the future. Avatar states • A new avatar is any avatar created after the previous snapshot. • A dead or abandoned avatar is any avatar who never returns from inactivity. • And the third group of Immediately abandoned (IA) new avatars. John Clements (Brock University) Dynamics in large scale networks February 01 2016 14 / 65
  • 17. small world for long diameters Advertisment network Random graphs. Bing Google Diameter 7 8 3 4 APL 2.528 2.752 2.945 (0.00180) 5.108 (0.00696) Table: The diameter and average path length of the competition network John Clements (Brock University) Dynamics in large scale networks February 01 2016 15 / 65
  • 18. The clustering coecient distribution of Google. Full Without the spike John Clements (Brock University) Dynamics in large scale networks February 01 2016 16 / 65
  • 19. The clustering coecient distribution of Bing. Full Without the spike John Clements (Brock University) Dynamics in large scale networks February 01 2016 17 / 65
  • 20. Emerald August 18th a typical distribution of avatar clustering coecient Full Without the spike John Clements (Brock University) Dynamics in large scale networks February 01 2016 18 / 65
  • 21. Table of Contents 1 Introduction Denitions 2 A Brief History of Large network dynamics Patterns in the removal of nodes from large networks. Network properties 3 High Clustering 4 Node expiration Connectivity and node expiration. Degree Clustering Coecient Conclusions 5 The server merger Graphical evolution The servers before the merger The merger. Degree dierences. 6 Graph motifs Finding bounds on the 3 node subgraphs.John Clements (Brock University) Dynamics in large scale networks February 01 2016 19 / 65
  • 22. Edges connecting failed companies. Compared with edges in random subgraphs. John Clements (Brock University) Dynamics in large scale networks February 01 2016 20 / 65
  • 23. Edge dynamics CW EW MW Edge formation existing ↔ existing 29.97% 29.23% 25.15% existing ↔ new 4.74% 5.27% 4.85% existing ↔ IA 3.31% 3.71% 3.32% new ↔ new 0.45% 0.53% 0.56% new ↔ IA 0.30% 0.41% 0.39% IA ↔ IA 0.23% 0.29% 0.30% Edge deletion One removed 53.15% 54.70% 52.01% Both removed 4.30% 5.02% 4.54% Broken 4.06% 4.23% 4.69% Unstable 0.29% 0.30% 0.26% John Clements (Brock University) Dynamics in large scale networks February 01 2016 21 / 65
  • 24. Power law Degree Distribution The impact of the degree • Power law x−α • Exponential e−λx • Power law with exponential cuto x−α e−λx John Clements (Brock University) Dynamics in large scale networks February 01 2016 22 / 65
  • 25. Degree of failed companies Bing Google John Clements (Brock University) Dynamics in large scale networks February 01 2016 23 / 65
  • 26. Degree of Dead Avatars John Clements (Brock University) Dynamics in large scale networks February 01 2016 24 / 65
  • 27. Clustering coecient distribution of failed companies Bing Google John Clements (Brock University) Dynamics in large scale networks February 01 2016 25 / 65
  • 28. The normalized battle rank distribution. John Clements (Brock University) Dynamics in large scale networks February 01 2016 26 / 65
  • 29. Avatar state by size of outt. John Clements (Brock University) Dynamics in large scale networks February 01 2016 27 / 65
  • 30. Creation date. John Clements (Brock University) Dynamics in large scale networks February 01 2016 28 / 65
  • 31. Conclusions • Generally the nodes that were removed from both network were peripheral in unimportant positions. • But none of the patterens we did nd were strong indicators in the end. John Clements (Brock University) Dynamics in large scale networks February 01 2016 29 / 65
  • 32. Table of Contents 1 Introduction Denitions 2 A Brief History of Large network dynamics Patterns in the removal of nodes from large networks. Network properties 3 High Clustering 4 Node expiration Connectivity and node expiration. Degree Clustering Coecient Conclusions 5 The server merger Graphical evolution The servers before the merger The merger. Degree dierences. 6 Graph motifs Finding bounds on the 3 node subgraphs.John Clements (Brock University) Dynamics in large scale networks February 01 2016 30 / 65
  • 33. W e initially started collecting the planetside 2 dataset to capture the merger of two servers. • This is the rst time that a server merger has been captured and studied. • Provides an easily studied analog to real world merger of populations. John Clements (Brock University) Dynamics in large scale networks February 01 2016 31 / 65
  • 34. Reading the graphs These images were created using Gephi [?] using the force atlas 2 layout. Node size scales linearly with degree, and colour is assigned by the following table. Colour Key Origin Faction NC TR VS Waterson Mattherson Neither John Clements (Brock University) Dynamics in large scale networks February 01 2016 32 / 65
  • 35. Merger: Waterson June 23 John Clements (Brock University) Dynamics in large scale networks February 01 2016 33 / 65
  • 36. Merger: Mattherson June 23 John Clements (Brock University) Dynamics in large scale networks February 01 2016 34 / 65
  • 37. June 30th John Clements (Brock University) Dynamics in large scale networks February 01 2016 35 / 65
  • 38. July 14th John Clements (Brock University) Dynamics in large scale networks February 01 2016 36 / 65
  • 39. August 4th John Clements (Brock University) Dynamics in large scale networks February 01 2016 37 / 65
  • 40. August 18th John Clements (Brock University) Dynamics in large scale networks February 01 2016 38 / 65
  • 41. September 15 John Clements (Brock University) Dynamics in large scale networks February 01 2016 39 / 65
  • 42. November 17th. John Clements (Brock University) Dynamics in large scale networks February 01 2016 40 / 65
  • 43. December 17th. John Clements (Brock University) Dynamics in large scale networks February 01 2016 41 / 65
  • 44. February 23rd. John Clements (Brock University) Dynamics in large scale networks February 01 2016 42 / 65
  • 45. Assortivity: Assortativity measures the tendency of nodes to be connected to nodes similar to themselves in some way. The assortativity coecient is dened as follows: r = i ei,i − i a2 i 1 − i a2 i Where ei,j is the fraction of edges that connecting vertexes of type i to vertexes of type j. Let ai to be the fraction of edges connecting to a vertex of type i. The minimum is: rmin = − i a2 i 1 − i a2 i which occurs when ei,j = 0∀i,j John Clements (Brock University) Dynamics in large scale networks February 01 2016 43 / 65
  • 46. The assortivity by origin. John Clements (Brock University) Dynamics in large scale networks February 01 2016 44 / 65
  • 47. Degree dierence of cross origin edges. John Clements (Brock University) Dynamics in large scale networks February 01 2016 45 / 65
  • 48. Degree dierence of mattherson to waterson edges. John Clements (Brock University) Dynamics in large scale networks February 01 2016 46 / 65
  • 49. Table of Contents 1 Introduction Denitions 2 A Brief History of Large network dynamics Patterns in the removal of nodes from large networks. Network properties 3 High Clustering 4 Node expiration Connectivity and node expiration. Degree Clustering Coecient Conclusions 5 The server merger Graphical evolution The servers before the merger The merger. Degree dierences. 6 Graph motifs Finding bounds on the 3 node subgraphs.John Clements (Brock University) Dynamics in large scale networks February 01 2016 47 / 65
  • 50. Graph motif Finding the potential motifs in a Barabási-Albert graph. Denition The graph motifs of a network are patterns that occur signicantly more often in it then expected in an ensemble of networks[?]. The signicance of a motif is measured with the simple Z score. Signicance Z = M − ¯Mr σr Where M be the number of subgraphs in the network and ¯Mr and σr be the mean and standard deviation for the number found in the ensemble John Clements (Brock University) Dynamics in large scale networks February 01 2016 48 / 65
  • 51. History of network motifs Introduced in by Shen-Orr et. al. Most research has focused eciently nding motifs such as: • FANMOD 2006 • KAVOSH 2009 In 2013 Johan Ugander found the extremal bounds on the potential subgraphs found in any network by its density. John Clements (Brock University) Dynamics in large scale networks February 01 2016 49 / 65
  • 52. History of network motifs Introduced in by Shen-Orr et. al. Most research has focused eciently nding motifs such as: • FANMOD 2006 • KAVOSH 2009 In 2013 Johan Ugander found the extremal bounds on the potential subgraphs found in any network by its density. John Clements (Brock University) Dynamics in large scale networks February 01 2016 49 / 65
  • 53. History of network motifs Introduced in by Shen-Orr et. al. Most research has focused eciently nding motifs such as: • FANMOD 2006 • KAVOSH 2009 In 2013 Johan Ugander found the extremal bounds on the potential subgraphs found in any network by its density. John Clements (Brock University) Dynamics in large scale networks February 01 2016 49 / 65
  • 54. signifcance prole example John Clements (Brock University) Dynamics in large scale networks February 01 2016 50 / 65
  • 55. The Barabási Albert Algorithm The algorithm takes two parameters N the number of nodes in the nal graph and m the number of edges each node forms to existing nodes. • Create graph with m unconnected nodes. • While there are less then N nodes in the network, add a node with m edges to existing nodes. • The probability of choosing a existing node is proportional to its degree. [?] P(v) = kx i∈V (G) ki (1) John Clements (Brock University) Dynamics in large scale networks February 01 2016 51 / 65
  • 56. Random graph ensemble • Traditionally the ensemble consists of random graphs with the same degree distribution as the original network. • However this method results in some correlations that arrise from the degree distrobution itself • So we used a ensemble of Gn,p random graphs with the same density as the original. Gn,p random graph There are two parameters n and p, generate a graph with n nodes for every pair of nodes add an edge with probability p independently. The expected density of such a graph is equal to the p. John Clements (Brock University) Dynamics in large scale networks February 01 2016 52 / 65
  • 57. Random graph ensemble • Traditionally the ensemble consists of random graphs with the same degree distribution as the original network. • However this method results in some correlations that arrise from the degree distrobution itself • So we used a ensemble of Gn,p random graphs with the same density as the original. Gn,p random graph There are two parameters n and p, generate a graph with n nodes for every pair of nodes add an edge with probability p independently. The expected density of such a graph is equal to the p. John Clements (Brock University) Dynamics in large scale networks February 01 2016 52 / 65
  • 58. The 4 undirected triads Possible triads Empty (1 − p)3 One edge 3p(1 − p)2 Open Triad 3p2(1 − p) Triangle p3 The density of a completed BA graph is: p = 2m(N − m) N(N − 1) (2) So we can easily compute the expected number of each triad in a Gn,p random graph. N (1 − p)3 N 3p(1 − p)2 N 3p2 (1 − p) N p3 John Clements (Brock University) Dynamics in large scale networks February 01 2016 53 / 65
  • 59. Empirical tests The triangles created by low values of m. John Clements (Brock University) Dynamics in large scale networks February 01 2016 54 / 65
  • 60. Probabilistic bounds. A the time t depends entirely on the parameters N and m. Since we start from a empty graph and add m edges with every node the number of edges at any given time must be: E(Gt) = m(t − m) (3) As a result many other graph parameters can be calculated at any given step such as density: D = 2m(t − m) t(t − 1) (4) Examples of edge probabilities and bounds. John Clements (Brock University) Dynamics in large scale networks February 01 2016 55 / 65
  • 61. Additive bounds. • So when N is greater then 8m2−1+ 16m3−16m2+1 8m−2 the probability of a edge is greater in the BA graph then in a Gn,p graph. • If we were to simply count how many of each subgraph are added and nd bounds for small motifs at least. • The expected number of any n node subgraphs in our ensemble which is simply (N 3)P where P is the probability of such a subgraph in a Gn,p random graph. John Clements (Brock University) Dynamics in large scale networks February 01 2016 56 / 65
  • 62. Bounds on triads in the BA graph vs the expected number in the ensemble. John Clements (Brock University) Dynamics in large scale networks February 01 2016 57 / 65
  • 63. Triangles As we know at each timestep we add m at most (m 2) triangles. And at least (m 2) open triads are created at each step after the second. BarabásiAlbert The upper bound on the number of Triangle subgraphs is: N t=m+1 m 2 = 1 2 m(m − 1)(N − m) (5) Random Graph Ensemble The expected number of triangle subgraphs in a Gn,p random graph is: N 3 p3 = 4 3 (N − 2)(N − m)3m3 (N − 1)2N2 (6) John Clements (Brock University) Dynamics in large scale networks February 01 2016 58 / 65
  • 64. Triangles As we know at each timestep we add m at most (m 2) triangles. And at least (m 2) open triads are created at each step after the second. Trivially: 1 2 m(m − 1)(N − m) 4 3 (N − 2)(N − m)3m3 (N − 1)2N2 For all 0 m N. John Clements (Brock University) Dynamics in large scale networks February 01 2016 58 / 65
  • 65. Open Triad. Random Graph Ensemble The expected number of open triads in a Gn,p random graph is. N 3 3p2 (p − 1) = 2(N − 2)(N − m)2m2(N2 − 2Nm + 2m2 − N) (N − 1)2N2 (5) BarabásiAlbert The lower bound on the number of open triads in a BA graph. 1 2 m(m + 1)(N − m) ≥ 2(N − 2)(N − m)2m2(N2 − 2Nm + 2m2 − N) (N − 1)2N2 (6) John Clements (Brock University) Dynamics in large scale networks February 01 2016 59 / 65
  • 66. Open Triad. solution Therefore when m and N are used such that m ≥ N 2 − 1 the open triad will be a motif of the resulting graph. John Clements (Brock University) Dynamics in large scale networks February 01 2016 59 / 65
  • 67. One Edge. BarabásiAlbert The minimum number of subgraphs containing a single edge. N t=m+1 t − 1 2 − t − 1 − m 2 = 1 2 m(N − 2)(N − m) (5) Random Graph Ensemble While the expected number of subgraphs containing exactly one edge in a Gn,p random graph is: N 3 3p(p − 1)2 = (N − 2)(N − m)m(N2 − 2Nm + 2m2 − N)2 (N − 1)2N2 (6) John Clements (Brock University) Dynamics in large scale networks February 01 2016 60 / 65
  • 68. One Edge. So the maximum number of one edge subgraphs in the BA graph is greater then the expected number in the Gn,p when: m 1 2 N + 1 2 ( 2 − 1)N2 + (2 − 2)N (5) m 1 2 N − 1 2 ( 2 − 1)N2 + (2 − 2)N (6) John Clements (Brock University) Dynamics in large scale networks February 01 2016 60 / 65
  • 69. Empty. BarabásiAlbert The upper bound on the number of empty nodes in a BA graph the bound is: m 3 + N t=m+1 t − 1 − m 2 = 1 6 (N − 2)(N2 − 3Nm + 3m2 − N) (7) Random Graph Ensemble The expected number of empty graphs in a Gn,p graph is: N 3 (1 − p)3 = 1 6 (N − 2)(N2 − 2Nm + 2m2 − N)3 (N − 1)2N2 (8) John Clements (Brock University) Dynamics in large scale networks February 01 2016 61 / 65
  • 70. Empty. For all N 5 the upper bound is less then the expected value of empty subgraphs in the ensemble. John Clements (Brock University) Dynamics in large scale networks February 01 2016 61 / 65
  • 71. Final bounds Probabilistic bounds When the dierence between m and N is such that N ≥ 8m2 − 16m3 − 16m2 + 1 − 1 8m − 2 holds, then the triangle and empty triads will never be a motif. John Clements (Brock University) Dynamics in large scale networks February 01 2016 62 / 65
  • 72. Final bounds Additive bounds The open triad will be a motif a BA graph whenever: m N 2 − 1 for any N 2. The single edge triad can only be a motif when: m 1 2 N + 1 2 ( 2 − 1)N2 + (2 − 2)N or m 1 2 N − 1 2 ( 2 − 1)N2 + (2 − 2)N for any valid N. John Clements (Brock University) Dynamics in large scale networks February 01 2016 62 / 65
  • 73. future work The formation of a servers social network. Continuing from the server merger, we have records of the formation of several new servers that could be investigated. Letting us learn how the original servers structure came to be. Identifying the social structure of players Continuing from the work on the removal of players from the planetside 2 network and from the ad network, it would be very helpful to have better way of identifying the parent players or companies. Potentially changing the networks structure greatly John Clements (Brock University) Dynamics in large scale networks February 01 2016 63 / 65
  • 74. future work The formation of a servers social network. Continuing from the server merger, we have records of the formation of several new servers that could be investigated. Letting us learn how the original servers structure came to be. Identifying the social structure of players Continuing from the work on the removal of players from the planetside 2 network and from the ad network, it would be very helpful to have better way of identifying the parent players or companies. Potentially changing the networks structure greatly John Clements (Brock University) Dynamics in large scale networks February 01 2016 63 / 65
  • 75. Future Work: Additional database analysis. By its very nature large datasets will always have more unanswered questions. There are a huge number of potential relationships between the networks, the actors and their removal that we did not have time to test, for example how many of the removed avatars had a typo in their name. In this suggests some future work that seems interesting but is either outside the scope of large network analysis or simply something we did not have time to do. John Clements (Brock University) Dynamics in large scale networks February 01 2016 64 / 65
  • 76. John Clements (Brock University) Dynamics in large scale networks February 01 2016 65 / 65