Analysis of overlapping communities in signed complex networks; this paper compares three overlapping community detection algorithms in networks with both positive and negative connections.
From idea to production in a day โ Leveraging Azure ML and Streamlit to build...
ย
Analysis of Overlapping Communities in Signed Networks
1. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 1
Analysis of Overlapping Communities in
Signed Complex Networks
Mohsen Shahriari, Ying Li, Ralf Klamma
Advanced Community Information Systems (ACIS)
RWTH Aachen University, Germany
shahriari@dbis.rwth-aachen.de
Chair of Computer Science 5
RWTH Aachen University
2. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 2
Agenda
๏ง Introduction to OCD
๏ง Related Work
๏ง Motivation & Research Questions
๏ง Overlapping Community Detection (OCD) Algorithms
for Signed Networks
๏ง Evaluation
๏ง Results
๏ง Conclusion and Outlook
3. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 3
Introduction to OCD in
Signed Networks
๏ง Community detection as an important part of network
analysis
๏ง Two key characteristics of signed social networks
- Nodes in the overlapping communities
- Relations with signs
๏ง Community structure
Inside
Communities
- Dense
- Positive
Between
Communities
- Negative
- Sparse
-
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
4. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 4
Motivation
๏ง Practical application of OCD in signed networks like
- Informal learning networks
- Review sites
- Open source developer networks
๏ง Contribute to the current research on OCD in signed
networks with the following difficiencies
- Few algorithms
- No comparison between available algorithms
5. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 5
Related Work on Community
Detection in Signed Graphs
๏ง Non-overlapping community detection
- Agent-based finding and extracting communities (FEC) [YaCL07]
- Two-step approach by maximizing modularity and minimizing
frustration [AnMa12]
- Clustering re-clustering algorithm (CRA) [AmPi13]
๏ง Overlapping community detection
- Signed Disassortative Degree Mixing and Information Diffusion
Algorithm (SDMID) [ShKl15]
- Signed Probabilistic Mixture Model (SPM) [CWYT14]
- Multi-objective Evolutionary Algorithm based on Similarity for
Community Detection in Signed Networks (MEAs-SN) [LiLJ14]
6. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 6
Research Questions
๏ง How do Signed Disassortative degree Mixing and
Information Diffusion (SDMID), Signed Probabilistic
Mixture model (SPM) and Multi-objective Evolutionary
Algorithm (MEA) perform in comparison with each
other, in terms of knowledge-driven and statistical
metrics?
๏ง What are the structural properties of covers detected
by SDMID, SPM and MEA and how do they differ?
7. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 7
Signed Disassortative Degree Mixing and
Information Diffusion Algorithm: Phase 1
Identify leaders
- Calculate Local Leadership Value (LLD) using effective
degree (ED) and normalized disassortativeness (DASS)
- Identify local leaders:
- Identify global leaders:
where FL: Follower Set, LL: Local Leader Set
๐ฌ๐ซ ๐ =
๐ด๐๐( ๐๐+
(๐) โ ๐๐โ
(๐) , ๐)
๐๐+(๐) + ๐๐โ(๐)
๐ซ๐จ๐บ๐บ ๐ =
๐โ๐ต๐๐(๐) ๐๐๐ ๐ โ ๐๐๐ (๐)
๐โ๐ต๐๐(๐) ๐ ๐๐ ๐ + ๐ ๐๐(๐)
๐ณ๐ณ๐ซ ๐ = ๐ถ ร ๐ซ๐จ๐บ๐บ ๐ + (๐ โ ๐ถ) ร ๐ฌ๐ซ(๐)
โ๐ โ ๐ต๐๐ ๐ , ๐ณ๐ณ๐ซ(๐) โฅ ๐ณ๐ณ๐ซ(๐)
๐ญ๐ณ(๐) >
๐โ๐ณ๐ณ ๐ญ๐ณ(๐)
๐ณ๐ณ
8. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 8
Cascading (network coordination game)
- Assign a leader node k behavior B and all other nodes behavior A
- Node i with current behavior A will change its behavior to that (B) of
its neighbors, if the potential payoff pB(i) is above a predefined
threshold, i.e. LLD:
๐ ๐ฉ(๐) =
๐|๐ โ ๐ต๐๐+
๐ ๐๐ง๐ ๐๐๐๐๐๐๐๐ ๐ = ๐ฉ โ ๐|๐ โ ๐ต๐๐+
๐ ๐๐๐ ๐๐๐๐๐๐๐๐ ๐ = ๐ฉ
๐|๐ โ ๐ต๐๐+ ๐ ๐๐๐ ๐๐๐๐๐๐๐๐ ๐ = ๐ฉ + ๐|๐ โ ๐ต๐๐+ ๐ ๐๐๐ ๐๐๐๐๐๐๐๐ ๐ = ๐ฉ
Signed Disassortative Degree Mixing and
Information Diffusion Algorithm: Phase 2
0.6
0.7
0.5
0.2
+
+ +
+
+
+
+-
0.6
0.7
0.5
0.2
+
+ +
+
+
+
+-
0.6
0.7
0.5
0.2
+
+ +
+
+
+
+-
9. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 9
Signed Probabilistic Mixture Model
๏ง Based on Expectation-Maximization (EM) method
๏ง Maximize the log function of the marginal likelihood of
the signed network:
Estimation
Maximization
Use ๐, ๐ to compute
o The probability of a positive edge from a community r : ๐1
o The probability of a negative edge from two communities r and s: ๐2
Update ๐, ๐ with ๐1 and ๐2 by maximizing ๐๐๐(๐ธ|๐, ๐)
๐ท ๐ฌ ๐, ๐ฝ =
๐ ๐๐โ๐ฌ ๐๐
๐ ๐๐ ๐ฝ ๐๐ ๐ฝ ๐๐
๐จ ๐๐
+
๐๐(๐โ ๐)
๐ ๐๐ ๐ฝ ๐๐ ๐ฝ ๐๐
๐จ ๐๐
โ
10. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 10
Multi-Objective Evolutionary Algorithm Based
on Similarity for Community Detection in
Signed Networks
๏ง Based upon structural similarity between adjacent nodes
where ๐น ๐ฅ = 0, if ๐ค ๐ข๐ฅ < 0 and ๐ค๐ฃ๐ฅ < 0; ๐ค ๐ข๐ฅ ๐ค ๐ฃ๐ฅ, ๐๐กโ๐๐๐ค๐๐ ๐
๏ง Objective functions
- Maximize the sum of positive similarities within communities
- Maximize the sum of negative similarities between communities
๏ง Optimal solution is selected with MOEA/D (multiobjective
evolutionary algorithm based on decomposition) [ZhLi07]
- Decomposition into scalar optimization
- Simultaneous optimization of these subproblems
s(๐, ๐) =
๐โ๐ฉ(๐)โฉ๐ฉ(๐) ๐ณ(๐)
๐โ๐ฉ(๐) ๐ ๐๐
๐ โ ๐โ๐ฉ(๐) ๐ ๐๐
๐
11. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 11
Evaluation Metrics
๏ง Normalized mutual information: regards ๐๐๐, ๐๐๐โฒ as two random
variables and determines the mutual information (๐๐: membership
vector, k: k-th community in detected cover, ๐โฒ: ๐โฒ-th community in real
cover)
๏ง Signed modularity: measures the strength of a community partition by
taking into account the degree distribution
๏ง Frustration: normalized weighted weight sum of negative edges inside
communities and positive edges between communities
๏ง Execution time
๐ญ๐๐๐๐๐๐๐๐๐๐ =
๐ถ ร ๐๐๐๐๐๐
โ
๐ + (๐ โ ๐ถ) ร |(๐๐๐๐๐๐
+
) ๐|
(๐+) ๐+|(๐โ) ๐|
๐ธ ๐บ๐ถ =
๐
๐(๐+) ๐+๐|(๐โ) ๐| ๐ ๐๐
๐๐๐ โ
๐ ๐
+
๐ ๐
+
๐(๐+) ๐
โ
๐ ๐
โ
๐ ๐
โ
๐|(๐โ) ๐|
๐น ๐ช๐, ๐ช๐ ,
where ๐ฟ ๐ถ๐, ๐ถ๐ : No.of communities ๐๐๐ resides
12. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 12
Synthetic Network Generator
๏ง Comes from the idea of [LiLJ14] and is based on the Lancichinetti-
Fortunato-Radicchi (LFR) model (directed and unweighted) and a
model from [YaCL07]
๏ง Parameters
- From LFR: no. of nodes, average/max degree, minus exponents for the
degree and community size distributions which are power laws, min/max
community size, no. of overlapping nodes, no. of communities, fraction of
edges that each node shares with other communities.
- From [YaCL07]: proportion of negative edges inside communities P- and
proportion of positive edges between communities P+
๏ง Generation
Generate a normal
LFR Network
Negate all
inter-community
edges
Randomly negate P- of
all intra-community
edges
Randomly negate P+ of
all inter-community
edges
13. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 13
Experiments on Benchmark
Networks: Community Structure (1)
0
1
2
3
4
5
2 3 4 5 6 7 9 10 11 12 15 18 21 23 25 26 27 28 29 30 31 41 42 52 57
No.ofCommunties
Community Distribution
0
1
2
3 6 7 10 13 16 17 18 19 21 22 23 27 33 35 38 41 43 45 47 55 58
Community Size
SDMID MEA SPM Ground Truth
Parameters: n=100, k=3, maxk=6, ฮผ=0.1, t1=-2.0, t2=-1.0, minc=5, on=5, om=2, P-=0.01, P+=0.01
Maxc=35
Maxc=40
๏ง SDMID has a more similar community distribution in comparison
to the ground truth
๏ง SPM detects the biggest community sizes
14. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 14
Experiments on Benchmark
Networks: Community Structure (2)
5
8
0
5
10
No.ofNodes
Standalone Nodes
9
0
5
10
No.ofNodes
5
28
0
10
20
30
No.ofNodes
SDMID MEA SPM Ground Truth
221
1 13 5
0
100
200
300
No.ofNodes
SDMID MEA SPM Ground Truth
208
17 9 5
0
100
200
300
No.ofNodes
157
11 11 5
0
100
200
No.ofNodes
Nodes in Overlapping
Communities
๏ง MEA detects the
highest number of
standalone nodes
๏ง SDMID also
identifies some
of the nodes as
standalone
๏ง SDMID assigns most
of the nodes as
overlapping
15. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 15
Experiment on Real World Network
Wiki-Elec: Metric Values
0.28
0.21
0.26
0.10
0.11
0.10
0.16
3,101
1,760
0
500
1,000
1,500
2,000
2,500
3,000
3,500
0.00
0.05
0.10
0.15
0.20
0.25
0.30
SDMID MEA SPM
ExecutionTimeinMinutes
Modularity/Frustration
Algorithm
Experiment on Wiki-Elec
Modularity Frustration Execution Time in Minutes
๏ง SDMID has the highest modularity value
๏ง SDMID and SPM obtain the lowest frustration values
๏ง SDMID is the best regarding the execution time
16. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 16
Experiments on Real World Network
Wiki-Elec: Community Structure
0
5
10
2 2,148 2,385 2,645 3,014 3,043 3,935 6,796 6,819 6,833
No.ofCommunties
Community Size
Community Distrubtion (size>1)
SDMID MEA SPM
149
3,250
77
0
2000
4000
No.ofNodes
Standalone Nodes
SDMID MEA SPM
6,853
5
6,354
0
5000
10000
No.ofNodes
Nodes in Overlapping Communties
SDMID MEA SPM
๏ง MEA detects most of the nodes as standalone and most of the nodes
are in one community
๏ง Fewest number of standalone nodes observed in SDMID and SPM
๏ง SDMID and SPM approximately detect high number of overlapping
ndoes
17. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 17
Experiment Summary: Evaluation
Radar
Modularity
Frustration
Execution
Time
Wiki-Elec Dataset
Modularity
Frustration
NMI
Execution
Time
Benchmark Networks
SDMID MEA SPM
๏ง In Wiki-Elec, SDMID has the best performance regarding modularity,
execution time and frustration
๏ง In Benchmark networks, SDMI has better performance regarding
modularity, execution time and NMI
๏ง Performance of SPM is better regarding Frustration
18. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 18
Experiment Summary: Community
Structure
๏ง SDMID
- Big-sized communities
- Large areas of overlapping
๏ง MEAs-SN
- Small-sized communities
- Few nodes in the overlapping area
- Large number of stand-alone nodes
๏ง SPM
- Predefined number of communities k
- Large areas of overlapping with a small k
19. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 19
Conclusion & Message
๏ง We compared SDMID, SPM and MEA OCD
algorithms from different aspects
๏ง There are few algorithms for overlapping
community detection in signed networks
๏ง Currently SDMID and SPM are the best options to
be applied on datasets in signed networks
๏ง SDMID is the fastest and has the highest modularity
๏ง SDMID obtained the best performance on the real world
network Wiki-Elec
๏ง SDMID might be a better choice when diffusion of
opinions is preferred across community borders
20. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 20
References
๏ง [CWYT14] Yi Chen, Xiaolong Wang, Bo Yuan and Buzhou Tang. Overlapping Community
Detection in Networks with Positive and Negative Links. In: Journal of Statistical Mechanics:
Theory and Experiment 2014.3: P03021, 2014.
๏ง [LiLJ14] Chenlong Liu, Jing Liu and Zhongzhou Jiang. A Multiobjective Evolutionary Algorithm
Based on Similarity for Community Detection from Signed Social Networks. In:IEEE
Transactions on Cybernetics 44.12: pp.2274-2286, 2014.
๏ง [ShKl15] Mohsen Shahriari and Ralf Klamma. Signed Social Networks: Link Prediction and
Overlapping Community Detection. In: Proceedings of IEEE/ACM International Conference on
Advances in Social Networks Analysis and Mining. 2015.
๏ง [YaCL07] Bo Yang, William K. Cheung, and Jiming Liu. Community Mining from Signed Social
Networks. In: IEEE Transactions on Knowledge and Data Engineering 19.10: pp. 1333-1348,
2007.
๏ง [ZhLi07] Qingfu Zhang and Hui Li. MOEA/D: A Multiobjective Evolutionary Algorithm Based on
Decomposition. In:IEEE Transactions on Evolutionary Computation 11.6: pp. 712-731, 2007.
21. Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
Mohsen
Shahriari,
Ying Li,
Ralf Klamma
Learning Layers
Analysis of
Overlapping
Communities in
Signed Complex
Networks
Slide 21
Thank you !