Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์ ดร. อานนท์ ศักดิ์วรวิชญ์
ในงาน THE FIRST NIDA BUSINESS ANALYTICS AND DATA SCIENCES CONTEST/CONFERENCE จัดโดย คณะสถิติประยุกต์และ DATA SCIENCES THAILAND
Similar to Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์ ดร. อานนท์ ศักดิ์วรวิชญ์
Similar to Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์ ดร. อานนท์ ศักดิ์วรวิชญ์ (20)
4.16.24 21st Century Movements for Black Lives.pptx
Subscriber Churn Prediction Model using Social Network Analysis In Telecommunication Industry โดย เชษฐพงศ์ ปัญญาชนกุล อาจารย์ ดร. อานนท์ ศักดิ์วรวิชญ์
1. Subscriber Churn Prediction Model
using Social Network Analysis
In Telecommunication Industry
Chettapong Punyachonkool
Dr. Arnond Sakworawich
The First NIDA Business Analytics and Data Sciences Contest/Conference
September 2, 2016
2. Chettapong Punyachonkool
Data Engineer, Business Intelligence Strategy
The Siam Commercial Bank
chettapongp@gmail.com
www.linkedin.com/in/chettapong-punyachonkool
Business Analytic and
Research
Applied Statistics, NIDA
3. Topics
› Social Network Analysis basic concepts
› Social Network Analysis with R
› Visualizing Social Network
› Using SNA to predict Subscriber Churn in
Telco
5. Social Network
Social Network: A social structure composed of
individuals (or organizations) interconnected by one or
more specific types of interdependencies such as
friendship, kinship, financial exchanges, communication
exchanges, etc.
Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation.
Peak Consulting, http://www.peakconsulting.eu
6. Social Network Analysis
Social Network Analysis:The application of graph
theory to understand, categorize and quantify relationships
in a social network.
Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation.
Peak Consulting, http://www.peakconsulting.eu
7. Why should you care about SNA?
Traditional marketing practices are becoming obsolete.
• Test and control group methodologies no longer work
as intended.
• Information exchange between individuals within
an online social network is extremely high.
• Difficult to keep control group “pure”.
• Need to understand behaviour across and within
communities rather than focusing just on individuals.
• Leverage (and protect against) high velocity of
information exchange within on-line social networks.
Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation.
Peak Consulting, http://www.peakconsulting.eu
8. Why should you care about SNA?
Customer are sceptical: if you want to sell your
products to your customers, convince their friends.
Use social network analysis to understand more about
your customers and their communities.
Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation.
Peak Consulting, http://www.peakconsulting.eu
9. Customer with the Role of an Influencer
• Influential user adopts a product or behaviour.
• Influential user tells (and influences) his or her
immediate contacts within the community.
• These immediate contacts tell their contacts.
It is important...
• To identify these people.
• To influence these people.
• To monitor the behaviour of these people.
Source: Wael Elrifai (2013), Social Network Analysis: Practical Uses and Implementation.
Peak Consulting, http://www.peakconsulting.eu
10. Social Network Analysis Application
Source: http://www.martingrandjean.ch/connected-world-air-traffic-network/
11. Social Network Analysis Application
Source: http://www.martingrandjean.ch/connected-world-air-traffic-network/
13. Social Network Analysis Application
Source: Apichart Wisitkitchakarn (2013), Risk Analysis of East Asian Stock Markets,
The Capital Market Research Institute, The Stock Exchange of Thailand.
14. Social Network Analysis Application
Source: Valdis Krebs (2001). Connecting the Dots. Tracking Two Identified Terrorists
http://orgnet.com/tnet.html
15. Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
Social Network Analysis (SNA)
including a tutorial on concepts and methods
Social Media – Dr. Giorgos Cheliotis (gcheliotis@nus.edu.sg)
Communications and New Media, National University of Singapore
16. Practical applications
Newman et al, 2006
Newman et al, 2006
A very early example of network analysis
comes from the city of Königsberg (now
Kaliningrad). Famous mathematician Leonard
Euler used a graph to prove that there is no
path that crosses each of the city’s bridges
only once (Newman et al, 2006).
SNA has its origins in both social science and in the
broader fields of network analysis and graph theory
Network analysis concerns itself with the
formulation and solution of problems that have a
network structure; such structure is usually
captured in a graph (see the circled structure to the right)
Graph theory provides a set of abstract concepts
and methods for the analysis of graphs. These, in
combination with other analytical tools and with
methods developed specifically for the visualization
and analysis of social (and other) networks, form
the basis of what we call SNA methods.
But SNA is not just a methodology; it is a unique
perspective on how society functions. Instead of
focusing on individuals and their attributes, or on
macroscopic social structures, it centers on relations
between individuals, groups, or social institutions
17. Basic Concepts
} Networks
} Tie Strength
} Key Players
} Cohesion
How to represent various social networks
How to identify strong/weak ties in the network
How to identify key/central nodes in network
Measures of overall network structure
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
18. Representing relations as networks
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
1
2
3
4
1 2 3 4
Graph
Anne Jim
Mary
John
Can we study their
interactions as a
network?
Communication
Anne: Jim, tell the Murrays they’re invited
Jim: Mary, you and your dad should come for dinner!
Jim: Mr. Murray, you should both come for dinner
Anne: Mary, did Jim tell you about the dinner? You must come.
John: Mary, are you hungry?
…
19. Network terminology
1
2
3
4
Graph
Vertex
Edge
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
points lines
vertices edges, arcs math
nodes links computer science
sites bonds physics
actors ties, relations sociology
20. Entering data on a directed graph
1
2
3
4
Graph (directed)
Vertex Vertex
1 2
1 3
2 3
2 4
3 4
Edge list
Vertex 1 2 3 4
1 - 1 1 0
2 0 - 1 1
3 0 0 - 0
4 0 0 1 -
Adjacency matrix
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
21. Entering data on a directed graph
1
2
3
4
Graph (directed)
Vertex Vertex
1 2
1 3
2 3
2 4
3 4
Edge list
Vertex 1 2 3 4
1 - 1 1 0
2 0 - 1 1
3 0 0 - 0
4 0 0 1 -
Adjacency matrix
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
22. Representing an undirected graph
Vertex Vertex
1 2
1 3
2 3
2 4
3 4
Edge list remains the same
Vertex 1 2 3 4
1 - 1 1 0
2 1 - 1 1
3 1 1 - 1
4 0 1 1 -
Adjacency matrix becomes symmetric
1
2
3
4
Graph (undirected)
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
23. Basic Concepts
Networks
} Tie Strength
Key Players
Cohesion
How to represent various social networks
How to identify strong/weak ties in the network
How to identify key/central nodes in network
Measures of overall network structure
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
24. Adding weights to edges
Vertex Vertex Weight
1 2 30
1 3 5
2 3 22
2 4 2
3 4 37
Edge list: add column of weights
Vertex 1 2 3 4
1 - 30 5 0
2 30 - 22 2
3 5 22 - 37
4 0 2 37 -
Adjacency matrix: add weights instead of 1
1
2
3
4
30
2
37
22
5
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
25. Adding weights to edges
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
1 2 3 4
Anne Jim
Mary
John
Communication
Anne: Jim, tell the Murrays they’re invited
Jim: Mary, you and your dad should come for dinner!
Jim: Mr. Murray, you should both come for dinner
Anne: Mary, did Jim tell you about the dinner? You must come.
John: Mary, are you hungry?
…
1
2
3
4
30
37
22
5
Graph (undirected)
add weights
26. Edge weights as relationship strength
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
} Edges can represent interactions, flows of
information or goods,
similarities/affiliations, or social relations
} Specifically for social relations, a ‘proxy’ for
the strength of a tie can be:
(a) the frequency of interaction (communication)
or the amount of flow (exchange)
(b) reciprocity in interaction or flow
(c) the type of interaction or flow between the
two parties (e.g., intimate or not)
(d) other attributes of the nodes or ties (e.g., kin
relationships)
(e) The structure of the nodes’ neighborhood (e.g.
many mutual ‘friends’)
} Surveys and interviews allows us to
establish the existence of mutual or one-
sided strength/affection with greater
certainty, but proxies above are also useful
27. Basic Concepts
Networks
Tie Strength
} Key Players
Cohesion
How to represent various social networks
How to identify strong/weak ties in the network
How to identify key/central nodes in network
Measures of overall network structure
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
28. Interpretation of measures
} Degree
} Betweenness
} Closeness
} Eigenvector
How many people can this person reach directly?
How likely is this person to be the most direct route
between two people in the network?
How fast can this person reach everyone in the
network?
How well is this person connected to other well-
connected people?
Centrality measure Interpretation in social networks
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
29. Degree centrality
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
} A node’s (in-) or (out-)degree is the
number of links that lead into or
out of the node
} In an undirected graph they are of
course identical
} Often used as measure of a node’s
degree of connectedness and hence
also influence and/or popularity
} Useful in assessing which nodes are
central with respect to spreading
information and influencing others
in their immediate ‘neighborhood’
1
2
3
4
5
6
7
2
3
4
1
4
1
1
Nodes 3 and 5 have the highest degree (4)
Hypothetical graph
30. Betweenness centrality
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
} For a given node v, calculate the
number of shortest paths between
nodes i and j that pass through v, and
divide by all shortest paths between
nodes i and j
} Sum the above values for all node
pairs i,j
} Sometimes normalized such that the
highest value is 1or that the sum of all
betweenness centralities in the
network is 1
} Shows which nodes are more likely to
be in communication paths between
other nodes
} Also useful in determining points
where the network would break apart
(think who would be cut off if nodes 3
or 5 would disappear)
1
2
3
4
5
6
7
0
1.5
6.5
0
9
0
0
Node 5 has higher betweenness centrality than 3
31. Closeness centrality
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
} Calculate the mean length of all
shortest paths from a node to all
other nodes in the network (i.e.
how many hops on average it takes
to reach every other node)
} Take the reciprocal of the above
value so that higher values are
‘better’ (indicate higher closeness)
like in other measures of centrality
} It is a measure of reach, i.e. the
speed with which information can
reach other nodes from a given
starting node
1
2
3
4
5
6
7
0.5
0.67
0.75
0.46
0.75
0.46
0.46
Nodes 3 and 5 have the highest (i.e. best)
closeness, while node 2 fares almost as well
Note: Sometimes closeness is calculated without taking the reciprocal of the
mean shortest path length. Then lower values are ‘better’.
32. Eigenvector centrality
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
} A node’s eigenvector centrality is
proportional to the sum of the
eigenvector centralities of all nodes
directly connected to it
} In other words, a node with a high
eigenvector centrality is connected to
other nodes with high eigenvector
centrality
} This is similar to how Google ranks
web pages: links from highly linked-to
pages count more
} Useful in determining who is
connected to the most connected
nodes
1
2
3
4
5
6
7
0.36
0.49
0.54
0.19
0.49
0.17
0.17
Node 3 has the highest eigenvector centrality,
closely followed by 2 and 5
Note: The term ‘eigenvector’ comes from mathematics (matrix algebra),
but it is not necessary for understanding how to interpret this measure
38. Basic Concepts
Networks
Tie Strength
Key Players
} Cohesion
How to represent various social networks
How to identify strong/weak ties in the network
How to identify key/central nodes in network
How to characterize a network’s structure
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
39. Reciprocity (degree of)
1 2
3 4
} The ratio of the number of relations
which are reciprocated (i.e. there is an
edge in both directions) over the total
number of relations in the network
} …where two vertices are said to be
related if there is at least one edge
between them
} In the example to the right this would be
2/5=0.4 (whether this is considered high
or low depends on the context)
} A useful indicator of the degree of
mutuality and reciprocal exchange in a
network, which relate to social cohesion
} Only makes sense in directed graphs
Reciprocity for network = 0.4
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
40. Density
1
2
3
4
} A network’s density is the ratio of the number of
edges in the network over the total number of
possible edges between all pairs of nodes (which is
n(n-1)/2, where n is the number of vertices, for an
undirected graph)
} In the example network to the right
density=5/6=0.83 (i.e. it is a fairly dense network;
opposite would be a sparse network)
} It is a common measure of how well connected a
network is (in other words, how closely knit it is) – a
perfectly connected network is called a clique and
has density=1
} A directed graph will have half the density of its
undirected equivalent, because there are twice as
many possible edges, i.e. n(n-1)
} Density is useful in comparing networks against each
other, or in doing the same for different regions
within a single network
1
2
3
4
density = 5/6 = 0.83
density = 5/12 = 0.42
Edge present in network
Possible but not present
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
41. 1
2
3
4
5
6
7
1
0.67
0.33
N/a
0.17
N/a
N/a
} A node’s clustering coefficient is the number of
closed triplets in the node’s neighborhood
over the total number of triplets in the
neighborhood. It is also known as transitivity.
Network clustering coefficient = 0.375
(3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total)
Cluster A
Cluster B
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
Clustering
closed triplets (CT) total number of triplets (TT)
clustering coefficient =
#$
$$
3
42. 1
2
3
4
5
6
7
1
0.67
0.33
N/a
0.17
N/a
N/a
} A node’s clustering coefficient is the number of
closed triplets in the node’s neighborhood
over the total number of triplets in the
neighborhood. It is also known as transitivity.
} E.g., node 1 to the right has a value of 1
because it is only connected to 2 and 3, and
these nodes are also connected to one
another (i.e. the only triplet in the
neighborhood of 1 is closed).We say that
nodes 1,2, and 3 form a clique.
Network clustering coefficient = 0.375
(3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total)
Cluster A
Cluster B
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
Clustering
clustering coefficient (node1) =
1
3
3
= 1
43. 1
2
3
4
5
6
7
1
0.67
0.33
N/a
0.17
N/a
N/a
} A node’s clustering coefficient is the number of
closed triplets in the node’s neighborhood
over the total number of triplets in the
neighborhood. It is also known as transitivity.
Network clustering coefficient = 0.375
(3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total)
Cluster A
Cluster B
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
Clustering
Kn = Degree of node v
Nn = Number of Link between neighbors of node n
CCn = Clustering coefficient of node n
CCn =
2 ) *+
,+ ) (,+ − 1)
CC (node2) =
2 (2)
3 ) (3 − 1)
=
4
6
= 2. 45
44. 1
2
3
4
5
6
7
0
1.5
6.5
0
9
0
0
} Clustering algorithms identify clusters or
‘communities’ within networks based on
network structure and specific clustering
criteria
} Hierarchical clustering
} Similarity based clustering
} Betweenness clustering (example shown
to the right with two clusters is based on
edge betweenness, an equivalent for edges
of the betweenness centrality presented
earlier for nodes)
Network clustering coefficient = 0.375
(3 nodes in each triangle x 2 triangles = 6 closed triplets divided by 16 total)
Cluster A
Cluster B
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
Clustering
45. Average and longest distance
1
2
3
4
5
6
7
} The longest shortest path (distance)
between any two nodes in a
network is called the network’s
diameter
} The diameter of the network on
the right is 3; it is a useful measure
of the reach of the network (as
opposed to looking only at the total
number of vertices or edges)
} It also indicates how long it will take
at most to reach any node in the
network (sparser networks will
generally have greater diameters)
} The average of all shortest paths in
a network is also interesting
because it indicates how far apart
any two nodes will be on average
(average distance)
diameter
Source: Dr. Giorgos Cheliotis, (201x). Social Network Analysis (SNA) including a tutorial on concepts and methods.
Communications and New Media, National University of Singapore
48. Katherine Ognyanova, www.kateto.net
NetSciX 2016 School of Code Workshop, Wroclaw, Poland
Assistant Professor at the School of Communication and
Information at Rutgers University.
Network Analysis and
Visualization with R and igraph
49. Basic SNA with R and igraph
} Networks
} Tie Strength
} Key Players
} Cohesion
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
50. Basic SNA with R and igraph
} Networks
} Tie Strength
} Key Players
} Cohesion
} Create network
} Edge,Vertex and Attributes
} Read network data from files
} Turning networks into igraph
objects
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
51. Create Network
› g1 <- graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F )
› plot(g1)
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
52. Create Network
› g2 <- graph( edges=c(1,2, 2,3, 3, 1), n=10 )
› plot(g2)
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
53. Create Network
› g3 <- graph( c("John", "Jim", "Jim", "Jill", "Jill",
"John"))
› plot(g3)
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
54. Create Network
› g4 <- graph( c("John", "Jim",
"Jim", "Jack", "Jim", "Jack",
"John", "John"),
isolates=c("Jesse", "Janis",
"Jennifer", "Justin") )
› plot(g4, edge.arrow.size=.5,
vertex.color="gold",
vertex.size=15,
vertex.frame.color="gray",
vertex.label.color="black",
vertex.label.cex=0.8,
vertex.label.dist=2,
edge.curved=0.2)
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
55. Create Network
› plot(graph_from_literal(a---b, b---c))
› plot(graph_from_literal(a--+b, b+--c))
› plot(graph_from_literal(a+-+b, b+-+c))
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
56. Edge, Vertex and Attributes
› # The edges of the object
› E(g4)
› # The vertices of the
object
› V(g4)
› # The network matrix
› g4[]
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
57. Edge, Vertex and Attributes
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
g4
vertex_attr
name gender
edge_attr
type weight
58. Read Network from files
3.1 DATASET 1: edgelist
› Dataset1-Media-Example-NODES.csv
› Dataset1-Media-Example-EDGES.csv
3.2 DATASET 2: matrix
› Dataset2-Media-User-Example-NODES.csv
› Dataset2-Media-User-Example-EDGES.csv
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
59. Turning networks into igraph objects
We start by converting the raw data to an igraph network object.
Here we use igraph’s graph.data.frame function, which takes two data
frames: d and vertices.
d describes the edges of the network. Its first two columns are the
IDs of the source and the target node for each edge.The following
columns are edge attributes (weight, type, label, or anything else).
vertices starts with a column of node IDs.Any following columns
are interpreted as node attributes.
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
60. Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
61. Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
62. Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
63. Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
NYT
64. Basic SNA with R and igraph
} Networks
} Tie Strength
} Key Players
} Cohesion
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
} Plotting networks with igraph
} Assign weight
} Assign type of Tie
} Deleted Tie
65. Plotting networks with igraph
We can set the node & edge options in two ways
› Specify them in the plot()
› Set attributes and add them to the igraph object
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
66. } Networks centrality with igraph
} Degree
} Closeness
} Betweenness
} Eigenvector
Basic SNA with R and igraph
} Networks
} Tie Strength
} Key Players
} Cohesion
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
67. Network centrality with igraph
› Degree centrality
› Closeness centrality
› Betweenness centrality
› Eigenvector centrality
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
68. Network centrality with igraph
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
69. } Density and Reciprocity
} Clustering
} Averages & longest distance
Basic SNA with R and igraph
} Networks
} Tie Strength
} Key Players
} Cohesion
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
70. Transitivity with igraph
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
71. Community Clustering based on edge
betweenness
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
72. Averages & longest distance
Source: Asst. Prof Katherine Ognyanova. Network Analysis with R and igraph: NetSci X Tutorial
the NetSciX 2016 School of Code Workshop, Wroclaw, Poland (www.kateto.net)
73. Using SNA to predict
Subscriber churn
in Telecommunication industry
75. “Churn represents the loss of an existing customer
to a competitor”
A prevalent problem in:
› Telecommunication services
› Home mortgage refinance
› Credit card
Churn is especially important to mobile phone service providers
› Easy for a subscriber to switch services.
› Mobile Number Portability (MNP) will remove last important
obstacle
What is Customer Churn ?
77. § Segmentation
§ Predictive Analytics
§ Customer Acquisition
§Costs of Customer Acquisition. (COCA)
§ Servicing
§QoS (Network)
§Call Center, Service Hall/Center (Net Promotor Score)
§ Customer Retention
§Churn Perdition
§ Customer LifetimeValue (CLV)
Core CRM in Telecommunication
78. Most Telco define their customer segments using some of the
following
› Payment type (prepaid vs. postpaid)
› ARPU (Average Revenue Per User >> revenue generated)
› Tenure (age of user :AOU)
› Demographics (location, income, job, gender, age, … etc.)
› Usage
– voice, data, other non-voice, roaming
› Handsets/Devices
– 2G,3G, 4G Device, Smartphone vs Feature phone
› Package
– Package, Price plans
Segmentation
79. Process current and historical data in order to make predictions
about future events.
› Making customer decisions.
› Next Best Offer
› Package & Price plan
› Cross-sell & Up-sell opportunities
› Credit scoring for setting dynamic limits (risk management)
› Fraud detection (postpaid only)
› Revenue Allocation
› Customer LifetimeValue
Predictive Analytics
80. › Quality of Service
– Network Utilization
– Drop Call
› Call Center, Service Hall
– The number of times that Customer contact (complain) via
Call Center/ Service Hall
– Service Scoring (Net Promotor Score)
Servicing
81. › Type of Churn
– Voluntary Churn
– Involuntary Churn
• Type of Customer Retention
– Reactive
– Proactive
Customer Retention
83. Data Source
Customer Demographic:
• Zip code
• Income
• Occupation
• Age
• Gender
• Living Address
• Occupation Address
Order:
• Customer Type Crop/SME/Indy
• Payment Type (Pre/Post)
• Current Package
• Package Plan
• ARPU
• Additional produce/service
Customer Relation:
• Number of Questions about the services
from e.g. IVR
• Number ofVisits to retail shops or online
website
• Number of Complaints solved
• Number of total complaints
Service Usage: (CDR)
• Number of calls
• Volume of Data usage
• Number of Outgoing calls
• Number of Incoming calls
• Number of Roaming calls
• Number of International calls
• Total minutes of usage (MOU)/Volume
• Number of Drop calls
84. Data Source
Billing Data:
• Total amount of bill
• Total number of barred (one-way barred)
• Total number of full barred (two-way
barred)
Network:
• Cell Site Location
• Network Type ( 4G/3G/2G )
• Network Utilization
• QoS
93. ทําการ summarize ข้อมูลจํานวนนาทีในการโทร(Call Duration) และ
จํานวนครัRงในการโทร (Number of Call) ให้อยู่ในรูปแบบ per subscriber per call
จะได้จํานวนความสัมพันธ์(links) = 1,747,835 transections
MO MT Call
Duration
Number
of Call
A B 3 1
A C 2 4
A D 16 2
B D 23 9
E D 1 1
94. แบ่งกลุ่มโดยใช้เทคนิค K-mean clustering โดยใช้จํานวนนาทีในการโทร (Call
Duration) และจํานวนครัRงในการโทร (Number of Call) เป็นตัวแปรในการแบ่งกลุ่ม โดย
กําหนดให้ k = 3 จะได้cluster ดังรูป
Blue : Cluster 1
Yellow : Cluster 2
Red : Cluster 3
95. 1. พิจารณาจากจํานวน
สมาชิกในแต่ละกลุ่ม เมืOอแบ่ง
จํานวนกลุ่มมากขึRน
Number of Cluster
K = 2 K = 3 K = 4 K = 5 K = 6
# transection in
each cluster group
1 1,737,166 1,725,216 1,705,381 1,675,572 1,638,302
2 10,669 21,801 38,463 61,628 88,167
3 818 3,831 9,388 16,924
4 160 1,160 3,701
5 87 664
6 77
2. พิจารณา ratio ของค่า
Inter-cluster distances and
Intra-cluster distances เมืOอ
แบ่งจํานวนกลุ่มมากขึRน
ทําไมถึงเลือก K= 3 ?
96. Blue : Cluster 1 Yellow : Cluster 2 Red : Cluster 3
Cluster
Median of Call
Duration
Median of number of
Call
Group Description
Cluster#1 2.13 2.00 โทรน้อย - โทรสัRน
Cluster#2 1,998.03 348.50 โทรน้อย - โทรนาน
Cluster#3 215.97 101.00 โทรบ่อย- โทรสั-น