SlideShare a Scribd company logo
1 of 128
Download to read offline
Machine Learning for Data Mining
Hierarchical Clustering
Andres Mendez-Vazquez
July 27, 2015
1 / 46
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems with Agglomerative Algorithms
Two Categories of Agglomerative Algorithms
Matrix Based Algorithms
Graph Based Algorithms
3 Divisive Algorithms
Introduction
4 Algorithms for Large Data Sets
Introduction
Clustering Using REpresentatives (CURE)
2 / 46
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems with Agglomerative Algorithms
Two Categories of Agglomerative Algorithms
Matrix Based Algorithms
Graph Based Algorithms
3 Divisive Algorithms
Introduction
4 Algorithms for Large Data Sets
Introduction
Clustering Using REpresentatives (CURE)
3 / 46
Images/cinvestav-
Concepts
Hierarchical Clustering Algorithms
They are quite different from the previous clustering algorithms.
Actually
They produce a hierarchy of clusterings.
4 / 46
Images/cinvestav-
Concepts
Hierarchical Clustering Algorithms
They are quite different from the previous clustering algorithms.
Actually
They produce a hierarchy of clusterings.
4 / 46
Images/cinvestav-
Dendrogram: Hierarchical Clustering
Hierarchical Clustering
The clustering is obtained by cutting the dendrogram at a desired level:
Each connected component forms a cluster.
5 / 46
Images/cinvestav-
Example
Dendrogram
6 / 46
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems with Agglomerative Algorithms
Two Categories of Agglomerative Algorithms
Matrix Based Algorithms
Graph Based Algorithms
3 Divisive Algorithms
Introduction
4 Algorithms for Large Data Sets
Introduction
Clustering Using REpresentatives (CURE)
7 / 46
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous step t − 1
Two Main Types
1 Agglomerative Algorithms.
1 Start with each item being a single cluster.
2 Eventually all items belong to the same cluster.
2 Divisive Algorithms
1 Start with all items belong to the same cluster.
2 Eventually each item forms a cluster on its own.
8 / 46
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous step t − 1
Two Main Types
1 Agglomerative Algorithms.
1 Start with each item being a single cluster.
2 Eventually all items belong to the same cluster.
2 Divisive Algorithms
1 Start with all items belong to the same cluster.
2 Eventually each item forms a cluster on its own.
8 / 46
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous step t − 1
Two Main Types
1 Agglomerative Algorithms.
1 Start with each item being a single cluster.
2 Eventually all items belong to the same cluster.
2 Divisive Algorithms
1 Start with all items belong to the same cluster.
2 Eventually each item forms a cluster on its own.
8 / 46
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous step t − 1
Two Main Types
1 Agglomerative Algorithms.
1 Start with each item being a single cluster.
2 Eventually all items belong to the same cluster.
2 Divisive Algorithms
1 Start with all items belong to the same cluster.
2 Eventually each item forms a cluster on its own.
8 / 46
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous step t − 1
Two Main Types
1 Agglomerative Algorithms.
1 Start with each item being a single cluster.
2 Eventually all items belong to the same cluster.
2 Divisive Algorithms
1 Start with all items belong to the same cluster.
2 Eventually each item forms a cluster on its own.
8 / 46
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous step t − 1
Two Main Types
1 Agglomerative Algorithms.
1 Start with each item being a single cluster.
2 Eventually all items belong to the same cluster.
2 Divisive Algorithms
1 Start with all items belong to the same cluster.
2 Eventually each item forms a cluster on its own.
8 / 46
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous step t − 1
Two Main Types
1 Agglomerative Algorithms.
1 Start with each item being a single cluster.
2 Eventually all items belong to the same cluster.
2 Divisive Algorithms
1 Start with all items belong to the same cluster.
2 Eventually each item forms a cluster on its own.
8 / 46
Images/cinvestav-
Therefore
Given the previous ideas
It is necessary to define the concept of nesting!!!
After all given a divisive and agglomerative procedure
9 / 46
Images/cinvestav-
Therefore
Given the previous ideas
It is necessary to define the concept of nesting!!!
After all given a divisive and agglomerative procedure
9 / 46
Images/cinvestav-
Nested Clustering
Definition
1 A clustering i containing k clusters is said to be nested in the
clustering i+1, which contains r < k clusters, if each cluster in i, it
is a subset of a set in i+1.
2 At least one cluster at i is a proper subset of a set in i+1.
This is written as
i i+1 (1)
10 / 46
Images/cinvestav-
Nested Clustering
Definition
1 A clustering i containing k clusters is said to be nested in the
clustering i+1, which contains r < k clusters, if each cluster in i, it
is a subset of a set in i+1.
2 At least one cluster at i is a proper subset of a set in i+1.
This is written as
i i+1 (1)
10 / 46
Images/cinvestav-
Nested Clustering
Definition
1 A clustering i containing k clusters is said to be nested in the
clustering i+1, which contains r < k clusters, if each cluster in i, it
is a subset of a set in i+1.
2 At least one cluster at i is a proper subset of a set in i+1.
This is written as
i i+1 (1)
10 / 46
Images/cinvestav-
Example
We have
The following set{x1, x2, x3, x4, x5}.
With the following structures
1 = {{x1, x3} , {x4} , {x2, x5}}
2 = {{x1, x3, x4} , {x2, x5}}
Again
Hierarchical Clustering produces a hierarchy of clusterings!!!
11 / 46
Images/cinvestav-
Example
We have
The following set{x1, x2, x3, x4, x5}.
With the following structures
1 = {{x1, x3} , {x4} , {x2, x5}}
2 = {{x1, x3, x4} , {x2, x5}}
Again
Hierarchical Clustering produces a hierarchy of clusterings!!!
11 / 46
Images/cinvestav-
Example
We have
The following set{x1, x2, x3, x4, x5}.
With the following structures
1 = {{x1, x3} , {x4} , {x2, x5}}
2 = {{x1, x3, x4} , {x2, x5}}
Again
Hierarchical Clustering produces a hierarchy of clusterings!!!
11 / 46
Images/cinvestav-
Example
We have
The following set{x1, x2, x3, x4, x5}.
With the following structures
1 = {{x1, x3} , {x4} , {x2, x5}}
2 = {{x1, x3, x4} , {x2, x5}}
Again
Hierarchical Clustering produces a hierarchy of clusterings!!!
11 / 46
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems with Agglomerative Algorithms
Two Categories of Agglomerative Algorithms
Matrix Based Algorithms
Graph Based Algorithms
3 Divisive Algorithms
Introduction
4 Algorithms for Large Data Sets
Introduction
Clustering Using REpresentatives (CURE)
12 / 46
Images/cinvestav-
Agglomerative Algorithms.
Initial State
You have N clusters each containing an element of the data X.
At each step i, you have an i structure with N − i.
Then, a new clustering structure i+1 is generated.
Thus
13 / 46
Images/cinvestav-
Agglomerative Algorithms.
Initial State
You have N clusters each containing an element of the data X.
At each step i, you have an i structure with N − i.
Then, a new clustering structure i+1 is generated.
Thus
13 / 46
Images/cinvestav-
Agglomerative Algorithms.
Initial State
You have N clusters each containing an element of the data X.
At each step i, you have an i structure with N − i.
Then, a new clustering structure i+1 is generated.
Thus
13 / 46
Images/cinvestav-
Agglomerative Algorithms.
Initial State
You have N clusters each containing an element of the data X.
At each step i, you have an i structure with N − i.
Then, a new clustering structure i+1 is generated.
Thus
13 / 46
Images/cinvestav-
In that way...
We have
At each step, we have that each cluster i is a proper subset of a cluste in
i or
i i+1 (2)
14 / 46
Images/cinvestav-
The Basic Algorithm for Agglomerative
For this
We have a function g (Ci, Cj) defined in all pair of cluster to measure
similarity or dissimilarity.
t denotes the current level of the hierarchy.
Algorithm
I n i t i a l i z a t i o n
Choose 0 = {Ci = {xi } , i = 1, ..., N}
t = 0
Repeat
t = t + 1
Find one p a i r of c l u s t e r s
(Cr , Cs) i n
t−1 such that
g(Ci , Cj ) = max, min of a s i m i l l a r i t y
or d i s s i m i l a r i t y f u n c t i o n
over a l l p a i r s
Define Cq = Ci ∪ Cj , t = t−1 − Ci , Cj ∪ Cq
U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r
15 / 46
Images/cinvestav-
The Basic Algorithm for Agglomerative
For this
We have a function g (Ci, Cj) defined in all pair of cluster to measure
similarity or dissimilarity.
t denotes the current level of the hierarchy.
Algorithm
I n i t i a l i z a t i o n
Choose 0 = {Ci = {xi } , i = 1, ..., N}
t = 0
Repeat
t = t + 1
Find one p a i r of c l u s t e r s
(Cr , Cs) i n
t−1 such that
g(Ci , Cj ) = max, min of a s i m i l l a r i t y
or d i s s i m i l a r i t y f u n c t i o n
over a l l p a i r s
Define Cq = Ci ∪ Cj , t = t−1 − Ci , Cj ∪ Cq
U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r
15 / 46
Images/cinvestav-
The Basic Algorithm for Agglomerative
For this
We have a function g (Ci, Cj) defined in all pair of cluster to measure
similarity or dissimilarity.
t denotes the current level of the hierarchy.
Algorithm
I n i t i a l i z a t i o n
Choose 0 = {Ci = {xi } , i = 1, ..., N}
t = 0
Repeat
t = t + 1
Find one p a i r of c l u s t e r s
(Cr , Cs) i n
t−1 such that
g(Ci , Cj ) = max, min of a s i m i l l a r i t y
or d i s s i m i l a r i t y f u n c t i o n
over a l l p a i r s
Define Cq = Ci ∪ Cj , t = t−1 − Ci , Cj ∪ Cq
U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r
15 / 46
Images/cinvestav-
Enforcing Nesting
Note the following
“We can say that if two vectors come together into a single cluster at level
t of the hierarchy, they will remain in the same cluster for all subsequent
clusterings.”
Thus
0 1 2 ... N−1 N (3)
Hurra!!!
Enforcing the nesting property!!!
16 / 46
Images/cinvestav-
Enforcing Nesting
Note the following
“We can say that if two vectors come together into a single cluster at level
t of the hierarchy, they will remain in the same cluster for all subsequent
clusterings.”
Thus
0 1 2 ... N−1 N (3)
Hurra!!!
Enforcing the nesting property!!!
16 / 46
Images/cinvestav-
Enforcing Nesting
Note the following
“We can say that if two vectors come together into a single cluster at level
t of the hierarchy, they will remain in the same cluster for all subsequent
clusterings.”
Thus
0 1 2 ... N−1 N (3)
Hurra!!!
Enforcing the nesting property!!!
16 / 46
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems with Agglomerative Algorithms
Two Categories of Agglomerative Algorithms
Matrix Based Algorithms
Graph Based Algorithms
3 Divisive Algorithms
Introduction
4 Algorithms for Large Data Sets
Introduction
Clustering Using REpresentatives (CURE)
17 / 46
Images/cinvestav-
Problems with Agglomerative Algorithms
First - Related to Nesting Property
No way to recover from a “poor” clustering that may have occurred in an
earlier level of the hierarchy.
Second
At each level t, there are N − t clusters.
Thus at level t+1 the total number of pairs compared.
N − t
2
=
(N − t) (N − t − 1)
2
(4)
Total Number of pairs compared are
N−1
t=0
N − t
2
(5)
18 / 46
Images/cinvestav-
Problems with Agglomerative Algorithms
First - Related to Nesting Property
No way to recover from a “poor” clustering that may have occurred in an
earlier level of the hierarchy.
Second
At each level t, there are N − t clusters.
Thus at level t+1 the total number of pairs compared.
N − t
2
=
(N − t) (N − t − 1)
2
(4)
Total Number of pairs compared are
N−1
t=0
N − t
2
(5)
18 / 46
Images/cinvestav-
Problems with Agglomerative Algorithms
First - Related to Nesting Property
No way to recover from a “poor” clustering that may have occurred in an
earlier level of the hierarchy.
Second
At each level t, there are N − t clusters.
Thus at level t+1 the total number of pairs compared.
N − t
2
=
(N − t) (N − t − 1)
2
(4)
Total Number of pairs compared are
N−1
t=0
N − t
2
(5)
18 / 46
Images/cinvestav-
Problems with Agglomerative Algorithms
First - Related to Nesting Property
No way to recover from a “poor” clustering that may have occurred in an
earlier level of the hierarchy.
Second
At each level t, there are N − t clusters.
Thus at level t+1 the total number of pairs compared.
N − t
2
=
(N − t) (N − t − 1)
2
(4)
Total Number of pairs compared are
N−1
t=0
N − t
2
(5)
18 / 46
Images/cinvestav-
Problems with Agglomerative Algorithms
First - Related to Nesting Property
No way to recover from a “poor” clustering that may have occurred in an
earlier level of the hierarchy.
Second
At each level t, there are N − t clusters.
Thus at level t+1 the total number of pairs compared.
N − t
2
=
(N − t) (N − t − 1)
2
(4)
Total Number of pairs compared are
N−1
t=0
N − t
2
(5)
18 / 46
Images/cinvestav-
Thus
We have that
N−1
t=0
N − t
2
=
N
k=1
k
2
=
(N − 1) N (N + 1)
6
(6)
Thus
The complexity of this schema is O N3
However
You still depend on the nature of g.
19 / 46
Images/cinvestav-
Thus
We have that
N−1
t=0
N − t
2
=
N
k=1
k
2
=
(N − 1) N (N + 1)
6
(6)
Thus
The complexity of this schema is O N3
However
You still depend on the nature of g.
19 / 46
Images/cinvestav-
Thus
We have that
N−1
t=0
N − t
2
=
N
k=1
k
2
=
(N − 1) N (N + 1)
6
(6)
Thus
The complexity of this schema is O N3
However
You still depend on the nature of g.
19 / 46
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems with Agglomerative Algorithms
Two Categories of Agglomerative Algorithms
Matrix Based Algorithms
Graph Based Algorithms
3 Divisive Algorithms
Introduction
4 Algorithms for Large Data Sets
Introduction
Clustering Using REpresentatives (CURE)
20 / 46
Images/cinvestav-
Two Categories of Agglomerative Algorithms
There are two
1 Matrix Theory Based.
2 Graph Theory Concepts.
Matrix Theory Based
As the name says, they are based in dissimilarity matrix P0 = P (X)
of N × N.
At each merging the matrix is reduced by one level ⇒ Pt becomes a
N − t × N − t matrix.
21 / 46
Images/cinvestav-
Two Categories of Agglomerative Algorithms
There are two
1 Matrix Theory Based.
2 Graph Theory Concepts.
Matrix Theory Based
As the name says, they are based in dissimilarity matrix P0 = P (X)
of N × N.
At each merging the matrix is reduced by one level ⇒ Pt becomes a
N − t × N − t matrix.
21 / 46
Images/cinvestav-
Two Categories of Agglomerative Algorithms
There are two
1 Matrix Theory Based.
2 Graph Theory Concepts.
Matrix Theory Based
As the name says, they are based in dissimilarity matrix P0 = P (X)
of N × N.
At each merging the matrix is reduced by one level ⇒ Pt becomes a
N − t × N − t matrix.
21 / 46
Images/cinvestav-
Two Categories of Agglomerative Algorithms
There are two
1 Matrix Theory Based.
2 Graph Theory Concepts.
Matrix Theory Based
As the name says, they are based in dissimilarity matrix P0 = P (X)
of N × N.
At each merging the matrix is reduced by one level ⇒ Pt becomes a
N − t × N − t matrix.
21 / 46
Images/cinvestav-
Matrix Based Algorithm
Matrix Updating Algorithmic Scheme (MUAS)
I n i t i a l i z a t i o n
Choose 0 = {Ci = {xi} , i = 1, ..., N}
P0 = P(X)
t = 0
Repeat
t = t + 1
Find one p a i r of c l u s t e r s
(Cr , Cs) i n
t−1 such that
d(Ci, Cj) = minr,s=1,..,N,r=s d(Cr , Cs)
Define Cq = Ci ∪ Cj , t = t−1 − {Ci, Cj} ∪ Cq
Create Pt by s t r a t e g y
U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r
22 / 46
Images/cinvestav-
Matrix Based Algorithm
Strategy
1 Delete the two rows and columns that correspond to the merged
clusters.
2 Add new row and a new column that contain the distances between
the newly formed cluster and the old (unaffected at this level) clusters.
23 / 46
Images/cinvestav-
Matrix Based Algorithm
Strategy
1 Delete the two rows and columns that correspond to the merged
clusters.
2 Add new row and a new column that contain the distances between
the newly formed cluster and the old (unaffected at this level) clusters.
23 / 46
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these algorithms
d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ...
bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)|
Where different values of ai, aj, b and c correspond to different choices of
the dissimilarity measures.
Using this distance is possible to generate several algorithms
1 The single link algorithm.
2 The complete link algorithm.
3 The weighted pair group method average.
4 The unweighted pair group method centroid.
5 Etc...
24 / 46
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these algorithms
d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ...
bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)|
Where different values of ai, aj, b and c correspond to different choices of
the dissimilarity measures.
Using this distance is possible to generate several algorithms
1 The single link algorithm.
2 The complete link algorithm.
3 The weighted pair group method average.
4 The unweighted pair group method centroid.
5 Etc...
24 / 46
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these algorithms
d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ...
bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)|
Where different values of ai, aj, b and c correspond to different choices of
the dissimilarity measures.
Using this distance is possible to generate several algorithms
1 The single link algorithm.
2 The complete link algorithm.
3 The weighted pair group method average.
4 The unweighted pair group method centroid.
5 Etc...
24 / 46
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these algorithms
d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ...
bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)|
Where different values of ai, aj, b and c correspond to different choices of
the dissimilarity measures.
Using this distance is possible to generate several algorithms
1 The single link algorithm.
2 The complete link algorithm.
3 The weighted pair group method average.
4 The unweighted pair group method centroid.
5 Etc...
24 / 46
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these algorithms
d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ...
bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)|
Where different values of ai, aj, b and c correspond to different choices of
the dissimilarity measures.
Using this distance is possible to generate several algorithms
1 The single link algorithm.
2 The complete link algorithm.
3 The weighted pair group method average.
4 The unweighted pair group method centroid.
5 Etc...
24 / 46
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these algorithms
d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ...
bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)|
Where different values of ai, aj, b and c correspond to different choices of
the dissimilarity measures.
Using this distance is possible to generate several algorithms
1 The single link algorithm.
2 The complete link algorithm.
3 The weighted pair group method average.
4 The unweighted pair group method centroid.
5 Etc...
24 / 46
Images/cinvestav-
For example
The single link algorithm
This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2
Thus, we have
d (Cq, Cs) = min {d (Ci, Cs) , d (Cj, Cs)} (7)
Please look at the example in the Dropbox
It is an interesting example.
25 / 46
Images/cinvestav-
For example
The single link algorithm
This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2
Thus, we have
d (Cq, Cs) = min {d (Ci, Cs) , d (Cj, Cs)} (7)
Please look at the example in the Dropbox
It is an interesting example.
25 / 46
Images/cinvestav-
For example
The single link algorithm
This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2
Thus, we have
d (Cq, Cs) = min {d (Ci, Cs) , d (Cj, Cs)} (7)
Please look at the example in the Dropbox
It is an interesting example.
25 / 46
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Consider the following
1 Each node in the graph G correspond to a vector.
2 Cluster are formed by connecting nodes.
3 Certain property, h (k), needs to be respected.
Common Properties: Node Connectivity
The node connectivity of a connected subgraph is the largest integer k
such that all pairs of nodes are joined by at least k paths having no nodes
in common.
26 / 46
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Consider the following
1 Each node in the graph G correspond to a vector.
2 Cluster are formed by connecting nodes.
3 Certain property, h (k), needs to be respected.
Common Properties: Node Connectivity
The node connectivity of a connected subgraph is the largest integer k
such that all pairs of nodes are joined by at least k paths having no nodes
in common.
26 / 46
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Consider the following
1 Each node in the graph G correspond to a vector.
2 Cluster are formed by connecting nodes.
3 Certain property, h (k), needs to be respected.
Common Properties: Node Connectivity
The node connectivity of a connected subgraph is the largest integer k
such that all pairs of nodes are joined by at least k paths having no nodes
in common.
26 / 46
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Consider the following
1 Each node in the graph G correspond to a vector.
2 Cluster are formed by connecting nodes.
3 Certain property, h (k), needs to be respected.
Common Properties: Node Connectivity
The node connectivity of a connected subgraph is the largest integer k
such that all pairs of nodes are joined by at least k paths having no nodes
in common.
26 / 46
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Common Properties: Edge Connectivity
The edge connectivity of a connected subgraph is the largest integer k
such that all pairs of nodes are joined by at least k paths having no edges
in common.
Common Properties: Node Degree
The degree of a connected subgraph is the largest integer k such that
each node has at least k incident edges.
27 / 46
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Common Properties: Edge Connectivity
The edge connectivity of a connected subgraph is the largest integer k
such that all pairs of nodes are joined by at least k paths having no edges
in common.
Common Properties: Node Degree
The degree of a connected subgraph is the largest integer k such that
each node has at least k incident edges.
27 / 46
Images/cinvestav-
Basically, We use the Same Scheme, But...
The function
gh(k) (Cr , Cs) = min
x∈Cr ,y∈Cs
{d (x, y) |Property} (8)
Property
The G subgraph defined by Cr ∪ Cs is
1 It is connected and either
1 It has the property h(k) or
2 It is complete
28 / 46
Images/cinvestav-
Basically, We use the Same Scheme, But...
The function
gh(k) (Cr , Cs) = min
x∈Cr ,y∈Cs
{d (x, y) |Property} (8)
Property
The G subgraph defined by Cr ∪ Cs is
1 It is connected and either
1 It has the property h(k) or
2 It is complete
28 / 46
Images/cinvestav-
Basically, We use the Same Scheme, But...
The function
gh(k) (Cr , Cs) = min
x∈Cr ,y∈Cs
{d (x, y) |Property} (8)
Property
The G subgraph defined by Cr ∪ Cs is
1 It is connected and either
1 It has the property h(k) or
2 It is complete
28 / 46
Images/cinvestav-
Basically, We use the Same Scheme, But...
The function
gh(k) (Cr , Cs) = min
x∈Cr ,y∈Cs
{d (x, y) |Property} (8)
Property
The G subgraph defined by Cr ∪ Cs is
1 It is connected and either
1 It has the property h(k) or
2 It is complete
28 / 46
Images/cinvestav-
Basically, We use the Same Scheme, But...
The function
gh(k) (Cr , Cs) = min
x∈Cr ,y∈Cs
{d (x, y) |Property} (8)
Property
The G subgraph defined by Cr ∪ Cs is
1 It is connected and either
1 It has the property h(k) or
2 It is complete
28 / 46
Images/cinvestav-
Examples
Examples
1 Single Link Algorithm
2 Complete Link Algorithm
There is other style of clustering
Clustering Algorithms Based on the Minimum Spanning Tree
29 / 46
Images/cinvestav-
Examples
Examples
1 Single Link Algorithm
2 Complete Link Algorithm
There is other style of clustering
Clustering Algorithms Based on the Minimum Spanning Tree
29 / 46
Images/cinvestav-
Examples
Examples
1 Single Link Algorithm
2 Complete Link Algorithm
There is other style of clustering
Clustering Algorithms Based on the Minimum Spanning Tree
29 / 46
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems with Agglomerative Algorithms
Two Categories of Agglomerative Algorithms
Matrix Based Algorithms
Graph Based Algorithms
3 Divisive Algorithms
Introduction
4 Algorithms for Large Data Sets
Introduction
Clustering Using REpresentatives (CURE)
30 / 46
Images/cinvestav-
Divisive Algorithms
Reverse Strategy
Start with a single cluster split it iteratively.
31 / 46
Images/cinvestav-
Generalized Divisive Scheme
Algorithm PROBLEM what is wrong!!!
I n i t i a l i z a t i o n
Choose 0 = {X}
P0 = P(X)
t = 0
Repeat
t = t + 1
For i = 1 to t
Given a p a r t i t i o n Ct−1, i
Generate a l l p o s s i b l e c l u s t e r s
next i
Find the p a i r C1
t−1,j, C2
t−1,j that
maximize g
Create
t = t−1 − {Ct−1,j} ∪ C1
t−1,j, C2
t−1,j
U n t i l a l l v e c t o r s l i e i n a s i n g l e c l u s t e r
32 / 46
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems with Agglomerative Algorithms
Two Categories of Agglomerative Algorithms
Matrix Based Algorithms
Graph Based Algorithms
3 Divisive Algorithms
Introduction
4 Algorithms for Large Data Sets
Introduction
Clustering Using REpresentatives (CURE)
33 / 46
Images/cinvestav-
Algorithms for Large Data Sets
There are several
1 The CURE Algorithm
2 The ROCK Algorithm
3 The Chameleon Algorithm
4 The BIRCH Algorithm
34 / 46
Images/cinvestav-
Algorithms for Large Data Sets
There are several
1 The CURE Algorithm
2 The ROCK Algorithm
3 The Chameleon Algorithm
4 The BIRCH Algorithm
34 / 46
Images/cinvestav-
Algorithms for Large Data Sets
There are several
1 The CURE Algorithm
2 The ROCK Algorithm
3 The Chameleon Algorithm
4 The BIRCH Algorithm
34 / 46
Images/cinvestav-
Algorithms for Large Data Sets
There are several
1 The CURE Algorithm
2 The ROCK Algorithm
3 The Chameleon Algorithm
4 The BIRCH Algorithm
34 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Idea
Each cluster Ci has a set of representatives RCi = x
(i)
1 , x
(i)
2 , ..., x
(i)
K
with K > 1.
What is happening
By using multiple representatives for each cluster, the CURE algorithm
tries to “capture” the shape of each one.
However
In order to avoid taking into account irregularities (For example, outliers)
in the border of the cluster.
The initially chosen representatives are “pushed” toward the mean of
the cluster.
35 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Idea
Each cluster Ci has a set of representatives RCi = x
(i)
1 , x
(i)
2 , ..., x
(i)
K
with K > 1.
What is happening
By using multiple representatives for each cluster, the CURE algorithm
tries to “capture” the shape of each one.
However
In order to avoid taking into account irregularities (For example, outliers)
in the border of the cluster.
The initially chosen representatives are “pushed” toward the mean of
the cluster.
35 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Idea
Each cluster Ci has a set of representatives RCi = x
(i)
1 , x
(i)
2 , ..., x
(i)
K
with K > 1.
What is happening
By using multiple representatives for each cluster, the CURE algorithm
tries to “capture” the shape of each one.
However
In order to avoid taking into account irregularities (For example, outliers)
in the border of the cluster.
The initially chosen representatives are “pushed” toward the mean of
the cluster.
35 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Idea
Each cluster Ci has a set of representatives RCi = x
(i)
1 , x
(i)
2 , ..., x
(i)
K
with K > 1.
What is happening
By using multiple representatives for each cluster, the CURE algorithm
tries to “capture” the shape of each one.
However
In order to avoid taking into account irregularities (For example, outliers)
in the border of the cluster.
The initially chosen representatives are “pushed” toward the mean of
the cluster.
35 / 46
Images/cinvestav-
Therfore
This action is known
As “Shrinking” in the sense that the volume of space “defined” by the
representatives is shrunk toward the mean of the cluster.
36 / 46
Images/cinvestav-
Shrinking Process
Given a cluster C
Select the point x ∈ C with the maximum distance from the mean of C
and set RC = {x} (the set of representatives).
Then
1 For i = 2 to min {K, nC }
2 Determine y ∈ C − RC that lies farthest from the points in RC
3 RC = RC ∪ {y}
37 / 46
Images/cinvestav-
Shrinking Process
Given a cluster C
Select the point x ∈ C with the maximum distance from the mean of C
and set RC = {x} (the set of representatives).
Then
1 For i = 2 to min {K, nC }
2 Determine y ∈ C − RC that lies farthest from the points in RC
3 RC = RC ∪ {y}
37 / 46
Images/cinvestav-
Shrinking Process
Given a cluster C
Select the point x ∈ C with the maximum distance from the mean of C
and set RC = {x} (the set of representatives).
Then
1 For i = 2 to min {K, nC }
2 Determine y ∈ C − RC that lies farthest from the points in RC
3 RC = RC ∪ {y}
37 / 46
Images/cinvestav-
Shrinking Process
Given a cluster C
Select the point x ∈ C with the maximum distance from the mean of C
and set RC = {x} (the set of representatives).
Then
1 For i = 2 to min {K, nC }
2 Determine y ∈ C − RC that lies farthest from the points in RC
3 RC = RC ∪ {y}
37 / 46
Images/cinvestav-
Shrinking Process
Given a cluster C
Select the point x ∈ C with the maximum distance from the mean of C
and set RC = {x} (the set of representatives).
Then
1 For i = 2 to min {K, nC }
2 Determine y ∈ C − RC that lies farthest from the points in RC
3 RC = RC ∪ {y}
37 / 46
Images/cinvestav-
Shrinking Process
Do the Shrinking
Shrink the points x ∈ RC toward the mean mC in C by a factor α.
Actually
x = (1 − α) x + αmC ∀x ∈ RC (9)
38 / 46
Images/cinvestav-
Shrinking Process
Do the Shrinking
Shrink the points x ∈ RC toward the mean mC in C by a factor α.
Actually
x = (1 − α) x + αmC ∀x ∈ RC (9)
38 / 46
Images/cinvestav-
Resulting set RC
Thus
The resulting set RC is the set of representatives of C.
Thus the distance between two cluster is defined as
d (Ci, Cj) = min
x∈RCi
,y∈RCj
d (x, y) (10)
39 / 46
Images/cinvestav-
Resulting set RC
Thus
The resulting set RC is the set of representatives of C.
Thus the distance between two cluster is defined as
d (Ci, Cj) = min
x∈RCi
,y∈RCj
d (x, y) (10)
39 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
Output : C clusters
1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi}
2 Ci.closest stores the cluster closest to Ci.
3 All the input points are inserted into a K − d tree T.
4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances
between Ci and Ci.closest).
5 While size(Q) > C
6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest.
7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj.
8 Also remove Ci and Cj from T and Q.
9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch.
10 insert Ck into Q
40 / 46
Images/cinvestav-
Complexity of Cure
Too Prohibitive
O N2
log2 N (11)
41 / 46
Images/cinvestav-
Possible Solution
CURE does the following
The technique adopted by the CURE algorithm, in order to reduce the
computational complexity, is that of random sampling.
Actually
That is, a sample set X is created from X, by choosing randomly N out
of the N points of X.
However, one has to ensure that the probability of missing a cluster of
X, due to this sampling
This can be guaranteed if the number of points N is sufficiently large.
42 / 46
Images/cinvestav-
Possible Solution
CURE does the following
The technique adopted by the CURE algorithm, in order to reduce the
computational complexity, is that of random sampling.
Actually
That is, a sample set X is created from X, by choosing randomly N out
of the N points of X.
However, one has to ensure that the probability of missing a cluster of
X, due to this sampling
This can be guaranteed if the number of points N is sufficiently large.
42 / 46
Images/cinvestav-
Possible Solution
CURE does the following
The technique adopted by the CURE algorithm, in order to reduce the
computational complexity, is that of random sampling.
Actually
That is, a sample set X is created from X, by choosing randomly N out
of the N points of X.
However, one has to ensure that the probability of missing a cluster of
X, due to this sampling
This can be guaranteed if the number of points N is sufficiently large.
42 / 46
Images/cinvestav-
Then
Having estimated N
CURE forms a number of p = N
N sample data sets by successive random
samples.
In other words
In other words, X is partitioned randomly in p subsets.
For this a parameter q is selected
Then, the points in each partition are clustered until N
q clusters are
formed or the distance between the closest pair of clusters to be merged in
the next iteration step exceeds a user-defined threshold.
43 / 46
Images/cinvestav-
Then
Having estimated N
CURE forms a number of p = N
N sample data sets by successive random
samples.
In other words
In other words, X is partitioned randomly in p subsets.
For this a parameter q is selected
Then, the points in each partition are clustered until N
q clusters are
formed or the distance between the closest pair of clusters to be merged in
the next iteration step exceeds a user-defined threshold.
43 / 46
Images/cinvestav-
Then
Having estimated N
CURE forms a number of p = N
N sample data sets by successive random
samples.
In other words
In other words, X is partitioned randomly in p subsets.
For this a parameter q is selected
Then, the points in each partition are clustered until N
q clusters are
formed or the distance between the closest pair of clusters to be merged in
the next iteration step exceeds a user-defined threshold.
43 / 46
Images/cinvestav-
Once this has been finished
A second clustering pass is done
One the at most pN
q = N
q clusters from all the subsets.
The Goal
To apply the merging procedure described previously to all (at most) N
q so
that we end up with the required final number, m, of clusters.
Finally
Each point x in the data set, X, that is not used as a representative in
any one of the m clusters is assigned to one of them according to the
following strategy.
44 / 46
Images/cinvestav-
Once this has been finished
A second clustering pass is done
One the at most pN
q = N
q clusters from all the subsets.
The Goal
To apply the merging procedure described previously to all (at most) N
q so
that we end up with the required final number, m, of clusters.
Finally
Each point x in the data set, X, that is not used as a representative in
any one of the m clusters is assigned to one of them according to the
following strategy.
44 / 46
Images/cinvestav-
Once this has been finished
A second clustering pass is done
One the at most pN
q = N
q clusters from all the subsets.
The Goal
To apply the merging procedure described previously to all (at most) N
q so
that we end up with the required final number, m, of clusters.
Finally
Each point x in the data set, X, that is not used as a representative in
any one of the m clusters is assigned to one of them according to the
following strategy.
44 / 46
Images/cinvestav-
Finally
First
A random sample of representative points from each of the m clusters is
chosen.
Then
Then, based on the previous representatives the point x is assigned to the
cluster that contains the representative closest to it.
Experiments reported by Guha et al. show that CURE
It is sensitive to parameter selection.
Specifically K must be large enough to capture the geometry of each
cluster.
In addition, N must be higher than a certain percentage ≈ 2.5% of N.
45 / 46
Images/cinvestav-
Finally
First
A random sample of representative points from each of the m clusters is
chosen.
Then
Then, based on the previous representatives the point x is assigned to the
cluster that contains the representative closest to it.
Experiments reported by Guha et al. show that CURE
It is sensitive to parameter selection.
Specifically K must be large enough to capture the geometry of each
cluster.
In addition, N must be higher than a certain percentage ≈ 2.5% of N.
45 / 46
Images/cinvestav-
Finally
First
A random sample of representative points from each of the m clusters is
chosen.
Then
Then, based on the previous representatives the point x is assigned to the
cluster that contains the representative closest to it.
Experiments reported by Guha et al. show that CURE
It is sensitive to parameter selection.
Specifically K must be large enough to capture the geometry of each
cluster.
In addition, N must be higher than a certain percentage ≈ 2.5% of N.
45 / 46
Images/cinvestav-
Finally
First
A random sample of representative points from each of the m clusters is
chosen.
Then
Then, based on the previous representatives the point x is assigned to the
cluster that contains the representative closest to it.
Experiments reported by Guha et al. show that CURE
It is sensitive to parameter selection.
Specifically K must be large enough to capture the geometry of each
cluster.
In addition, N must be higher than a certain percentage ≈ 2.5% of N.
45 / 46
Images/cinvestav-
Finally
First
A random sample of representative points from each of the m clusters is
chosen.
Then
Then, based on the previous representatives the point x is assigned to the
cluster that contains the representative closest to it.
Experiments reported by Guha et al. show that CURE
It is sensitive to parameter selection.
Specifically K must be large enough to capture the geometry of each
cluster.
In addition, N must be higher than a certain percentage ≈ 2.5% of N.
45 / 46
Images/cinvestav-
Not only that
The value of a affects also CURE
Small values, CURE looks similar than a MST clustering.
Large values, CURE resembles an algorithm with a single
representative.
Worst Case Complexity
O N 2
log2 N (12)
46 / 46
Images/cinvestav-
Not only that
The value of a affects also CURE
Small values, CURE looks similar than a MST clustering.
Large values, CURE resembles an algorithm with a single
representative.
Worst Case Complexity
O N 2
log2 N (12)
46 / 46

More Related Content

What's hot

18 Machine Learning Radial Basis Function Networks Forward Heuristics
18 Machine Learning Radial Basis Function Networks Forward Heuristics18 Machine Learning Radial Basis Function Networks Forward Heuristics
18 Machine Learning Radial Basis Function Networks Forward HeuristicsAndres Mendez-Vazquez
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1arogozhnikov
 
Machine learning in science and industry — day 4
Machine learning in science and industry — day 4Machine learning in science and industry — day 4
Machine learning in science and industry — day 4arogozhnikov
 
Machine learning in science and industry — day 2
Machine learning in science and industry — day 2Machine learning in science and industry — day 2
Machine learning in science and industry — day 2arogozhnikov
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3arogozhnikov
 
Vc dimension in Machine Learning
Vc dimension in Machine LearningVc dimension in Machine Learning
Vc dimension in Machine LearningVARUN KUMAR
 
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...Masahiro Suzuki
 
The Kernel Trick
The Kernel TrickThe Kernel Trick
The Kernel TrickEdgar Marca
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsGilles Louppe
 
Neural Networks: Support Vector machines
Neural Networks: Support Vector machinesNeural Networks: Support Vector machines
Neural Networks: Support Vector machinesMostafa G. M. Mostafa
 
"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural NetKen'ichi Matsui
 
(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi DivergenceMasahiro Suzuki
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeGilles Louppe
 
Probabilistic PCA, EM, and more
Probabilistic PCA, EM, and moreProbabilistic PCA, EM, and more
Probabilistic PCA, EM, and morehsharmasshare
 
Ridge regression, lasso and elastic net
Ridge regression, lasso and elastic netRidge regression, lasso and elastic net
Ridge regression, lasso and elastic netVivian S. Zhang
 
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural NetworksMasahiro Suzuki
 
Scaling Multinomial Logistic Regression via Hybrid Parallelism
Scaling Multinomial Logistic Regression via Hybrid ParallelismScaling Multinomial Logistic Regression via Hybrid Parallelism
Scaling Multinomial Logistic Regression via Hybrid ParallelismParameswaran Raman
 

What's hot (20)

18.1 combining models
18.1 combining models18.1 combining models
18.1 combining models
 
18 Machine Learning Radial Basis Function Networks Forward Heuristics
18 Machine Learning Radial Basis Function Networks Forward Heuristics18 Machine Learning Radial Basis Function Networks Forward Heuristics
18 Machine Learning Radial Basis Function Networks Forward Heuristics
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
 
Machine learning in science and industry — day 4
Machine learning in science and industry — day 4Machine learning in science and industry — day 4
Machine learning in science and industry — day 4
 
Machine learning in science and industry — day 2
Machine learning in science and industry — day 2Machine learning in science and industry — day 2
Machine learning in science and industry — day 2
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3
 
06 Machine Learning - Naive Bayes
06 Machine Learning - Naive Bayes06 Machine Learning - Naive Bayes
06 Machine Learning - Naive Bayes
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
 
Vc dimension in Machine Learning
Vc dimension in Machine LearningVc dimension in Machine Learning
Vc dimension in Machine Learning
 
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
 
The Kernel Trick
The Kernel TrickThe Kernel Trick
The Kernel Trick
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Neural Networks: Support Vector machines
Neural Networks: Support Vector machinesNeural Networks: Support Vector machines
Neural Networks: Support Vector machines
 
"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net
 
(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to Practice
 
Probabilistic PCA, EM, and more
Probabilistic PCA, EM, and moreProbabilistic PCA, EM, and more
Probabilistic PCA, EM, and more
 
Ridge regression, lasso and elastic net
Ridge regression, lasso and elastic netRidge regression, lasso and elastic net
Ridge regression, lasso and elastic net
 
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
 
Scaling Multinomial Logistic Regression via Hybrid Parallelism
Scaling Multinomial Logistic Regression via Hybrid ParallelismScaling Multinomial Logistic Regression via Hybrid Parallelism
Scaling Multinomial Logistic Regression via Hybrid Parallelism
 

Viewers also liked

3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methodsKrish_ver2
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
Machine Learning and Data Mining: 05 Advanced Association Rule Mining
Machine Learning and Data Mining: 05 Advanced Association Rule MiningMachine Learning and Data Mining: 05 Advanced Association Rule Mining
Machine Learning and Data Mining: 05 Advanced Association Rule MiningPier Luca Lanzi
 
Machine Learning and Data Mining: 08 Clustering: Hierarchical
Machine Learning and Data Mining: 08 Clustering: Hierarchical Machine Learning and Data Mining: 08 Clustering: Hierarchical
Machine Learning and Data Mining: 08 Clustering: Hierarchical Pier Luca Lanzi
 
Hierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyondHierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyondFrank Kelly
 
K means Clustering
K means ClusteringK means Clustering
K means ClusteringEdureka!
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learningbutest
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningLior Rokach
 

Viewers also liked (15)

Db Scan
Db ScanDb Scan
Db Scan
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Db scan multiview
Db scan multiviewDb scan multiview
Db scan multiview
 
Machine Learning and Data Mining: 05 Advanced Association Rule Mining
Machine Learning and Data Mining: 05 Advanced Association Rule MiningMachine Learning and Data Mining: 05 Advanced Association Rule Mining
Machine Learning and Data Mining: 05 Advanced Association Rule Mining
 
Machine Learning and Data Mining: 08 Clustering: Hierarchical
Machine Learning and Data Mining: 08 Clustering: Hierarchical Machine Learning and Data Mining: 08 Clustering: Hierarchical
Machine Learning and Data Mining: 08 Clustering: Hierarchical
 
Hierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyondHierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyond
 
An Introduction to Agglomeration
An Introduction to AgglomerationAn Introduction to Agglomeration
An Introduction to Agglomeration
 
Clustering: A Survey
Clustering: A SurveyClustering: A Survey
Clustering: A Survey
 
08 clustering
08 clustering08 clustering
08 clustering
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Similar to 28 Machine Learning Unsupervised Hierarchical Clustering

Poggi analytics - clustering - 1
Poggi   analytics - clustering - 1Poggi   analytics - clustering - 1
Poggi analytics - clustering - 1Gaston Liberman
 
A Tree Kernel based approach for clone detection
A Tree Kernel based approach for clone detectionA Tree Kernel based approach for clone detection
A Tree Kernel based approach for clone detectionValerio Maggio
 
A tree kernel based approach for clone detection
A tree kernel based approach for clone detectionA tree kernel based approach for clone detection
A tree kernel based approach for clone detectionICSM 2010
 
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...Daniel Katz
 
Data Science - Part VII - Cluster Analysis
Data Science - Part VII -  Cluster AnalysisData Science - Part VII -  Cluster Analysis
Data Science - Part VII - Cluster AnalysisDerek Kane
 
Joint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clustersJoint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clustersUniversitat Politècnica de Catalunya
 
Clustering in Machine Learning.pdf
Clustering in Machine Learning.pdfClustering in Machine Learning.pdf
Clustering in Machine Learning.pdfSudhanshiBakre1
 
Preparation Data Structures 04 array linear_list
Preparation Data Structures 04 array linear_listPreparation Data Structures 04 array linear_list
Preparation Data Structures 04 array linear_listAndres Mendez-Vazquez
 
iiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfiiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfVIKASGUPTA127897
 
Clustering &amp; classification
Clustering &amp; classificationClustering &amp; classification
Clustering &amp; classificationJamshed Khan
 
Artificial Neural Network.pptx
Artificial Neural Network.pptxArtificial Neural Network.pptx
Artificial Neural Network.pptxshashankbhadouria4
 
Clustering and Classification Algorithms Ankita Dubey
Clustering and Classification Algorithms Ankita DubeyClustering and Classification Algorithms Ankita Dubey
Clustering and Classification Algorithms Ankita DubeyAnkita Dubey
 

Similar to 28 Machine Learning Unsupervised Hierarchical Clustering (20)

cluster analysis
cluster analysiscluster analysis
cluster analysis
 
Poggi analytics - clustering - 1
Poggi   analytics - clustering - 1Poggi   analytics - clustering - 1
Poggi analytics - clustering - 1
 
A Tree Kernel based approach for clone detection
A Tree Kernel based approach for clone detectionA Tree Kernel based approach for clone detection
A Tree Kernel based approach for clone detection
 
A tree kernel based approach for clone detection
A tree kernel based approach for clone detectionA tree kernel based approach for clone detection
A tree kernel based approach for clone detection
 
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
 
Data Science - Part VII - Cluster Analysis
Data Science - Part VII -  Cluster AnalysisData Science - Part VII -  Cluster Analysis
Data Science - Part VII - Cluster Analysis
 
Joint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clustersJoint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clusters
 
Clustering in Machine Learning.pdf
Clustering in Machine Learning.pdfClustering in Machine Learning.pdf
Clustering in Machine Learning.pdf
 
Neural nw k means
Neural nw k meansNeural nw k means
Neural nw k means
 
Clustering.pptx
Clustering.pptxClustering.pptx
Clustering.pptx
 
Preparation Data Structures 04 array linear_list
Preparation Data Structures 04 array linear_listPreparation Data Structures 04 array linear_list
Preparation Data Structures 04 array linear_list
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
iiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfiiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdf
 
Hierachical clustering
Hierachical clusteringHierachical clustering
Hierachical clustering
 
Clustering &amp; classification
Clustering &amp; classificationClustering &amp; classification
Clustering &amp; classification
 
Artificial Neural Network.pptx
Artificial Neural Network.pptxArtificial Neural Network.pptx
Artificial Neural Network.pptx
 
Machine Learning - Clustering
Machine Learning - ClusteringMachine Learning - Clustering
Machine Learning - Clustering
 
Clustering and Classification Algorithms Ankita Dubey
Clustering and Classification Algorithms Ankita DubeyClustering and Classification Algorithms Ankita Dubey
Clustering and Classification Algorithms Ankita Dubey
 
Clustering.pdf
Clustering.pdfClustering.pdf
Clustering.pdf
 
Introduction (1).pdf
Introduction (1).pdfIntroduction (1).pdf
Introduction (1).pdf
 

More from Andres Mendez-Vazquez

01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectorsAndres Mendez-Vazquez
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issuesAndres Mendez-Vazquez
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiationAndres Mendez-Vazquez
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep LearningAndres Mendez-Vazquez
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learningAndres Mendez-Vazquez
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusAndres Mendez-Vazquez
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusAndres Mendez-Vazquez
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesAndres Mendez-Vazquez
 
20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variationsAndres Mendez-Vazquez
 
Introduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems SyllabusIntroduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems SyllabusAndres Mendez-Vazquez
 

More from Andres Mendez-Vazquez (20)

2.03 bayesian estimation
2.03 bayesian estimation2.03 bayesian estimation
2.03 bayesian estimation
 
05 linear transformations
05 linear transformations05 linear transformations
05 linear transformations
 
01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues
 
01.02 linear equations
01.02 linear equations01.02 linear equations
01.02 linear equations
 
01.01 vector spaces
01.01 vector spaces01.01 vector spaces
01.01 vector spaces
 
06 recurrent neural_networks
06 recurrent neural_networks06 recurrent neural_networks
06 recurrent neural_networks
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation
 
Zetta global
Zetta globalZetta global
Zetta global
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learning
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning Syllabus
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabus
 
Ideas 09 22_2018
Ideas 09 22_2018Ideas 09 22_2018
Ideas 09 22_2018
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data Sciences
 
Analysis of Algorithms Syllabus
Analysis of Algorithms  SyllabusAnalysis of Algorithms  Syllabus
Analysis of Algorithms Syllabus
 
20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations
 
17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension
 
A basic introduction to learning
A basic introduction to learningA basic introduction to learning
A basic introduction to learning
 
Introduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems SyllabusIntroduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems Syllabus
 

Recently uploaded

Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfPaper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfNainaShrivastava14
 
List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfisabel213075
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewsandhya757531
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Communityprachaibot
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodManicka Mamallan Andavar
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHSneha Padhiar
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmDeepika Walanjkar
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSsandhya757531
 
Immutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfImmutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfDrew Moseley
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfChristianCDAM
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdfsahilsajad201
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionSneha Padhiar
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsResearcher Researcher
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESkarthi keyan
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 

Recently uploaded (20)

Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfPaper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
 
List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdf
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overview
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Community
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument method
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
 
Immutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfImmutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdf
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdf
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based question
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending Actuators
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 

28 Machine Learning Unsupervised Hierarchical Clustering

  • 1. Machine Learning for Data Mining Hierarchical Clustering Andres Mendez-Vazquez July 27, 2015 1 / 46
  • 2. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 2 / 46
  • 3. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 3 / 46
  • 4. Images/cinvestav- Concepts Hierarchical Clustering Algorithms They are quite different from the previous clustering algorithms. Actually They produce a hierarchy of clusterings. 4 / 46
  • 5. Images/cinvestav- Concepts Hierarchical Clustering Algorithms They are quite different from the previous clustering algorithms. Actually They produce a hierarchy of clusterings. 4 / 46
  • 6. Images/cinvestav- Dendrogram: Hierarchical Clustering Hierarchical Clustering The clustering is obtained by cutting the dendrogram at a desired level: Each connected component forms a cluster. 5 / 46
  • 8. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 7 / 46
  • 9. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  • 10. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  • 11. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  • 12. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  • 13. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  • 14. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  • 15. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  • 16. Images/cinvestav- Therefore Given the previous ideas It is necessary to define the concept of nesting!!! After all given a divisive and agglomerative procedure 9 / 46
  • 17. Images/cinvestav- Therefore Given the previous ideas It is necessary to define the concept of nesting!!! After all given a divisive and agglomerative procedure 9 / 46
  • 18. Images/cinvestav- Nested Clustering Definition 1 A clustering i containing k clusters is said to be nested in the clustering i+1, which contains r < k clusters, if each cluster in i, it is a subset of a set in i+1. 2 At least one cluster at i is a proper subset of a set in i+1. This is written as i i+1 (1) 10 / 46
  • 19. Images/cinvestav- Nested Clustering Definition 1 A clustering i containing k clusters is said to be nested in the clustering i+1, which contains r < k clusters, if each cluster in i, it is a subset of a set in i+1. 2 At least one cluster at i is a proper subset of a set in i+1. This is written as i i+1 (1) 10 / 46
  • 20. Images/cinvestav- Nested Clustering Definition 1 A clustering i containing k clusters is said to be nested in the clustering i+1, which contains r < k clusters, if each cluster in i, it is a subset of a set in i+1. 2 At least one cluster at i is a proper subset of a set in i+1. This is written as i i+1 (1) 10 / 46
  • 21. Images/cinvestav- Example We have The following set{x1, x2, x3, x4, x5}. With the following structures 1 = {{x1, x3} , {x4} , {x2, x5}} 2 = {{x1, x3, x4} , {x2, x5}} Again Hierarchical Clustering produces a hierarchy of clusterings!!! 11 / 46
  • 22. Images/cinvestav- Example We have The following set{x1, x2, x3, x4, x5}. With the following structures 1 = {{x1, x3} , {x4} , {x2, x5}} 2 = {{x1, x3, x4} , {x2, x5}} Again Hierarchical Clustering produces a hierarchy of clusterings!!! 11 / 46
  • 23. Images/cinvestav- Example We have The following set{x1, x2, x3, x4, x5}. With the following structures 1 = {{x1, x3} , {x4} , {x2, x5}} 2 = {{x1, x3, x4} , {x2, x5}} Again Hierarchical Clustering produces a hierarchy of clusterings!!! 11 / 46
  • 24. Images/cinvestav- Example We have The following set{x1, x2, x3, x4, x5}. With the following structures 1 = {{x1, x3} , {x4} , {x2, x5}} 2 = {{x1, x3, x4} , {x2, x5}} Again Hierarchical Clustering produces a hierarchy of clusterings!!! 11 / 46
  • 25. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 12 / 46
  • 26. Images/cinvestav- Agglomerative Algorithms. Initial State You have N clusters each containing an element of the data X. At each step i, you have an i structure with N − i. Then, a new clustering structure i+1 is generated. Thus 13 / 46
  • 27. Images/cinvestav- Agglomerative Algorithms. Initial State You have N clusters each containing an element of the data X. At each step i, you have an i structure with N − i. Then, a new clustering structure i+1 is generated. Thus 13 / 46
  • 28. Images/cinvestav- Agglomerative Algorithms. Initial State You have N clusters each containing an element of the data X. At each step i, you have an i structure with N − i. Then, a new clustering structure i+1 is generated. Thus 13 / 46
  • 29. Images/cinvestav- Agglomerative Algorithms. Initial State You have N clusters each containing an element of the data X. At each step i, you have an i structure with N − i. Then, a new clustering structure i+1 is generated. Thus 13 / 46
  • 30. Images/cinvestav- In that way... We have At each step, we have that each cluster i is a proper subset of a cluste in i or i i+1 (2) 14 / 46
  • 31. Images/cinvestav- The Basic Algorithm for Agglomerative For this We have a function g (Ci, Cj) defined in all pair of cluster to measure similarity or dissimilarity. t denotes the current level of the hierarchy. Algorithm I n i t i a l i z a t i o n Choose 0 = {Ci = {xi } , i = 1, ..., N} t = 0 Repeat t = t + 1 Find one p a i r of c l u s t e r s (Cr , Cs) i n t−1 such that g(Ci , Cj ) = max, min of a s i m i l l a r i t y or d i s s i m i l a r i t y f u n c t i o n over a l l p a i r s Define Cq = Ci ∪ Cj , t = t−1 − Ci , Cj ∪ Cq U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r 15 / 46
  • 32. Images/cinvestav- The Basic Algorithm for Agglomerative For this We have a function g (Ci, Cj) defined in all pair of cluster to measure similarity or dissimilarity. t denotes the current level of the hierarchy. Algorithm I n i t i a l i z a t i o n Choose 0 = {Ci = {xi } , i = 1, ..., N} t = 0 Repeat t = t + 1 Find one p a i r of c l u s t e r s (Cr , Cs) i n t−1 such that g(Ci , Cj ) = max, min of a s i m i l l a r i t y or d i s s i m i l a r i t y f u n c t i o n over a l l p a i r s Define Cq = Ci ∪ Cj , t = t−1 − Ci , Cj ∪ Cq U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r 15 / 46
  • 33. Images/cinvestav- The Basic Algorithm for Agglomerative For this We have a function g (Ci, Cj) defined in all pair of cluster to measure similarity or dissimilarity. t denotes the current level of the hierarchy. Algorithm I n i t i a l i z a t i o n Choose 0 = {Ci = {xi } , i = 1, ..., N} t = 0 Repeat t = t + 1 Find one p a i r of c l u s t e r s (Cr , Cs) i n t−1 such that g(Ci , Cj ) = max, min of a s i m i l l a r i t y or d i s s i m i l a r i t y f u n c t i o n over a l l p a i r s Define Cq = Ci ∪ Cj , t = t−1 − Ci , Cj ∪ Cq U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r 15 / 46
  • 34. Images/cinvestav- Enforcing Nesting Note the following “We can say that if two vectors come together into a single cluster at level t of the hierarchy, they will remain in the same cluster for all subsequent clusterings.” Thus 0 1 2 ... N−1 N (3) Hurra!!! Enforcing the nesting property!!! 16 / 46
  • 35. Images/cinvestav- Enforcing Nesting Note the following “We can say that if two vectors come together into a single cluster at level t of the hierarchy, they will remain in the same cluster for all subsequent clusterings.” Thus 0 1 2 ... N−1 N (3) Hurra!!! Enforcing the nesting property!!! 16 / 46
  • 36. Images/cinvestav- Enforcing Nesting Note the following “We can say that if two vectors come together into a single cluster at level t of the hierarchy, they will remain in the same cluster for all subsequent clusterings.” Thus 0 1 2 ... N−1 N (3) Hurra!!! Enforcing the nesting property!!! 16 / 46
  • 37. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 17 / 46
  • 38. Images/cinvestav- Problems with Agglomerative Algorithms First - Related to Nesting Property No way to recover from a “poor” clustering that may have occurred in an earlier level of the hierarchy. Second At each level t, there are N − t clusters. Thus at level t+1 the total number of pairs compared. N − t 2 = (N − t) (N − t − 1) 2 (4) Total Number of pairs compared are N−1 t=0 N − t 2 (5) 18 / 46
  • 39. Images/cinvestav- Problems with Agglomerative Algorithms First - Related to Nesting Property No way to recover from a “poor” clustering that may have occurred in an earlier level of the hierarchy. Second At each level t, there are N − t clusters. Thus at level t+1 the total number of pairs compared. N − t 2 = (N − t) (N − t − 1) 2 (4) Total Number of pairs compared are N−1 t=0 N − t 2 (5) 18 / 46
  • 40. Images/cinvestav- Problems with Agglomerative Algorithms First - Related to Nesting Property No way to recover from a “poor” clustering that may have occurred in an earlier level of the hierarchy. Second At each level t, there are N − t clusters. Thus at level t+1 the total number of pairs compared. N − t 2 = (N − t) (N − t − 1) 2 (4) Total Number of pairs compared are N−1 t=0 N − t 2 (5) 18 / 46
  • 41. Images/cinvestav- Problems with Agglomerative Algorithms First - Related to Nesting Property No way to recover from a “poor” clustering that may have occurred in an earlier level of the hierarchy. Second At each level t, there are N − t clusters. Thus at level t+1 the total number of pairs compared. N − t 2 = (N − t) (N − t − 1) 2 (4) Total Number of pairs compared are N−1 t=0 N − t 2 (5) 18 / 46
  • 42. Images/cinvestav- Problems with Agglomerative Algorithms First - Related to Nesting Property No way to recover from a “poor” clustering that may have occurred in an earlier level of the hierarchy. Second At each level t, there are N − t clusters. Thus at level t+1 the total number of pairs compared. N − t 2 = (N − t) (N − t − 1) 2 (4) Total Number of pairs compared are N−1 t=0 N − t 2 (5) 18 / 46
  • 43. Images/cinvestav- Thus We have that N−1 t=0 N − t 2 = N k=1 k 2 = (N − 1) N (N + 1) 6 (6) Thus The complexity of this schema is O N3 However You still depend on the nature of g. 19 / 46
  • 44. Images/cinvestav- Thus We have that N−1 t=0 N − t 2 = N k=1 k 2 = (N − 1) N (N + 1) 6 (6) Thus The complexity of this schema is O N3 However You still depend on the nature of g. 19 / 46
  • 45. Images/cinvestav- Thus We have that N−1 t=0 N − t 2 = N k=1 k 2 = (N − 1) N (N + 1) 6 (6) Thus The complexity of this schema is O N3 However You still depend on the nature of g. 19 / 46
  • 46. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 20 / 46
  • 47. Images/cinvestav- Two Categories of Agglomerative Algorithms There are two 1 Matrix Theory Based. 2 Graph Theory Concepts. Matrix Theory Based As the name says, they are based in dissimilarity matrix P0 = P (X) of N × N. At each merging the matrix is reduced by one level ⇒ Pt becomes a N − t × N − t matrix. 21 / 46
  • 48. Images/cinvestav- Two Categories of Agglomerative Algorithms There are two 1 Matrix Theory Based. 2 Graph Theory Concepts. Matrix Theory Based As the name says, they are based in dissimilarity matrix P0 = P (X) of N × N. At each merging the matrix is reduced by one level ⇒ Pt becomes a N − t × N − t matrix. 21 / 46
  • 49. Images/cinvestav- Two Categories of Agglomerative Algorithms There are two 1 Matrix Theory Based. 2 Graph Theory Concepts. Matrix Theory Based As the name says, they are based in dissimilarity matrix P0 = P (X) of N × N. At each merging the matrix is reduced by one level ⇒ Pt becomes a N − t × N − t matrix. 21 / 46
  • 50. Images/cinvestav- Two Categories of Agglomerative Algorithms There are two 1 Matrix Theory Based. 2 Graph Theory Concepts. Matrix Theory Based As the name says, they are based in dissimilarity matrix P0 = P (X) of N × N. At each merging the matrix is reduced by one level ⇒ Pt becomes a N − t × N − t matrix. 21 / 46
  • 51. Images/cinvestav- Matrix Based Algorithm Matrix Updating Algorithmic Scheme (MUAS) I n i t i a l i z a t i o n Choose 0 = {Ci = {xi} , i = 1, ..., N} P0 = P(X) t = 0 Repeat t = t + 1 Find one p a i r of c l u s t e r s (Cr , Cs) i n t−1 such that d(Ci, Cj) = minr,s=1,..,N,r=s d(Cr , Cs) Define Cq = Ci ∪ Cj , t = t−1 − {Ci, Cj} ∪ Cq Create Pt by s t r a t e g y U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r 22 / 46
  • 52. Images/cinvestav- Matrix Based Algorithm Strategy 1 Delete the two rows and columns that correspond to the merged clusters. 2 Add new row and a new column that contain the distances between the newly formed cluster and the old (unaffected at this level) clusters. 23 / 46
  • 53. Images/cinvestav- Matrix Based Algorithm Strategy 1 Delete the two rows and columns that correspond to the merged clusters. 2 Add new row and a new column that contain the distances between the newly formed cluster and the old (unaffected at this level) clusters. 23 / 46
  • 54. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  • 55. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  • 56. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  • 57. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  • 58. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  • 59. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  • 60. Images/cinvestav- For example The single link algorithm This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2 Thus, we have d (Cq, Cs) = min {d (Ci, Cs) , d (Cj, Cs)} (7) Please look at the example in the Dropbox It is an interesting example. 25 / 46
  • 61. Images/cinvestav- For example The single link algorithm This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2 Thus, we have d (Cq, Cs) = min {d (Ci, Cs) , d (Cj, Cs)} (7) Please look at the example in the Dropbox It is an interesting example. 25 / 46
  • 62. Images/cinvestav- For example The single link algorithm This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2 Thus, we have d (Cq, Cs) = min {d (Ci, Cs) , d (Cj, Cs)} (7) Please look at the example in the Dropbox It is an interesting example. 25 / 46
  • 63. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Consider the following 1 Each node in the graph G correspond to a vector. 2 Cluster are formed by connecting nodes. 3 Certain property, h (k), needs to be respected. Common Properties: Node Connectivity The node connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no nodes in common. 26 / 46
  • 64. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Consider the following 1 Each node in the graph G correspond to a vector. 2 Cluster are formed by connecting nodes. 3 Certain property, h (k), needs to be respected. Common Properties: Node Connectivity The node connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no nodes in common. 26 / 46
  • 65. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Consider the following 1 Each node in the graph G correspond to a vector. 2 Cluster are formed by connecting nodes. 3 Certain property, h (k), needs to be respected. Common Properties: Node Connectivity The node connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no nodes in common. 26 / 46
  • 66. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Consider the following 1 Each node in the graph G correspond to a vector. 2 Cluster are formed by connecting nodes. 3 Certain property, h (k), needs to be respected. Common Properties: Node Connectivity The node connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no nodes in common. 26 / 46
  • 67. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Common Properties: Edge Connectivity The edge connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no edges in common. Common Properties: Node Degree The degree of a connected subgraph is the largest integer k such that each node has at least k incident edges. 27 / 46
  • 68. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Common Properties: Edge Connectivity The edge connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no edges in common. Common Properties: Node Degree The degree of a connected subgraph is the largest integer k such that each node has at least k incident edges. 27 / 46
  • 69. Images/cinvestav- Basically, We use the Same Scheme, But... The function gh(k) (Cr , Cs) = min x∈Cr ,y∈Cs {d (x, y) |Property} (8) Property The G subgraph defined by Cr ∪ Cs is 1 It is connected and either 1 It has the property h(k) or 2 It is complete 28 / 46
  • 70. Images/cinvestav- Basically, We use the Same Scheme, But... The function gh(k) (Cr , Cs) = min x∈Cr ,y∈Cs {d (x, y) |Property} (8) Property The G subgraph defined by Cr ∪ Cs is 1 It is connected and either 1 It has the property h(k) or 2 It is complete 28 / 46
  • 71. Images/cinvestav- Basically, We use the Same Scheme, But... The function gh(k) (Cr , Cs) = min x∈Cr ,y∈Cs {d (x, y) |Property} (8) Property The G subgraph defined by Cr ∪ Cs is 1 It is connected and either 1 It has the property h(k) or 2 It is complete 28 / 46
  • 72. Images/cinvestav- Basically, We use the Same Scheme, But... The function gh(k) (Cr , Cs) = min x∈Cr ,y∈Cs {d (x, y) |Property} (8) Property The G subgraph defined by Cr ∪ Cs is 1 It is connected and either 1 It has the property h(k) or 2 It is complete 28 / 46
  • 73. Images/cinvestav- Basically, We use the Same Scheme, But... The function gh(k) (Cr , Cs) = min x∈Cr ,y∈Cs {d (x, y) |Property} (8) Property The G subgraph defined by Cr ∪ Cs is 1 It is connected and either 1 It has the property h(k) or 2 It is complete 28 / 46
  • 74. Images/cinvestav- Examples Examples 1 Single Link Algorithm 2 Complete Link Algorithm There is other style of clustering Clustering Algorithms Based on the Minimum Spanning Tree 29 / 46
  • 75. Images/cinvestav- Examples Examples 1 Single Link Algorithm 2 Complete Link Algorithm There is other style of clustering Clustering Algorithms Based on the Minimum Spanning Tree 29 / 46
  • 76. Images/cinvestav- Examples Examples 1 Single Link Algorithm 2 Complete Link Algorithm There is other style of clustering Clustering Algorithms Based on the Minimum Spanning Tree 29 / 46
  • 77. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 30 / 46
  • 78. Images/cinvestav- Divisive Algorithms Reverse Strategy Start with a single cluster split it iteratively. 31 / 46
  • 79. Images/cinvestav- Generalized Divisive Scheme Algorithm PROBLEM what is wrong!!! I n i t i a l i z a t i o n Choose 0 = {X} P0 = P(X) t = 0 Repeat t = t + 1 For i = 1 to t Given a p a r t i t i o n Ct−1, i Generate a l l p o s s i b l e c l u s t e r s next i Find the p a i r C1 t−1,j, C2 t−1,j that maximize g Create t = t−1 − {Ct−1,j} ∪ C1 t−1,j, C2 t−1,j U n t i l a l l v e c t o r s l i e i n a s i n g l e c l u s t e r 32 / 46
  • 80. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 33 / 46
  • 81. Images/cinvestav- Algorithms for Large Data Sets There are several 1 The CURE Algorithm 2 The ROCK Algorithm 3 The Chameleon Algorithm 4 The BIRCH Algorithm 34 / 46
  • 82. Images/cinvestav- Algorithms for Large Data Sets There are several 1 The CURE Algorithm 2 The ROCK Algorithm 3 The Chameleon Algorithm 4 The BIRCH Algorithm 34 / 46
  • 83. Images/cinvestav- Algorithms for Large Data Sets There are several 1 The CURE Algorithm 2 The ROCK Algorithm 3 The Chameleon Algorithm 4 The BIRCH Algorithm 34 / 46
  • 84. Images/cinvestav- Algorithms for Large Data Sets There are several 1 The CURE Algorithm 2 The ROCK Algorithm 3 The Chameleon Algorithm 4 The BIRCH Algorithm 34 / 46
  • 85. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Idea Each cluster Ci has a set of representatives RCi = x (i) 1 , x (i) 2 , ..., x (i) K with K > 1. What is happening By using multiple representatives for each cluster, the CURE algorithm tries to “capture” the shape of each one. However In order to avoid taking into account irregularities (For example, outliers) in the border of the cluster. The initially chosen representatives are “pushed” toward the mean of the cluster. 35 / 46
  • 86. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Idea Each cluster Ci has a set of representatives RCi = x (i) 1 , x (i) 2 , ..., x (i) K with K > 1. What is happening By using multiple representatives for each cluster, the CURE algorithm tries to “capture” the shape of each one. However In order to avoid taking into account irregularities (For example, outliers) in the border of the cluster. The initially chosen representatives are “pushed” toward the mean of the cluster. 35 / 46
  • 87. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Idea Each cluster Ci has a set of representatives RCi = x (i) 1 , x (i) 2 , ..., x (i) K with K > 1. What is happening By using multiple representatives for each cluster, the CURE algorithm tries to “capture” the shape of each one. However In order to avoid taking into account irregularities (For example, outliers) in the border of the cluster. The initially chosen representatives are “pushed” toward the mean of the cluster. 35 / 46
  • 88. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Idea Each cluster Ci has a set of representatives RCi = x (i) 1 , x (i) 2 , ..., x (i) K with K > 1. What is happening By using multiple representatives for each cluster, the CURE algorithm tries to “capture” the shape of each one. However In order to avoid taking into account irregularities (For example, outliers) in the border of the cluster. The initially chosen representatives are “pushed” toward the mean of the cluster. 35 / 46
  • 89. Images/cinvestav- Therfore This action is known As “Shrinking” in the sense that the volume of space “defined” by the representatives is shrunk toward the mean of the cluster. 36 / 46
  • 90. Images/cinvestav- Shrinking Process Given a cluster C Select the point x ∈ C with the maximum distance from the mean of C and set RC = {x} (the set of representatives). Then 1 For i = 2 to min {K, nC } 2 Determine y ∈ C − RC that lies farthest from the points in RC 3 RC = RC ∪ {y} 37 / 46
  • 91. Images/cinvestav- Shrinking Process Given a cluster C Select the point x ∈ C with the maximum distance from the mean of C and set RC = {x} (the set of representatives). Then 1 For i = 2 to min {K, nC } 2 Determine y ∈ C − RC that lies farthest from the points in RC 3 RC = RC ∪ {y} 37 / 46
  • 92. Images/cinvestav- Shrinking Process Given a cluster C Select the point x ∈ C with the maximum distance from the mean of C and set RC = {x} (the set of representatives). Then 1 For i = 2 to min {K, nC } 2 Determine y ∈ C − RC that lies farthest from the points in RC 3 RC = RC ∪ {y} 37 / 46
  • 93. Images/cinvestav- Shrinking Process Given a cluster C Select the point x ∈ C with the maximum distance from the mean of C and set RC = {x} (the set of representatives). Then 1 For i = 2 to min {K, nC } 2 Determine y ∈ C − RC that lies farthest from the points in RC 3 RC = RC ∪ {y} 37 / 46
  • 94. Images/cinvestav- Shrinking Process Given a cluster C Select the point x ∈ C with the maximum distance from the mean of C and set RC = {x} (the set of representatives). Then 1 For i = 2 to min {K, nC } 2 Determine y ∈ C − RC that lies farthest from the points in RC 3 RC = RC ∪ {y} 37 / 46
  • 95. Images/cinvestav- Shrinking Process Do the Shrinking Shrink the points x ∈ RC toward the mean mC in C by a factor α. Actually x = (1 − α) x + αmC ∀x ∈ RC (9) 38 / 46
  • 96. Images/cinvestav- Shrinking Process Do the Shrinking Shrink the points x ∈ RC toward the mean mC in C by a factor α. Actually x = (1 − α) x + αmC ∀x ∈ RC (9) 38 / 46
  • 97. Images/cinvestav- Resulting set RC Thus The resulting set RC is the set of representatives of C. Thus the distance between two cluster is defined as d (Ci, Cj) = min x∈RCi ,y∈RCj d (x, y) (10) 39 / 46
  • 98. Images/cinvestav- Resulting set RC Thus The resulting set RC is the set of representatives of C. Thus the distance between two cluster is defined as d (Ci, Cj) = min x∈RCi ,y∈RCj d (x, y) (10) 39 / 46
  • 99. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 100. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 101. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 102. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 103. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 104. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 105. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 106. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 107. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 108. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 109. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 110. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 111. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  • 112. Images/cinvestav- Complexity of Cure Too Prohibitive O N2 log2 N (11) 41 / 46
  • 113. Images/cinvestav- Possible Solution CURE does the following The technique adopted by the CURE algorithm, in order to reduce the computational complexity, is that of random sampling. Actually That is, a sample set X is created from X, by choosing randomly N out of the N points of X. However, one has to ensure that the probability of missing a cluster of X, due to this sampling This can be guaranteed if the number of points N is sufficiently large. 42 / 46
  • 114. Images/cinvestav- Possible Solution CURE does the following The technique adopted by the CURE algorithm, in order to reduce the computational complexity, is that of random sampling. Actually That is, a sample set X is created from X, by choosing randomly N out of the N points of X. However, one has to ensure that the probability of missing a cluster of X, due to this sampling This can be guaranteed if the number of points N is sufficiently large. 42 / 46
  • 115. Images/cinvestav- Possible Solution CURE does the following The technique adopted by the CURE algorithm, in order to reduce the computational complexity, is that of random sampling. Actually That is, a sample set X is created from X, by choosing randomly N out of the N points of X. However, one has to ensure that the probability of missing a cluster of X, due to this sampling This can be guaranteed if the number of points N is sufficiently large. 42 / 46
  • 116. Images/cinvestav- Then Having estimated N CURE forms a number of p = N N sample data sets by successive random samples. In other words In other words, X is partitioned randomly in p subsets. For this a parameter q is selected Then, the points in each partition are clustered until N q clusters are formed or the distance between the closest pair of clusters to be merged in the next iteration step exceeds a user-defined threshold. 43 / 46
  • 117. Images/cinvestav- Then Having estimated N CURE forms a number of p = N N sample data sets by successive random samples. In other words In other words, X is partitioned randomly in p subsets. For this a parameter q is selected Then, the points in each partition are clustered until N q clusters are formed or the distance between the closest pair of clusters to be merged in the next iteration step exceeds a user-defined threshold. 43 / 46
  • 118. Images/cinvestav- Then Having estimated N CURE forms a number of p = N N sample data sets by successive random samples. In other words In other words, X is partitioned randomly in p subsets. For this a parameter q is selected Then, the points in each partition are clustered until N q clusters are formed or the distance between the closest pair of clusters to be merged in the next iteration step exceeds a user-defined threshold. 43 / 46
  • 119. Images/cinvestav- Once this has been finished A second clustering pass is done One the at most pN q = N q clusters from all the subsets. The Goal To apply the merging procedure described previously to all (at most) N q so that we end up with the required final number, m, of clusters. Finally Each point x in the data set, X, that is not used as a representative in any one of the m clusters is assigned to one of them according to the following strategy. 44 / 46
  • 120. Images/cinvestav- Once this has been finished A second clustering pass is done One the at most pN q = N q clusters from all the subsets. The Goal To apply the merging procedure described previously to all (at most) N q so that we end up with the required final number, m, of clusters. Finally Each point x in the data set, X, that is not used as a representative in any one of the m clusters is assigned to one of them according to the following strategy. 44 / 46
  • 121. Images/cinvestav- Once this has been finished A second clustering pass is done One the at most pN q = N q clusters from all the subsets. The Goal To apply the merging procedure described previously to all (at most) N q so that we end up with the required final number, m, of clusters. Finally Each point x in the data set, X, that is not used as a representative in any one of the m clusters is assigned to one of them according to the following strategy. 44 / 46
  • 122. Images/cinvestav- Finally First A random sample of representative points from each of the m clusters is chosen. Then Then, based on the previous representatives the point x is assigned to the cluster that contains the representative closest to it. Experiments reported by Guha et al. show that CURE It is sensitive to parameter selection. Specifically K must be large enough to capture the geometry of each cluster. In addition, N must be higher than a certain percentage ≈ 2.5% of N. 45 / 46
  • 123. Images/cinvestav- Finally First A random sample of representative points from each of the m clusters is chosen. Then Then, based on the previous representatives the point x is assigned to the cluster that contains the representative closest to it. Experiments reported by Guha et al. show that CURE It is sensitive to parameter selection. Specifically K must be large enough to capture the geometry of each cluster. In addition, N must be higher than a certain percentage ≈ 2.5% of N. 45 / 46
  • 124. Images/cinvestav- Finally First A random sample of representative points from each of the m clusters is chosen. Then Then, based on the previous representatives the point x is assigned to the cluster that contains the representative closest to it. Experiments reported by Guha et al. show that CURE It is sensitive to parameter selection. Specifically K must be large enough to capture the geometry of each cluster. In addition, N must be higher than a certain percentage ≈ 2.5% of N. 45 / 46
  • 125. Images/cinvestav- Finally First A random sample of representative points from each of the m clusters is chosen. Then Then, based on the previous representatives the point x is assigned to the cluster that contains the representative closest to it. Experiments reported by Guha et al. show that CURE It is sensitive to parameter selection. Specifically K must be large enough to capture the geometry of each cluster. In addition, N must be higher than a certain percentage ≈ 2.5% of N. 45 / 46
  • 126. Images/cinvestav- Finally First A random sample of representative points from each of the m clusters is chosen. Then Then, based on the previous representatives the point x is assigned to the cluster that contains the representative closest to it. Experiments reported by Guha et al. show that CURE It is sensitive to parameter selection. Specifically K must be large enough to capture the geometry of each cluster. In addition, N must be higher than a certain percentage ≈ 2.5% of N. 45 / 46
  • 127. Images/cinvestav- Not only that The value of a affects also CURE Small values, CURE looks similar than a MST clustering. Large values, CURE resembles an algorithm with a single representative. Worst Case Complexity O N 2 log2 N (12) 46 / 46
  • 128. Images/cinvestav- Not only that The value of a affects also CURE Small values, CURE looks similar than a MST clustering. Large values, CURE resembles an algorithm with a single representative. Worst Case Complexity O N 2 log2 N (12) 46 / 46