Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Machine Learning for Data Mining
Hierarchical Clustering
Andres Mendez-Vazquez
July 27, 2015
1 / 46
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems...
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems...
Images/cinvestav-
Concepts
Hierarchical Clustering Algorithms
They are quite different from the previous clustering algorit...
Images/cinvestav-
Concepts
Hierarchical Clustering Algorithms
They are quite different from the previous clustering algorit...
Images/cinvestav-
Dendrogram: Hierarchical Clustering
Hierarchical Clustering
The clustering is obtained by cutting the de...
Images/cinvestav-
Example
Dendrogram
6 / 46
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems...
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous...
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous...
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous...
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous...
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous...
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous...
Images/cinvestav-
Basic Ideas
At each step t
A new clustering is obtained based on the clustering produced at the
previous...
Images/cinvestav-
Therefore
Given the previous ideas
It is necessary to define the concept of nesting!!!
After all given a ...
Images/cinvestav-
Therefore
Given the previous ideas
It is necessary to define the concept of nesting!!!
After all given a ...
Images/cinvestav-
Nested Clustering
Definition
1 A clustering i containing k clusters is said to be nested in the
clusterin...
Images/cinvestav-
Nested Clustering
Definition
1 A clustering i containing k clusters is said to be nested in the
clusterin...
Images/cinvestav-
Nested Clustering
Definition
1 A clustering i containing k clusters is said to be nested in the
clusterin...
Images/cinvestav-
Example
We have
The following set{x1, x2, x3, x4, x5}.
With the following structures
1 = {{x1, x3} , {x4...
Images/cinvestav-
Example
We have
The following set{x1, x2, x3, x4, x5}.
With the following structures
1 = {{x1, x3} , {x4...
Images/cinvestav-
Example
We have
The following set{x1, x2, x3, x4, x5}.
With the following structures
1 = {{x1, x3} , {x4...
Images/cinvestav-
Example
We have
The following set{x1, x2, x3, x4, x5}.
With the following structures
1 = {{x1, x3} , {x4...
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems...
Images/cinvestav-
Agglomerative Algorithms.
Initial State
You have N clusters each containing an element of the data X.
At...
Images/cinvestav-
Agglomerative Algorithms.
Initial State
You have N clusters each containing an element of the data X.
At...
Images/cinvestav-
Agglomerative Algorithms.
Initial State
You have N clusters each containing an element of the data X.
At...
Images/cinvestav-
Agglomerative Algorithms.
Initial State
You have N clusters each containing an element of the data X.
At...
Images/cinvestav-
In that way...
We have
At each step, we have that each cluster i is a proper subset of a cluste in
i or
...
Images/cinvestav-
The Basic Algorithm for Agglomerative
For this
We have a function g (Ci, Cj) defined in all pair of clust...
Images/cinvestav-
The Basic Algorithm for Agglomerative
For this
We have a function g (Ci, Cj) defined in all pair of clust...
Images/cinvestav-
The Basic Algorithm for Agglomerative
For this
We have a function g (Ci, Cj) defined in all pair of clust...
Images/cinvestav-
Enforcing Nesting
Note the following
“We can say that if two vectors come together into a single cluster...
Images/cinvestav-
Enforcing Nesting
Note the following
“We can say that if two vectors come together into a single cluster...
Images/cinvestav-
Enforcing Nesting
Note the following
“We can say that if two vectors come together into a single cluster...
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems...
Images/cinvestav-
Problems with Agglomerative Algorithms
First - Related to Nesting Property
No way to recover from a “poo...
Images/cinvestav-
Problems with Agglomerative Algorithms
First - Related to Nesting Property
No way to recover from a “poo...
Images/cinvestav-
Problems with Agglomerative Algorithms
First - Related to Nesting Property
No way to recover from a “poo...
Images/cinvestav-
Problems with Agglomerative Algorithms
First - Related to Nesting Property
No way to recover from a “poo...
Images/cinvestav-
Problems with Agglomerative Algorithms
First - Related to Nesting Property
No way to recover from a “poo...
Images/cinvestav-
Thus
We have that
N−1
t=0
N − t
2
=
N
k=1
k
2
=
(N − 1) N (N + 1)
6
(6)
Thus
The complexity of this sche...
Images/cinvestav-
Thus
We have that
N−1
t=0
N − t
2
=
N
k=1
k
2
=
(N − 1) N (N + 1)
6
(6)
Thus
The complexity of this sche...
Images/cinvestav-
Thus
We have that
N−1
t=0
N − t
2
=
N
k=1
k
2
=
(N − 1) N (N + 1)
6
(6)
Thus
The complexity of this sche...
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems...
Images/cinvestav-
Two Categories of Agglomerative Algorithms
There are two
1 Matrix Theory Based.
2 Graph Theory Concepts....
Images/cinvestav-
Two Categories of Agglomerative Algorithms
There are two
1 Matrix Theory Based.
2 Graph Theory Concepts....
Images/cinvestav-
Two Categories of Agglomerative Algorithms
There are two
1 Matrix Theory Based.
2 Graph Theory Concepts....
Images/cinvestav-
Two Categories of Agglomerative Algorithms
There are two
1 Matrix Theory Based.
2 Graph Theory Concepts....
Images/cinvestav-
Matrix Based Algorithm
Matrix Updating Algorithmic Scheme (MUAS)
I n i t i a l i z a t i o n
Choose 0 = ...
Images/cinvestav-
Matrix Based Algorithm
Strategy
1 Delete the two rows and columns that correspond to the merged
clusters...
Images/cinvestav-
Matrix Based Algorithm
Strategy
1 Delete the two rows and columns that correspond to the merged
clusters...
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these...
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these...
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these...
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these...
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these...
Images/cinvestav-
Distance Used in These Schemes
It has been pointed out that there is only one general distance for
these...
Images/cinvestav-
For example
The single link algorithm
This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2
Thu...
Images/cinvestav-
For example
The single link algorithm
This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2
Thu...
Images/cinvestav-
For example
The single link algorithm
This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2
Thu...
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Consider the following
1 Each node in the graph G corresp...
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Consider the following
1 Each node in the graph G corresp...
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Consider the following
1 Each node in the graph G corresp...
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Consider the following
1 Each node in the graph G corresp...
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Common Properties: Edge Connectivity
The edge connectivit...
Images/cinvestav-
Agglomerative Algorithms Based on Graph Theory
Common Properties: Edge Connectivity
The edge connectivit...
Images/cinvestav-
Basically, We use the Same Scheme, But...
The function
gh(k) (Cr , Cs) = min
x∈Cr ,y∈Cs
{d (x, y) |Prope...
Images/cinvestav-
Basically, We use the Same Scheme, But...
The function
gh(k) (Cr , Cs) = min
x∈Cr ,y∈Cs
{d (x, y) |Prope...
Images/cinvestav-
Basically, We use the Same Scheme, But...
The function
gh(k) (Cr , Cs) = min
x∈Cr ,y∈Cs
{d (x, y) |Prope...
Images/cinvestav-
Basically, We use the Same Scheme, But...
The function
gh(k) (Cr , Cs) = min
x∈Cr ,y∈Cs
{d (x, y) |Prope...
Images/cinvestav-
Basically, We use the Same Scheme, But...
The function
gh(k) (Cr , Cs) = min
x∈Cr ,y∈Cs
{d (x, y) |Prope...
Images/cinvestav-
Examples
Examples
1 Single Link Algorithm
2 Complete Link Algorithm
There is other style of clustering
C...
Images/cinvestav-
Examples
Examples
1 Single Link Algorithm
2 Complete Link Algorithm
There is other style of clustering
C...
Images/cinvestav-
Examples
Examples
1 Single Link Algorithm
2 Complete Link Algorithm
There is other style of clustering
C...
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems...
Images/cinvestav-
Divisive Algorithms
Reverse Strategy
Start with a single cluster split it iteratively.
31 / 46
Images/cinvestav-
Generalized Divisive Scheme
Algorithm PROBLEM what is wrong!!!
I n i t i a l i z a t i o n
Choose 0 = {X...
Images/cinvestav-
Outline
1 Hierarchical Clustering
Definition
Basic Ideas
2 Agglomerative Algorithms
Introduction
Problems...
Images/cinvestav-
Algorithms for Large Data Sets
There are several
1 The CURE Algorithm
2 The ROCK Algorithm
3 The Chamele...
Images/cinvestav-
Algorithms for Large Data Sets
There are several
1 The CURE Algorithm
2 The ROCK Algorithm
3 The Chamele...
Images/cinvestav-
Algorithms for Large Data Sets
There are several
1 The CURE Algorithm
2 The ROCK Algorithm
3 The Chamele...
Images/cinvestav-
Algorithms for Large Data Sets
There are several
1 The CURE Algorithm
2 The ROCK Algorithm
3 The Chamele...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Idea
Each cluster Ci has a set of representatives RCi = x
...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Idea
Each cluster Ci has a set of representatives RCi = x
...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Idea
Each cluster Ci has a set of representatives RCi = x
...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Idea
Each cluster Ci has a set of representatives RCi = x
...
Images/cinvestav-
Therfore
This action is known
As “Shrinking” in the sense that the volume of space “defined” by the
repre...
Images/cinvestav-
Shrinking Process
Given a cluster C
Select the point x ∈ C with the maximum distance from the mean of C
...
Images/cinvestav-
Shrinking Process
Given a cluster C
Select the point x ∈ C with the maximum distance from the mean of C
...
Images/cinvestav-
Shrinking Process
Given a cluster C
Select the point x ∈ C with the maximum distance from the mean of C
...
Images/cinvestav-
Shrinking Process
Given a cluster C
Select the point x ∈ C with the maximum distance from the mean of C
...
Images/cinvestav-
Shrinking Process
Given a cluster C
Select the point x ∈ C with the maximum distance from the mean of C
...
Images/cinvestav-
Shrinking Process
Do the Shrinking
Shrink the points x ∈ RC toward the mean mC in C by a factor α.
Actua...
Images/cinvestav-
Shrinking Process
Do the Shrinking
Shrink the points x ∈ RC toward the mean mC in C by a factor α.
Actua...
Images/cinvestav-
Resulting set RC
Thus
The resulting set RC is the set of representatives of C.
Thus the distance between...
Images/cinvestav-
Resulting set RC
Thus
The resulting set RC is the set of representatives of C.
Thus the distance between...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Clustering Using REpresentatives (CURE)
Basic Algorithm
Input : A set of points X = {x1, x2, ..., xN }
O...
Images/cinvestav-
Complexity of Cure
Too Prohibitive
O N2
log2 N (11)
41 / 46
Images/cinvestav-
Possible Solution
CURE does the following
The technique adopted by the CURE algorithm, in order to reduc...
Images/cinvestav-
Possible Solution
CURE does the following
The technique adopted by the CURE algorithm, in order to reduc...
Images/cinvestav-
Possible Solution
CURE does the following
The technique adopted by the CURE algorithm, in order to reduc...
Images/cinvestav-
Then
Having estimated N
CURE forms a number of p = N
N sample data sets by successive random
samples.
In...
Images/cinvestav-
Then
Having estimated N
CURE forms a number of p = N
N sample data sets by successive random
samples.
In...
Images/cinvestav-
Then
Having estimated N
CURE forms a number of p = N
N sample data sets by successive random
samples.
In...
Images/cinvestav-
Once this has been finished
A second clustering pass is done
One the at most pN
q = N
q clusters from all...
Images/cinvestav-
Once this has been finished
A second clustering pass is done
One the at most pN
q = N
q clusters from all...
Images/cinvestav-
Once this has been finished
A second clustering pass is done
One the at most pN
q = N
q clusters from all...
Images/cinvestav-
Finally
First
A random sample of representative points from each of the m clusters is
chosen.
Then
Then,...
Images/cinvestav-
Finally
First
A random sample of representative points from each of the m clusters is
chosen.
Then
Then,...
Images/cinvestav-
Finally
First
A random sample of representative points from each of the m clusters is
chosen.
Then
Then,...
Images/cinvestav-
Finally
First
A random sample of representative points from each of the m clusters is
chosen.
Then
Then,...
Images/cinvestav-
Finally
First
A random sample of representative points from each of the m clusters is
chosen.
Then
Then,...
Images/cinvestav-
Not only that
The value of a affects also CURE
Small values, CURE looks similar than a MST clustering.
La...
Images/cinvestav-
Not only that
The value of a affects also CURE
Small values, CURE looks similar than a MST clustering.
La...
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
Db Scan
Next
Download to read offline and view in fullscreen.

2

Share

Download to read offline

28 Machine Learning Unsupervised Hierarchical Clustering

Download to read offline

Basic stuff about Unsupervised Hierarchical Clustering

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

28 Machine Learning Unsupervised Hierarchical Clustering

  1. 1. Machine Learning for Data Mining Hierarchical Clustering Andres Mendez-Vazquez July 27, 2015 1 / 46
  2. 2. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 2 / 46
  3. 3. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 3 / 46
  4. 4. Images/cinvestav- Concepts Hierarchical Clustering Algorithms They are quite different from the previous clustering algorithms. Actually They produce a hierarchy of clusterings. 4 / 46
  5. 5. Images/cinvestav- Concepts Hierarchical Clustering Algorithms They are quite different from the previous clustering algorithms. Actually They produce a hierarchy of clusterings. 4 / 46
  6. 6. Images/cinvestav- Dendrogram: Hierarchical Clustering Hierarchical Clustering The clustering is obtained by cutting the dendrogram at a desired level: Each connected component forms a cluster. 5 / 46
  7. 7. Images/cinvestav- Example Dendrogram 6 / 46
  8. 8. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 7 / 46
  9. 9. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  10. 10. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  11. 11. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  12. 12. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  13. 13. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  14. 14. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  15. 15. Images/cinvestav- Basic Ideas At each step t A new clustering is obtained based on the clustering produced at the previous step t − 1 Two Main Types 1 Agglomerative Algorithms. 1 Start with each item being a single cluster. 2 Eventually all items belong to the same cluster. 2 Divisive Algorithms 1 Start with all items belong to the same cluster. 2 Eventually each item forms a cluster on its own. 8 / 46
  16. 16. Images/cinvestav- Therefore Given the previous ideas It is necessary to define the concept of nesting!!! After all given a divisive and agglomerative procedure 9 / 46
  17. 17. Images/cinvestav- Therefore Given the previous ideas It is necessary to define the concept of nesting!!! After all given a divisive and agglomerative procedure 9 / 46
  18. 18. Images/cinvestav- Nested Clustering Definition 1 A clustering i containing k clusters is said to be nested in the clustering i+1, which contains r < k clusters, if each cluster in i, it is a subset of a set in i+1. 2 At least one cluster at i is a proper subset of a set in i+1. This is written as i i+1 (1) 10 / 46
  19. 19. Images/cinvestav- Nested Clustering Definition 1 A clustering i containing k clusters is said to be nested in the clustering i+1, which contains r < k clusters, if each cluster in i, it is a subset of a set in i+1. 2 At least one cluster at i is a proper subset of a set in i+1. This is written as i i+1 (1) 10 / 46
  20. 20. Images/cinvestav- Nested Clustering Definition 1 A clustering i containing k clusters is said to be nested in the clustering i+1, which contains r < k clusters, if each cluster in i, it is a subset of a set in i+1. 2 At least one cluster at i is a proper subset of a set in i+1. This is written as i i+1 (1) 10 / 46
  21. 21. Images/cinvestav- Example We have The following set{x1, x2, x3, x4, x5}. With the following structures 1 = {{x1, x3} , {x4} , {x2, x5}} 2 = {{x1, x3, x4} , {x2, x5}} Again Hierarchical Clustering produces a hierarchy of clusterings!!! 11 / 46
  22. 22. Images/cinvestav- Example We have The following set{x1, x2, x3, x4, x5}. With the following structures 1 = {{x1, x3} , {x4} , {x2, x5}} 2 = {{x1, x3, x4} , {x2, x5}} Again Hierarchical Clustering produces a hierarchy of clusterings!!! 11 / 46
  23. 23. Images/cinvestav- Example We have The following set{x1, x2, x3, x4, x5}. With the following structures 1 = {{x1, x3} , {x4} , {x2, x5}} 2 = {{x1, x3, x4} , {x2, x5}} Again Hierarchical Clustering produces a hierarchy of clusterings!!! 11 / 46
  24. 24. Images/cinvestav- Example We have The following set{x1, x2, x3, x4, x5}. With the following structures 1 = {{x1, x3} , {x4} , {x2, x5}} 2 = {{x1, x3, x4} , {x2, x5}} Again Hierarchical Clustering produces a hierarchy of clusterings!!! 11 / 46
  25. 25. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 12 / 46
  26. 26. Images/cinvestav- Agglomerative Algorithms. Initial State You have N clusters each containing an element of the data X. At each step i, you have an i structure with N − i. Then, a new clustering structure i+1 is generated. Thus 13 / 46
  27. 27. Images/cinvestav- Agglomerative Algorithms. Initial State You have N clusters each containing an element of the data X. At each step i, you have an i structure with N − i. Then, a new clustering structure i+1 is generated. Thus 13 / 46
  28. 28. Images/cinvestav- Agglomerative Algorithms. Initial State You have N clusters each containing an element of the data X. At each step i, you have an i structure with N − i. Then, a new clustering structure i+1 is generated. Thus 13 / 46
  29. 29. Images/cinvestav- Agglomerative Algorithms. Initial State You have N clusters each containing an element of the data X. At each step i, you have an i structure with N − i. Then, a new clustering structure i+1 is generated. Thus 13 / 46
  30. 30. Images/cinvestav- In that way... We have At each step, we have that each cluster i is a proper subset of a cluste in i or i i+1 (2) 14 / 46
  31. 31. Images/cinvestav- The Basic Algorithm for Agglomerative For this We have a function g (Ci, Cj) defined in all pair of cluster to measure similarity or dissimilarity. t denotes the current level of the hierarchy. Algorithm I n i t i a l i z a t i o n Choose 0 = {Ci = {xi } , i = 1, ..., N} t = 0 Repeat t = t + 1 Find one p a i r of c l u s t e r s (Cr , Cs) i n t−1 such that g(Ci , Cj ) = max, min of a s i m i l l a r i t y or d i s s i m i l a r i t y f u n c t i o n over a l l p a i r s Define Cq = Ci ∪ Cj , t = t−1 − Ci , Cj ∪ Cq U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r 15 / 46
  32. 32. Images/cinvestav- The Basic Algorithm for Agglomerative For this We have a function g (Ci, Cj) defined in all pair of cluster to measure similarity or dissimilarity. t denotes the current level of the hierarchy. Algorithm I n i t i a l i z a t i o n Choose 0 = {Ci = {xi } , i = 1, ..., N} t = 0 Repeat t = t + 1 Find one p a i r of c l u s t e r s (Cr , Cs) i n t−1 such that g(Ci , Cj ) = max, min of a s i m i l l a r i t y or d i s s i m i l a r i t y f u n c t i o n over a l l p a i r s Define Cq = Ci ∪ Cj , t = t−1 − Ci , Cj ∪ Cq U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r 15 / 46
  33. 33. Images/cinvestav- The Basic Algorithm for Agglomerative For this We have a function g (Ci, Cj) defined in all pair of cluster to measure similarity or dissimilarity. t denotes the current level of the hierarchy. Algorithm I n i t i a l i z a t i o n Choose 0 = {Ci = {xi } , i = 1, ..., N} t = 0 Repeat t = t + 1 Find one p a i r of c l u s t e r s (Cr , Cs) i n t−1 such that g(Ci , Cj ) = max, min of a s i m i l l a r i t y or d i s s i m i l a r i t y f u n c t i o n over a l l p a i r s Define Cq = Ci ∪ Cj , t = t−1 − Ci , Cj ∪ Cq U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r 15 / 46
  34. 34. Images/cinvestav- Enforcing Nesting Note the following “We can say that if two vectors come together into a single cluster at level t of the hierarchy, they will remain in the same cluster for all subsequent clusterings.” Thus 0 1 2 ... N−1 N (3) Hurra!!! Enforcing the nesting property!!! 16 / 46
  35. 35. Images/cinvestav- Enforcing Nesting Note the following “We can say that if two vectors come together into a single cluster at level t of the hierarchy, they will remain in the same cluster for all subsequent clusterings.” Thus 0 1 2 ... N−1 N (3) Hurra!!! Enforcing the nesting property!!! 16 / 46
  36. 36. Images/cinvestav- Enforcing Nesting Note the following “We can say that if two vectors come together into a single cluster at level t of the hierarchy, they will remain in the same cluster for all subsequent clusterings.” Thus 0 1 2 ... N−1 N (3) Hurra!!! Enforcing the nesting property!!! 16 / 46
  37. 37. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 17 / 46
  38. 38. Images/cinvestav- Problems with Agglomerative Algorithms First - Related to Nesting Property No way to recover from a “poor” clustering that may have occurred in an earlier level of the hierarchy. Second At each level t, there are N − t clusters. Thus at level t+1 the total number of pairs compared. N − t 2 = (N − t) (N − t − 1) 2 (4) Total Number of pairs compared are N−1 t=0 N − t 2 (5) 18 / 46
  39. 39. Images/cinvestav- Problems with Agglomerative Algorithms First - Related to Nesting Property No way to recover from a “poor” clustering that may have occurred in an earlier level of the hierarchy. Second At each level t, there are N − t clusters. Thus at level t+1 the total number of pairs compared. N − t 2 = (N − t) (N − t − 1) 2 (4) Total Number of pairs compared are N−1 t=0 N − t 2 (5) 18 / 46
  40. 40. Images/cinvestav- Problems with Agglomerative Algorithms First - Related to Nesting Property No way to recover from a “poor” clustering that may have occurred in an earlier level of the hierarchy. Second At each level t, there are N − t clusters. Thus at level t+1 the total number of pairs compared. N − t 2 = (N − t) (N − t − 1) 2 (4) Total Number of pairs compared are N−1 t=0 N − t 2 (5) 18 / 46
  41. 41. Images/cinvestav- Problems with Agglomerative Algorithms First - Related to Nesting Property No way to recover from a “poor” clustering that may have occurred in an earlier level of the hierarchy. Second At each level t, there are N − t clusters. Thus at level t+1 the total number of pairs compared. N − t 2 = (N − t) (N − t − 1) 2 (4) Total Number of pairs compared are N−1 t=0 N − t 2 (5) 18 / 46
  42. 42. Images/cinvestav- Problems with Agglomerative Algorithms First - Related to Nesting Property No way to recover from a “poor” clustering that may have occurred in an earlier level of the hierarchy. Second At each level t, there are N − t clusters. Thus at level t+1 the total number of pairs compared. N − t 2 = (N − t) (N − t − 1) 2 (4) Total Number of pairs compared are N−1 t=0 N − t 2 (5) 18 / 46
  43. 43. Images/cinvestav- Thus We have that N−1 t=0 N − t 2 = N k=1 k 2 = (N − 1) N (N + 1) 6 (6) Thus The complexity of this schema is O N3 However You still depend on the nature of g. 19 / 46
  44. 44. Images/cinvestav- Thus We have that N−1 t=0 N − t 2 = N k=1 k 2 = (N − 1) N (N + 1) 6 (6) Thus The complexity of this schema is O N3 However You still depend on the nature of g. 19 / 46
  45. 45. Images/cinvestav- Thus We have that N−1 t=0 N − t 2 = N k=1 k 2 = (N − 1) N (N + 1) 6 (6) Thus The complexity of this schema is O N3 However You still depend on the nature of g. 19 / 46
  46. 46. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 20 / 46
  47. 47. Images/cinvestav- Two Categories of Agglomerative Algorithms There are two 1 Matrix Theory Based. 2 Graph Theory Concepts. Matrix Theory Based As the name says, they are based in dissimilarity matrix P0 = P (X) of N × N. At each merging the matrix is reduced by one level ⇒ Pt becomes a N − t × N − t matrix. 21 / 46
  48. 48. Images/cinvestav- Two Categories of Agglomerative Algorithms There are two 1 Matrix Theory Based. 2 Graph Theory Concepts. Matrix Theory Based As the name says, they are based in dissimilarity matrix P0 = P (X) of N × N. At each merging the matrix is reduced by one level ⇒ Pt becomes a N − t × N − t matrix. 21 / 46
  49. 49. Images/cinvestav- Two Categories of Agglomerative Algorithms There are two 1 Matrix Theory Based. 2 Graph Theory Concepts. Matrix Theory Based As the name says, they are based in dissimilarity matrix P0 = P (X) of N × N. At each merging the matrix is reduced by one level ⇒ Pt becomes a N − t × N − t matrix. 21 / 46
  50. 50. Images/cinvestav- Two Categories of Agglomerative Algorithms There are two 1 Matrix Theory Based. 2 Graph Theory Concepts. Matrix Theory Based As the name says, they are based in dissimilarity matrix P0 = P (X) of N × N. At each merging the matrix is reduced by one level ⇒ Pt becomes a N − t × N − t matrix. 21 / 46
  51. 51. Images/cinvestav- Matrix Based Algorithm Matrix Updating Algorithmic Scheme (MUAS) I n i t i a l i z a t i o n Choose 0 = {Ci = {xi} , i = 1, ..., N} P0 = P(X) t = 0 Repeat t = t + 1 Find one p a i r of c l u s t e r s (Cr , Cs) i n t−1 such that d(Ci, Cj) = minr,s=1,..,N,r=s d(Cr , Cs) Define Cq = Ci ∪ Cj , t = t−1 − {Ci, Cj} ∪ Cq Create Pt by s t r a t e g y U n t i l a l v e c t o r s i n a s i n g l e c l u s t e r 22 / 46
  52. 52. Images/cinvestav- Matrix Based Algorithm Strategy 1 Delete the two rows and columns that correspond to the merged clusters. 2 Add new row and a new column that contain the distances between the newly formed cluster and the old (unaffected at this level) clusters. 23 / 46
  53. 53. Images/cinvestav- Matrix Based Algorithm Strategy 1 Delete the two rows and columns that correspond to the merged clusters. 2 Add new row and a new column that contain the distances between the newly formed cluster and the old (unaffected at this level) clusters. 23 / 46
  54. 54. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  55. 55. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  56. 56. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  57. 57. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  58. 58. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  59. 59. Images/cinvestav- Distance Used in These Schemes It has been pointed out that there is only one general distance for these algorithms d (Cq, Cs) =aid (Ci, Cs) + ajd (Cj, Cs) + ... bd (Ci, Cj) + c |d (Ci, Cs) − d (Cj, Cs)| Where different values of ai, aj, b and c correspond to different choices of the dissimilarity measures. Using this distance is possible to generate several algorithms 1 The single link algorithm. 2 The complete link algorithm. 3 The weighted pair group method average. 4 The unweighted pair group method centroid. 5 Etc... 24 / 46
  60. 60. Images/cinvestav- For example The single link algorithm This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2 Thus, we have d (Cq, Cs) = min {d (Ci, Cs) , d (Cj, Cs)} (7) Please look at the example in the Dropbox It is an interesting example. 25 / 46
  61. 61. Images/cinvestav- For example The single link algorithm This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2 Thus, we have d (Cq, Cs) = min {d (Ci, Cs) , d (Cj, Cs)} (7) Please look at the example in the Dropbox It is an interesting example. 25 / 46
  62. 62. Images/cinvestav- For example The single link algorithm This is obtained if we set ai = 1/2, aj = 1/2, b = 0, c = −1/2 Thus, we have d (Cq, Cs) = min {d (Ci, Cs) , d (Cj, Cs)} (7) Please look at the example in the Dropbox It is an interesting example. 25 / 46
  63. 63. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Consider the following 1 Each node in the graph G correspond to a vector. 2 Cluster are formed by connecting nodes. 3 Certain property, h (k), needs to be respected. Common Properties: Node Connectivity The node connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no nodes in common. 26 / 46
  64. 64. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Consider the following 1 Each node in the graph G correspond to a vector. 2 Cluster are formed by connecting nodes. 3 Certain property, h (k), needs to be respected. Common Properties: Node Connectivity The node connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no nodes in common. 26 / 46
  65. 65. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Consider the following 1 Each node in the graph G correspond to a vector. 2 Cluster are formed by connecting nodes. 3 Certain property, h (k), needs to be respected. Common Properties: Node Connectivity The node connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no nodes in common. 26 / 46
  66. 66. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Consider the following 1 Each node in the graph G correspond to a vector. 2 Cluster are formed by connecting nodes. 3 Certain property, h (k), needs to be respected. Common Properties: Node Connectivity The node connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no nodes in common. 26 / 46
  67. 67. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Common Properties: Edge Connectivity The edge connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no edges in common. Common Properties: Node Degree The degree of a connected subgraph is the largest integer k such that each node has at least k incident edges. 27 / 46
  68. 68. Images/cinvestav- Agglomerative Algorithms Based on Graph Theory Common Properties: Edge Connectivity The edge connectivity of a connected subgraph is the largest integer k such that all pairs of nodes are joined by at least k paths having no edges in common. Common Properties: Node Degree The degree of a connected subgraph is the largest integer k such that each node has at least k incident edges. 27 / 46
  69. 69. Images/cinvestav- Basically, We use the Same Scheme, But... The function gh(k) (Cr , Cs) = min x∈Cr ,y∈Cs {d (x, y) |Property} (8) Property The G subgraph defined by Cr ∪ Cs is 1 It is connected and either 1 It has the property h(k) or 2 It is complete 28 / 46
  70. 70. Images/cinvestav- Basically, We use the Same Scheme, But... The function gh(k) (Cr , Cs) = min x∈Cr ,y∈Cs {d (x, y) |Property} (8) Property The G subgraph defined by Cr ∪ Cs is 1 It is connected and either 1 It has the property h(k) or 2 It is complete 28 / 46
  71. 71. Images/cinvestav- Basically, We use the Same Scheme, But... The function gh(k) (Cr , Cs) = min x∈Cr ,y∈Cs {d (x, y) |Property} (8) Property The G subgraph defined by Cr ∪ Cs is 1 It is connected and either 1 It has the property h(k) or 2 It is complete 28 / 46
  72. 72. Images/cinvestav- Basically, We use the Same Scheme, But... The function gh(k) (Cr , Cs) = min x∈Cr ,y∈Cs {d (x, y) |Property} (8) Property The G subgraph defined by Cr ∪ Cs is 1 It is connected and either 1 It has the property h(k) or 2 It is complete 28 / 46
  73. 73. Images/cinvestav- Basically, We use the Same Scheme, But... The function gh(k) (Cr , Cs) = min x∈Cr ,y∈Cs {d (x, y) |Property} (8) Property The G subgraph defined by Cr ∪ Cs is 1 It is connected and either 1 It has the property h(k) or 2 It is complete 28 / 46
  74. 74. Images/cinvestav- Examples Examples 1 Single Link Algorithm 2 Complete Link Algorithm There is other style of clustering Clustering Algorithms Based on the Minimum Spanning Tree 29 / 46
  75. 75. Images/cinvestav- Examples Examples 1 Single Link Algorithm 2 Complete Link Algorithm There is other style of clustering Clustering Algorithms Based on the Minimum Spanning Tree 29 / 46
  76. 76. Images/cinvestav- Examples Examples 1 Single Link Algorithm 2 Complete Link Algorithm There is other style of clustering Clustering Algorithms Based on the Minimum Spanning Tree 29 / 46
  77. 77. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 30 / 46
  78. 78. Images/cinvestav- Divisive Algorithms Reverse Strategy Start with a single cluster split it iteratively. 31 / 46
  79. 79. Images/cinvestav- Generalized Divisive Scheme Algorithm PROBLEM what is wrong!!! I n i t i a l i z a t i o n Choose 0 = {X} P0 = P(X) t = 0 Repeat t = t + 1 For i = 1 to t Given a p a r t i t i o n Ct−1, i Generate a l l p o s s i b l e c l u s t e r s next i Find the p a i r C1 t−1,j, C2 t−1,j that maximize g Create t = t−1 − {Ct−1,j} ∪ C1 t−1,j, C2 t−1,j U n t i l a l l v e c t o r s l i e i n a s i n g l e c l u s t e r 32 / 46
  80. 80. Images/cinvestav- Outline 1 Hierarchical Clustering Definition Basic Ideas 2 Agglomerative Algorithms Introduction Problems with Agglomerative Algorithms Two Categories of Agglomerative Algorithms Matrix Based Algorithms Graph Based Algorithms 3 Divisive Algorithms Introduction 4 Algorithms for Large Data Sets Introduction Clustering Using REpresentatives (CURE) 33 / 46
  81. 81. Images/cinvestav- Algorithms for Large Data Sets There are several 1 The CURE Algorithm 2 The ROCK Algorithm 3 The Chameleon Algorithm 4 The BIRCH Algorithm 34 / 46
  82. 82. Images/cinvestav- Algorithms for Large Data Sets There are several 1 The CURE Algorithm 2 The ROCK Algorithm 3 The Chameleon Algorithm 4 The BIRCH Algorithm 34 / 46
  83. 83. Images/cinvestav- Algorithms for Large Data Sets There are several 1 The CURE Algorithm 2 The ROCK Algorithm 3 The Chameleon Algorithm 4 The BIRCH Algorithm 34 / 46
  84. 84. Images/cinvestav- Algorithms for Large Data Sets There are several 1 The CURE Algorithm 2 The ROCK Algorithm 3 The Chameleon Algorithm 4 The BIRCH Algorithm 34 / 46
  85. 85. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Idea Each cluster Ci has a set of representatives RCi = x (i) 1 , x (i) 2 , ..., x (i) K with K > 1. What is happening By using multiple representatives for each cluster, the CURE algorithm tries to “capture” the shape of each one. However In order to avoid taking into account irregularities (For example, outliers) in the border of the cluster. The initially chosen representatives are “pushed” toward the mean of the cluster. 35 / 46
  86. 86. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Idea Each cluster Ci has a set of representatives RCi = x (i) 1 , x (i) 2 , ..., x (i) K with K > 1. What is happening By using multiple representatives for each cluster, the CURE algorithm tries to “capture” the shape of each one. However In order to avoid taking into account irregularities (For example, outliers) in the border of the cluster. The initially chosen representatives are “pushed” toward the mean of the cluster. 35 / 46
  87. 87. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Idea Each cluster Ci has a set of representatives RCi = x (i) 1 , x (i) 2 , ..., x (i) K with K > 1. What is happening By using multiple representatives for each cluster, the CURE algorithm tries to “capture” the shape of each one. However In order to avoid taking into account irregularities (For example, outliers) in the border of the cluster. The initially chosen representatives are “pushed” toward the mean of the cluster. 35 / 46
  88. 88. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Idea Each cluster Ci has a set of representatives RCi = x (i) 1 , x (i) 2 , ..., x (i) K with K > 1. What is happening By using multiple representatives for each cluster, the CURE algorithm tries to “capture” the shape of each one. However In order to avoid taking into account irregularities (For example, outliers) in the border of the cluster. The initially chosen representatives are “pushed” toward the mean of the cluster. 35 / 46
  89. 89. Images/cinvestav- Therfore This action is known As “Shrinking” in the sense that the volume of space “defined” by the representatives is shrunk toward the mean of the cluster. 36 / 46
  90. 90. Images/cinvestav- Shrinking Process Given a cluster C Select the point x ∈ C with the maximum distance from the mean of C and set RC = {x} (the set of representatives). Then 1 For i = 2 to min {K, nC } 2 Determine y ∈ C − RC that lies farthest from the points in RC 3 RC = RC ∪ {y} 37 / 46
  91. 91. Images/cinvestav- Shrinking Process Given a cluster C Select the point x ∈ C with the maximum distance from the mean of C and set RC = {x} (the set of representatives). Then 1 For i = 2 to min {K, nC } 2 Determine y ∈ C − RC that lies farthest from the points in RC 3 RC = RC ∪ {y} 37 / 46
  92. 92. Images/cinvestav- Shrinking Process Given a cluster C Select the point x ∈ C with the maximum distance from the mean of C and set RC = {x} (the set of representatives). Then 1 For i = 2 to min {K, nC } 2 Determine y ∈ C − RC that lies farthest from the points in RC 3 RC = RC ∪ {y} 37 / 46
  93. 93. Images/cinvestav- Shrinking Process Given a cluster C Select the point x ∈ C with the maximum distance from the mean of C and set RC = {x} (the set of representatives). Then 1 For i = 2 to min {K, nC } 2 Determine y ∈ C − RC that lies farthest from the points in RC 3 RC = RC ∪ {y} 37 / 46
  94. 94. Images/cinvestav- Shrinking Process Given a cluster C Select the point x ∈ C with the maximum distance from the mean of C and set RC = {x} (the set of representatives). Then 1 For i = 2 to min {K, nC } 2 Determine y ∈ C − RC that lies farthest from the points in RC 3 RC = RC ∪ {y} 37 / 46
  95. 95. Images/cinvestav- Shrinking Process Do the Shrinking Shrink the points x ∈ RC toward the mean mC in C by a factor α. Actually x = (1 − α) x + αmC ∀x ∈ RC (9) 38 / 46
  96. 96. Images/cinvestav- Shrinking Process Do the Shrinking Shrink the points x ∈ RC toward the mean mC in C by a factor α. Actually x = (1 − α) x + αmC ∀x ∈ RC (9) 38 / 46
  97. 97. Images/cinvestav- Resulting set RC Thus The resulting set RC is the set of representatives of C. Thus the distance between two cluster is defined as d (Ci, Cj) = min x∈RCi ,y∈RCj d (x, y) (10) 39 / 46
  98. 98. Images/cinvestav- Resulting set RC Thus The resulting set RC is the set of representatives of C. Thus the distance between two cluster is defined as d (Ci, Cj) = min x∈RCi ,y∈RCj d (x, y) (10) 39 / 46
  99. 99. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  100. 100. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  101. 101. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  102. 102. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  103. 103. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  104. 104. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  105. 105. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  106. 106. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  107. 107. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  108. 108. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  109. 109. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  110. 110. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  111. 111. Images/cinvestav- Clustering Using REpresentatives (CURE) Basic Algorithm Input : A set of points X = {x1, x2, ..., xN } Output : C clusters 1 For every cluster Ci = {xi} store Ci.mC = {xi} and Ci.RC = {xi} 2 Ci.closest stores the cluster closest to Ci. 3 All the input points are inserted into a K − d tree T. 4 Insert each cluster into the heap Q. (Clusters are arranged in increasing order of distances between Ci and Ci.closest). 5 While size(Q) > C 6 Remove the top element of Q, Ci and merge it with Cj == Ci.closest. 7 Then compute the new representative points for the merged cluster Ck = Ci ∪ Cj. 8 Also remove Ci and Cj from T and Q. 9 Also for all the clusters Ch ∈ Q, update Ch.closest and relocate Ch. 10 insert Ck into Q 40 / 46
  112. 112. Images/cinvestav- Complexity of Cure Too Prohibitive O N2 log2 N (11) 41 / 46
  113. 113. Images/cinvestav- Possible Solution CURE does the following The technique adopted by the CURE algorithm, in order to reduce the computational complexity, is that of random sampling. Actually That is, a sample set X is created from X, by choosing randomly N out of the N points of X. However, one has to ensure that the probability of missing a cluster of X, due to this sampling This can be guaranteed if the number of points N is sufficiently large. 42 / 46
  114. 114. Images/cinvestav- Possible Solution CURE does the following The technique adopted by the CURE algorithm, in order to reduce the computational complexity, is that of random sampling. Actually That is, a sample set X is created from X, by choosing randomly N out of the N points of X. However, one has to ensure that the probability of missing a cluster of X, due to this sampling This can be guaranteed if the number of points N is sufficiently large. 42 / 46
  115. 115. Images/cinvestav- Possible Solution CURE does the following The technique adopted by the CURE algorithm, in order to reduce the computational complexity, is that of random sampling. Actually That is, a sample set X is created from X, by choosing randomly N out of the N points of X. However, one has to ensure that the probability of missing a cluster of X, due to this sampling This can be guaranteed if the number of points N is sufficiently large. 42 / 46
  116. 116. Images/cinvestav- Then Having estimated N CURE forms a number of p = N N sample data sets by successive random samples. In other words In other words, X is partitioned randomly in p subsets. For this a parameter q is selected Then, the points in each partition are clustered until N q clusters are formed or the distance between the closest pair of clusters to be merged in the next iteration step exceeds a user-defined threshold. 43 / 46
  117. 117. Images/cinvestav- Then Having estimated N CURE forms a number of p = N N sample data sets by successive random samples. In other words In other words, X is partitioned randomly in p subsets. For this a parameter q is selected Then, the points in each partition are clustered until N q clusters are formed or the distance between the closest pair of clusters to be merged in the next iteration step exceeds a user-defined threshold. 43 / 46
  118. 118. Images/cinvestav- Then Having estimated N CURE forms a number of p = N N sample data sets by successive random samples. In other words In other words, X is partitioned randomly in p subsets. For this a parameter q is selected Then, the points in each partition are clustered until N q clusters are formed or the distance between the closest pair of clusters to be merged in the next iteration step exceeds a user-defined threshold. 43 / 46
  119. 119. Images/cinvestav- Once this has been finished A second clustering pass is done One the at most pN q = N q clusters from all the subsets. The Goal To apply the merging procedure described previously to all (at most) N q so that we end up with the required final number, m, of clusters. Finally Each point x in the data set, X, that is not used as a representative in any one of the m clusters is assigned to one of them according to the following strategy. 44 / 46
  120. 120. Images/cinvestav- Once this has been finished A second clustering pass is done One the at most pN q = N q clusters from all the subsets. The Goal To apply the merging procedure described previously to all (at most) N q so that we end up with the required final number, m, of clusters. Finally Each point x in the data set, X, that is not used as a representative in any one of the m clusters is assigned to one of them according to the following strategy. 44 / 46
  121. 121. Images/cinvestav- Once this has been finished A second clustering pass is done One the at most pN q = N q clusters from all the subsets. The Goal To apply the merging procedure described previously to all (at most) N q so that we end up with the required final number, m, of clusters. Finally Each point x in the data set, X, that is not used as a representative in any one of the m clusters is assigned to one of them according to the following strategy. 44 / 46
  122. 122. Images/cinvestav- Finally First A random sample of representative points from each of the m clusters is chosen. Then Then, based on the previous representatives the point x is assigned to the cluster that contains the representative closest to it. Experiments reported by Guha et al. show that CURE It is sensitive to parameter selection. Specifically K must be large enough to capture the geometry of each cluster. In addition, N must be higher than a certain percentage ≈ 2.5% of N. 45 / 46
  123. 123. Images/cinvestav- Finally First A random sample of representative points from each of the m clusters is chosen. Then Then, based on the previous representatives the point x is assigned to the cluster that contains the representative closest to it. Experiments reported by Guha et al. show that CURE It is sensitive to parameter selection. Specifically K must be large enough to capture the geometry of each cluster. In addition, N must be higher than a certain percentage ≈ 2.5% of N. 45 / 46
  124. 124. Images/cinvestav- Finally First A random sample of representative points from each of the m clusters is chosen. Then Then, based on the previous representatives the point x is assigned to the cluster that contains the representative closest to it. Experiments reported by Guha et al. show that CURE It is sensitive to parameter selection. Specifically K must be large enough to capture the geometry of each cluster. In addition, N must be higher than a certain percentage ≈ 2.5% of N. 45 / 46
  125. 125. Images/cinvestav- Finally First A random sample of representative points from each of the m clusters is chosen. Then Then, based on the previous representatives the point x is assigned to the cluster that contains the representative closest to it. Experiments reported by Guha et al. show that CURE It is sensitive to parameter selection. Specifically K must be large enough to capture the geometry of each cluster. In addition, N must be higher than a certain percentage ≈ 2.5% of N. 45 / 46
  126. 126. Images/cinvestav- Finally First A random sample of representative points from each of the m clusters is chosen. Then Then, based on the previous representatives the point x is assigned to the cluster that contains the representative closest to it. Experiments reported by Guha et al. show that CURE It is sensitive to parameter selection. Specifically K must be large enough to capture the geometry of each cluster. In addition, N must be higher than a certain percentage ≈ 2.5% of N. 45 / 46
  127. 127. Images/cinvestav- Not only that The value of a affects also CURE Small values, CURE looks similar than a MST clustering. Large values, CURE resembles an algorithm with a single representative. Worst Case Complexity O N 2 log2 N (12) 46 / 46
  128. 128. Images/cinvestav- Not only that The value of a affects also CURE Small values, CURE looks similar than a MST clustering. Large values, CURE resembles an algorithm with a single representative. Worst Case Complexity O N 2 log2 N (12) 46 / 46
  • avinash_bharti01

    Nov. 8, 2019
  • choeungjin

    Aug. 17, 2016

Basic stuff about Unsupervised Hierarchical Clustering

Views

Total views

743

On Slideshare

0

From embeds

0

Number of embeds

41

Actions

Downloads

44

Shares

0

Comments

0

Likes

2

×