SlideShare a Scribd company logo
1 of 19
PPrreesseenntteedd BByy,, 
AAsshhwwiinn SShheennooyy MM 
44CCBB1133SSCCSS0022 
1
 Partitioning methods 
 HHiieerraarrcchhiiccaall mmeetthhooddss 
 DDeennssiittyy--bbaasseedd mmeetthhooddss 
 GGrriidd--bbaasseedd mmeetthhooddss 
 MMooddeell--bbaasseedd mmeetthhooddss 
(conceptual clustering, 
neural networks) 
2
 AA ppaarrttiittiioonniinngg mmeetthhoodd:: construct a partition of 
a database D of nn objects into a set of k clusters 
such that 
o each cluster contains at least one object 
o each object belongs to exactly one cluster 
3
mmeetthhooddss: 
o kk--mmeeaannss : Each cluster is 
represented by the center of the 
cluster (cceennttrrooiidd). 
o kk--mmeeddooiiddss : Each cluster is 
represented by one of the objects in 
the cluster (mmeeddooiidd). 
4
 IInnppuutt ttoo tthhee aallggoorriitthhmm: the number of clusters k, 
and a database of n objects 
 AAllggoorriitthhmm ccoonnssiissttss ooff ffoouurr sstteeppss: 
1. partition object into k nonempty 
subsets/clusters 
2. compute a seed points as the cceennttrrooiidd (the 
mean of the objects in the cluster) for each 
cluster in the current partition 
3. assign each object to the cluster with the 
nearest centroid 
4. go back to Step 2, stop when there are no 
more new assignments 
5
Alternative algorithm aallssoo ccoonnssiissttss ooff ffoouurr sstteeppss: 
1. arbitrarily choose k objects as the initial 
cluster centers (centroids) 
2. (re)assign each object to the cluster with the 
nearest centroid 
3. update the centroids 
4. go back to Step 2, stop when there are no 
more new assignments 
6
7 
10 
9 
8 
7 
6 
5 
4 
3 
2 
1 
0 
10 
9 
8 
7 
6 
5 
4 
3 
2 
1 
0 1 2 3 4 5 6 7 8 9 10 0 
0 1 2 3 4 5 6 7 8 9 10 
10 
9 
8 
7 
6 
5 
4 
3 
2 
1 
0 
0 1 2 3 4 5 6 7 8 9 10 
10 
9 
8 
7 
6 
5 
4 
3 
2 
1 
0 
0 1 2 3 4 5 6 7 8 9 10
 IInnppuutt ttoo tthhee aallggoorriitthhmm: the number of clusters k, 
and a database of n objects 
 AAllggoorriitthhmm ccoonnssiissttss ooff ffoouurr sstteeppss: 
1. arbitrarily choose k objects as the initial 
mmeeddooiiddss (representative objects) 
2. assign each remaining object to the cluster 
with the nearest medoid 
3. select a nonmedoid and replace one of the 
medoids with it if this improves the clustering 
4. go back to Step 2, stop when there are no 
more new assignments 
8
 AA hhiieerraarrcchhiiccaall mmeetthhoodd:: construct a hierarchy of 
clustering, not just a single partition of objects. 
 The number of clusters kk is not required as an 
input . 
 Use a ddiissttaannccee mmaattrriixx as clustering criteria 
 A tteerrmmiinnaattiioonn ccoonnddiittiioonn can be used (e.g., a 
number of clusters) 
9
11..aagggglloommeerraattiivvee (bottom-up): 
oplace each object in its own cluster. 
omerge in each step the two most similar clusters until 
there is only one cluster left or the termination 
condition is satisfied. 
22..ddiivviissiivvee (top-down): 
ostart with one big cluster containing all the objects 
odivide the most distinctive cluster into smaller 
clusters and proceed until there are n clusters or the 
termination condition is satisfied. 
10
11 
Step 0 Step 1 Step 2 Step 3 Step 4 
a a b 
b 
c 
d 
e 
d e 
a b c d e 
c d e 
Step 4 Step 3 Step 2 Step 1 Step 0 
aagggglloommeerraattiivvee 
ddiivviissiivvee
 Birch: Balanced Iterative Reducing and Clustering using 
Hierarchies. 
 Incrementally construct a CF (Clustering Feature) tree, a 
hierarchical data structure for multiphase clustering 
Phase 1: scan DB to build an initial in-memory CF tree 
(a multi-level compression of the data that tries to 
preserve the inherent clustering structure of the data). 
Phase 2: use an arbitrary clustering algorithm to 
cluster the leaf nodes of the CF-tree. 
12
 Clustering feature: 
• summary of the statistics for a given subcluster: the 0-th, 1st and 
2nd moments of the subcluster from the statistical point of view. 
• registers crucial measurements for computing cluster and utilizes 
storage efficiently 
A CF tree is a height-balanced tree that stores the clustering features 
for a hierarchical clustering 
• A nonleaf node in a tree has descendants or “children” 
• The nonleaf nodes store sums of the CFs of their children 
 A CF tree has two parameters 
• Branching factor: specify the maximum number of children. 
• threshold: max diameter of sub-clusters stored at the leaf nodes 
13
14 
CF1 
child1 
Root 
CF3 
child3 
CF2 
child2 
CF6 
child6 
CF1 
child1 
Non-leaf node 
CF3 
child3 
CF2 
child2 
CF5 
child5 
B = 7 
L = 6 
Leaf node Leaf node 
CF1 CF2 CF6 prev next CF1 CF2 CF4 prev next
 ROCK: RObust Clustering using linKs 
 Major ideas 
• Use links to measure similarity/proximity. 
15
 Links: # of common neighbors 
• C1 <a, b, c, d, e>: {a, b, c}, {a, b, d}, {a, b, e}, {a, c, d}, {a, c, e}, {a, 
d, e}, {b, c, d}, {b, c, e}, {b, d, e}, {c, d, e} 
• C2 <a, b, f, g>: {a, b, f}, {a, b, g}, {a, f, g}, {b, f, g} 
 Let T1 = {a, b, c}, T2 = {c, d, e}, T3 = {a, b, f} 
• link(T1, T2) = 4, since they have 4 common neighbors 
 {a, c, d}, {a, c, e}, {b, c, d}, {b, c, e} 
• link(T1, T3) = 3, since they have 3 common neighbors 
 {a, b, d}, {a, b, e}, {a, b, g} 
 Thus link is a better measure than Jaccard coefficient 
16
 Measures the similarity based on a dynamic model 
• Two clusters are merged only if the interconnectivity and 
closeness (proximity) between two clusters are high relative to the 
internal interconnectivity of the clusters and closeness of items 
within the clusters 
 A two-phase algorithm 
1. Use a graph partitioning algorithm: cluster objects into a large 
number of relatively small sub-clusters 
2. Use an agglomerative hierarchical clustering algorithm: find the 
genuine clusters by repeatedly combining these sub-clusters 
17
18 
Construct 
Sparse Graph Partition the Graph 
Merge Partition 
Final Clusters 
Data Set
THANK YOU 
19

More Related Content

What's hot

3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methodsKrish_ver2
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)Pravinkumar Landge
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Mustafa Sherazi
 
New Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmNew Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmEditor IJCATR
 
Large Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewLarge Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewVahid Mirjalili
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clusteringKrish_ver2
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysisDataminingTools Inc
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithmhadifar
 
3.6 constraint based cluster analysis
3.6 constraint based cluster analysis3.6 constraint based cluster analysis
3.6 constraint based cluster analysisKrish_ver2
 
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...ijscmc
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 
K means Clustering
K means ClusteringK means Clustering
K means ClusteringEdureka!
 
Document clustering and classification
Document clustering and classification Document clustering and classification
Document clustering and classification Mahmoud Alfarra
 

What's hot (20)

3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
Unsupervised Learning
Unsupervised LearningUnsupervised Learning
Unsupervised Learning
 
New Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmNew Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids Algorithm
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Large Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewLarge Scale Data Clustering: an overview
Large Scale Data Clustering: an overview
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clustering
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 
Clustering
ClusteringClustering
Clustering
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
 
3.6 constraint based cluster analysis
3.6 constraint based cluster analysis3.6 constraint based cluster analysis
3.6 constraint based cluster analysis
 
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
 
Document clustering and classification
Document clustering and classification Document clustering and classification
Document clustering and classification
 
Clustering
ClusteringClustering
Clustering
 

Viewers also liked

Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsAntonio Severien
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data miningZHAO Sam
 
Machine Learning and Data Mining: 08 Clustering: Hierarchical
Machine Learning and Data Mining: 08 Clustering: Hierarchical Machine Learning and Data Mining: 08 Clustering: Hierarchical
Machine Learning and Data Mining: 08 Clustering: Hierarchical Pier Luca Lanzi
 
Linear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsLinear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsHesen Peng
 
Marketing wartosci dodanej (added value marketing)
Marketing wartosci dodanej (added value marketing)Marketing wartosci dodanej (added value marketing)
Marketing wartosci dodanej (added value marketing)Shake Your Brand
 
фотоальбом дороги
фотоальбом дорогифотоальбом дороги
фотоальбом дорогиNASTIA_090583
 
Citizen Journalism (Myanmar Version) 2014
Citizen Journalism (Myanmar Version) 2014Citizen Journalism (Myanmar Version) 2014
Citizen Journalism (Myanmar Version) 2014YoYam Hnine
 
The Fail Lecture
The Fail LectureThe Fail Lecture
The Fail Lectureryansmc
 
Concentrated Stress & Stress Tensor
Concentrated Stress & Stress TensorConcentrated Stress & Stress Tensor
Concentrated Stress & Stress TensorHassan Raza
 

Viewers also liked (12)

Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
 
Machine Learning and Data Mining: 08 Clustering: Hierarchical
Machine Learning and Data Mining: 08 Clustering: Hierarchical Machine Learning and Data Mining: 08 Clustering: Hierarchical
Machine Learning and Data Mining: 08 Clustering: Hierarchical
 
Linear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsLinear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actions
 
Marketing wartosci dodanej (added value marketing)
Marketing wartosci dodanej (added value marketing)Marketing wartosci dodanej (added value marketing)
Marketing wartosci dodanej (added value marketing)
 
фотоальбом дороги
фотоальбом дорогифотоальбом дороги
фотоальбом дороги
 
Citizen Journalism (Myanmar Version) 2014
Citizen Journalism (Myanmar Version) 2014Citizen Journalism (Myanmar Version) 2014
Citizen Journalism (Myanmar Version) 2014
 
The Fail Lecture
The Fail LectureThe Fail Lecture
The Fail Lecture
 
Newforma by numbers
Newforma by numbersNewforma by numbers
Newforma by numbers
 
Concentrated Stress & Stress Tensor
Concentrated Stress & Stress TensorConcentrated Stress & Stress Tensor
Concentrated Stress & Stress Tensor
 
Job app
Job appJob app
Job app
 
Pwpn
PwpnPwpn
Pwpn
 

Similar to DATA MINING:Clustering Types

Clustering techniques
Clustering techniquesClustering techniques
Clustering techniquestalktoharry
 
ML basic &amp; clustering
ML basic &amp; clusteringML basic &amp; clustering
ML basic &amp; clusteringmonalisa Das
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsNithyananthSengottai
 
Mat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports DataMat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports DataKathleneNgo
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 5
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter 5 Data Mining:  Concepts and Techniques (3rd ed.)— Chapter 5
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 5 Salah Amean
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering108kaushik
 
hierarchical clustering.pptx
hierarchical clustering.pptxhierarchical clustering.pptx
hierarchical clustering.pptxPriyadharshiniG41
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.pptLPrashanthi
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in RSudhakar Chavan
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.pptvikassingh569137
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxnikshaikh786
 
K means clustering
K means clusteringK means clustering
K means clusteringKuppusamy P
 

Similar to DATA MINING:Clustering Types (20)

My8clst
My8clstMy8clst
My8clst
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 
ML basic &amp; clustering
ML basic &amp; clusteringML basic &amp; clustering
ML basic &amp; clustering
 
Clustering.pptx
Clustering.pptxClustering.pptx
Clustering.pptx
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
 
Mat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports DataMat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports Data
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 5
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter 5 Data Mining:  Concepts and Techniques (3rd ed.)— Chapter 5
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 5
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering
 
dm_clustering2.ppt
dm_clustering2.pptdm_clustering2.ppt
dm_clustering2.ppt
 
hierarchical clustering.pptx
hierarchical clustering.pptxhierarchical clustering.pptx
hierarchical clustering.pptx
 
Project PPT
Project PPTProject PPT
Project PPT
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.ppt
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
 
ClusetrigBasic.ppt
ClusetrigBasic.pptClusetrigBasic.ppt
ClusetrigBasic.ppt
 
Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptx
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Birch1
Birch1Birch1
Birch1
 

Recently uploaded

US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 

Recently uploaded (20)

US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 

DATA MINING:Clustering Types

  • 1. PPrreesseenntteedd BByy,, AAsshhwwiinn SShheennooyy MM 44CCBB1133SSCCSS0022 1
  • 2.  Partitioning methods  HHiieerraarrcchhiiccaall mmeetthhooddss  DDeennssiittyy--bbaasseedd mmeetthhooddss  GGrriidd--bbaasseedd mmeetthhooddss  MMooddeell--bbaasseedd mmeetthhooddss (conceptual clustering, neural networks) 2
  • 3.  AA ppaarrttiittiioonniinngg mmeetthhoodd:: construct a partition of a database D of nn objects into a set of k clusters such that o each cluster contains at least one object o each object belongs to exactly one cluster 3
  • 4. mmeetthhooddss: o kk--mmeeaannss : Each cluster is represented by the center of the cluster (cceennttrrooiidd). o kk--mmeeddooiiddss : Each cluster is represented by one of the objects in the cluster (mmeeddooiidd). 4
  • 5.  IInnppuutt ttoo tthhee aallggoorriitthhmm: the number of clusters k, and a database of n objects  AAllggoorriitthhmm ccoonnssiissttss ooff ffoouurr sstteeppss: 1. partition object into k nonempty subsets/clusters 2. compute a seed points as the cceennttrrooiidd (the mean of the objects in the cluster) for each cluster in the current partition 3. assign each object to the cluster with the nearest centroid 4. go back to Step 2, stop when there are no more new assignments 5
  • 6. Alternative algorithm aallssoo ccoonnssiissttss ooff ffoouurr sstteeppss: 1. arbitrarily choose k objects as the initial cluster centers (centroids) 2. (re)assign each object to the cluster with the nearest centroid 3. update the centroids 4. go back to Step 2, stop when there are no more new assignments 6
  • 7. 7 10 9 8 7 6 5 4 3 2 1 0 10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 0 0 1 2 3 4 5 6 7 8 9 10 10 9 8 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 10 9 8 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10
  • 8.  IInnppuutt ttoo tthhee aallggoorriitthhmm: the number of clusters k, and a database of n objects  AAllggoorriitthhmm ccoonnssiissttss ooff ffoouurr sstteeppss: 1. arbitrarily choose k objects as the initial mmeeddooiiddss (representative objects) 2. assign each remaining object to the cluster with the nearest medoid 3. select a nonmedoid and replace one of the medoids with it if this improves the clustering 4. go back to Step 2, stop when there are no more new assignments 8
  • 9.  AA hhiieerraarrcchhiiccaall mmeetthhoodd:: construct a hierarchy of clustering, not just a single partition of objects.  The number of clusters kk is not required as an input .  Use a ddiissttaannccee mmaattrriixx as clustering criteria  A tteerrmmiinnaattiioonn ccoonnddiittiioonn can be used (e.g., a number of clusters) 9
  • 10. 11..aagggglloommeerraattiivvee (bottom-up): oplace each object in its own cluster. omerge in each step the two most similar clusters until there is only one cluster left or the termination condition is satisfied. 22..ddiivviissiivvee (top-down): ostart with one big cluster containing all the objects odivide the most distinctive cluster into smaller clusters and proceed until there are n clusters or the termination condition is satisfied. 10
  • 11. 11 Step 0 Step 1 Step 2 Step 3 Step 4 a a b b c d e d e a b c d e c d e Step 4 Step 3 Step 2 Step 1 Step 0 aagggglloommeerraattiivvee ddiivviissiivvee
  • 12.  Birch: Balanced Iterative Reducing and Clustering using Hierarchies.  Incrementally construct a CF (Clustering Feature) tree, a hierarchical data structure for multiphase clustering Phase 1: scan DB to build an initial in-memory CF tree (a multi-level compression of the data that tries to preserve the inherent clustering structure of the data). Phase 2: use an arbitrary clustering algorithm to cluster the leaf nodes of the CF-tree. 12
  • 13.  Clustering feature: • summary of the statistics for a given subcluster: the 0-th, 1st and 2nd moments of the subcluster from the statistical point of view. • registers crucial measurements for computing cluster and utilizes storage efficiently A CF tree is a height-balanced tree that stores the clustering features for a hierarchical clustering • A nonleaf node in a tree has descendants or “children” • The nonleaf nodes store sums of the CFs of their children  A CF tree has two parameters • Branching factor: specify the maximum number of children. • threshold: max diameter of sub-clusters stored at the leaf nodes 13
  • 14. 14 CF1 child1 Root CF3 child3 CF2 child2 CF6 child6 CF1 child1 Non-leaf node CF3 child3 CF2 child2 CF5 child5 B = 7 L = 6 Leaf node Leaf node CF1 CF2 CF6 prev next CF1 CF2 CF4 prev next
  • 15.  ROCK: RObust Clustering using linKs  Major ideas • Use links to measure similarity/proximity. 15
  • 16.  Links: # of common neighbors • C1 <a, b, c, d, e>: {a, b, c}, {a, b, d}, {a, b, e}, {a, c, d}, {a, c, e}, {a, d, e}, {b, c, d}, {b, c, e}, {b, d, e}, {c, d, e} • C2 <a, b, f, g>: {a, b, f}, {a, b, g}, {a, f, g}, {b, f, g}  Let T1 = {a, b, c}, T2 = {c, d, e}, T3 = {a, b, f} • link(T1, T2) = 4, since they have 4 common neighbors  {a, c, d}, {a, c, e}, {b, c, d}, {b, c, e} • link(T1, T3) = 3, since they have 3 common neighbors  {a, b, d}, {a, b, e}, {a, b, g}  Thus link is a better measure than Jaccard coefficient 16
  • 17.  Measures the similarity based on a dynamic model • Two clusters are merged only if the interconnectivity and closeness (proximity) between two clusters are high relative to the internal interconnectivity of the clusters and closeness of items within the clusters  A two-phase algorithm 1. Use a graph partitioning algorithm: cluster objects into a large number of relatively small sub-clusters 2. Use an agglomerative hierarchical clustering algorithm: find the genuine clusters by repeatedly combining these sub-clusters 17
  • 18. 18 Construct Sparse Graph Partition the Graph Merge Partition Final Clusters Data Set