Rajia cluster analysis

REHANA RAJ
DFK1307
DEPT OF FISH PROCESSING TECHNOLOGY
COLLEGE OF FISHERIES
MANGALORE
CLUSTER ANALYSIS

 Cluster Analysis is a multivariate statistical techniques
in which large data set is segregated into several
groups based on homogeneity or similarity measures
 Cluster Analysis make sensible and informative
classification of an initially unclassified set of data
with desired accuracy, using the variable values
observed on each individual
 It saves lot of resource in terms of time, money etc

Before clustering After clustering

 To assign observations to groups (‘clusters’)
 To divide the observations into homogenous and
distinct groups
 To reduce the complexity of data

 Generates several groups of data set which are similar
 Homogeneous within the group and as much as
possible heterogeneous to other groups
 Normally, data consists of objects or persons
 Segregation is done based on more than two
variables.

 Hierarchical Clustering
 Centroid-based clustering
 Distribution-based clustering
 Density-based clustering

 Hierarchical clustering is a method of cluster analysis which
seeks to build a hierarchy of clusters.
 Two types:
 Agglomerative (bottom-top):
◦ Start with each document being a single cluster.
◦ Eventually all documents belong to the same cluster.
 Divisive (top-bottom):
◦ Start with all documents belong to the same cluster.
◦ Eventually each node forms a cluster on its own.
 No. of clusters need not be k.

 Construction of a tree-based hierarchical diagram
usually called dendrogram. E.g., In case of taxonomy
classification
animal
vertebrate
fish reptile amphib. mammal worm insect crustacean
invertebrate

 In this clustering, clusters are
represented by a central
vector, which may not
necessarily be a member of
the data set.
 Aims to partition on
observations into k clusters.
 Each observation belongs to
the cluster with the nearest
mean.
 Here, the no. of clusters is
fixed to k(k-means clustering)

 Clusters can be defined as objects belonging to same
distribution.
 It provides correlation and dependence of attributes.

 Clusters are based on density.
 Objects in these sparse areas - that are required to separate
clusters - are usually considered to be noise and border
points.
 The most popular density based clustering method is
DBSCAN (density-based spatial clustering of applications
with noise).
 OPTICS (Ordering Points To Identify the Clustering
Structure) is a generalization of DBSCAN that handles
different densities much better way.

Density-based clustering
with DBSCAN.
DBSCAN assumes clusters of
similar density, and may have
problems separating nearby
clusters
OPTICS is a DBSCAN variant
that handles different densities
much better

1. Forming the clusters from the given data set – resulting
in a new variable that identifies cluster members among
the cases (one phase cluster)
2. Description of clusters by re-crossing with the data
(Two phase cluster)

FISH CUTLET
FISH FINGER
FISH BURGER
VALUE
ADDED
PRODUCTS
One phase cluster
Forming of clusters by the
chosen data set

FISH CUTLET
Seer fish Mackerel
Baked Fried
Two phase cluster
Third phase cluster

 Cuts down the cost of preparing a sampling frame and
other administrative factors.
 No special scales of measurement necessary
 Visual graphic provides clear understanding of the
clusters.
Disadvantages:
 Choice of cluster-forming variables often not based on
theory but at random
 In some cases, determination of clusters is difficult to
decide.
Advantages :

Marketing: Help marketers to discover distinct groups in their
customer bases, and then use this knowledge to develop targeted
marketing programs
Land use: Identification of areas of similar land use in an earth
observation database
Insurance: Identifying groups of motor insurance policy holders
with a high average claim cost
City-planning: Identifying groups of houses according to their
house type, value, and geographical location
Earth-quake studies: Observed earth quake epicenters should be
clustered along continent faults

Rajia cluster analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Rajia cluster analysis

Similar to Rajia cluster analysis (20)

More from College of Fisheries, KVAFSU, Mangalore, Karnataka

More from College of Fisheries, KVAFSU, Mangalore, Karnataka (20)

Recently uploaded

Recently uploaded (20)

Rajia cluster analysis