SlideShare a Scribd company logo
1 of 13
Chapter 3
Data Mining Techniques
3.1 Introduction
• Parametric models describe the relationship between input and output
through the use of algebraic equations what are some parameters are not
specified. These unspecified parameters are determined by providing input
examples.
• Nonparametric techniques are more appropriate for data mining
applications. A non-parametric model is one that is data-driven. Recent
techniques are able to learn dynamically as data are added to the input.
This dynamic Learning process allows the model to be created
continuously. The more data, the better the model.
• Nonparametric techniques are particularly suitable to the database
applications with large amounts of dynamically changing data.
Nonparametric techniques include neural networks, decision trees, and
genetic algorithms.
3.2 Statistical Perspective. Point Estimation
• The bias of an estimator is the difference between the expected value of the estimator and the actual value. Let 𝐸 Θ denote the
expected value
𝐵𝑖𝑎𝑠 = 𝐸 Θ − Θ = 𝐸 Θ − Θ
• One measure of the effectiveness of an estimate is the mean squared error (MSE), which is the expected value of difference
between the estimates and the actual value:
𝐸 Θ − Θ 2
• The root mean square error (RMSE) is found by taking the square root of the MSE.
• The root mean square (RMS) may also be used to estimate error or as another statistic to describe a distribution. Unlike mean, it
does indicate the magnitude of the values.
𝑅𝑀𝑆 =
𝑗=1
𝑛
𝑥𝑗
2
𝑛
• At popular estimating technique is the jackknife estimate. With this approach, the estimate of a parameter, Θ, is obtained by
omitting one value from the set of observed values. Given set of jackknife estimates, Θ 𝑖 we can obtain an overall estimate
Θ . =
𝑖=1
𝑛
Θ(𝑖)
𝑛
• When we determine a range of values, within which the true parameter value should fall. This range is called a confidence interval.
3.2.2 Estimation and Summarization Models
• Maximum likelihood estimate (MLE) technique for point estimation. The approach obtains parameter estimates that maximize the
probability that that sample data 𝑋 = 𝑥𝑖, … , 𝑥 𝑛 occur for the specific model 𝑓 𝑥𝑖 Θ . The likelihood function is thus defined as
𝐿 Θ 𝑥𝑖, … , 𝑥 𝑛 =
𝑖=1
𝑛
𝑓(𝑥𝑖|Θ) .
The value Θ that maximizes 𝐿 is the estimate chosen. This can be found by taking the derivative with respect to Θ.
• The expectation maximization (EM) algorithm can solve the estimation problem with incomplete data. The EM algorithm finds an
MLE for a parameter (such as a mean) using a two step process: estimation and maximization. These steps are applied iteratively
until successive parameter estimates converge. Such iterative estimates must satisfy
𝜕𝑙𝑛𝐿(Θ|𝑋)
𝜕𝜃𝑖
= 0
• Models based on summarization provide an abstraction and the summarization of the data as a whole. Well-known statistical
concepts such as mean, variance, standard deviation, median, mode are simple models of the underlying population. Fitting
population into a specific frequency distribution provides an even better model of the data.
• Visualization techniques help to display the structure of the data graphically (histograms, box plots, scatter diagrams).
3.2.3 Bayes Theorem
• Bayes rule is a technique to estimate the likelihood of a property given the set of data as evidence or input.
Suppose that either hypothesis ℎ1 or hypothesis ℎ2 must occur and 𝑥𝑖 is an observable event, the Bayes rule
states
𝑃 ℎ1 𝑥𝑖 =
𝑃 𝑥𝑖|ℎ1 𝑃 ℎ1
𝑃 𝑥𝑖|ℎ1 𝑃 ℎ1 + 𝑃 𝑥𝑖|ℎ2 𝑃 ℎ2
• 𝑃 ℎ1 𝑥𝑖 is called the posterior probability, while 𝑃 ℎ1 is the prior probability associated with hypothesis ℎ1. 𝑃 𝑥𝑖
is the probability of the occurrence of data value 𝑥𝑖 and 𝑃 𝑥𝑖|ℎ1 is the conditional probability that, given a
hypothesis the tuple satisfies it. Bayes rule allows to assign probabilities 𝑃 ℎ𝑗 𝑥𝑖 of hypotheses given a data value
𝑃 ℎ1 𝑥𝑖 =
𝑃 𝑥𝑖|ℎ1 𝑃 ℎ1
𝑃 𝑥𝑖
• Hypothesis testing helps to determine if a set of observed variable values is statistically significant (differs
from the expected case). This approach explains the observed data by testing a hypothesis against it. A
hypothesis is first made, then the observed values are compared based on this hypothesis to those of the
expected case. Assuming that 𝑂 represents the observed data and 𝐸 is the expected values based on
hypothesis, the chi-squared statistic, 𝜒2
, is defined as:
𝜒2 =
𝑂 − 𝐸 2
𝐸
3.2.5 Correlations and Regression
• Linear regression assumes that a linear relationship exists between the input and the output data.
The common formula for a linear relationship is:
𝑦 = 𝑐0 + 𝑐1 𝑥1 + ⋯ + 𝑐 𝑛 𝑥 𝑛
• There are: 𝑛 input variables, which are called predictors or regressors; one output variable being
predicted (called a response); 𝑛+1 constants, which are chosen to match model by the input
sample. This is called multiple linear regression because there is more than one predictor.
• Both bivariate regression and correlation can be used to evaluate the strength of a relationship
between two variables.
• One standard formula to measure linear correlation is the correlation coefficient 𝑟 ∈ −1,1 . Here
negative correlation indicates that one variable increases while the other decreases:
𝑟 =
(𝑥𝑖 − 𝑋)(𝑦𝑖 − 𝑌)
(𝑥𝑖 − 𝑋)2 (𝑦𝑖 − 𝑌)2
• When two data variables have a strong correlation, they are similar. Thus, the correlation
coefficient can be used to define similarity for clustering or classification.
3.3 Similarity Measures
Those tuples, that answer the query should be more like each other than those that do not answer it. Each IR query provides the class
definition in the form of the IR query itself. So classification problem then becomes one of determining similarity between each tuple
and the query 𝑂 𝑛 rather than 𝑂 𝑛2
problem. Common similarity measures used:
• Dice 𝑠𝑖𝑚 𝑡𝑖, 𝑡𝑗 =
2 ℎ=1
𝑘
𝑡 𝑖ℎ 𝑡 𝑗ℎ
ℎ=1
𝑘 𝑡 𝑖ℎ
2 + ℎ=1
𝑘 𝑡 𝑗ℎ
2 relates the overlap to the average size of the two sets together
• Jaccard 𝑠𝑖𝑚 𝑡𝑖, 𝑡𝑗 = ℎ=1
𝑘
𝑡 𝑖ℎ 𝑡 𝑗ℎ
ℎ=1
𝑘 𝑡 𝑖ℎ
2 + ℎ=1
𝑘 𝑡 𝑗ℎ
2 − ℎ=1
𝑘 𝑡 𝑖ℎ 𝑡 𝑗ℎ
measures overlap of two sets as related to the whole set caused by their union
• Cosine 𝑠𝑖𝑚 𝑡𝑖, 𝑡𝑗 = ℎ=1
𝑘
𝑡 𝑖ℎ 𝑡 𝑗ℎ
ℎ=1
𝑘 𝑡 𝑖ℎ
2
ℎ=1
𝑘 𝑡 𝑗ℎ
2
relates the overlap to the geometric average of the two sets
• Overlap 𝑠𝑖𝑚 𝑡𝑖, 𝑡𝑗 = ℎ=1
𝑘
𝑡 𝑖ℎ 𝑡 𝑗ℎ
min ℎ=1
𝑘 𝑡 𝑖ℎ
2 , ℎ=1
𝑘 𝑡 𝑗ℎ
2
determines the degree to which two sets overlap
Distance or dissimilarity measure are often used instead of similarity measures. These measure how unlike items are.
• Euclidean 𝑑𝑖𝑠 𝑡𝑖, 𝑡𝑗 = ℎ=1
𝑘
(𝑡𝑖ℎ − 𝑡𝑗ℎ)2
• Manhattan 𝑑𝑖𝑠 𝑡𝑖, 𝑡𝑗 = ℎ=1
𝑘
(𝑡𝑖ℎ−𝑡𝑗ℎ)
Since most similarity measures assume numeric (and often discrete) values, they may be difficult to use for general data types. A
mapping from the attribute domain to a subset of integers may be used and some approach to determining the difference is needed.
3.4 Decision Trees
A decision tree (DT) is a predictive modeling technique used in classification,
clustering, and prediction. A computational DT model consists of three steps:
• A decision tree
• An algorithm to create the tree
• An algorithm that applies the tree to data and solves the problem under
consideration (complexity depends on the product of the number of levels
and the maximum branching factor).
Most decision tree techniques differ in how the tree is created. An algorithm
examines data from a training sample with known classification values in
order to build the tree, or it could be constructed by a domain expert.
3.5 Neural Networks
• The NN can be viewed as directed graph 𝐹 = 𝑉, 𝐴
consisting of vertices and arcs. All the vertices are
partitioned into source(input), sink (output), and
internal (hidden) nodes; every arch 𝑖, 𝑗 is labeled
with a numeric value 𝑤𝑖𝑗; every node 𝑖 is labeled with
a function 𝑓𝑖. The NN as an information processing
system consists of a directed graph and various
algorithms that access the graph.
• NN usually works only with numeric data.
• Artificial NN can be classified based on the type of
connectivity and learning into feed-forward or
feedback, with supervised or unsupervised learning.
• Unlike decision trees, after a tuple is processed, the
NN may be changed to improve future performance.
• NN have a long training time and thus are not
appropriate for real-world applications. NN can be
used in massively parallel systems.
Activation Functions
The output of each node 𝑖 in the NN is based on the definition of an activation function 𝑓𝑖,
associated with it. An activation 𝑓𝑖is applied to the input values 𝑥1𝑖, ⋯ , 𝑥 𝑘𝑖 and weights
𝑤1𝑖, ⋯ , 𝑤 𝑘𝑖 . The inputs are usually combined in a sum of products form 𝑆 = 𝑤ℎ𝑖 𝑥ℎ𝑖 .
The following are alternative definitions for activation function 𝑓𝑖 𝑆 at node 𝑖:
• Linear: 𝑓𝑖 𝑆 = 𝑐𝑆
• Threshold or step: 𝑓𝑖 𝑆 =
1 𝑖𝑓 𝑆 > 𝑇
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
• Sigmoid: 𝑓𝑖 𝑆 =
1
(1+𝑒−𝑐𝑆)
. This function possesses a simple derivative
𝜕𝑓 𝑖
𝜕𝑆
= 𝑓𝑖 1 − 𝑓𝑖
• Hyperbolic tangent: 𝑓𝑖 𝑆 =
(1−𝑒−𝑆)
(1+𝑒−𝑐𝑆)
• Gaussian: 𝑓𝑖 𝑆 = 𝑒
−𝑆2
𝑣
3.6 Genetic Algorithms
• Initially, a population of individuals 𝑃 is created. They typically are generated randomly. From this
population, a new population 𝑃/ of the same size is created. The algorithm repeatedly selects
individuals from whom to create new ones. These parents (𝑖1, 𝑖2), are then used to produce
offspring or children (𝑜1, 𝑜2) using a crossover process. Then mutants may be generated. The
process continues until the new population satisfies the termination condition.
• A fitness function 𝑓 is used to determine the best individuals in a population. This is then used in
selection process to chose parents to keep. Given an objective by which the population can be
measured, the fitness function indicates how well the goodness objective is being met by an
individual.
• The simplest selections process is to select individuals based on their fitness. Here 𝑝𝐼 𝑖
is the
probability of selecting individual 𝐼𝑖. This type of selection is called roulette wheel selection.
𝑝𝐼 𝑖
=
𝑓(𝐼𝑖)
𝐼 𝑖∈𝑃 𝑓(𝐼𝑗)
• A genetic algorithm (GA) is computational model consisting of five parts: 1) starting set, 2)
crossover technique, 3) mutation algorithm, 4) fitness function, 5) GA algorithm.
References:
Dunham, Margaret H. “Data Mining: Introductory and Advanced
Topics”. Pearson Education, Inc., 2003.

More Related Content

What's hot

2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagationKrish_ver2
 
Sets and disjoint sets union123
Sets and disjoint sets union123Sets and disjoint sets union123
Sets and disjoint sets union123Ankita Goyal
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networksAkash Goel
 
State space search
State space searchState space search
State space searchchauhankapil
 
Classes and Objects
Classes and Objects  Classes and Objects
Classes and Objects yndaravind
 
Over fitting underfitting
Over fitting underfittingOver fitting underfitting
Over fitting underfittingSivapriyaS12
 
State space search and Problem Solving techniques
State space search and Problem Solving techniquesState space search and Problem Solving techniques
State space search and Problem Solving techniquesKirti Verma
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data MiningValerii Klymchuk
 
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
FUNCTION DEPENDENCY  AND TYPES & EXAMPLEFUNCTION DEPENDENCY  AND TYPES & EXAMPLE
FUNCTION DEPENDENCY AND TYPES & EXAMPLEVraj Patel
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)EdutechLearners
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade offVARUN KUMAR
 
Mapping ER and EER Model
Mapping ER and EER ModelMapping ER and EER Model
Mapping ER and EER ModelMary Brinda
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalitiesKrish_ver2
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data MiningDHIVYADEVAKI
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic conceptsKrish_ver2
 

What's hot (20)

2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
 
Sets and disjoint sets union123
Sets and disjoint sets union123Sets and disjoint sets union123
Sets and disjoint sets union123
 
Data reduction
Data reductionData reduction
Data reduction
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
 
State space search
State space searchState space search
State space search
 
Elmasri Navathe DBMS Unit-1 ppt
Elmasri Navathe DBMS Unit-1 pptElmasri Navathe DBMS Unit-1 ppt
Elmasri Navathe DBMS Unit-1 ppt
 
Classes and Objects
Classes and Objects  Classes and Objects
Classes and Objects
 
Over fitting underfitting
Over fitting underfittingOver fitting underfitting
Over fitting underfitting
 
State space search and Problem Solving techniques
State space search and Problem Solving techniquesState space search and Problem Solving techniques
State space search and Problem Solving techniques
 
Data mining primitives
Data mining primitivesData mining primitives
Data mining primitives
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
FUNCTION DEPENDENCY  AND TYPES & EXAMPLEFUNCTION DEPENDENCY  AND TYPES & EXAMPLE
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
 
Sementic nets
Sementic netsSementic nets
Sementic nets
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 
Mapping ER and EER Model
Mapping ER and EER ModelMapping ER and EER Model
Mapping ER and EER Model
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts
 

Viewers also liked

05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining TechniquesHouw Liong The
 
Chapter 10 Data Mining Techniques
 Chapter 10 Data Mining Techniques Chapter 10 Data Mining Techniques
Chapter 10 Data Mining TechniquesHouw Liong The
 
Chapter 08 Data Mining Techniques
Chapter 08 Data Mining Techniques Chapter 08 Data Mining Techniques
Chapter 08 Data Mining Techniques Houw Liong The
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data MiningValerii Klymchuk
 
Artificial Intelligence for Automated Decision Support Project
Artificial Intelligence for Automated Decision Support ProjectArtificial Intelligence for Automated Decision Support Project
Artificial Intelligence for Automated Decision Support ProjectValerii Klymchuk
 
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Krishna Petrochemicals
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDataminingTools Inc
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsMotaz Saad
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining ConceptsDung Nguyen
 
Monaco by Saseendranath vs
Monaco by Saseendranath vsMonaco by Saseendranath vs
Monaco by Saseendranath vsSaseendranath VS
 
Análise Pesquisa de Clima
Análise Pesquisa de ClimaAnálise Pesquisa de Clima
Análise Pesquisa de ClimaRafael Perez
 
Presentacion
PresentacionPresentacion
PresentacionSacnuaj
 
Actividad 1 teorias organizativas
Actividad 1 teorias organizativasActividad 1 teorias organizativas
Actividad 1 teorias organizativasJuan moncada
 

Viewers also liked (20)

02 Related Concepts
02 Related Concepts02 Related Concepts
02 Related Concepts
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Data mining
Data miningData mining
Data mining
 
Data Warehouse Project
Data Warehouse ProjectData Warehouse Project
Data Warehouse Project
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
 
Chapter 10 Data Mining Techniques
 Chapter 10 Data Mining Techniques Chapter 10 Data Mining Techniques
Chapter 10 Data Mining Techniques
 
Chapter 08 Data Mining Techniques
Chapter 08 Data Mining Techniques Chapter 08 Data Mining Techniques
Chapter 08 Data Mining Techniques
 
Database Project
Database ProjectDatabase Project
Database Project
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
 
Artificial Intelligence for Automated Decision Support Project
Artificial Intelligence for Automated Decision Support ProjectArtificial Intelligence for Automated Decision Support Project
Artificial Intelligence for Automated Decision Support Project
 
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Ghhh
GhhhGhhh
Ghhh
 
Monaco by Saseendranath vs
Monaco by Saseendranath vsMonaco by Saseendranath vs
Monaco by Saseendranath vs
 
Análise Pesquisa de Clima
Análise Pesquisa de ClimaAnálise Pesquisa de Clima
Análise Pesquisa de Clima
 
Presentacion
PresentacionPresentacion
Presentacion
 
Actividad 1 teorias organizativas
Actividad 1 teorias organizativasActividad 1 teorias organizativas
Actividad 1 teorias organizativas
 
Fishing
 Fishing Fishing
Fishing
 

Similar to Data Mining Techniques Chapter 3 Overview

Predictive analytics
Predictive analyticsPredictive analytics
Predictive analyticsDinakar nk
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptxhiblooms
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdfBeyaNasr1
 
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on ClusteringAbility Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on ClusteringKamleshKumar394
 
Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章Tsuyoshi Sakama
 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis Baivab Nag
 
machine learning for engineering students
machine learning for engineering studentsmachine learning for engineering students
machine learning for engineering studentsKavitabani1
 
Lect 3 background mathematics
Lect 3 background mathematicsLect 3 background mathematics
Lect 3 background mathematicshktripathy
 
L1 statistics
L1 statisticsL1 statistics
L1 statisticsdapdai
 
Dimensionality Reduction.pptx
Dimensionality Reduction.pptxDimensionality Reduction.pptx
Dimensionality Reduction.pptxPriyadharshiniG41
 
Deep learning MindMap
Deep learning MindMapDeep learning MindMap
Deep learning MindMapAshish Patel
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mininghktripathy
 

Similar to Data Mining Techniques Chapter 3 Overview (20)

Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on ClusteringAbility Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章
 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis
 
machine learning for engineering students
machine learning for engineering studentsmachine learning for engineering students
machine learning for engineering students
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
 
Lect 3 background mathematics
Lect 3 background mathematicsLect 3 background mathematics
Lect 3 background mathematics
 
L1 statistics
L1 statisticsL1 statistics
L1 statistics
 
Dimensionality Reduction.pptx
Dimensionality Reduction.pptxDimensionality Reduction.pptx
Dimensionality Reduction.pptx
 
Deep learning MindMap
Deep learning MindMapDeep learning MindMap
Deep learning MindMap
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mining
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Ai saturdays presentation
Ai saturdays presentationAi saturdays presentation
Ai saturdays presentation
 

More from Valerii Klymchuk

Sample presentation slides template
Sample presentation slides templateSample presentation slides template
Sample presentation slides templateValerii Klymchuk
 
Crime Analysis based on Historical and Transportation Data
Crime Analysis based on Historical and Transportation DataCrime Analysis based on Historical and Transportation Data
Crime Analysis based on Historical and Transportation DataValerii Klymchuk
 

More from Valerii Klymchuk (7)

Sample presentation slides template
Sample presentation slides templateSample presentation slides template
Sample presentation slides template
 
Toronto Capstone
Toronto CapstoneToronto Capstone
Toronto Capstone
 
03 Data Representation
03 Data Representation03 Data Representation
03 Data Representation
 
05 Scalar Visualization
05 Scalar Visualization05 Scalar Visualization
05 Scalar Visualization
 
06 Vector Visualization
06 Vector Visualization06 Vector Visualization
06 Vector Visualization
 
07 Tensor Visualization
07 Tensor Visualization07 Tensor Visualization
07 Tensor Visualization
 
Crime Analysis based on Historical and Transportation Data
Crime Analysis based on Historical and Transportation DataCrime Analysis based on Historical and Transportation Data
Crime Analysis based on Historical and Transportation Data
 

Recently uploaded

专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 

Recently uploaded (20)

专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 

Data Mining Techniques Chapter 3 Overview

  • 2. 3.1 Introduction • Parametric models describe the relationship between input and output through the use of algebraic equations what are some parameters are not specified. These unspecified parameters are determined by providing input examples. • Nonparametric techniques are more appropriate for data mining applications. A non-parametric model is one that is data-driven. Recent techniques are able to learn dynamically as data are added to the input. This dynamic Learning process allows the model to be created continuously. The more data, the better the model. • Nonparametric techniques are particularly suitable to the database applications with large amounts of dynamically changing data. Nonparametric techniques include neural networks, decision trees, and genetic algorithms.
  • 3. 3.2 Statistical Perspective. Point Estimation • The bias of an estimator is the difference between the expected value of the estimator and the actual value. Let 𝐸 Θ denote the expected value 𝐵𝑖𝑎𝑠 = 𝐸 Θ − Θ = 𝐸 Θ − Θ • One measure of the effectiveness of an estimate is the mean squared error (MSE), which is the expected value of difference between the estimates and the actual value: 𝐸 Θ − Θ 2 • The root mean square error (RMSE) is found by taking the square root of the MSE. • The root mean square (RMS) may also be used to estimate error or as another statistic to describe a distribution. Unlike mean, it does indicate the magnitude of the values. 𝑅𝑀𝑆 = 𝑗=1 𝑛 𝑥𝑗 2 𝑛 • At popular estimating technique is the jackknife estimate. With this approach, the estimate of a parameter, Θ, is obtained by omitting one value from the set of observed values. Given set of jackknife estimates, Θ 𝑖 we can obtain an overall estimate Θ . = 𝑖=1 𝑛 Θ(𝑖) 𝑛 • When we determine a range of values, within which the true parameter value should fall. This range is called a confidence interval.
  • 4. 3.2.2 Estimation and Summarization Models • Maximum likelihood estimate (MLE) technique for point estimation. The approach obtains parameter estimates that maximize the probability that that sample data 𝑋 = 𝑥𝑖, … , 𝑥 𝑛 occur for the specific model 𝑓 𝑥𝑖 Θ . The likelihood function is thus defined as 𝐿 Θ 𝑥𝑖, … , 𝑥 𝑛 = 𝑖=1 𝑛 𝑓(𝑥𝑖|Θ) . The value Θ that maximizes 𝐿 is the estimate chosen. This can be found by taking the derivative with respect to Θ. • The expectation maximization (EM) algorithm can solve the estimation problem with incomplete data. The EM algorithm finds an MLE for a parameter (such as a mean) using a two step process: estimation and maximization. These steps are applied iteratively until successive parameter estimates converge. Such iterative estimates must satisfy 𝜕𝑙𝑛𝐿(Θ|𝑋) 𝜕𝜃𝑖 = 0 • Models based on summarization provide an abstraction and the summarization of the data as a whole. Well-known statistical concepts such as mean, variance, standard deviation, median, mode are simple models of the underlying population. Fitting population into a specific frequency distribution provides an even better model of the data. • Visualization techniques help to display the structure of the data graphically (histograms, box plots, scatter diagrams).
  • 5. 3.2.3 Bayes Theorem • Bayes rule is a technique to estimate the likelihood of a property given the set of data as evidence or input. Suppose that either hypothesis ℎ1 or hypothesis ℎ2 must occur and 𝑥𝑖 is an observable event, the Bayes rule states 𝑃 ℎ1 𝑥𝑖 = 𝑃 𝑥𝑖|ℎ1 𝑃 ℎ1 𝑃 𝑥𝑖|ℎ1 𝑃 ℎ1 + 𝑃 𝑥𝑖|ℎ2 𝑃 ℎ2 • 𝑃 ℎ1 𝑥𝑖 is called the posterior probability, while 𝑃 ℎ1 is the prior probability associated with hypothesis ℎ1. 𝑃 𝑥𝑖 is the probability of the occurrence of data value 𝑥𝑖 and 𝑃 𝑥𝑖|ℎ1 is the conditional probability that, given a hypothesis the tuple satisfies it. Bayes rule allows to assign probabilities 𝑃 ℎ𝑗 𝑥𝑖 of hypotheses given a data value 𝑃 ℎ1 𝑥𝑖 = 𝑃 𝑥𝑖|ℎ1 𝑃 ℎ1 𝑃 𝑥𝑖 • Hypothesis testing helps to determine if a set of observed variable values is statistically significant (differs from the expected case). This approach explains the observed data by testing a hypothesis against it. A hypothesis is first made, then the observed values are compared based on this hypothesis to those of the expected case. Assuming that 𝑂 represents the observed data and 𝐸 is the expected values based on hypothesis, the chi-squared statistic, 𝜒2 , is defined as: 𝜒2 = 𝑂 − 𝐸 2 𝐸
  • 6. 3.2.5 Correlations and Regression • Linear regression assumes that a linear relationship exists between the input and the output data. The common formula for a linear relationship is: 𝑦 = 𝑐0 + 𝑐1 𝑥1 + ⋯ + 𝑐 𝑛 𝑥 𝑛 • There are: 𝑛 input variables, which are called predictors or regressors; one output variable being predicted (called a response); 𝑛+1 constants, which are chosen to match model by the input sample. This is called multiple linear regression because there is more than one predictor. • Both bivariate regression and correlation can be used to evaluate the strength of a relationship between two variables. • One standard formula to measure linear correlation is the correlation coefficient 𝑟 ∈ −1,1 . Here negative correlation indicates that one variable increases while the other decreases: 𝑟 = (𝑥𝑖 − 𝑋)(𝑦𝑖 − 𝑌) (𝑥𝑖 − 𝑋)2 (𝑦𝑖 − 𝑌)2 • When two data variables have a strong correlation, they are similar. Thus, the correlation coefficient can be used to define similarity for clustering or classification.
  • 7. 3.3 Similarity Measures Those tuples, that answer the query should be more like each other than those that do not answer it. Each IR query provides the class definition in the form of the IR query itself. So classification problem then becomes one of determining similarity between each tuple and the query 𝑂 𝑛 rather than 𝑂 𝑛2 problem. Common similarity measures used: • Dice 𝑠𝑖𝑚 𝑡𝑖, 𝑡𝑗 = 2 ℎ=1 𝑘 𝑡 𝑖ℎ 𝑡 𝑗ℎ ℎ=1 𝑘 𝑡 𝑖ℎ 2 + ℎ=1 𝑘 𝑡 𝑗ℎ 2 relates the overlap to the average size of the two sets together • Jaccard 𝑠𝑖𝑚 𝑡𝑖, 𝑡𝑗 = ℎ=1 𝑘 𝑡 𝑖ℎ 𝑡 𝑗ℎ ℎ=1 𝑘 𝑡 𝑖ℎ 2 + ℎ=1 𝑘 𝑡 𝑗ℎ 2 − ℎ=1 𝑘 𝑡 𝑖ℎ 𝑡 𝑗ℎ measures overlap of two sets as related to the whole set caused by their union • Cosine 𝑠𝑖𝑚 𝑡𝑖, 𝑡𝑗 = ℎ=1 𝑘 𝑡 𝑖ℎ 𝑡 𝑗ℎ ℎ=1 𝑘 𝑡 𝑖ℎ 2 ℎ=1 𝑘 𝑡 𝑗ℎ 2 relates the overlap to the geometric average of the two sets • Overlap 𝑠𝑖𝑚 𝑡𝑖, 𝑡𝑗 = ℎ=1 𝑘 𝑡 𝑖ℎ 𝑡 𝑗ℎ min ℎ=1 𝑘 𝑡 𝑖ℎ 2 , ℎ=1 𝑘 𝑡 𝑗ℎ 2 determines the degree to which two sets overlap Distance or dissimilarity measure are often used instead of similarity measures. These measure how unlike items are. • Euclidean 𝑑𝑖𝑠 𝑡𝑖, 𝑡𝑗 = ℎ=1 𝑘 (𝑡𝑖ℎ − 𝑡𝑗ℎ)2 • Manhattan 𝑑𝑖𝑠 𝑡𝑖, 𝑡𝑗 = ℎ=1 𝑘 (𝑡𝑖ℎ−𝑡𝑗ℎ) Since most similarity measures assume numeric (and often discrete) values, they may be difficult to use for general data types. A mapping from the attribute domain to a subset of integers may be used and some approach to determining the difference is needed.
  • 8. 3.4 Decision Trees A decision tree (DT) is a predictive modeling technique used in classification, clustering, and prediction. A computational DT model consists of three steps: • A decision tree • An algorithm to create the tree • An algorithm that applies the tree to data and solves the problem under consideration (complexity depends on the product of the number of levels and the maximum branching factor). Most decision tree techniques differ in how the tree is created. An algorithm examines data from a training sample with known classification values in order to build the tree, or it could be constructed by a domain expert.
  • 9. 3.5 Neural Networks • The NN can be viewed as directed graph 𝐹 = 𝑉, 𝐴 consisting of vertices and arcs. All the vertices are partitioned into source(input), sink (output), and internal (hidden) nodes; every arch 𝑖, 𝑗 is labeled with a numeric value 𝑤𝑖𝑗; every node 𝑖 is labeled with a function 𝑓𝑖. The NN as an information processing system consists of a directed graph and various algorithms that access the graph. • NN usually works only with numeric data. • Artificial NN can be classified based on the type of connectivity and learning into feed-forward or feedback, with supervised or unsupervised learning. • Unlike decision trees, after a tuple is processed, the NN may be changed to improve future performance. • NN have a long training time and thus are not appropriate for real-world applications. NN can be used in massively parallel systems.
  • 10. Activation Functions The output of each node 𝑖 in the NN is based on the definition of an activation function 𝑓𝑖, associated with it. An activation 𝑓𝑖is applied to the input values 𝑥1𝑖, ⋯ , 𝑥 𝑘𝑖 and weights 𝑤1𝑖, ⋯ , 𝑤 𝑘𝑖 . The inputs are usually combined in a sum of products form 𝑆 = 𝑤ℎ𝑖 𝑥ℎ𝑖 . The following are alternative definitions for activation function 𝑓𝑖 𝑆 at node 𝑖: • Linear: 𝑓𝑖 𝑆 = 𝑐𝑆 • Threshold or step: 𝑓𝑖 𝑆 = 1 𝑖𝑓 𝑆 > 𝑇 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 • Sigmoid: 𝑓𝑖 𝑆 = 1 (1+𝑒−𝑐𝑆) . This function possesses a simple derivative 𝜕𝑓 𝑖 𝜕𝑆 = 𝑓𝑖 1 − 𝑓𝑖 • Hyperbolic tangent: 𝑓𝑖 𝑆 = (1−𝑒−𝑆) (1+𝑒−𝑐𝑆) • Gaussian: 𝑓𝑖 𝑆 = 𝑒 −𝑆2 𝑣
  • 11.
  • 12. 3.6 Genetic Algorithms • Initially, a population of individuals 𝑃 is created. They typically are generated randomly. From this population, a new population 𝑃/ of the same size is created. The algorithm repeatedly selects individuals from whom to create new ones. These parents (𝑖1, 𝑖2), are then used to produce offspring or children (𝑜1, 𝑜2) using a crossover process. Then mutants may be generated. The process continues until the new population satisfies the termination condition. • A fitness function 𝑓 is used to determine the best individuals in a population. This is then used in selection process to chose parents to keep. Given an objective by which the population can be measured, the fitness function indicates how well the goodness objective is being met by an individual. • The simplest selections process is to select individuals based on their fitness. Here 𝑝𝐼 𝑖 is the probability of selecting individual 𝐼𝑖. This type of selection is called roulette wheel selection. 𝑝𝐼 𝑖 = 𝑓(𝐼𝑖) 𝐼 𝑖∈𝑃 𝑓(𝐼𝑗) • A genetic algorithm (GA) is computational model consisting of five parts: 1) starting set, 2) crossover technique, 3) mutation algorithm, 4) fitness function, 5) GA algorithm.
  • 13. References: Dunham, Margaret H. “Data Mining: Introductory and Advanced Topics”. Pearson Education, Inc., 2003.