SlideShare a Scribd company logo
1 of 32
Download to read offline
ARTIFICIAL NEURAL
NETWORKS PARAMETERS
Setting ARTIFICIAL NEURAL
NETWORKS PARAMETERS
NEED FOR SETTING
PARAMETER VALUES
1. LOCAL MINIMA
w1 – Global Minima
w2, w3 – Local minima
W1 W2 W3
1
32
Erms
Erms
min
NEED FOR SETTING
PARAMETER VALUES
2. LEARNING RATE
 Small learning rate – Slow and lengthy
learning
 Large learning rate –
 Output may saturate
 or may swing across desired.
 May take too long to train.
3. Learning will improve and network training will
converge if inputs and outputs are statistical i.e.
numeric.
TYPES OF TRAINING
Supervised Training
•Supplies the neural network with inputs and the
desired Outputs
• Response of the network to the inputs is measured
•The weights are modified to reduce the difference
between the actual and desired outputs
Unsupervised Training
•Only supplies inputs
•The neural network adjusts its own weights so that
similar inputs cause similar outputs
•The network identifies the patterns and differences in
the inputs without any external assistance
I.INITIALISATION OF WEIGHTS
 Larger weights will drive the output of layer 1 to
saturation.
 Network will require larger training time to
emerge out of saturation.
 Weights chosen as :- Small weights
between -1 and 1
Or
between -0.5 and 0.5
INITIALISATION OF WEIGHTS
 PROBLEM ITH THIS CHOISE:
 If some of input parameters are very high, they
will predominate the output.
 e. g. x = [ 10 2 0.2 1]
 SOLUTION:
 Weights are initialized as inversely proportional
to input.
 Output will not depend on any individual
parameters, but total input as a whole.
RULE FOR INITIALISATION OF WEIGHTS
 Weight between input and 1st layer:
P
 vij = (1/2P) ∑p=1 (1/|xj|)
 P is total no of input patterns.
 Weight between 1st layer and output layer:
P
 wij = (1/2P) ∑p=1 (1/ f(∑ vij xj )
II. FREQUENCY OF WEIGHT UPDATES
 Per pattern training: Weight changes after every
input is applied.
 Input set repeated if NN is not trained yet.
 Per epoch training: Epoch is one iteration through the
process of providing the network with an input and
updating the network's weights
 Many epochs are required to train the neural
network
 weight changes as suggested by every input are
accumulated together into a single change at the
end of each epoch i.e. set of patterns.
 No change in weight at end of each input.
 Also called BATCH MODE training
FREQUENCY OF WEIGHT UPDATES
 Advantages / Disadvantages
 Batch mode training not possible for on-
line training.
 For large applications, with large training
time, parallel processing may reduce time
in batch mode training.
 Per pattern training is more expensive as
weight changes more often.
 Per pattern suitable for small NN and
small data set
III. LEARNING RATE
 FOR PERCEPTRON TRAINING
ALGORITHM
 Too small η – Very slow learning
 Too large η – Output may saturate on one
direction.
 η = 0 --- no weight change
 η = 1 --- Common Choice
PROBLEM WITH η = 1
 If η = 1 ∆w = ± x
 New Output = (w + ∆w)t x
 Output = wtx ± xtx
 Here if wtx > xtx output will always be positive
and grows in one direction only.
 Should be - wtx < ∆wtx ∆w = ± x
 η |xtx| >| wtx |
 η > | wtx | / |xtx|
 η is normally between 0 and 1.
III. LEARNING RATE
 FOR BACK PROPAGATION ALGORITHM
 Large η in early iterations and steadily
decrease it when NN is converging.
 Increase η at every iteration that improves
performance by significant amount and vise
versa.
 Steadily double the η untill error value worsens.
 If Second derivative of E, ▼2E is constant and
low, η can be large.
 If Second derivative of E, ▼2E is large, η can be
small.
 For above, more computation required.
MOMENTUM
•Training done to reduce this error.
•Training may stop at local minima instead
global minima.
MOMENTUM
 Can be prevented if weight changes depend
on average gradient of Error, rather than
gradient at a point.
 Averaging δE/ δw in a small neighborhood
leads the network in general direction of MSE
decrease without getting stuck at local
minima.
 May become complex.
MOMENTUM
 Shortcut method:
 Weight change at ith iteration of back
propagation algorithm also depends on
immediately preceding weight changes.
 This has an averaging effect.
 This diminishes drastic fluctuations in weight
changes over consecutive iterations.
 Achieved by adding momentum to weight
update rule.
MOMENTUM
 Δwkj(t+1) = ηδkxi + α∆wkj(t)
 ∆wkj(t) is weight change required at time t .
 α is a constant . α ≤ 1.
 Disadvantage:
 Past training trend can strongly bias current
training.
 α depends on application.
 α = 0, no effect of past value.
 α = 1, no effect of current value.
What constitutes a “good” training
set?
 Samples must represent the general population
 Samples must contain members of each class
 Samples in each class must contain a wide range
of variations or noise effect
GENERALIZABILITY
 Occurs more in large NN with less inputs.
 Inputs are repeated while training till error
reduces.
 This leads to network memorizing the inputs
samples.
 Such trained NN may behave correctly with
training data but fail with any unknown data.
 Also called over training.
GENERALIZABILITY- SOLUTION
 The set of all known samples is broken into
two orthogonal (independent) sets:
 Training set - A group of samples used to
train the neural network
 Testing set - A group of samples used to test
the performance of the neural network
◦ Used to estimate the error rate
 Training continues as long as error to test
data gradually reduces.
 Training terminates as soon as error on test
data increases.
GENERALIZABILITY
E
time
Error on test
data
Error on training
data
Time when
error on test
data starts to
increase
•Performance over test data is monitored over
several iterations, not just one iteration.
GENERALIZABILITY
 Weight will NOT change on test data.
 Overtraining can be avoided by using small
number of parameters (hidden nodes and
weights).
 If size of training set is small, multiple sets
can be created by adding small randomly
generated noise or displacement.
 X = { x1, x2, x3…..xn} then
 X’ = { x1+ß1, x2+ß2, x3+ß3… xn + ßn}
NO. OF HIDDEN LAYERS AND NODES
 Mostly obtained by trial and error.
 Too few nodes – NW may not be efficient.
 Too large nodes –
 Computation is tedious and expensive.
 NW may memorize the inputs and perform poorly on
test data.
 NW is called well trained if performs well on
data not used for testing.
 Hence NN should be capable of generalizing
from input, rather than memorizing the
inputs.
NO. OF HIDDEN LAYERS AND NODES
 Methods:
 Adaptive algorithm-
◦ Choose large number of nodes and train.
◦ Gradually discard nodes one by one during training.
◦ Train till performance reduces below unacceptable
level.
◦ NN to be retrained at each change in nodes.
◦ Or vice versa
◦ Choose small number of nodes and increase nodes
till performance is satisfactory.
Let’s see how NN size advances:
 Linear Classification:
L1 ax1+bx2+c>0
ax1+bx2+c<0
L1
Let’s see how NN size advances:
 Two class problem - Nonlinear
L1
L2 L11
L1
L2
Let’s see how NN size advances:
 Two class problem - Nonlinear
L1
L2 L11
L3
L4
L1
L2
L3
L4
P
Let’s see how NN size advances:
 Two class problem - Nonlinear
L22
PP1
P2
P3
P4
P1
P4
P2
P3
L22
P1
P2
P3
P4
L11
L11
L11
L11
NUMBER OF INPUT SAMPLES
 As a thumb rule: 5 to 10 times as many
samples as the number of weights to be trained.
 Baum and Haussler suggest:
◦ P > |w| /(1-a)
◦ P is number of samples,
◦ |w| is number of weights to be trained,
◦ a expected accuracy on test set.
Non-numeric inputs
 Nonnumeric inputs like colours have no
inherent order.
 Can not be depicted on an axis e.g. red-blue-
green-yellow.
 Colour becomes position sensitive. Results in
Erroneous training.
 Hence assign binary vector with component
corresponding to each colour. e.g.
 Green – 0 0 1 0 red – 1 0 0 0
 Blue – 0 1 0 0 yellow – 0 0 0 1
 But dimension increases drastically
Termination criteria
 “Halt when goal is achieved.”
 Perceptron training of linearly separable patterns –
◦ Correct classification of all samples.
◦ Termination is assured if ƞ is sufficiently small.
◦ Program may run indefinitely if ƞ is not appropriate.
◦ Different choice of if ƞ may yield classification.
 Back propagation algorithm using delta rule–
◦ Termination can never be achieved with above criteria as
output can never be +1 or -1.
◦ Will have to fix Emin , the minimum error acceptable.
Terminates as error goes below Emin.
Termination criteria
 Perceptron training of linearly non-separable
patterns –
◦ Above criteria will allow procedure to run indefinitely.
◦ Compare amount of progress in recent past.
◦ If number of misclassification has not changed in large
step, samples are not linearly separable.
◦ Can fix limit of minimum % of correct classification for
termination.

More Related Content

What's hot

All-Reduce and Prefix-Sum Operations
All-Reduce and Prefix-Sum Operations All-Reduce and Prefix-Sum Operations
All-Reduce and Prefix-Sum Operations Syed Zaid Irshad
 
Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...
Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...
Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...Beatriz Martín @zigiella
 
Classification using L1-Penalized Logistic Regression
Classification using L1-Penalized Logistic RegressionClassification using L1-Penalized Logistic Regression
Classification using L1-Penalized Logistic RegressionSetia Pramana
 
Vector Quantization Vs Scalar Quantization
Vector Quantization Vs Scalar Quantization Vector Quantization Vs Scalar Quantization
Vector Quantization Vs Scalar Quantization ManasiKaur
 
Basic communication operations - One to all Broadcast
Basic communication operations - One to all BroadcastBasic communication operations - One to all Broadcast
Basic communication operations - One to all BroadcastRashiJoshi11
 
Graph in data structure
Graph in data structureGraph in data structure
Graph in data structureAbrish06
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineMusa Hawamdah
 
Gradient-based optimization for Deep Learning: a short introduction
Gradient-based optimization for Deep Learning: a short introductionGradient-based optimization for Deep Learning: a short introduction
Gradient-based optimization for Deep Learning: a short introductionChristian Perone
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reductionmrizwan969
 
data structures- back tracking
data structures- back trackingdata structures- back tracking
data structures- back trackingAbinaya B
 
Digital Differential Analyzer Line Drawing Algorithm
Digital Differential Analyzer Line Drawing AlgorithmDigital Differential Analyzer Line Drawing Algorithm
Digital Differential Analyzer Line Drawing AlgorithmKasun Ranga Wijeweera
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
dynamic programming Rod cutting class
dynamic programming Rod cutting classdynamic programming Rod cutting class
dynamic programming Rod cutting classgiridaroori
 
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...Edureka!
 
PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017Yu-Hsun (lymanblue) Lin
 
Randomized Algorithms
Randomized AlgorithmsRandomized Algorithms
Randomized AlgorithmsKetan Kamra
 

What's hot (20)

All-Reduce and Prefix-Sum Operations
All-Reduce and Prefix-Sum Operations All-Reduce and Prefix-Sum Operations
All-Reduce and Prefix-Sum Operations
 
Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...
Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...
Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...
 
Classification using L1-Penalized Logistic Regression
Classification using L1-Penalized Logistic RegressionClassification using L1-Penalized Logistic Regression
Classification using L1-Penalized Logistic Regression
 
Bloom filter
Bloom filterBloom filter
Bloom filter
 
Representation image
Representation imageRepresentation image
Representation image
 
Computer graphics realism
Computer graphics realismComputer graphics realism
Computer graphics realism
 
8 queen problem
8 queen problem8 queen problem
8 queen problem
 
Vector Quantization Vs Scalar Quantization
Vector Quantization Vs Scalar Quantization Vector Quantization Vs Scalar Quantization
Vector Quantization Vs Scalar Quantization
 
Basic communication operations - One to all Broadcast
Basic communication operations - One to all BroadcastBasic communication operations - One to all Broadcast
Basic communication operations - One to all Broadcast
 
Graph in data structure
Graph in data structureGraph in data structure
Graph in data structure
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Gradient-based optimization for Deep Learning: a short introduction
Gradient-based optimization for Deep Learning: a short introductionGradient-based optimization for Deep Learning: a short introduction
Gradient-based optimization for Deep Learning: a short introduction
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
data structures- back tracking
data structures- back trackingdata structures- back tracking
data structures- back tracking
 
Digital Differential Analyzer Line Drawing Algorithm
Digital Differential Analyzer Line Drawing AlgorithmDigital Differential Analyzer Line Drawing Algorithm
Digital Differential Analyzer Line Drawing Algorithm
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
dynamic programming Rod cutting class
dynamic programming Rod cutting classdynamic programming Rod cutting class
dynamic programming Rod cutting class
 
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
 
PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017
 
Randomized Algorithms
Randomized AlgorithmsRandomized Algorithms
Randomized Algorithms
 

Similar to Setting Artificial Neural Networks parameters

nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyayabhishek upadhyay
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyayabhishek upadhyay
 
Training Neural Networks.pptx
Training Neural Networks.pptxTraining Neural Networks.pptx
Training Neural Networks.pptxksghuge
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Simplilearn
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networksAkash Goel
 
ML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxDebabrataPain1
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryAhmed Yousry
 
Learning algorithm including gradient descent.pptx
Learning algorithm including gradient descent.pptxLearning algorithm including gradient descent.pptx
Learning algorithm including gradient descent.pptxamrita chaturvedi
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesMohammed Bennamoun
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
Artificial Neural Networks Deep Learning Report
Artificial Neural Networks   Deep Learning ReportArtificial Neural Networks   Deep Learning Report
Artificial Neural Networks Deep Learning ReportLisa Muthukumar
 
Batch normalization paper review
Batch normalization paper reviewBatch normalization paper review
Batch normalization paper reviewMinho Heo
 
Basic Learning Algorithms of ANN
Basic Learning Algorithms of ANNBasic Learning Algorithms of ANN
Basic Learning Algorithms of ANNwaseem khan
 
Classification by back propagation, multi layered feed forward neural network...
Classification by back propagation, multi layered feed forward neural network...Classification by back propagation, multi layered feed forward neural network...
Classification by back propagation, multi layered feed forward neural network...bihira aggrey
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
CS532L4_Backpropagation.pptx
CS532L4_Backpropagation.pptxCS532L4_Backpropagation.pptx
CS532L4_Backpropagation.pptxMFaisalRiaz5
 

Similar to Setting Artificial Neural Networks parameters (20)

nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
 
Training Neural Networks.pptx
Training Neural Networks.pptxTraining Neural Networks.pptx
Training Neural Networks.pptx
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
ML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptx
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
 
EE5180_G-5.pptx
EE5180_G-5.pptxEE5180_G-5.pptx
EE5180_G-5.pptx
 
Learning algorithm including gradient descent.pptx
Learning algorithm including gradient descent.pptxLearning algorithm including gradient descent.pptx
Learning algorithm including gradient descent.pptx
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rules
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Artificial Neural Networks Deep Learning Report
Artificial Neural Networks   Deep Learning ReportArtificial Neural Networks   Deep Learning Report
Artificial Neural Networks Deep Learning Report
 
Batch normalization paper review
Batch normalization paper reviewBatch normalization paper review
Batch normalization paper review
 
Basic Learning Algorithms of ANN
Basic Learning Algorithms of ANNBasic Learning Algorithms of ANN
Basic Learning Algorithms of ANN
 
Classification by back propagation, multi layered feed forward neural network...
Classification by back propagation, multi layered feed forward neural network...Classification by back propagation, multi layered feed forward neural network...
Classification by back propagation, multi layered feed forward neural network...
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
CS532L4_Backpropagation.pptx
CS532L4_Backpropagation.pptxCS532L4_Backpropagation.pptx
CS532L4_Backpropagation.pptx
 
Unit 2
Unit 2Unit 2
Unit 2
 
Lectura seis
Lectura seisLectura seis
Lectura seis
 

More from Madhumita Tamhane

Optical wireless communication li fi
Optical wireless communication li fiOptical wireless communication li fi
Optical wireless communication li fiMadhumita Tamhane
 
Optical Fiber Communication Part 3 Optical Digital Receiver
Optical Fiber Communication Part 3 Optical Digital ReceiverOptical Fiber Communication Part 3 Optical Digital Receiver
Optical Fiber Communication Part 3 Optical Digital ReceiverMadhumita Tamhane
 
Optical fiber communication Part 2 Sources and Detectors
Optical fiber communication Part 2 Sources and DetectorsOptical fiber communication Part 2 Sources and Detectors
Optical fiber communication Part 2 Sources and DetectorsMadhumita Tamhane
 
Optical fiber communication Part 1 Optical Fiber Fundamentals
Optical fiber communication Part 1 Optical Fiber FundamentalsOptical fiber communication Part 1 Optical Fiber Fundamentals
Optical fiber communication Part 1 Optical Fiber FundamentalsMadhumita Tamhane
 
Black and white TV fundamentals
Black and white TV fundamentalsBlack and white TV fundamentals
Black and white TV fundamentalsMadhumita Tamhane
 
Telecommunication switching system
Telecommunication switching systemTelecommunication switching system
Telecommunication switching systemMadhumita Tamhane
 
Data Link Synchronous Protocols - SDLC, HDLC
Data Link Synchronous Protocols - SDLC, HDLCData Link Synchronous Protocols - SDLC, HDLC
Data Link Synchronous Protocols - SDLC, HDLCMadhumita Tamhane
 
Data communication protocols in centralised networks (in master:slave environ...
Data communication protocols in centralised networks (in master:slave environ...Data communication protocols in centralised networks (in master:slave environ...
Data communication protocols in centralised networks (in master:slave environ...Madhumita Tamhane
 
Data link control line control unit LCU
Data link control  line control unit LCUData link control  line control unit LCU
Data link control line control unit LCUMadhumita Tamhane
 
Flyod's algorithm for finding shortest path
Flyod's algorithm for finding shortest pathFlyod's algorithm for finding shortest path
Flyod's algorithm for finding shortest pathMadhumita Tamhane
 
ISDN Integrated Services Digital Network
ISDN Integrated Services Digital NetworkISDN Integrated Services Digital Network
ISDN Integrated Services Digital NetworkMadhumita Tamhane
 
Asynchronous Transfer Mode ATM
Asynchronous Transfer Mode  ATMAsynchronous Transfer Mode  ATM
Asynchronous Transfer Mode ATMMadhumita Tamhane
 
Weight enumerators of block codes and the mc williams
Weight  enumerators of block codes and  the mc williamsWeight  enumerators of block codes and  the mc williams
Weight enumerators of block codes and the mc williamsMadhumita Tamhane
 
Justesen codes alternant codes goppa codes
Justesen codes alternant codes goppa codesJustesen codes alternant codes goppa codes
Justesen codes alternant codes goppa codesMadhumita Tamhane
 

More from Madhumita Tamhane (20)

Fiber optic sensors
Fiber optic sensors  Fiber optic sensors
Fiber optic sensors
 
OFDM for LTE
OFDM for LTEOFDM for LTE
OFDM for LTE
 
Small cells I : Femto cell
Small cells I :  Femto cellSmall cells I :  Femto cell
Small cells I : Femto cell
 
Optical wireless communication li fi
Optical wireless communication li fiOptical wireless communication li fi
Optical wireless communication li fi
 
Optical Fiber Communication Part 3 Optical Digital Receiver
Optical Fiber Communication Part 3 Optical Digital ReceiverOptical Fiber Communication Part 3 Optical Digital Receiver
Optical Fiber Communication Part 3 Optical Digital Receiver
 
Optical fiber communication Part 2 Sources and Detectors
Optical fiber communication Part 2 Sources and DetectorsOptical fiber communication Part 2 Sources and Detectors
Optical fiber communication Part 2 Sources and Detectors
 
Optical fiber communication Part 1 Optical Fiber Fundamentals
Optical fiber communication Part 1 Optical Fiber FundamentalsOptical fiber communication Part 1 Optical Fiber Fundamentals
Optical fiber communication Part 1 Optical Fiber Fundamentals
 
Colout TV Fundamentals
Colout TV FundamentalsColout TV Fundamentals
Colout TV Fundamentals
 
Black and white TV fundamentals
Black and white TV fundamentalsBlack and white TV fundamentals
Black and white TV fundamentals
 
Telecommunication switching system
Telecommunication switching systemTelecommunication switching system
Telecommunication switching system
 
X.25
X.25X.25
X.25
 
Data Link Synchronous Protocols - SDLC, HDLC
Data Link Synchronous Protocols - SDLC, HDLCData Link Synchronous Protocols - SDLC, HDLC
Data Link Synchronous Protocols - SDLC, HDLC
 
Data communication protocols in centralised networks (in master:slave environ...
Data communication protocols in centralised networks (in master:slave environ...Data communication protocols in centralised networks (in master:slave environ...
Data communication protocols in centralised networks (in master:slave environ...
 
Data link control line control unit LCU
Data link control  line control unit LCUData link control  line control unit LCU
Data link control line control unit LCU
 
Flyod's algorithm for finding shortest path
Flyod's algorithm for finding shortest pathFlyod's algorithm for finding shortest path
Flyod's algorithm for finding shortest path
 
Line codes
Line codesLine codes
Line codes
 
ISDN Integrated Services Digital Network
ISDN Integrated Services Digital NetworkISDN Integrated Services Digital Network
ISDN Integrated Services Digital Network
 
Asynchronous Transfer Mode ATM
Asynchronous Transfer Mode  ATMAsynchronous Transfer Mode  ATM
Asynchronous Transfer Mode ATM
 
Weight enumerators of block codes and the mc williams
Weight  enumerators of block codes and  the mc williamsWeight  enumerators of block codes and  the mc williams
Weight enumerators of block codes and the mc williams
 
Justesen codes alternant codes goppa codes
Justesen codes alternant codes goppa codesJustesen codes alternant codes goppa codes
Justesen codes alternant codes goppa codes
 

Recently uploaded

The SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teamsThe SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teamsDILIPKUMARMONDAL6
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Steel Structures - Building technology.pptx
Steel Structures - Building technology.pptxSteel Structures - Building technology.pptx
Steel Structures - Building technology.pptxNikhil Raut
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Industrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESIndustrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESNarmatha D
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...Amil Baba Dawood bangali
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 

Recently uploaded (20)

The SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teamsThe SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teams
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Steel Structures - Building technology.pptx
Steel Structures - Building technology.pptxSteel Structures - Building technology.pptx
Steel Structures - Building technology.pptx
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
Industrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESIndustrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIES
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 

Setting Artificial Neural Networks parameters

  • 1. ARTIFICIAL NEURAL NETWORKS PARAMETERS Setting ARTIFICIAL NEURAL NETWORKS PARAMETERS
  • 2. NEED FOR SETTING PARAMETER VALUES 1. LOCAL MINIMA w1 – Global Minima w2, w3 – Local minima W1 W2 W3 1 32 Erms Erms min
  • 3. NEED FOR SETTING PARAMETER VALUES 2. LEARNING RATE  Small learning rate – Slow and lengthy learning  Large learning rate –  Output may saturate  or may swing across desired.  May take too long to train. 3. Learning will improve and network training will converge if inputs and outputs are statistical i.e. numeric.
  • 4. TYPES OF TRAINING Supervised Training •Supplies the neural network with inputs and the desired Outputs • Response of the network to the inputs is measured •The weights are modified to reduce the difference between the actual and desired outputs Unsupervised Training •Only supplies inputs •The neural network adjusts its own weights so that similar inputs cause similar outputs •The network identifies the patterns and differences in the inputs without any external assistance
  • 5. I.INITIALISATION OF WEIGHTS  Larger weights will drive the output of layer 1 to saturation.  Network will require larger training time to emerge out of saturation.  Weights chosen as :- Small weights between -1 and 1 Or between -0.5 and 0.5
  • 6. INITIALISATION OF WEIGHTS  PROBLEM ITH THIS CHOISE:  If some of input parameters are very high, they will predominate the output.  e. g. x = [ 10 2 0.2 1]  SOLUTION:  Weights are initialized as inversely proportional to input.  Output will not depend on any individual parameters, but total input as a whole.
  • 7. RULE FOR INITIALISATION OF WEIGHTS  Weight between input and 1st layer: P  vij = (1/2P) ∑p=1 (1/|xj|)  P is total no of input patterns.  Weight between 1st layer and output layer: P  wij = (1/2P) ∑p=1 (1/ f(∑ vij xj )
  • 8. II. FREQUENCY OF WEIGHT UPDATES  Per pattern training: Weight changes after every input is applied.  Input set repeated if NN is not trained yet.  Per epoch training: Epoch is one iteration through the process of providing the network with an input and updating the network's weights  Many epochs are required to train the neural network  weight changes as suggested by every input are accumulated together into a single change at the end of each epoch i.e. set of patterns.  No change in weight at end of each input.  Also called BATCH MODE training
  • 9. FREQUENCY OF WEIGHT UPDATES  Advantages / Disadvantages  Batch mode training not possible for on- line training.  For large applications, with large training time, parallel processing may reduce time in batch mode training.  Per pattern training is more expensive as weight changes more often.  Per pattern suitable for small NN and small data set
  • 10. III. LEARNING RATE  FOR PERCEPTRON TRAINING ALGORITHM  Too small η – Very slow learning  Too large η – Output may saturate on one direction.  η = 0 --- no weight change  η = 1 --- Common Choice
  • 11. PROBLEM WITH η = 1  If η = 1 ∆w = ± x  New Output = (w + ∆w)t x  Output = wtx ± xtx  Here if wtx > xtx output will always be positive and grows in one direction only.  Should be - wtx < ∆wtx ∆w = ± x  η |xtx| >| wtx |  η > | wtx | / |xtx|  η is normally between 0 and 1.
  • 12. III. LEARNING RATE  FOR BACK PROPAGATION ALGORITHM  Large η in early iterations and steadily decrease it when NN is converging.  Increase η at every iteration that improves performance by significant amount and vise versa.  Steadily double the η untill error value worsens.  If Second derivative of E, ▼2E is constant and low, η can be large.  If Second derivative of E, ▼2E is large, η can be small.  For above, more computation required.
  • 13. MOMENTUM •Training done to reduce this error. •Training may stop at local minima instead global minima.
  • 14. MOMENTUM  Can be prevented if weight changes depend on average gradient of Error, rather than gradient at a point.  Averaging δE/ δw in a small neighborhood leads the network in general direction of MSE decrease without getting stuck at local minima.  May become complex.
  • 15. MOMENTUM  Shortcut method:  Weight change at ith iteration of back propagation algorithm also depends on immediately preceding weight changes.  This has an averaging effect.  This diminishes drastic fluctuations in weight changes over consecutive iterations.  Achieved by adding momentum to weight update rule.
  • 16. MOMENTUM  Δwkj(t+1) = ηδkxi + α∆wkj(t)  ∆wkj(t) is weight change required at time t .  α is a constant . α ≤ 1.  Disadvantage:  Past training trend can strongly bias current training.  α depends on application.  α = 0, no effect of past value.  α = 1, no effect of current value.
  • 17. What constitutes a “good” training set?  Samples must represent the general population  Samples must contain members of each class  Samples in each class must contain a wide range of variations or noise effect
  • 18. GENERALIZABILITY  Occurs more in large NN with less inputs.  Inputs are repeated while training till error reduces.  This leads to network memorizing the inputs samples.  Such trained NN may behave correctly with training data but fail with any unknown data.  Also called over training.
  • 19. GENERALIZABILITY- SOLUTION  The set of all known samples is broken into two orthogonal (independent) sets:  Training set - A group of samples used to train the neural network  Testing set - A group of samples used to test the performance of the neural network ◦ Used to estimate the error rate  Training continues as long as error to test data gradually reduces.  Training terminates as soon as error on test data increases.
  • 20. GENERALIZABILITY E time Error on test data Error on training data Time when error on test data starts to increase •Performance over test data is monitored over several iterations, not just one iteration.
  • 21. GENERALIZABILITY  Weight will NOT change on test data.  Overtraining can be avoided by using small number of parameters (hidden nodes and weights).  If size of training set is small, multiple sets can be created by adding small randomly generated noise or displacement.  X = { x1, x2, x3…..xn} then  X’ = { x1+ß1, x2+ß2, x3+ß3… xn + ßn}
  • 22. NO. OF HIDDEN LAYERS AND NODES  Mostly obtained by trial and error.  Too few nodes – NW may not be efficient.  Too large nodes –  Computation is tedious and expensive.  NW may memorize the inputs and perform poorly on test data.  NW is called well trained if performs well on data not used for testing.  Hence NN should be capable of generalizing from input, rather than memorizing the inputs.
  • 23. NO. OF HIDDEN LAYERS AND NODES  Methods:  Adaptive algorithm- ◦ Choose large number of nodes and train. ◦ Gradually discard nodes one by one during training. ◦ Train till performance reduces below unacceptable level. ◦ NN to be retrained at each change in nodes. ◦ Or vice versa ◦ Choose small number of nodes and increase nodes till performance is satisfactory.
  • 24. Let’s see how NN size advances:  Linear Classification: L1 ax1+bx2+c>0 ax1+bx2+c<0 L1
  • 25. Let’s see how NN size advances:  Two class problem - Nonlinear L1 L2 L11 L1 L2
  • 26. Let’s see how NN size advances:  Two class problem - Nonlinear L1 L2 L11 L3 L4 L1 L2 L3 L4 P
  • 27. Let’s see how NN size advances:  Two class problem - Nonlinear L22 PP1 P2 P3 P4 P1 P4 P2 P3
  • 29. NUMBER OF INPUT SAMPLES  As a thumb rule: 5 to 10 times as many samples as the number of weights to be trained.  Baum and Haussler suggest: ◦ P > |w| /(1-a) ◦ P is number of samples, ◦ |w| is number of weights to be trained, ◦ a expected accuracy on test set.
  • 30. Non-numeric inputs  Nonnumeric inputs like colours have no inherent order.  Can not be depicted on an axis e.g. red-blue- green-yellow.  Colour becomes position sensitive. Results in Erroneous training.  Hence assign binary vector with component corresponding to each colour. e.g.  Green – 0 0 1 0 red – 1 0 0 0  Blue – 0 1 0 0 yellow – 0 0 0 1  But dimension increases drastically
  • 31. Termination criteria  “Halt when goal is achieved.”  Perceptron training of linearly separable patterns – ◦ Correct classification of all samples. ◦ Termination is assured if ƞ is sufficiently small. ◦ Program may run indefinitely if ƞ is not appropriate. ◦ Different choice of if ƞ may yield classification.  Back propagation algorithm using delta rule– ◦ Termination can never be achieved with above criteria as output can never be +1 or -1. ◦ Will have to fix Emin , the minimum error acceptable. Terminates as error goes below Emin.
  • 32. Termination criteria  Perceptron training of linearly non-separable patterns – ◦ Above criteria will allow procedure to run indefinitely. ◦ Compare amount of progress in recent past. ◦ If number of misclassification has not changed in large step, samples are not linearly separable. ◦ Can fix limit of minimum % of correct classification for termination.