2. Introduction
• originally proposed by James Albus(1975)
• A class of sparse coarse-coded associative
memory algorithms that mimic the functionality
of the mammalian cerebellum
• A localized three-layer feedforward form of
neural network.
3. A 3 Layer structure
The first layer neurons: single input ,single output buffers
arrays of feature detecting.
Each neuron is a threshold unit: gives a one if the input is within a limited
range
when an input is presented at the input layer, a fixed number of neurons will
give out ones, the others will give zeros.
• The second layer = dual input AND gates
•perform logic AND operation on all it’s input from it’s related input from layer 1
•The third layer = Or gates =summation unit
computes the weight sum and produce an output.
7. association unit (AU)
• = lookup table
0
1
y1
y2
0
1
The AU tables store one weight value in each cell. Each AU has cells
which are na times
larger than the input
quantization cell size,
and are also
displaced along each
axis by some
constant.
Each layer 2 neuron
has a receptive field
that is na*na units in
size
12. Displacement
good overlay displacements are achieved
when na is prime and equal to 2ny +1,
bad overlay displacements are achieved when na is divisible by 6.
15. How a CMAC network operates
• having two inputs (N = 2)
• the input vector
s = [s1; s2].
• resolution elements =
(Qi = 6; i = 1; 2).
• quantizing functions =
(K = 4)
• Point A
sA = [2.35; 2.36]
• Point B is a point nearby
sB = [2.65; 2.56]
16. learning
• Learning is done in the
third layer.
• supervised learning
• output is compared with the
desired output, and CMAC
uses the difference to adjust
its weights to get the
desired response.
•similar to the back propagation learning
algorithm used in the MLP model
1)it only has one layer of weight to
modify
2)it does not do any back
propagation like the MLP model
17. Training
• Least Mean Square (LMS)
• weights are adjusted based on the desired
and actual outputs
• Each of the na referenced weights (one
per lookup table) is increased so that the
outputs come closer
18. The input/desired-output pairs (training points) are normally
presented to the CMAC in one of two ways:
• Random order:
• Function approximation:
– a selection of training points from the function can be
presented in random order to minimize learning
interference
• Trajectory order:
• used in an online controller :
– the training point inputs will probably vary gradually,
as they are tied to the system sensors.
• each training point will be close to the previous one in the
input space.
19. The number of CMAC weights
• The na¨ıve approach: storing each weight as an separate number in a
large floating point array: not possible.
– The maximum value of the table index pij is (if the largest displacements are
assumed):
•The total number of CMAC weights is the number of AU tables times the
number of weights stored per AU:
20. • increasing na : reduces the number weights
increases the number of weight tables.
• for typical values of res, na and ny the number of weights is huge.
• Ex: suppose res = 200, na = 10 and ny = 6
• Number of weights is 640 million.
• At four bytes per floating point number this would require 2441 megabytes of memory,
which is presently unreasonable.
• The problem is that the number of weights is exponential in the number of input
dimensions ny.
• One way to reduce the number of weights is to use hashing.
•assume that resj is sufficiently large
21.
22. Hashing
• Idea : to take an address into a large “virtual” memory and map it
into an address into a smaller physical memory.
• Ex: suppose a program is required to quickly store and recall 100 seven digit
telephone numbers.
1)use the telephone number as an index into an array of 10^7 entries.
2)Use an array of only (say) 251
the index into this array:
The modulo function hashes the large phone number address into a smaller
index in the range 0 ,.., 250.
•more efficient use of memory
•hash collision = there is the problem that some of the phone numbers will
hash to the same physical memory location
23. How the CMAC uses hashing
– Hash collisions are ignored— if any two
weights with the same physical address are
written, the second one will simply replace the
first.
– Sensor values may be correlated with one
another
– only a small subset of all possible input
trajectories are seen
24. Drawbacks
• some cells still are not thoroughly visited.
– it is difficult for them to achieve a 100% memory
utilization rate.
• when collisions occur, hash coding algorithms regard the
weights stored in the colliding memories as a
validated one.
• Therefore, most CMACs with hash coding algorithms may
retrieve, output, and update irrelevant data
• From the viewpoint of system control: this may cause
CMACs to introduce control noises to control
process.
25.
26. Training interference
• Points A and B have overlapping LG areas :
they share some weights
• Training at B will affect the value of xA because
v of its na weights will be altered
27. • If v = 0 : obviously no interference (there is
no overlap)
• if v = na: A and B are the same point and
the interference is maximized.
– This can be particularly problematic during
trajectory training, because then successive
training points are very likely to be close to
each other.
– problem can be reduced with a lower learning rate.
28. Generalization
• individual association cells may be
activated by more than one input patterns
• May be beneficial or nonbeneficial :
– if two input patterns are going to produce
similar outputs : desirable
– when different output values are needed : a
problem may occur
29. Training sparsity
• To ensure good generalization,
how far apart (in the input space) can the
training points be in trajectory ordered data?
32. Results from comparison
•The low resolution CMAC network performed better than the high resolution
CMAC network when noise was present in the data.
•The high resolution CMAC network was able to model noise-free data better.
•The functions learnt by the MLP and RBF networks are smooth despite the
presence of noise in the data.
•The mesh-plots for the CMAC network are jagged.
•The output values of a CMAC network tend to track the noise in target values ((c)
and (d)).
35. Advantages
very fast learning
( less complex calculations and weight involved
to be considered)
quickly training certain areas of memory
without affecting the whole memory
structure.
very important in real-time applications
36. Disadvantages
The storage requirement :
-less than that of look-up tables,
-significantly greater than MLP and RBF networks.
-worse when the dimensionality of the input space
increases.
not a major issue!
-the abundant availability of memory chips recently +
reusing memory space
• The function that is approximated by the CMAC is
not smooth, but jagged.
• Tend to track the noise in target values.
37. Applications
robot control
for Path planning for manipulators and mobile robots
in Control of manipulators
Industrial processes
Pole balancers and walking machines
in Character recognition
Reinforcement learning
as a Classifier system
Color correction ,
Rapid computation,
Function approximation
Computing plant models for feedforward control on-line in real-time.
39. an example of a two-dimensional space
input vector u of size d as a point in d-
dimensional space.
The input space is quantized
using a set of overlapping tiles.
• An input vector or query point activates
one tile from each set.
• For input spaces of high dimensionality,
the tiles form hypercubes.
• A query is performed by first activating
all the tiles that contain a query point.
• The activated tiles in turn activate
memory cells, which contain stored
values: the weights of the system
• Each tile is associated with a weight
value.
• The summing of these values produces
an overall output.
• The CMAC output is therefore stored in
a distributed fashion, such that the
output corresponding to any point in
input space is derived from the value
stored in a number of memory cells.
41. • The network’s n-dimensional input
vector is denoted by x, and the
network’s sparse internal
representation is represented by the
p-dimensional vector a;
• this vector is called transformed input
vector or the basis function vec-tor.
Fig. 1. shows the schematic
illustration of the CMAC neural
network.
• The transfer input vector a has as
many element as the weight vector
W.
• The second mapping is the output
computation from the weight vector
as the sum of the activated weights.
• The output is i1The vector a has only
ρ active elements and all others are
equal zero.
42. • The network’s n-dimensional input vector
is denoted by x, and the network’s
sparse internal representation is
represented by the p-dimensional vector
a;
• this vector is called transformed input
vector or the basis function vector. Fig. 1.
shows the schematic illustration of the
CMAC neural network.
• The transfer input vector a has as many
element as the weight vector W.
• The second mapping is the output
computation from the weight vector as
the sum of the activated weights.
• The output is i1
• The vector a has only ρ active elements
and all others are equal zero.
43. Hash coding algorithms
• Hardware implementation of the CMAC is difficult to achieve since its conceptual
memory theoretically needs a huge space to address the encoded input information.
• Hash coding algorithms are generally applied to reduce the space to a more reasonable
scale.
• However, this approach has some drawbacks :
• First, collisions frequently occur during mapping from a wide input domain to a relatively
small field.
• For instance, the cells of c3 and a2, as illustrated in Fig. 1, are projected onto a memory
cell p1 simultaneously.
• Some algorithms have been introduced to achieve low collision rate and high memory
utilization.
• However, in these improved algorithms, some cells still are not thoroughly visited.
• In other words, it is difficult for them to achieve a 100% memory utilization rate.
• Second, when collisions occur, hash coding algorithms regard the weights stored in the
colliding memories as a validated one. Therefore, most CMACs with hash coding
algorithms may retrieve, output, and update irrelevant data in terms of a mapping crash.
• From the viewpoint of system control, this may cause CMACs to introduce control
noises to control process.
45. • If the distance between the input vectors is relatively short,
then there should be some overlapping addresses, as is the
case with fc(S2) and fc(S3) in Fig. 1.
• This situation is referred to as the generalization problem.
• This problem can also be explained as follows:
• if input vectors S2 and S3 are similar, and if adequately
tuned weights have already been stored in the memory with
respect to S2 , then S3 can refer to these overlapping
addresses associated with S2 to get more suitable weights
for producing output before updating.