Ada boost

Hank
2013/06/26
AdaBOOST Classifier

Contents
 Concept
 Code Tracing
 Haar + AdaBoost
 Notice
 Usage
 Appendix

Adaboost v.2c3
Concept-Classifier
 Training procedures
 Give +ve and –ve examples to the system, then the
system will learn to classify an unknown input.
 E.g. give pictures of faces (+ve examples) and non-
faces (-ve examples) to train the system.
 Detection procedures
 Input an unknown (e.g. an image) , the system will
tell you it is a face or not.
Face non-face

 
 
)(
otherwise1
if1
)(
becomes)(,
.variablesareconstants,are
and),]),[((:functiontheis
1},-or1{polaritywhere
)(
otherwise1
)(if1
)(
use.oyou want tcasewhichcontroltopolarityuseand
equationbecometotogether2and1casecombine,-At time-
otherwise1
constantsgivenare,where1
)(
:aswrittenbecanIt.1otherwise
1then,areawhite""in theis],[xpointaIf
---Case2-
otherwise1
constantsgivenare,where,if1
)(
:aswrittenbecanIt.1otherwise
1then,areagray""in theis][pointaIf
---Case1-
ib
cpmuvp
xh
iequationcpmuvp
u,vm,cwhere
cmuvvuxff
p
i
pxfp
xh
p
(i)t
m,xc ,mu)if -(v-
xh
h(x)
h(x)v)(u
m,xcv-mu
xh
h(x)
h(x)(u,v)x
tt
t
tt
t
t
tttt
t
t



































4
(Updated)!! First let us learn what is what a weak classifier h( )
 v=mu+c
or
v-mu=c
•m,c are used to define
the line
•Any points in the gray
area satisfy v-mu<c
•Any points in the white
area satisfy v-mu>c
v
c
Gradient m
(0,0)
v-mu<c
v-mu>c
u

Adaboost - Adaptive Boosting
5
 Instead of resampling, uses training set re-weighting
 Each training sample uses a weight to determine the
probability of being selected for a training set.
 AdaBoost is an algorithm for constructing a “strong”
classifier as linear combination of “simple” “weak”
classifier
 Final classification based on weighted vote of weak
classifiers

Concept
Weak learners from
the family of lines
h => p(error) = 0.5 it is at chance
Each data point has
a class label:
wt =1
and a weight:
+1 ( )
-1 ( )
yt =

Concept
This one seems to be the best
Each data point has
a class label:
wt =1
and a weight:
+1 ( )
-1 ( )
yt =
This is a ‘weak classifier’: It performs slightly better than chance.

Concept
We set a new problem for which the previous weak classifier performs at chance again
Each data point has
a class label:
wt wt exp{-yt Ht}
We update the weights:
+1 ( )
-1 ( )
yt =

We set a new problem for which the previous weak classifier performs at chance again
Each data point has
a class label:
wt wt exp{-yt Ht}
We update the weights:
+1 ( )
-1 ( )
yt =
Concept

The strong (non- linear) classifier is built as the combination of
all the weak (linear) classifiers.
f1 f2
f3
f4
Concept

An example to show how Adaboost
works
Adaboost v.2c13
 Training,
 Present ten samples to the
system :[xi={ui,vi},yi={’+’ or ‘-’}]
 5 +ve (blue, diamond) samples
 5 –ve (red, circle) samples
 Train up the system
 Detection
 Give an input xj=(1.5,3.4)
 The system will tell you it is ‘+’ or ‘-’.
E.g. Face or non-face
 Example:
 u=weight, v=height
 Classification: suitability to play in
the basket ball team.
[xi={-0.48,0},yi=’+’]
[xi={-0.2,-0.5},yi=’+’]u-axis
v-axis

Adaboost concept
Adaboost v.2c14
 Use this training data,
how to make a classifier
One axis-parallel weak
classifier cannot achieve 100%
classification. E.g. h1(), h2(),
h3() all fail. That means no
matter how you place the
decision line (horizontally or
vertically) you cannot get 100%
classification result.
You may try it yourself!
The above strong classifier should; work,
but how can we find it?
ANSWER:
Combine many weak classifiers to
achieve it.
Training data
6 squares,
5 circles.
h1( )
h2 ( )
h3( )
The solution is a
H_complex( )
Objective: Train a classifier to
classify an unknown input to see
if it is a circle or square.

How? Each classifier may not be perfect but each can achieve over 50%
correct rate.

1






 
T
t
tt (x)hαsignH(x)
1
Adaboost v.2c15
Classification
Result
Combine to form the
Final strong classifier
h1( ) h2() h3( ) h4( ) h5( ) h6() h7()
2 3 4 5 6
7
7,..,2,1for,
classifierweak
eachforWeight
ii

Adaboost
Algorithm
Adaboost v.2c16
 
   
 
 
 
 
 
 
 


 


























 


























otherwise
iosigny
)(xhαtS)(xhαxo
CE
)(xhαtI)(xhαsigny
x
)(xhαtI)(xhαsigny
x
I
)(xhαtI
n
E
)(xhαtECE
Z
xhyiD
iDStep
ε
ε
.ε
otherwise
yxh
IIiD
εhD
Xh
,...Tt
)(
niD
YyXx),,y),..(xy(x
ti
i
t
iit
t
i
t
ii
i
i
t
ii
i
n
i
tj
j
ijt
t
ititt
t
t
t
t
t
t
iit
yxhyxh
n
i
tt
q
q
tt
t
t
iinn,
iitiit
0
)(if1
,,and,)(outputThe
}
break;t,Tthen0If
1,,errorhence,
i.e.classifiercascadedcurrentby theclassifiedyincorrectlisIf
0,,errorhence,
i.e.,classifiercascadedrentrcuby thedclassifiecorrectlyisIf
:followsasdefinedis)(and
,,,
1
errorclassifiercurrentthehilew
,,errorclassifiercascadedtalCurrent to:Step4
nexplanatioforslidenextsee,
))(exp()(
)(:3
value).confidence(orweight,
1
ln
2
1
:Step2
stop.otherwiseok)is0.5ansmaller th(error:50:teprerequisi:stepchecking
0
y)incorrectld(classifie)(if1
where,*)(error:Step1b
minarg:meansthat,respect toerror with
theminimizesthat}1,1{:classifiertheFind:Step1a{
1For
examples)1(negativeofnumberLexamples;1positiveofnumberM
LMnsuch that;/1)((weight)ondistributiInitialze
}1,1{,where:Given
1
1
1
1
1
1
)()(
1
1
11















classifierstrongfinalThe
1






 
T
t
tt (x)hαsignH(x)
Initialization
Main
Training
loop
The final strong
classifier
See
enlarged
versions
in the
following
slides
)(xhy(i)eD)(xhy(i)eD
weightincorrrectweightcorrectZ
ondistrubutiyprobabilitDionnormalizatZ
iti
α
classifiedyincorrectln
i
titi
-α
classifiedcorrectlyn
i
t
i
i
t
tt
tt







__
1
__
1
__
1
__
1
__
aissofactor,where

Initialization
Adaboost v.2c17

examples)1(negativeofnumberL
examples;1positiveofnumberM
LMnsuch that
;/1)((weight)ondistributiInitialze
}1,1{,where:Given
1
11






)(
niD
YyXx),,y),..(xy(x
t
iinn,

Main loop
(step1,2,3)
Adaboost v.2c18
 
   
 
nexplanatioforslidenextsee,
))(exp()(
)(:3
value).confidence(orweight,
1
ln
2
1
:Step2
stop.otherwiseok)is0.5ansmaller th(error:50:teprerequisi:stepchecking
0
)yincrroectlclassified()(if1
where,*)(error:Step1b
minarg:meansthat,respect toerror with
theminimizesthat}1,1{:classifiertheFind:Step1a{
1For
1
)()(
1
t
ititt
t
t
t
t
t
t
iit
yxhyxh
n
i
tt
q
q
tt
t
Z
xhyiD
iDStep
ε
ε
.ε
otherwise
yxh
IIiD
εhD
Xh
,...Tt
iitiit











 












Main loop (step 4)
Adaboost v.2c19
  
 
 
 
 
 


 





























otherwise
iosigny
)(xhαtS)(xhαxo
CE
)(xhαtI)(xhαsigny
x
)(xhαtI)(xhαsigny
x
I
)(xhαtI
n
E
)(xhαtECE
ti
i
t
iit
t
i
t
ii
i
i
t
ii
i
n
i
tj
j
ijt
0
)(if1
,,and,)(outputThe
}
break;t,Tthen0If
1,,errorhence,
0,,errorhence,
,,,
1
1
1
1
1
1












classifierstrongfinalThe
1






 
T
t
tt (x)hαsignH(x)

AdaBoost chooses this weight update function deliberately
Because,
•when a training sample is correctly classified, weight decreases
•when a training sample is incorrectly classified, weight increases
Note: Normalization factor Zt in step3
Adaboost v.2c20
)(xhy(i)eD)(xhy(i)eD
weightincorrrectweightcorrectZ
ondistrubutiyprobabilitDionnormalizatZ
Z
xhyiD
iDStep
call
iti
α
i
titi
-α
i
t
i
i
t
tt
t
ititt
t
tt










__
1
__
1
__
1
__
1
1
__
abecomessofactor,where
,
))(exp()(
)(:3
:Re

))(exp()()(1 itittt xhyiDiD 

Note: Stopping criterion of the main loop
 The main loops stops when all training data are correctly
classified by the cascaded classifier up to stage t.
 
 
 
 
}
break;t,Tthen0If
1,,errorhence,
0,,errorhence,
,,,
1
1
1
1
1


























t
i
t
ii
i
i
t
ii
i
n
i
tj
j
ijt
CE
)(xhαtI)(xhαsigny
x
)(xhαtI)(xhαsigny
x
I
)(xhαtI
n
E
)(xhαtECE









Adaboost v.2c21

Dt(i) =weight
Adaboost v.2c22
 Dt(i) = probability distribution of the i-th
training sample at time t . i=1,2…n.
 It shows how much you trust this sample.
 At t=1, all samples are the same with equal
weight. Dt=1(all i)=same
 At t >1 , Dt>1(i) will be modified, we will see later.

An example to show how Adaboost
works
Adaboost v.2c23
 Training,
 Present ten samples to the
system :[xi={ui,vi},yi={’+’ or ‘-’}]
 5 +ve (blue, diamond) samples
 5 –ve (red, circle) samples
 Train up the classification system.
 Detection example:
 Give an input xj=(1.5,3.4)
 The system will tell you it is ‘+’ or ‘-’.
E.g. Face or non-face.
 Example:
 You may treat u=weight, v=height
 Classification task: suitability to play
in the basket ball team.
[xi={-0.48,0},yi=’+’]
[xi={-0.2,-0.5},yi=’+’]u-axis
v-axis

Initialization
 M=5 +ve (blue, diamond) samples
 L=5 –ve (red, circle) samples
 n=M+L=10
 Initialize weight D(t=1)(i)= 1/10 for all
i=1,2,..,10,
 So, D(1)(1)=0.1, D(1) (2)=0.1,……, D(1)(10)=0.1
exampleLexample;positiveM
LMnthatsuch;/1)(Initialze
}1,1{,wherewhere:Given
1
11
negative
niD
YyXx),,y),..(x,y(x
t
iinn




Adaboost v.2c24

Select h( ): For simplicity in implementation
we use the Axis-parallel weak classifier

0
0
bycontrolledbecanlinetheofpositionthe
line)(vertcialmgradientoflineais
or
bycontrolledbecanlinetheofpositionthe
line)l(horizonta0mgradientoflineais
classifierweakparallel-Axis
.variablesareconstants,are),(:functiontheis
threshold1},-or1{polaritywhere
)(
otherwise0
)(if1
)(
Recall
u
f
v
f
u,vm,ccmuff
vp
i
pxfp
xh
tt
tttt
t








 



Adaboost v.2c25
ha (x)
hb(x)
u0
v0

Step1a,
1b
 Assume h() can only be
horizontal or vertical
separators. (axis-parallel
weak classifier)
 There are still many ways to
set h(), here, if this hq() is
selected, there will be 3
incorrectly classified training
samples.
 See the 3 circled training
samples
 We can go through all h( )s
and select the best with the
least misclassification (see
the following 2 slides)
 
stop.otherwiseok)is0.5ansmaller th(error:50:teprerequisi:stepchecking:Step1b
minarg:meansThat
respect toerror withtheminimizethat}1,1{:classifiertheFind:{Step1a
.ε
εh
DXh
t
q
q
t
tt






Adaboost v.2c26
Incorrectly classified by hq()
hq()

Example :Training example slides from [Smyth 2007]
classifier the ten red (circle)/blue (diamond) dots
Step 1a:

},-{p
(x)h
vvux
pupu
xh
i
i
i
11polarity
axis.verticalthe
toparallelisbecause
usednotis),,(
otherwise1
if1
)(








Adaboost v.2c27
Initialize:
Dn
(t=1)=1/10
You may choose
one of the following
axis-parallel (vertical
line) classifiers
Vertical Dotted lines
are possible choices
hi=1(x) ………….. hi=4(x) ……………… hi=9(x)
u1 u2 u3 u4 u5 u6 u7 u8 u9
u-axis
v-axis
There are 9x2 choices here,
hi=1,2,3,..9, (polarity +1)
h’i=1,2,3,..9, (polarity -1)

Example :Training example slides from [Smyth 2007]
classifier the ten red (circle)/blue (diamond) dots
Step 1a:

},-{p
(x)h
uvux
pvpv
xh
j
j
j
11polarity
axis.horizontalthe
toparallelisbecause
usednotis),,(
otherwise1
if1
)(








28
Initialize:
Dn
(t=1)=1/10
You may choose
one of the following
axis-parallel (horizontal
lines) classifiers
Horizontal dotted lines
are possible choices
hj=1(x)
hj=2(x)
:
hj=4(x)
:
:
:
:
:
hj=9(x)
v1
v2
v3
V4
V5
V6
V7
V8
v9
u-axis
v-axis
There are 9x2 choices here,
hj=1,2,3,..9, (polarity +1)
h’j=1,2,3,..9, (polarity -1)
All together including the previous
slide 36 choices

Step 1b:
Find and check the error of the weak classifier
h( )
 To evaluate how successful is your selected weak classifier h( ),
we can evaluate the error rate of the weak classifier
 ɛt = Misclassification probability of h( )
 Checking: If εt>= 0.5 (something wrong), stop the training
 Because, by definition a weak classifier should be slightly
better than a random choice--probability =0.5
 So if εt >= 0.5 , your h( ) is a bad choice, redesign another
h”( ) and do the training based on the new h”( ).
   
 
stop.otherwise,50:teprerequisi:stepchecking:Step1b
0
)classifiedly(incorrect)(if1
where,*)( )()(
1
.ε
otherwise
yxh
IIiD
t
iit
yxhyxh
n
i
tt iitiit



 
 


Adaboost v.2c29

 Assume h() can only be
horizontal or vertical
separators.
 How many different
classifiers are available?
 If hj() is selected as shown,
circle the misclassified
training samples. Find ɛ( ) to
see misclassification
probability if the probability
distribution (D) for each
sample is the same.
 Find h() with minimum error.
stop.otherwise,50:teprerequisi:stepchecking:Step1b
respect toerror withtheminimizesthat}1,1{:classfiertheFind:{Step1a
.ε
DXh
t
tt


Adaboost v.2c30
hj()

Result of step2 at t=1
Adaboost v.2c31

Incorrectly classified by ht=1(x)
ht=1(x)

Step2 at t=1 (refer to the previous
slide)
 Using εt=1=0.3, because
3 samples are
incorrectly classified
424.0
30.0
3.01
ln
2
1
.classifierofrateerrorweightedtheiswhere
1
ln
2
1
:Step2
3.01.01.01.0
1
1








t
tt
t
t
t
t
so
hε
ε
ε
ε


 
 
 


 






otherwise
yxh
I
IiD
iit
yxh
yxh
n
i
tt
iit
iit
0
)(if1
where
,*)(
)(
)(
1

Adaboost v.2c32
The proof can be found at http://vision.ucsd.edu/~bbabenko/data/boosting_note.pdf
Also see appendix.

Step3 at t=1, update Dt to Dt+1
 Update the weight Dt(i) for each training sample i
function)(prob.ondistrubutiaisso
factor,ionnormalizatwhere
))(exp()(
)(:3 1
t
t
t
ititt
t
D
Z
Z
xhyiD
iDStep




Adaboost v.2c33
The proof can be found at http://vision.ucsd.edu/~bbabenko/data/boosting_note.pdf
Also see appendix.

Step 3: Find first Z (the normalization
factor). Note that Dt=1=0.1, at=1 =0.424

911.0
456.0455.0
52.1*3*1.065.0*7*1.0*3*1.0*7*1.0
)__()__(
initput,1)(so),(:classifiedyincorrectl
initput,1)(so),(:classifiedcorrectly
)(
__
1t
samplesincorrect3andcorrect7
424.0,1.0
1
424.0424.0
)()()(
)1(
)(
)1(
)()(
)( )(
11























 



 
t
xhyi
α
t
xhy
α
t
xhyi
α
t
xhy
α
tt
iiiiii
iiiiii
xhyi
)(xhyα
t
xhy
)(xhyα
tt
xhy xhyi
t
tt
Z
ee
weightincorrecttotalweightcorrecttotal
(i)eD(i)eD(i)eD(i)eDZ
(i)xhyxhy
(i)xhyxhy
i(i)eD(i)eDZ
weightincorrectweightcorrectZ
αD
ii
t
iii
t
ii
t
iii
t
ii
itit
iii
itit
iii ii
Adaboost v.2c34
Note: currently t=1,
Dt=1(i)=0.1 for all i
7 correctly classified
3 incorrectly classified

Step 3: Example: update Dt to Dt+1
If correctly classified, weight Dt+1 will decrease, and vice versa.

 
  167.052.1
911.0
1.0
911.0
1.0
)(
0714.065.0
911.0
1.0
911.0
1.0
)(
,911.0since
52.1*1.0
1.01.0
)(
65.0
1.0
)(
1.0
)(
1
1
1
1
42.01
1
1
42.0
)(
1



















eiDincrease
eiDdecrease
SoZ
e
Z
e
Z
iD
Z
iD
e
Z
e
Z
iD
D
incorrectt
correctt
t
tt
incorrectt
t
correctt
t
correct
t
t
t
Adaboost v.2c35

Now run the main training loop second time t=2
 167.052.1
911.0
1.0
911.0
1.0
)(
0714.065.0
911.0
1.0
911.0
1.0
)(
1
1
1
1







eiD
eiD
incorrectt
correctt
Adaboost v.2c36

Now run the main training loop second
time t=2, and then t=3
Adaboost v.2c37

Final classifier by
combining three weak
classifiers

Combined classifier for t=1,2,3
Exercise: work out 1and 2

 )()()(*424.0)( 33221
1
xhαxhαxhsignxH
(x)hαsignH(x)
tt
T
t
tt









 
Adaboost v.2c38
Combine to form the
classifier.
May need one more step for the
final classifier
ht=1()
ht=2()
ht=3()
1
2 3

CvCascadeBoost::train
 update_weights( 0 );
 do{
 CvCascadeBoostTree* tree = new
CvCascadeBoostTree;
if( !tree->train( data, subsample_mask, this ) ){
delete tree;
 continue;
 }
 cvSeqPush( weak, &tree );
 update_weights( tree );
 trim_weights();
 } while( !isErrDesired() && (weak->total <
params.weak_count) );
weak_eval[i] = f(x_i) in [-
1,1]
w_i *= exp(-y_i*f(x_i))

Trace code
 Main related files
 traincascade.cpp
 classifier.train
 Main Boosting algorithm
 CvCascadeClassifier::train (file: CascadeClassifier.cpp), 只要觀察裡面的 for
numStages loop
1. updateTrainingSet
1. 只取之前 stage失敗的->predict=1
2. fillPassedSamples
1. imgReader.getPos與 imgReader.getNeg不太一樣
2. 利用 CvCascadeBoost::predict (boost.cpp) 來選擇加入的 samples, stage (stage為0時,
全取->predict(i)=1)
1. acceptanceRatio = negCount / negConsumed
3. 每個 stage會計算 tempLeafFARate, 若已經比
requiredLeafFARate 小, 則結束
4. CvCascadeBoost::train (file: boost.cpp)
1. new CvCascadeBoostTrainData 會在此時被 new
2. update_weights -> 若還不存在 tree, 則各 tree的 weight會在此時被 update
3. featureEvaluator 可任意被置換為 e.g. HaarEvaluator

Usage
 Pre-processing
 opencv_createsamples.exe
 Training
 opencv_traincascade.exe -featureType HAAR -data classifier/
-vec positive.vec -bg negative.dat -w 30 -h 30 -numPos 696 -
numNeg 545 –numStage 16
 Parameters:
 maxFalseAlarm: 最高可容忍的 false alarm rate, 此參數會影
響各 stage的停止條件
 requiredLeafFARate = pow(maxFalseAlarm, numStages )
/max_depth

Usage
 # pre-processing
 # resize images in directory, you need to have imageMagicK utility
 ################# 1. collect file names #############################
 # notice: a. negative image size should be larger than posititve ones
 find ./dataset/positive/resize/ -name '*.jpg' > temp.dat
 find ./dataset/negative/ -name '*.jpg' > negative.dat
 sed 's/$/ 1 0 0 30 30/' temp.dat > positive.dat
 rm temp.dat
 ################# 2. create samples #################################
 ./opencv_createsamples.exe -info positive.dat -vec positive.vec -w 30 -h 30 -show
 ################## 3. train samples #################################
 ./opencv_traincascade.exe -featureType HAAR -data classifier -vec positive.vec -bg
negative.dat -w 30 -h 30 -numPos 100 -numNeg 300 -numStages 18

Usage
 Detection
 Windows-based
 haarClassifier.load
 haarClassifier.detectMultiScale(procImg, resultRect, 1.1, 3, 0,
cvSize(12, 12), cvSize(80, 80));
 Detect on your own
 haarClassifier.load
 haarClassifier.featureEvaluator->setImage( scaledImage,
originalWindowSize )
 haarClassifier.runAt(evaluator, Point(0, 0), gypWeight);
 Notes
 Infinite loop in CvCascadeClassifier::fillPassedSamples
 Solution:
 Add more samples
 Reduce stages

Example
)Sum(r)Sum(r blacki,whitei, if
•Feature’s value is calculated as the difference between the
sum of the pixels within white and black rectangle regions.






thresholdfif
thresholdfif
xh
i
i
i
1
1
)(

Reference
 http://docs.opencv.org/doc/user_guide/ug_trainca
scade.html

Ada boost

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Ada boost

Similar to Ada boost (20)

Recently uploaded

Recently uploaded (20)

Ada boost