SlideShare a Scribd company logo
1 of 78
Download to read offline
Sampling strategies for
Sequential Monte Carlo methods
Arnaud Doucet1
, St´ephane S´en´ecal2
1
Department of Engineering, University of Cambridge
2
The Institute of Statistical Mathematics
2004
thanks to the Japanese Ministry of Education and the Japan Society for
the Promotion of Science
1
Overview
– Introduction : state space models, Monte Carlo methods
– Sequential Importance Sampling/Resampling
– Strategies for sampling
– Examples, applications
– References
2
Estimation of state space models
xt = ft(xt−1, ut) yt = gt(xt, vt)
p(x0:t|y1:t) → p(xt|y1:t) = p(x0:t|y1:t)dx0:t−1
distribution of x0:t ⇒ computation of estimate x0:t :
x0:t = x0:tp(x0:t|y1:t)dx0:t → Ep(.|y1:t){f(x0:t)}
x0:t = arg max
x0:t
p(x0:t|y1:t)
3
Computation of the estimates
p(x0:t|y1:t) ⇒ multidimensionnal, non-standard distributions :
→ analytical, numerical approximations
→ integration, optimisation methods
⇒ Monte Carlo techniques
4
Monte Carlo approach
compute estimates for distribution π(.) → samples x1, . . . , xN ∼ π
x
pi(x)
x_1 x_N
⇒ distribution πN = 1
N
N
i=1 δxi approximates π(.)
5
Monte Carlo estimates
SN (f) =
1
N
N
i=1
f(xi) −→ f(x)π(x)dx = Eπ{f(x)}
arg max(xi)1≤i≤N
πN (xi) approximates arg maxx π(x)
⇒ sampling xi ∼ π difficult
→ importance sampling techniques
6
Importance Sampling
xi ∼ π → candidate/proposal distribution xi ∼ g
x
g(x)
pi(x)
x_Nx_1
7
Importance Sampling
xi ∼ g = π → (xi, wi) weighted sample
⇒ weight wi =
π(xi)
g(xi)
x
g(x)
pi(x)
x_Nx_1
8
Estimation
importance sampling → computation of Monte Carlo estimates
e. g. expectations Eπ{f(x)} :
f(x)
π(x)
g(x)
g(x)dx = f(x)π(x)dx
N
i=1
wif(xi) → f(x)π(x)dx = Eπ{f(x)}
dynamic model (xt, yt) ⇒ recursive estimation x0:t−1 → x0:t
Monte Carlo techniques ⇒ sampling sequences x
(i)
0:t−1 → x
(i)
0:t
9
Sequential simulation
sampling sequences x
(i)
0:t ∼ πt(x0:t) recursively :
time
variable
state
x
p(x,t) target distribution:
t
t2
t1
p(x,t2)
x_t1
x_t2
p(x_t1)
p(x_t2)
p(x,t1)
10
Sequential simulation : importance sampling
samples x
(i)
0:t ∼ πt(x0:t) approximated by weighted particles
(x
(i)
0:t, w
(i)
t )1≤i≤N
time
p(x,t) target distribution:
p(x,t2)
t
t2
t1
x
p(x,t1)
11
Sequential importance sampling
diffusing particles x
(i)
0:t1
→ x
(i)
0:t2
time
p(x,t) target distribution:
p(x,t2)
t
x
p(x,t1)
t2
t1
⇒ sampling scheme x
(i)
0:t−1 → x
(i)
0:t
12
Sequential importance sampling
updating weights w
(i)
t1
→ w
(i)
t2
time
p(x,t) target distribution:
p(x,t2)
t
p(x,t1)
x
t2
t1
⇒ updating rule w
(i)
t−1 → w
(i)
t
13
Sequential Importance Sampling
x0:t ∼ πt(x0:t) ⇒ (x
(i)
0:t, w
(i)
t )1≤i≤N
Simulation scheme t − 1 → t :
– Sampling step x
(i)
t ∼ qt(xt|x
(i)
0:t−1)
– Updating weights
w
(i)
t ∝ w
(i)
t−1 ×
πt(x
(i)
0:t−1, x
(i)
t )
πt−1(x
(i)
0:t−1)qt(x
(i)
t |x
(i)
0:t−1)
incremental weight (iw)
normalizing
N
i=1 w
(i)
t = 1
14
Sequential Importance Sampling
x0:t ∼ πt(x0:t) ⇒ (x
(i)
0:t, w
(i)
t )1≤i≤N
proposal + reweighting →
pi(x_t)
x_t
15
Sequential Importance Sampling
proposal + reweighting → var{(w
(i)
t )1≤i≤N } with t
x_t
pi(x_t)
→ w
(i)
t ≈ 0 for all i except one
16
⇒ Resampling
x_t
pi(x_t)
0 x_t^(1)
x_t^(j)1x_t^(i)2 x_t^(k)3
x_t^(N)0
→ draw N particles paths from the set (x
(i)
0:t)1≤i≤N
with probability (w
(i)
t )1≤i≤N
17
Sequential Importance Sampling/Resampling
Simulation scheme t − 1 → t :
– Sampling step x
,(i)
t ∼ qt(x,
t|x
(i)
0:t−1)
– Updating weights w
(i)
t ∝ w
(i)
t−1 ×
πt(x
(i)
0:t−1,x
,(i)
t )
πt−1(x
(i)
0:t−1)qt(x
,(i)
t |x
(i)
0:t−1)
→ parallel computing
– ⇒ Resampling step : sample N paths from (x
(i)
0:t−1, x
,(i)
t )1≤i≤N
→ particles interacting : computation at least O(N)
18
SISR for recursive estimation of state space models
xt = ft(xt−1, ut) → p(xt|xt−1)
yt = gt(xt, vt) → p(yt|xt)
Usual SISR : Bootstrap filter (Gordon et al. 93, Kitagawa 96) :
– Sampling step x
(i)
t ∼ p(xt|x
(i)
t−1)
– Updating weights : incremental weight w
(i)
t ∝ w
(i)
t−1 × iw
iw ∝ p(yt|x
(i)
t )
– Stratified/Deterministic resampling
efficient, easy, fast for a wide class of models
→ tracking, time series
19
Overview - Break
– Introduction :
→ state space models
→ estimation, computating estimates via Monte Carlo methods
→ importance sampling
– recursive estimation → sequential simulation
⇒ Sequential Importance Sampling/Resampling
– ⇒ Strategies for sampling :
→ designing/sampling “optimal” candidate distribution
→ considering blocks of variables : reweighting, → sampling
– Examples and applications
20
Improving simulation
sampling multimodal, multidimensional distributions
model with informative observation → peaky likelihood
→ prior dynamics to diffuse particles : poor approximation results
→ efficient propagation for a finite number of particles N
⇒ need for good sampling proposals
21
Improving simulation
Optimal proposal distribution qt(xt|x
(i)
0:t−1)
→ mimimizing variance of incremental weight (w
(i)
t ∝ w
(i)
t−1 × iw)
iw =
πt(x
(i)
0:t−1, x
(i)
t )
πt−1(x
(i)
0:t−1)qt(x
(i)
t |x
(i)
0:t−1)
⇒ 1-step ahead predictive :
πt(xt|x0:t−1) = p(xt|xt−1, yt)
⇒ incremental weight :
iw →
πt(x0:t−1)
πt−1(x0:t−1)
=
p(x0:t−1|y1:t)
p(x0:t−1|y1:t−1)
∝ p(yt|xt−1) = p(yt|xt)p(xt|xt−1)dxt
22
Approximations
Sampling the predictive distribution πt(xt|x0:t−1) = p(xt|xt−1, yt) :
– expansions of the p.d.f. or log(p.d.f.), Taylor
– mixture models : Gaussian i πiN(µi, σ2
i )
– Accept/Reject schemes
– Markov chain schemes : Metropolis-Hastings, Gibbs sampler
– dynamic stochastic simulation (Hybrid Monte Carlo)
– augmented sampling spaces :
→ slice samplers
→ auxiliary variables
23
Auxiliary variables
Pitt and Shephard 99 : approximating predictive p(xt|x
(k)
t−1, yt)
via augmented sampling space → p(xt, k|x
(k)
t−1, yt)
x_t
p(x_t/y_t)
x_t^(j)
x_t−1^(1)0
x_t−1
p(x_t−1/y_t−1)
1 x_t−1^(j) 3 x_t−1^(k)
x_t−1(N)0
x_t−1^(i)2
x_t^(i2)x_t^(i1) x_t^(k1) x_t^(k3)
x_t^(k2)
index of particle k (→ number of offsping(s) of particle x
(k)
t−1) ∼ .|yt
⇒ boost particles with high likelihood
24
Auxiliary variables
→ importance sampling for p(xt, k|x
(k)
t−1, yt) :
candidate distribution :
g(xt, k|xt−1, yt) ∝ p(yt|µ
(k)
t )p(xt|x
(k)
t−1)
where µ
(k)
t = mean, mode, draw from xt|x
(k)
t−1
x_t
p(x_t/x_t−1^(k))
mean maximummu_t^(k)= ,
25
Auxiliary variables
– sample (x
(j)
t , kj)1≤j≤R from g(xt, k|x
(k)
t−1, yt) :
k ∼ g(k|xt−1, yt) ∝ p(yt|µ
(k)
t )p(xt|x
(k)
t−1)dxt = p(yt|µ
(k)
t )
xt ∼ p(xt|x
(k)
t−1)
– reweighting (x
(j)
t , kj) with
wj ∝
p(yt|x
(j)
t )
p(yt|µ
(kj )
t )
– resample N paths from (x
(kj )
0:t−1, x
(j)
t )1≤j≤R with second stage
weights wj
26
Improving simulation
sampling/approximating predictive πt(xt|x0:t−1) may not be sufficient
for diffusing particles efficiently : e.g. discrepancy (πt)t>0 high :
⇒ consider a block of variables xt−L:t for a fixed lag L
27
Approaches using a block of variables
– discrete distributions : Meirovitch 85
– reweighting before resampling :
auxiliary variables Pitt and Shephard 99,
Wang et al. 02
⇒ discrete distribution → analytical form for proposal
xt ∼ πt+L(xt|x0:t−1) = πt+L(xt:t+L|x0:t−1)dxt+1:t+L
Meirovitch 85 : growing a polymer, random walk in discrete space
→ complexity X L
for lag L
28
Reweighting idea
29
Reweighting technique
30
Reweighting + resampling
2
1
01
0
0
0
0
1 1
31
Approaches using a block of variables
→ auxiliary variables : Pitt and Shephard 99
proposal distribution :
p(xt, k|x
(k)
t−1, yt:t+L) ∝ p(xt:t+L, k|x
(k)
t−1, yt:t+L)dxt+1:t+L
approximated with importance sampling :
g(xt, k|x
(k)
t−1, yt:t+L) = p(yt+L|µ
(k)
t+L) . . . p(yt|µ
(k)
t )p(xt|x
(k)
t−1)
→ sample (x
(j)
t , kj)1≤j≤R
kj ∼ g(k|yt:t+L) ∝ p(yt+L|µ
(k)
t+L) . . . p(yt|µ
(k)
t )
x
(j)
t ∼ p(xt|x
(kj )
t−1 )
32
Approaches using a block of variables
auxiliary variables → resampling from (x
(j)
t )1≤j≤R :
→ propagate/sample x
(j)
t+1 → x
(j)
t+L with prior transitions p(xt|xt−1)
→ use second stage weights : w
(j)
t ∝ w
(j)
t−1 × iw
iw ∝
p(yt+L|x
(j)
t+L) . . . p(yt|x
(j)
t )
p(yt+L|µ
(kj )
t+L) . . . p(yt|µ
(kj )
t )
for resampling N paths (x
(i)
0:t)1≤i≤N
33
Approaches using a block of variables
→ reweighting before resampling : Wang et al. 02
0
x_t^(i)
x_t
t
w_t^(i)
a_t^(i)
t t+L
x_t+L^(i_j)
a_t^(i_j)
propagate particles x
(i)
t → x
(ij )
t+1:t+L for j = 1, . . . , R
compute weights a
(ij )
t , particle path x
(i)
0:t reweighted with e.g.
a
(i)
t =
R
j=1 a
(ij )
t
α
, resampling from the set (x
(i)
0:t, a
(i)
t )i=1,...,N
34
Reweighting
→ need to sample/propagate xt from a/by block of variables :
πt+L(xt|x0:t−1) = πt+L(xt:t+L|x0:t−1)dxt+1:t+L
⇒ sampling a block of variables
→ design a proposal/candidate distribution
35
Sampling recursively a block of variables
t−L t−L+1 tt−1
xt−L:t−1 → xt−L+1:t : imputing xt and re-imputing xt−L+1:t−1
36
Sampling a block of variables
t−L t−L+1 tt−1
t−L+1x’(
0
:0 t−1x( )
:t)
direct sampling :
xt−L+1:t ∼ qt(xt−L+1:t|x0:t−1)
37
Sampling a block of variables
t−L t−L+1 tt−1
t−L+1x’(
t−L+1x(
0
:0 t−1x(
:0 t−Lx(
:t)
)
)
)t−1:
proposal/candidate distribution for the block :
(x0:t−L, xt−L+1:t) ∼ πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1)dxt−L+1:t−1
38
Sampling a block of variables
⇒ Idea : consider extended block of variables
(x0:t−L, xt−L+1:t) → (x0:t−L, xt−L+1:t−1, xt−L+1:t) = (x0:t−1, xt−L+1:t)
t−L t−L+1 tt−1
t−L+1x’(
t−L+1x(
0
:0 t−Lx(
:t)
)
)t−1:
39
Sampling a block of variables
candidate distribution for extended block
(x0:t−L, xt−L+1:t−1, xt−L+1:t) = (x0:t−1, xt−L+1:t) :
(x0:t−1, xt−L+1:t) ∼ πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1)
direct sampling :
(x0:t−L, xt−L+1:t) ∼ πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1)dxt−L+1:t−1
40
Sampling a block of variables
target distribution for the block (x0:t−L, xt−L+1:t) :
πt(x0:t−L, xt−L+1:t)
⇒ auxiliary target distribution for the extended block
(x0:t−1, xt−L+1:t) = (x0:t−L, xt−L+1:t−1, xt−L+1:t) :
πt(x0:t−L, xt−L+1:t)rt(xt−L+1:t−1|x0:t−L, xt−L+1:t)
with rt = any conditional distribution
⇒ proposal + target distributions → importance sampling
41
Sequential Importance Block Sampling/Resampling
Simulation scheme t − 1 → t (index (i) dropped) :
– Proposal sampling step
xt−L+1:t ∼ qt(xt−L+1:t|x0:t−1)
– Updating weights
wt ∝ wt−1 ×
πt(x0:t−L, xt−L+1:t)rt(xt−L+1:t−1|x0:t−L, xt−L+1:t)
πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1)
– Resampling step
42
Sampling techniques for a block of variables
sampling the block xt−L+1:t ∼ qt(xt−L+1:t|x0:t−1) :
→ forward-backward recursion : e. g. Carter and Kohn 94
t−L t−L+1 tt−1
xt−L:t−1 → xt−L+1:t : imputing xt and re-imputing xt−L+1:t−1
43
Sampling techniques for a block of variables
→ forward-backward recursion :
t−L t−L+1 tt−1
xt−L:t−1 → xt−L+1:t : imputing xt and re-imputing xt−L+1:t−1
→ approximations : expansions, mixture models, MCMC, . . .
44
Improving simulation
Optimal proposal qt(xt−L+1:t|x0:t−1) distribution :
→ mimimizing variance of incremental weight w
(i)
t ∝ w
(i)
t−1 × iw :
iw =
πt(x0:t−L, xt−L+1:t)rt(xt−L+1:t−1|x0:t−L, xt−L+1:t)
πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1)
⇒ qt = L-step ahead predictive
πt(xt−L+1:t|x0:t−L) = p(xt−L+1:t|xt−L, yt−L+1:t)
For one variable : optimal qt = 1-step ahead predictive
πt(xt|x0:t−1) = p(xt|xt−1, yt)
45
Improving simulation
→ block of variables ⇒ optimal proposal and target distribution
mimimizing variance of incremental weight w
(i)
t ∝ w
(i)
t−1 × iw
iw =
πt(x0:t−L, xt−L+1:t)rt(xt−L+1:t−1|x0:t−L, xt−L+1:t)
πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1)
→ optimal conditional distribution rt(xt−L+1:t−1|x0:t−L, xt−L+1:t)
⇒ rt = (L − 1)-step ahead predictive
πt−1(xt−L+1:t−1|x0:t−L) = p(xt−L+1:t−1|xt−L, yt−L+1:t−1)
46
Improving simulation
For optimal qt and rt, incremental weight w
(i)
t ∝ w
(i)
t−1 × iw :
iw →
πt(x0:t−L)
πt−1(x0:t−L)
=
p(x0:t−L|y1:t)
p(x0:t−L|y1:t−1)
∝ p(yt|xt−L, yt−L+1:t−1)
∝ p(yt, xt−L+1:t|xt−L, yt−L+1:t−1)dxt−L+1:t
SISR for one variable with optimal proposal qt :
iw →
πt(x0:t−1)
πt−1(x0:t−1)
= p(yt|xt−1) = p(yt|xt)p(xt|xt−1)dxt
Bootstrap filter : iw = p(yt|xt)
47
Approximations for block sampling
Sampling (sub-)optimal qt(xt−L+1:t|x0:t−1) :
→ exact/approximated forward-backward recursions
→ approximations : expansions, mixture models, MCMC, . . .
For approximated optimal qt and rt, incremental weight :
πt(x0:t−L, xt−L+1:t)qt−1(xt−L+1:t−1|x0:t−L)
πt−1(x0:t−1)qt(xt−L+1:t|x0:t−L)
p(x0:t−L, xt−L+1:t|y1:t)q(xt−L+1:t−1|xt−L, yt−L+1:t−1)
p(x0:t−1|y1:t−1)q(xt−L+1:t|xt−L, yt−L+1:t)
48
Overview - Break
– Introduction : state space models, Monte Carlo methods
– Sequential Importance Sampling/Resampling
– Strategies for sampling :
→ “optimal” candidate distribution
sampling with e.g. auxiliary variables
→ considering a block of variables : reweighting
⇒ sampling a block of variables :
definition of importance sampling for a block
performing sampling → “optimal” candidate distribution
– ⇒ Examples, applications :
→ simple, complex models
→ why the sampling strategy for particles can be crucial ?
49
Example
Linear and Gaussian state space model :
xt = αxt−1 + ut x0, ut ∼ N(0, 1)
yt = xt + vt vt ∼ N(0, σ2
)
Sequential Monte Carlo methods :
– Bootstrap filter, proposal p(xt|xt−1)
– SISR with optimal proposal p(xt|xt−1, yt)
– SISR for blocks with optimal proposal p(xt−L+1:t|xt−L, yt−L+1:t)
computed by forward-backward exact recursions
⇒ estimates compared with Kalman filter results
⇒ approximation of target distribution p(xt|y1:t)
50
Estimation
0 10 20 30 40 50 60 70 80 90 100
−4
−3
−2
−1
0
1
2
3
4
time index
x(t)
x(t) and estimates
model (α, σ) = (0.9, 0.1) xt =
N
i=1 w
(i)
t x
(i)
t N=100
51
Approximation of the target distribution
⇒ Effective Sample Size :
ESS =
1
N
i=1[w
(i)
t ]2
w(i)
= 1
N : ESS = N
pi(x_t)
x_t
w(i)
≈ 0 ∀i except one : ESS = 1
x_t
pi(x_t)
⇒ Resampling performed for ESS ≤ N
2 , N
10
52
Approximation of the target distribution
Resampling for ESS ≤ N
2 , N = 100
0 10 20 30 40 50 60 70 80 90 100
0
10
20
30
40
50
60
70
80
90
100
time index
ESS
Efficient Sample Size (ESS)
ESS for Bootstrap (→), SISR (→) and SISR for blocks
of 2 variables (→) with optimal proposals
53
Approximation of the target distribution
Resampling for ESS ≤ N
2 , N = 100, 100 time steps
algorithm ESS resampling steps CPU time
Bootstrap 11.19 99 0.84
optimal SISR 77.1 2 0.12
Block-SISR L = 2 99.1 1 0.23
54
Approximation of the target distribution
Resampling for ESS ≤ N
2 , N = 100, ∞ time steps
algorithm ESS resampling steps CPU time
Bootstrap 10 100% ∝0.84
optimal SISR 75 0.04% ∝0.12
Block-SISR L = 2 99 0% ∝0.23
55
Approximation of the target distribution
Resampling for ESS ≤ N
2 , various N
algorithm ESS resampling steps
Bootstrap 10%N 100%
optimal SISR 75%N 0.04%
Block-SISR L = 2 99%N 0%
computational complexity : resampling O(N) → CPU time
56
CPU time / number of particles N
Resampling for ESS ≤ N
2 , 1,000 time steps
100 150 200 250 300 350 400 450 500
0
5
10
15
20
25
30
35
40
Number of particles N
CPUtime
Bootstrap (→), SISR (→) and SISR for blocks
of 2 variables (→) with optimal proposals
57
CPU time / number of particles N
Resampling for ESS ≤ N
2 , 1,000 time steps
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
0
10
20
30
40
50
60
70
number of particles N
CPUtime
CPU time/number of particles N
SISR (→) and SISR for blocks
of 2 variables (→) with optimal proposals
58
Sequential Monte Carlo methods
for this model :
– Same estimation results as Kalman filtering
– sampling ⇒ = approximation for the target distribution
– N ≤ 500 : computational complexity, CPU time
→ SISR with optimal proposal p(xt|xt−1, yt)
– N ≥ 500 : → block SISR with optimal proposal
p(xt−L+1:t|xt−L, yt−L+1:t)
59
Sampling strategies
xt = αxt−1 + ut x0, ut ∼ N(0, σ2
u)
yt = xt + vt vt ∼ N(0, σ2
v)
– σv=0.1 → observation yt very informative/prior σu=1.0
⇒ take into account yt for diffusing particles
p(xt|xt−1) → p(xt|xt−1, yt) ⇒ ESS
– α=0.9 → variables (xt)t correlated
⇒ sampling by block xt−L+1:t
block of observations yt−L+1:t more informative/single one yt
p(xt|xt−1, yt) → p(xt−L+1:t|xt−L, yt−L+1:t) ⇒ ESS
60
Approximation of the target distribution
Resampling for ESS ≤ N
2 , N = 100
0 20 40 60 80 100
0
20
40
60
80
100
0 20 40 60 80 100
0
20
40
60
80
100
120
0 20 40 60 80 100
0
20
40
60
80
100
120
0 20 40 60 80 100
0
20
40
60
80
100
ESS for Bootstrap (→), SISR with optimal proposals for 1 (→),
2 (→) and 10 variables (→)
σu=1.0, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5
61
CPU time / number of particles N
Resampling for ESS ≤ N
2 , 1,000 time steps
100 200 300 400 500
0
10
20
30
40
100 200 300 400 500
0
5
10
15
20
100 200 300 400 500
0
5
10
15
20
100 200 300 400 500
0
10
20
30
40
Bootstrap (→), SISR with optimal proposals for 1 (→),
2 (→) and 10 variables (→)
σu=1.0, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5
62
CPU time / number of particles N
Resampling for ESS ≤ N
2 , 1,000 time steps
0 2000 4000 6000 8000 10000
0
200
400
600
800
0 2000 4000 6000 8000 10000
0
100
200
300
400
0 2000 4000 6000 8000 10000
0
200
400
600
800
0 2000 4000 6000 8000 10000
0
50
100
150
200
250
300
350
SISR with optimal proposals for 1 (→), 2 (→) and 10 variables (→)
σu=1.0, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5
63
Approximation of the target distribution
Resampling for ESS ≤ N
2 , N = 100
0 20 40 60 80 100
0
20
40
60
80
100
0 20 40 60 80 100
30
40
50
60
70
80
90
100
0 20 40 60 80 100
0
20
40
60
80
100
0 20 40 60 80 100
40
50
60
70
80
90
100
ESS for Bootstrap (→), SISR with optimal proposals for 1 (→),
2 (→) and 10 variables (→)
σu=0.1, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5
64
CPU time / number of particles N
Resampling for ESS ≤ N
2 , 1,000 time steps
100 200 300 400 500
0
5
10
15
20
100 200 300 400 500
0
5
10
15
20
25
100 200 300 400 500
0
5
10
15
20
100 200 300 400 500
0
5
10
15
20
Bootstrap (→), SISR with optimal proposals for 1 (→),
2 (→) and 10 variables (→)
σu=0.1, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5
65
CPU time / number of particles N
Resampling for ESS ≤ N
2 , 1,000 time steps
0 2000 4000 6000 8000 10000
0
100
200
300
400
0 2000 4000 6000 8000 10000
0
50
100
150
200
250
0 2000 4000 6000 8000 10000
0
50
100
150
200
250
300
350
0 2000 4000 6000 8000 10000
0
50
100
150
200
SISR with optimal proposals for 1 (→), 2 (→) and 10 variables (→)
σu=0.1, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5
66
Overview- Break
– Introduction : state space models, Monte Carlo methods
– Sequential Importance Sampling/Resampling
– Strategies for sampling
– Applications : Linear and Gaussian model
⇒ sampling strategy :
→ approximation of the target distribution, CPU time
→ information in observation, dynamic of the state variable
→ nonlinear non-Gaussian models
67
Example
Nonlinear state space model :
xt = α(xt−1 + βx3
t−1) + ut x0, ut ∼ N(0, σ2
u)
yt = xt + vt vt ∼ N(0, σ2
v)
Sequential Monte Carlo methods :
– Bootstrap filter, proposal p(xt|xt−1)
– SISR with optimal proposal p(xt|xt−1, yt) approximated by
KF/EKF
– SISR for blocks with optimal proposal p(xt−L+1:t|xt−L, yt−L+1:t)
approximated by forward-backward recursions with KF/EKF
Parameters values α=0.9, β=0.2, σu=0.1 and σv=0.05
⇒ approximation of target distribution p(xt|y1:t)
68
Simulation results
algorithm MSE ESS RS CPU
Bootstrap 0.0021 36.8 70.3 % 0.68
SISR-KF 0.0019 64.7 19.3% 0.44
SISR-EKF 0.0019 65.8 19.2% 0.48
BSISR-KF 0.0018 72.3 0.9% 0.21
BSISR-EKF 0.0018 73.5 0.8% 0.24
N = 100 particles, 100 runs of particle filters for a single and for a
block of L = 2 variables (MSE from KF/EKF = 0.0034).
69
Approximation of the target distribution
Resampling for ESS ≤ N
2 , N = 100
0 20 40 60 80 100 120 140 160 180 200
0
10
20
30
40
50
60
70
80
90
100
time index
EffectiveSampleSize
Approximated ESS vs. time index for a realization of the Bootstrap
filter (dotted), the SISR with Kalman filter proposal for a single
variable (dashdotted) and for a block of L=2 variables (straight).
70
Simulation results
block size L N=100 N=500 N=1000 RS
2 74 370 715 0.9%
3 96 493 985 0.9%
4 99 496 989 1%
5 98 494 988 1%
10 97 486 972 2.5%
Approximated ESS averaged over 100 runs of particle filters for
blocks of L variables, considering N particles.
71
CPU time / number of particles N
Resampling for ESS ≤ N
2 , 1,000 time steps
100 200 300 400 500 600 700 800 900 1000
0
1
2
3
4
5
6
7
Number of Particles
ComputingTime
CPU time vs. N for bootstrap filter (dotted), SISR with KF proposal
for a single variable (KF : dashed, EKF : dashdotted) and for a block
of L=2 variables (straight), 100 realizations.
72
CPU time / number of particles N
Resampling for ESS ≤ N
2 , 1,000 time steps
2000 4000 6000 8000 10000
0
2
4
6
8
10
12
14
Number of Particles
ComputingTime
Computational time vs. N for block sampling scheme with lags from
L=2 (bottom), 3, 4, 5, 10 (top), 100 realizations.
73
Sequential Monte Carlo methods
for this model :
– Good approximations of the target distribution
– sampling ⇒ = approximation for the target distribution
– even for small N, block SISR with approximated optimal proposal
p(xt−L+1:t|xt−L, yt−L+1:t) is efficient for L=3, 4, 5
– → information in observation : σu=0.1, σv=0.05
74
Conclusion
⇒ Importance of proposal/candidate distribution for
Sequential Monte Carlo simulation methods
design of proposal :
→ information in observation, dynamic of the state variable :
p(xt|xt−1) ←→ p(xt|yt, xt−1) ←→ p(xt|yt)
→ sampling a block/fixed lag of variables can be useful :
– for intermittent/informative observation, correlated variables
– applications ⇒ tracking, radar, navigation, positioning . . .
75
References - SISR, Sequential Monte Carlo
– N. Gordon, D. Salmond, and A. F. M. Smith, “Novel approach to
nonlinear and non-Gaussian Bayesian state estimation,”
Proceedings IEE-F, vol. 140, pp. 107–113, 1993.
– G. Kitagawa, “Monte carlo filter and smoother for non-Gaussian
nonlinear state space models,” J. Comput. Graph. Statist., vol. 5,
pp. 1–25, 1996.
– A. Doucet, N. de Freitas, and N. Gordon, Eds., Sequential Monte
Carlo methods in practice, Statistics for engineering and
information science. Springer, 2001.
76
References - block/fixed lag approaches
– H. Meirovitch, “Scanning method as an unbiased simulation
technique and its application to the study of self-avoiding random
walks,” Phys. Rev. A, vol. 32, pp. 3699–3708, 1985.
– M. K. Pitt and N. Shephard, “Filtering via simulation : auxiliary
particle filter,” J. Am. Stat. Assoc., vol. 94, pp. 590–599, 1999.
– X. Wang, R. Chen, and D. Guo, “Delayed-pilot sampling for
mixture Kalman filter with application in fading channels,” IEEE
Trans. Sig. Proc., vol. 50, pp. 241–253, 2002.
77
References - block/fixed lag sampling methods
– A. Doucet and S. S´en´ecal, “Fixed-Lag Sequential Monte Carlo”,
accepted at EUSIPCO 2004.
– S. S´en´ecal and A. Doucet, “An example of sequential Monte Carlo
block sampling method,” AIC2003 Science of Modeling,
pp. 418-419, 2003.
– C. K. Carter and R. Kohn, “On the Gibbs sampling for state space
models,” Biometrika, vol. 81, pp. 541–553, 1994.
78

More Related Content

What's hot

Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big DataChristian Robert
 
Poster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferencePoster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferenceChristian Robert
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsStefano Cabras
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsCaleb (Shiqiang) Jin
 
Introducing Zap Q-Learning
Introducing Zap Q-Learning   Introducing Zap Q-Learning
Introducing Zap Q-Learning Sean Meyn
 
Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Christian Robert
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical MethodsChristian Robert
 
Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms
Reinforcement Learning: Hidden Theory and New Super-Fast AlgorithmsReinforcement Learning: Hidden Theory and New Super-Fast Algorithms
Reinforcement Learning: Hidden Theory and New Super-Fast AlgorithmsSean Meyn
 
The proof complexity of matrix algebra - Newton Institute, Cambridge 2006
The proof complexity of matrix algebra - Newton Institute, Cambridge 2006The proof complexity of matrix algebra - Newton Institute, Cambridge 2006
The proof complexity of matrix algebra - Newton Institute, Cambridge 2006Michael Soltys
 
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Yandex
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas EberleBigMC
 

What's hot (20)

2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
 
2018 MUMS Fall Course - Bayesian inference for model calibration in UQ - Ralp...
2018 MUMS Fall Course - Bayesian inference for model calibration in UQ - Ralp...2018 MUMS Fall Course - Bayesian inference for model calibration in UQ - Ralp...
2018 MUMS Fall Course - Bayesian inference for model calibration in UQ - Ralp...
 
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
 
Nested sampling
Nested samplingNested sampling
Nested sampling
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big Data
 
Poster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferencePoster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conference
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-Likelihoods
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
 
Introducing Zap Q-Learning
Introducing Zap Q-Learning   Introducing Zap Q-Learning
Introducing Zap Q-Learning
 
Richard Everitt's slides
Richard Everitt's slidesRichard Everitt's slides
Richard Everitt's slides
 
Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical Methods
 
Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms
Reinforcement Learning: Hidden Theory and New Super-Fast AlgorithmsReinforcement Learning: Hidden Theory and New Super-Fast Algorithms
Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms
 
The proof complexity of matrix algebra - Newton Institute, Cambridge 2006
The proof complexity of matrix algebra - Newton Institute, Cambridge 2006The proof complexity of matrix algebra - Newton Institute, Cambridge 2006
The proof complexity of matrix algebra - Newton Institute, Cambridge 2006
 
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas Eberle
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 

Viewers also liked

Consequences of Sequential Sampling for Meta-analysis
Consequences of Sequential Sampling for Meta-analysisConsequences of Sequential Sampling for Meta-analysis
Consequences of Sequential Sampling for Meta-analysisLorenzo Braschi Diaferia
 
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정Mad Scientists
 
AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012OptiModel
 
Field research and interaction design: course #3
Field research and interaction design: course #3Field research and interaction design: course #3
Field research and interaction design: course #3nicolas nova
 
Sampling Design
Sampling DesignSampling Design
Sampling DesignJale Nonan
 
Sampling methods PPT
Sampling methods PPTSampling methods PPT
Sampling methods PPTVijay Mehta
 
Sample and sampling techniques
Sample and sampling techniquesSample and sampling techniques
Sample and sampling techniquesNursing Path
 
Sampling Methods in Qualitative and Quantitative Research
Sampling Methods in Qualitative and Quantitative ResearchSampling Methods in Qualitative and Quantitative Research
Sampling Methods in Qualitative and Quantitative ResearchSam Ladner
 
Sampling and Sample Types
Sampling  and Sample TypesSampling  and Sample Types
Sampling and Sample TypesDr. Sunil Kumar
 
Slideshare.Com Powerpoint
Slideshare.Com PowerpointSlideshare.Com Powerpoint
Slideshare.Com Powerpointguested929b
 
RESEARCH METHOD - SAMPLING
RESEARCH METHOD - SAMPLINGRESEARCH METHOD - SAMPLING
RESEARCH METHOD - SAMPLINGHafizah Hajimia
 

Viewers also liked (16)

Consequences of Sequential Sampling for Meta-analysis
Consequences of Sequential Sampling for Meta-analysisConsequences of Sequential Sampling for Meta-analysis
Consequences of Sequential Sampling for Meta-analysis
 
Voxel based global-illumination
Voxel based global-illuminationVoxel based global-illumination
Voxel based global-illumination
 
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
 
AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012
 
Data collection techniques
Data collection techniquesData collection techniques
Data collection techniques
 
Field research and interaction design: course #3
Field research and interaction design: course #3Field research and interaction design: course #3
Field research and interaction design: course #3
 
Sampling Design
Sampling DesignSampling Design
Sampling Design
 
Sampling
SamplingSampling
Sampling
 
Sampling methods PPT
Sampling methods PPTSampling methods PPT
Sampling methods PPT
 
Sample and sampling techniques
Sample and sampling techniquesSample and sampling techniques
Sample and sampling techniques
 
Sampling Methods in Qualitative and Quantitative Research
Sampling Methods in Qualitative and Quantitative ResearchSampling Methods in Qualitative and Quantitative Research
Sampling Methods in Qualitative and Quantitative Research
 
Chapter 8-SAMPLE & SAMPLING TECHNIQUES
Chapter 8-SAMPLE & SAMPLING TECHNIQUESChapter 8-SAMPLE & SAMPLING TECHNIQUES
Chapter 8-SAMPLE & SAMPLING TECHNIQUES
 
sampling ppt
sampling pptsampling ppt
sampling ppt
 
Sampling and Sample Types
Sampling  and Sample TypesSampling  and Sample Types
Sampling and Sample Types
 
Slideshare.Com Powerpoint
Slideshare.Com PowerpointSlideshare.Com Powerpoint
Slideshare.Com Powerpoint
 
RESEARCH METHOD - SAMPLING
RESEARCH METHOD - SAMPLINGRESEARCH METHOD - SAMPLING
RESEARCH METHOD - SAMPLING
 

Similar to Sampling strategies for Sequential Monte Carlo (SMC) methods

Particle filtering
Particle filteringParticle filtering
Particle filteringWei Wang
 
Controlled sequential Monte Carlo
Controlled sequential Monte Carlo Controlled sequential Monte Carlo
Controlled sequential Monte Carlo JeremyHeng10
 
Markov Tutorial CDC Shanghai 2009
Markov Tutorial CDC Shanghai 2009Markov Tutorial CDC Shanghai 2009
Markov Tutorial CDC Shanghai 2009Sean Meyn
 
Looking Inside Mechanistic Models of Carcinogenesis
Looking Inside Mechanistic Models of CarcinogenesisLooking Inside Mechanistic Models of Carcinogenesis
Looking Inside Mechanistic Models of CarcinogenesisSascha Zöllner
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themPierre Jacob
 
Mathematics and AI
Mathematics and AIMathematics and AI
Mathematics and AIMarc Lelarge
 
Phase-Type Distributions for Finite Interacting Particle Systems
Phase-Type Distributions for Finite Interacting Particle SystemsPhase-Type Distributions for Finite Interacting Particle Systems
Phase-Type Distributions for Finite Interacting Particle SystemsStefan Eng
 
Ray : modeling dynamic systems
Ray : modeling dynamic systemsRay : modeling dynamic systems
Ray : modeling dynamic systemsHouw Liong The
 
Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMCPierre Jacob
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplingsPierre Jacob
 
optimal control principle slided
optimal control principle slidedoptimal control principle slided
optimal control principle slidedKarthi Ramachandran
 
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...Gota Morota
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Pierre Jacob
 
2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filternozomuhamada
 

Similar to Sampling strategies for Sequential Monte Carlo (SMC) methods (20)

Particle filtering
Particle filteringParticle filtering
Particle filtering
 
Controlled sequential Monte Carlo
Controlled sequential Monte Carlo Controlled sequential Monte Carlo
Controlled sequential Monte Carlo
 
Markov Tutorial CDC Shanghai 2009
Markov Tutorial CDC Shanghai 2009Markov Tutorial CDC Shanghai 2009
Markov Tutorial CDC Shanghai 2009
 
Looking Inside Mechanistic Models of Carcinogenesis
Looking Inside Mechanistic Models of CarcinogenesisLooking Inside Mechanistic Models of Carcinogenesis
Looking Inside Mechanistic Models of Carcinogenesis
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
 
talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 
Mathematics and AI
Mathematics and AIMathematics and AI
Mathematics and AI
 
Phase-Type Distributions for Finite Interacting Particle Systems
Phase-Type Distributions for Finite Interacting Particle SystemsPhase-Type Distributions for Finite Interacting Particle Systems
Phase-Type Distributions for Finite Interacting Particle Systems
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
 
Ray : modeling dynamic systems
Ray : modeling dynamic systemsRay : modeling dynamic systems
Ray : modeling dynamic systems
 
002 ray modeling dynamic systems
002 ray modeling dynamic systems002 ray modeling dynamic systems
002 ray modeling dynamic systems
 
002 ray modeling dynamic systems
002 ray modeling dynamic systems002 ray modeling dynamic systems
002 ray modeling dynamic systems
 
Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMC
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplings
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
optimal control principle slided
optimal control principle slidedoptimal control principle slided
optimal control principle slided
 
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods
 
2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter
 

Sampling strategies for Sequential Monte Carlo (SMC) methods

  • 1. Sampling strategies for Sequential Monte Carlo methods Arnaud Doucet1 , St´ephane S´en´ecal2 1 Department of Engineering, University of Cambridge 2 The Institute of Statistical Mathematics 2004 thanks to the Japanese Ministry of Education and the Japan Society for the Promotion of Science 1
  • 2. Overview – Introduction : state space models, Monte Carlo methods – Sequential Importance Sampling/Resampling – Strategies for sampling – Examples, applications – References 2
  • 3. Estimation of state space models xt = ft(xt−1, ut) yt = gt(xt, vt) p(x0:t|y1:t) → p(xt|y1:t) = p(x0:t|y1:t)dx0:t−1 distribution of x0:t ⇒ computation of estimate x0:t : x0:t = x0:tp(x0:t|y1:t)dx0:t → Ep(.|y1:t){f(x0:t)} x0:t = arg max x0:t p(x0:t|y1:t) 3
  • 4. Computation of the estimates p(x0:t|y1:t) ⇒ multidimensionnal, non-standard distributions : → analytical, numerical approximations → integration, optimisation methods ⇒ Monte Carlo techniques 4
  • 5. Monte Carlo approach compute estimates for distribution π(.) → samples x1, . . . , xN ∼ π x pi(x) x_1 x_N ⇒ distribution πN = 1 N N i=1 δxi approximates π(.) 5
  • 6. Monte Carlo estimates SN (f) = 1 N N i=1 f(xi) −→ f(x)π(x)dx = Eπ{f(x)} arg max(xi)1≤i≤N πN (xi) approximates arg maxx π(x) ⇒ sampling xi ∼ π difficult → importance sampling techniques 6
  • 7. Importance Sampling xi ∼ π → candidate/proposal distribution xi ∼ g x g(x) pi(x) x_Nx_1 7
  • 8. Importance Sampling xi ∼ g = π → (xi, wi) weighted sample ⇒ weight wi = π(xi) g(xi) x g(x) pi(x) x_Nx_1 8
  • 9. Estimation importance sampling → computation of Monte Carlo estimates e. g. expectations Eπ{f(x)} : f(x) π(x) g(x) g(x)dx = f(x)π(x)dx N i=1 wif(xi) → f(x)π(x)dx = Eπ{f(x)} dynamic model (xt, yt) ⇒ recursive estimation x0:t−1 → x0:t Monte Carlo techniques ⇒ sampling sequences x (i) 0:t−1 → x (i) 0:t 9
  • 10. Sequential simulation sampling sequences x (i) 0:t ∼ πt(x0:t) recursively : time variable state x p(x,t) target distribution: t t2 t1 p(x,t2) x_t1 x_t2 p(x_t1) p(x_t2) p(x,t1) 10
  • 11. Sequential simulation : importance sampling samples x (i) 0:t ∼ πt(x0:t) approximated by weighted particles (x (i) 0:t, w (i) t )1≤i≤N time p(x,t) target distribution: p(x,t2) t t2 t1 x p(x,t1) 11
  • 12. Sequential importance sampling diffusing particles x (i) 0:t1 → x (i) 0:t2 time p(x,t) target distribution: p(x,t2) t x p(x,t1) t2 t1 ⇒ sampling scheme x (i) 0:t−1 → x (i) 0:t 12
  • 13. Sequential importance sampling updating weights w (i) t1 → w (i) t2 time p(x,t) target distribution: p(x,t2) t p(x,t1) x t2 t1 ⇒ updating rule w (i) t−1 → w (i) t 13
  • 14. Sequential Importance Sampling x0:t ∼ πt(x0:t) ⇒ (x (i) 0:t, w (i) t )1≤i≤N Simulation scheme t − 1 → t : – Sampling step x (i) t ∼ qt(xt|x (i) 0:t−1) – Updating weights w (i) t ∝ w (i) t−1 × πt(x (i) 0:t−1, x (i) t ) πt−1(x (i) 0:t−1)qt(x (i) t |x (i) 0:t−1) incremental weight (iw) normalizing N i=1 w (i) t = 1 14
  • 15. Sequential Importance Sampling x0:t ∼ πt(x0:t) ⇒ (x (i) 0:t, w (i) t )1≤i≤N proposal + reweighting → pi(x_t) x_t 15
  • 16. Sequential Importance Sampling proposal + reweighting → var{(w (i) t )1≤i≤N } with t x_t pi(x_t) → w (i) t ≈ 0 for all i except one 16
  • 17. ⇒ Resampling x_t pi(x_t) 0 x_t^(1) x_t^(j)1x_t^(i)2 x_t^(k)3 x_t^(N)0 → draw N particles paths from the set (x (i) 0:t)1≤i≤N with probability (w (i) t )1≤i≤N 17
  • 18. Sequential Importance Sampling/Resampling Simulation scheme t − 1 → t : – Sampling step x ,(i) t ∼ qt(x, t|x (i) 0:t−1) – Updating weights w (i) t ∝ w (i) t−1 × πt(x (i) 0:t−1,x ,(i) t ) πt−1(x (i) 0:t−1)qt(x ,(i) t |x (i) 0:t−1) → parallel computing – ⇒ Resampling step : sample N paths from (x (i) 0:t−1, x ,(i) t )1≤i≤N → particles interacting : computation at least O(N) 18
  • 19. SISR for recursive estimation of state space models xt = ft(xt−1, ut) → p(xt|xt−1) yt = gt(xt, vt) → p(yt|xt) Usual SISR : Bootstrap filter (Gordon et al. 93, Kitagawa 96) : – Sampling step x (i) t ∼ p(xt|x (i) t−1) – Updating weights : incremental weight w (i) t ∝ w (i) t−1 × iw iw ∝ p(yt|x (i) t ) – Stratified/Deterministic resampling efficient, easy, fast for a wide class of models → tracking, time series 19
  • 20. Overview - Break – Introduction : → state space models → estimation, computating estimates via Monte Carlo methods → importance sampling – recursive estimation → sequential simulation ⇒ Sequential Importance Sampling/Resampling – ⇒ Strategies for sampling : → designing/sampling “optimal” candidate distribution → considering blocks of variables : reweighting, → sampling – Examples and applications 20
  • 21. Improving simulation sampling multimodal, multidimensional distributions model with informative observation → peaky likelihood → prior dynamics to diffuse particles : poor approximation results → efficient propagation for a finite number of particles N ⇒ need for good sampling proposals 21
  • 22. Improving simulation Optimal proposal distribution qt(xt|x (i) 0:t−1) → mimimizing variance of incremental weight (w (i) t ∝ w (i) t−1 × iw) iw = πt(x (i) 0:t−1, x (i) t ) πt−1(x (i) 0:t−1)qt(x (i) t |x (i) 0:t−1) ⇒ 1-step ahead predictive : πt(xt|x0:t−1) = p(xt|xt−1, yt) ⇒ incremental weight : iw → πt(x0:t−1) πt−1(x0:t−1) = p(x0:t−1|y1:t) p(x0:t−1|y1:t−1) ∝ p(yt|xt−1) = p(yt|xt)p(xt|xt−1)dxt 22
  • 23. Approximations Sampling the predictive distribution πt(xt|x0:t−1) = p(xt|xt−1, yt) : – expansions of the p.d.f. or log(p.d.f.), Taylor – mixture models : Gaussian i πiN(µi, σ2 i ) – Accept/Reject schemes – Markov chain schemes : Metropolis-Hastings, Gibbs sampler – dynamic stochastic simulation (Hybrid Monte Carlo) – augmented sampling spaces : → slice samplers → auxiliary variables 23
  • 24. Auxiliary variables Pitt and Shephard 99 : approximating predictive p(xt|x (k) t−1, yt) via augmented sampling space → p(xt, k|x (k) t−1, yt) x_t p(x_t/y_t) x_t^(j) x_t−1^(1)0 x_t−1 p(x_t−1/y_t−1) 1 x_t−1^(j) 3 x_t−1^(k) x_t−1(N)0 x_t−1^(i)2 x_t^(i2)x_t^(i1) x_t^(k1) x_t^(k3) x_t^(k2) index of particle k (→ number of offsping(s) of particle x (k) t−1) ∼ .|yt ⇒ boost particles with high likelihood 24
  • 25. Auxiliary variables → importance sampling for p(xt, k|x (k) t−1, yt) : candidate distribution : g(xt, k|xt−1, yt) ∝ p(yt|µ (k) t )p(xt|x (k) t−1) where µ (k) t = mean, mode, draw from xt|x (k) t−1 x_t p(x_t/x_t−1^(k)) mean maximummu_t^(k)= , 25
  • 26. Auxiliary variables – sample (x (j) t , kj)1≤j≤R from g(xt, k|x (k) t−1, yt) : k ∼ g(k|xt−1, yt) ∝ p(yt|µ (k) t )p(xt|x (k) t−1)dxt = p(yt|µ (k) t ) xt ∼ p(xt|x (k) t−1) – reweighting (x (j) t , kj) with wj ∝ p(yt|x (j) t ) p(yt|µ (kj ) t ) – resample N paths from (x (kj ) 0:t−1, x (j) t )1≤j≤R with second stage weights wj 26
  • 27. Improving simulation sampling/approximating predictive πt(xt|x0:t−1) may not be sufficient for diffusing particles efficiently : e.g. discrepancy (πt)t>0 high : ⇒ consider a block of variables xt−L:t for a fixed lag L 27
  • 28. Approaches using a block of variables – discrete distributions : Meirovitch 85 – reweighting before resampling : auxiliary variables Pitt and Shephard 99, Wang et al. 02 ⇒ discrete distribution → analytical form for proposal xt ∼ πt+L(xt|x0:t−1) = πt+L(xt:t+L|x0:t−1)dxt+1:t+L Meirovitch 85 : growing a polymer, random walk in discrete space → complexity X L for lag L 28
  • 32. Approaches using a block of variables → auxiliary variables : Pitt and Shephard 99 proposal distribution : p(xt, k|x (k) t−1, yt:t+L) ∝ p(xt:t+L, k|x (k) t−1, yt:t+L)dxt+1:t+L approximated with importance sampling : g(xt, k|x (k) t−1, yt:t+L) = p(yt+L|µ (k) t+L) . . . p(yt|µ (k) t )p(xt|x (k) t−1) → sample (x (j) t , kj)1≤j≤R kj ∼ g(k|yt:t+L) ∝ p(yt+L|µ (k) t+L) . . . p(yt|µ (k) t ) x (j) t ∼ p(xt|x (kj ) t−1 ) 32
  • 33. Approaches using a block of variables auxiliary variables → resampling from (x (j) t )1≤j≤R : → propagate/sample x (j) t+1 → x (j) t+L with prior transitions p(xt|xt−1) → use second stage weights : w (j) t ∝ w (j) t−1 × iw iw ∝ p(yt+L|x (j) t+L) . . . p(yt|x (j) t ) p(yt+L|µ (kj ) t+L) . . . p(yt|µ (kj ) t ) for resampling N paths (x (i) 0:t)1≤i≤N 33
  • 34. Approaches using a block of variables → reweighting before resampling : Wang et al. 02 0 x_t^(i) x_t t w_t^(i) a_t^(i) t t+L x_t+L^(i_j) a_t^(i_j) propagate particles x (i) t → x (ij ) t+1:t+L for j = 1, . . . , R compute weights a (ij ) t , particle path x (i) 0:t reweighted with e.g. a (i) t = R j=1 a (ij ) t α , resampling from the set (x (i) 0:t, a (i) t )i=1,...,N 34
  • 35. Reweighting → need to sample/propagate xt from a/by block of variables : πt+L(xt|x0:t−1) = πt+L(xt:t+L|x0:t−1)dxt+1:t+L ⇒ sampling a block of variables → design a proposal/candidate distribution 35
  • 36. Sampling recursively a block of variables t−L t−L+1 tt−1 xt−L:t−1 → xt−L+1:t : imputing xt and re-imputing xt−L+1:t−1 36
  • 37. Sampling a block of variables t−L t−L+1 tt−1 t−L+1x’( 0 :0 t−1x( ) :t) direct sampling : xt−L+1:t ∼ qt(xt−L+1:t|x0:t−1) 37
  • 38. Sampling a block of variables t−L t−L+1 tt−1 t−L+1x’( t−L+1x( 0 :0 t−1x( :0 t−Lx( :t) ) ) )t−1: proposal/candidate distribution for the block : (x0:t−L, xt−L+1:t) ∼ πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1)dxt−L+1:t−1 38
  • 39. Sampling a block of variables ⇒ Idea : consider extended block of variables (x0:t−L, xt−L+1:t) → (x0:t−L, xt−L+1:t−1, xt−L+1:t) = (x0:t−1, xt−L+1:t) t−L t−L+1 tt−1 t−L+1x’( t−L+1x( 0 :0 t−Lx( :t) ) )t−1: 39
  • 40. Sampling a block of variables candidate distribution for extended block (x0:t−L, xt−L+1:t−1, xt−L+1:t) = (x0:t−1, xt−L+1:t) : (x0:t−1, xt−L+1:t) ∼ πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1) direct sampling : (x0:t−L, xt−L+1:t) ∼ πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1)dxt−L+1:t−1 40
  • 41. Sampling a block of variables target distribution for the block (x0:t−L, xt−L+1:t) : πt(x0:t−L, xt−L+1:t) ⇒ auxiliary target distribution for the extended block (x0:t−1, xt−L+1:t) = (x0:t−L, xt−L+1:t−1, xt−L+1:t) : πt(x0:t−L, xt−L+1:t)rt(xt−L+1:t−1|x0:t−L, xt−L+1:t) with rt = any conditional distribution ⇒ proposal + target distributions → importance sampling 41
  • 42. Sequential Importance Block Sampling/Resampling Simulation scheme t − 1 → t (index (i) dropped) : – Proposal sampling step xt−L+1:t ∼ qt(xt−L+1:t|x0:t−1) – Updating weights wt ∝ wt−1 × πt(x0:t−L, xt−L+1:t)rt(xt−L+1:t−1|x0:t−L, xt−L+1:t) πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1) – Resampling step 42
  • 43. Sampling techniques for a block of variables sampling the block xt−L+1:t ∼ qt(xt−L+1:t|x0:t−1) : → forward-backward recursion : e. g. Carter and Kohn 94 t−L t−L+1 tt−1 xt−L:t−1 → xt−L+1:t : imputing xt and re-imputing xt−L+1:t−1 43
  • 44. Sampling techniques for a block of variables → forward-backward recursion : t−L t−L+1 tt−1 xt−L:t−1 → xt−L+1:t : imputing xt and re-imputing xt−L+1:t−1 → approximations : expansions, mixture models, MCMC, . . . 44
  • 45. Improving simulation Optimal proposal qt(xt−L+1:t|x0:t−1) distribution : → mimimizing variance of incremental weight w (i) t ∝ w (i) t−1 × iw : iw = πt(x0:t−L, xt−L+1:t)rt(xt−L+1:t−1|x0:t−L, xt−L+1:t) πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1) ⇒ qt = L-step ahead predictive πt(xt−L+1:t|x0:t−L) = p(xt−L+1:t|xt−L, yt−L+1:t) For one variable : optimal qt = 1-step ahead predictive πt(xt|x0:t−1) = p(xt|xt−1, yt) 45
  • 46. Improving simulation → block of variables ⇒ optimal proposal and target distribution mimimizing variance of incremental weight w (i) t ∝ w (i) t−1 × iw iw = πt(x0:t−L, xt−L+1:t)rt(xt−L+1:t−1|x0:t−L, xt−L+1:t) πt−1(x0:t−1)qt(xt−L+1:t|x0:t−1) → optimal conditional distribution rt(xt−L+1:t−1|x0:t−L, xt−L+1:t) ⇒ rt = (L − 1)-step ahead predictive πt−1(xt−L+1:t−1|x0:t−L) = p(xt−L+1:t−1|xt−L, yt−L+1:t−1) 46
  • 47. Improving simulation For optimal qt and rt, incremental weight w (i) t ∝ w (i) t−1 × iw : iw → πt(x0:t−L) πt−1(x0:t−L) = p(x0:t−L|y1:t) p(x0:t−L|y1:t−1) ∝ p(yt|xt−L, yt−L+1:t−1) ∝ p(yt, xt−L+1:t|xt−L, yt−L+1:t−1)dxt−L+1:t SISR for one variable with optimal proposal qt : iw → πt(x0:t−1) πt−1(x0:t−1) = p(yt|xt−1) = p(yt|xt)p(xt|xt−1)dxt Bootstrap filter : iw = p(yt|xt) 47
  • 48. Approximations for block sampling Sampling (sub-)optimal qt(xt−L+1:t|x0:t−1) : → exact/approximated forward-backward recursions → approximations : expansions, mixture models, MCMC, . . . For approximated optimal qt and rt, incremental weight : πt(x0:t−L, xt−L+1:t)qt−1(xt−L+1:t−1|x0:t−L) πt−1(x0:t−1)qt(xt−L+1:t|x0:t−L) p(x0:t−L, xt−L+1:t|y1:t)q(xt−L+1:t−1|xt−L, yt−L+1:t−1) p(x0:t−1|y1:t−1)q(xt−L+1:t|xt−L, yt−L+1:t) 48
  • 49. Overview - Break – Introduction : state space models, Monte Carlo methods – Sequential Importance Sampling/Resampling – Strategies for sampling : → “optimal” candidate distribution sampling with e.g. auxiliary variables → considering a block of variables : reweighting ⇒ sampling a block of variables : definition of importance sampling for a block performing sampling → “optimal” candidate distribution – ⇒ Examples, applications : → simple, complex models → why the sampling strategy for particles can be crucial ? 49
  • 50. Example Linear and Gaussian state space model : xt = αxt−1 + ut x0, ut ∼ N(0, 1) yt = xt + vt vt ∼ N(0, σ2 ) Sequential Monte Carlo methods : – Bootstrap filter, proposal p(xt|xt−1) – SISR with optimal proposal p(xt|xt−1, yt) – SISR for blocks with optimal proposal p(xt−L+1:t|xt−L, yt−L+1:t) computed by forward-backward exact recursions ⇒ estimates compared with Kalman filter results ⇒ approximation of target distribution p(xt|y1:t) 50
  • 51. Estimation 0 10 20 30 40 50 60 70 80 90 100 −4 −3 −2 −1 0 1 2 3 4 time index x(t) x(t) and estimates model (α, σ) = (0.9, 0.1) xt = N i=1 w (i) t x (i) t N=100 51
  • 52. Approximation of the target distribution ⇒ Effective Sample Size : ESS = 1 N i=1[w (i) t ]2 w(i) = 1 N : ESS = N pi(x_t) x_t w(i) ≈ 0 ∀i except one : ESS = 1 x_t pi(x_t) ⇒ Resampling performed for ESS ≤ N 2 , N 10 52
  • 53. Approximation of the target distribution Resampling for ESS ≤ N 2 , N = 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 time index ESS Efficient Sample Size (ESS) ESS for Bootstrap (→), SISR (→) and SISR for blocks of 2 variables (→) with optimal proposals 53
  • 54. Approximation of the target distribution Resampling for ESS ≤ N 2 , N = 100, 100 time steps algorithm ESS resampling steps CPU time Bootstrap 11.19 99 0.84 optimal SISR 77.1 2 0.12 Block-SISR L = 2 99.1 1 0.23 54
  • 55. Approximation of the target distribution Resampling for ESS ≤ N 2 , N = 100, ∞ time steps algorithm ESS resampling steps CPU time Bootstrap 10 100% ∝0.84 optimal SISR 75 0.04% ∝0.12 Block-SISR L = 2 99 0% ∝0.23 55
  • 56. Approximation of the target distribution Resampling for ESS ≤ N 2 , various N algorithm ESS resampling steps Bootstrap 10%N 100% optimal SISR 75%N 0.04% Block-SISR L = 2 99%N 0% computational complexity : resampling O(N) → CPU time 56
  • 57. CPU time / number of particles N Resampling for ESS ≤ N 2 , 1,000 time steps 100 150 200 250 300 350 400 450 500 0 5 10 15 20 25 30 35 40 Number of particles N CPUtime Bootstrap (→), SISR (→) and SISR for blocks of 2 variables (→) with optimal proposals 57
  • 58. CPU time / number of particles N Resampling for ESS ≤ N 2 , 1,000 time steps 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 10 20 30 40 50 60 70 number of particles N CPUtime CPU time/number of particles N SISR (→) and SISR for blocks of 2 variables (→) with optimal proposals 58
  • 59. Sequential Monte Carlo methods for this model : – Same estimation results as Kalman filtering – sampling ⇒ = approximation for the target distribution – N ≤ 500 : computational complexity, CPU time → SISR with optimal proposal p(xt|xt−1, yt) – N ≥ 500 : → block SISR with optimal proposal p(xt−L+1:t|xt−L, yt−L+1:t) 59
  • 60. Sampling strategies xt = αxt−1 + ut x0, ut ∼ N(0, σ2 u) yt = xt + vt vt ∼ N(0, σ2 v) – σv=0.1 → observation yt very informative/prior σu=1.0 ⇒ take into account yt for diffusing particles p(xt|xt−1) → p(xt|xt−1, yt) ⇒ ESS – α=0.9 → variables (xt)t correlated ⇒ sampling by block xt−L+1:t block of observations yt−L+1:t more informative/single one yt p(xt|xt−1, yt) → p(xt−L+1:t|xt−L, yt−L+1:t) ⇒ ESS 60
  • 61. Approximation of the target distribution Resampling for ESS ≤ N 2 , N = 100 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 120 0 20 40 60 80 100 0 20 40 60 80 100 120 0 20 40 60 80 100 0 20 40 60 80 100 ESS for Bootstrap (→), SISR with optimal proposals for 1 (→), 2 (→) and 10 variables (→) σu=1.0, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5 61
  • 62. CPU time / number of particles N Resampling for ESS ≤ N 2 , 1,000 time steps 100 200 300 400 500 0 10 20 30 40 100 200 300 400 500 0 5 10 15 20 100 200 300 400 500 0 5 10 15 20 100 200 300 400 500 0 10 20 30 40 Bootstrap (→), SISR with optimal proposals for 1 (→), 2 (→) and 10 variables (→) σu=1.0, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5 62
  • 63. CPU time / number of particles N Resampling for ESS ≤ N 2 , 1,000 time steps 0 2000 4000 6000 8000 10000 0 200 400 600 800 0 2000 4000 6000 8000 10000 0 100 200 300 400 0 2000 4000 6000 8000 10000 0 200 400 600 800 0 2000 4000 6000 8000 10000 0 50 100 150 200 250 300 350 SISR with optimal proposals for 1 (→), 2 (→) and 10 variables (→) σu=1.0, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5 63
  • 64. Approximation of the target distribution Resampling for ESS ≤ N 2 , N = 100 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 30 40 50 60 70 80 90 100 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 40 50 60 70 80 90 100 ESS for Bootstrap (→), SISR with optimal proposals for 1 (→), 2 (→) and 10 variables (→) σu=0.1, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5 64
  • 65. CPU time / number of particles N Resampling for ESS ≤ N 2 , 1,000 time steps 100 200 300 400 500 0 5 10 15 20 100 200 300 400 500 0 5 10 15 20 25 100 200 300 400 500 0 5 10 15 20 100 200 300 400 500 0 5 10 15 20 Bootstrap (→), SISR with optimal proposals for 1 (→), 2 (→) and 10 variables (→) σu=0.1, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5 65
  • 66. CPU time / number of particles N Resampling for ESS ≤ N 2 , 1,000 time steps 0 2000 4000 6000 8000 10000 0 100 200 300 400 0 2000 4000 6000 8000 10000 0 50 100 150 200 250 0 2000 4000 6000 8000 10000 0 50 100 150 200 250 300 350 0 2000 4000 6000 8000 10000 0 50 100 150 200 SISR with optimal proposals for 1 (→), 2 (→) and 10 variables (→) σu=0.1, left/right : σv=0.1/1.0, top/bottom : α=0.9/0.5 66
  • 67. Overview- Break – Introduction : state space models, Monte Carlo methods – Sequential Importance Sampling/Resampling – Strategies for sampling – Applications : Linear and Gaussian model ⇒ sampling strategy : → approximation of the target distribution, CPU time → information in observation, dynamic of the state variable → nonlinear non-Gaussian models 67
  • 68. Example Nonlinear state space model : xt = α(xt−1 + βx3 t−1) + ut x0, ut ∼ N(0, σ2 u) yt = xt + vt vt ∼ N(0, σ2 v) Sequential Monte Carlo methods : – Bootstrap filter, proposal p(xt|xt−1) – SISR with optimal proposal p(xt|xt−1, yt) approximated by KF/EKF – SISR for blocks with optimal proposal p(xt−L+1:t|xt−L, yt−L+1:t) approximated by forward-backward recursions with KF/EKF Parameters values α=0.9, β=0.2, σu=0.1 and σv=0.05 ⇒ approximation of target distribution p(xt|y1:t) 68
  • 69. Simulation results algorithm MSE ESS RS CPU Bootstrap 0.0021 36.8 70.3 % 0.68 SISR-KF 0.0019 64.7 19.3% 0.44 SISR-EKF 0.0019 65.8 19.2% 0.48 BSISR-KF 0.0018 72.3 0.9% 0.21 BSISR-EKF 0.0018 73.5 0.8% 0.24 N = 100 particles, 100 runs of particle filters for a single and for a block of L = 2 variables (MSE from KF/EKF = 0.0034). 69
  • 70. Approximation of the target distribution Resampling for ESS ≤ N 2 , N = 100 0 20 40 60 80 100 120 140 160 180 200 0 10 20 30 40 50 60 70 80 90 100 time index EffectiveSampleSize Approximated ESS vs. time index for a realization of the Bootstrap filter (dotted), the SISR with Kalman filter proposal for a single variable (dashdotted) and for a block of L=2 variables (straight). 70
  • 71. Simulation results block size L N=100 N=500 N=1000 RS 2 74 370 715 0.9% 3 96 493 985 0.9% 4 99 496 989 1% 5 98 494 988 1% 10 97 486 972 2.5% Approximated ESS averaged over 100 runs of particle filters for blocks of L variables, considering N particles. 71
  • 72. CPU time / number of particles N Resampling for ESS ≤ N 2 , 1,000 time steps 100 200 300 400 500 600 700 800 900 1000 0 1 2 3 4 5 6 7 Number of Particles ComputingTime CPU time vs. N for bootstrap filter (dotted), SISR with KF proposal for a single variable (KF : dashed, EKF : dashdotted) and for a block of L=2 variables (straight), 100 realizations. 72
  • 73. CPU time / number of particles N Resampling for ESS ≤ N 2 , 1,000 time steps 2000 4000 6000 8000 10000 0 2 4 6 8 10 12 14 Number of Particles ComputingTime Computational time vs. N for block sampling scheme with lags from L=2 (bottom), 3, 4, 5, 10 (top), 100 realizations. 73
  • 74. Sequential Monte Carlo methods for this model : – Good approximations of the target distribution – sampling ⇒ = approximation for the target distribution – even for small N, block SISR with approximated optimal proposal p(xt−L+1:t|xt−L, yt−L+1:t) is efficient for L=3, 4, 5 – → information in observation : σu=0.1, σv=0.05 74
  • 75. Conclusion ⇒ Importance of proposal/candidate distribution for Sequential Monte Carlo simulation methods design of proposal : → information in observation, dynamic of the state variable : p(xt|xt−1) ←→ p(xt|yt, xt−1) ←→ p(xt|yt) → sampling a block/fixed lag of variables can be useful : – for intermittent/informative observation, correlated variables – applications ⇒ tracking, radar, navigation, positioning . . . 75
  • 76. References - SISR, Sequential Monte Carlo – N. Gordon, D. Salmond, and A. F. M. Smith, “Novel approach to nonlinear and non-Gaussian Bayesian state estimation,” Proceedings IEE-F, vol. 140, pp. 107–113, 1993. – G. Kitagawa, “Monte carlo filter and smoother for non-Gaussian nonlinear state space models,” J. Comput. Graph. Statist., vol. 5, pp. 1–25, 1996. – A. Doucet, N. de Freitas, and N. Gordon, Eds., Sequential Monte Carlo methods in practice, Statistics for engineering and information science. Springer, 2001. 76
  • 77. References - block/fixed lag approaches – H. Meirovitch, “Scanning method as an unbiased simulation technique and its application to the study of self-avoiding random walks,” Phys. Rev. A, vol. 32, pp. 3699–3708, 1985. – M. K. Pitt and N. Shephard, “Filtering via simulation : auxiliary particle filter,” J. Am. Stat. Assoc., vol. 94, pp. 590–599, 1999. – X. Wang, R. Chen, and D. Guo, “Delayed-pilot sampling for mixture Kalman filter with application in fading channels,” IEEE Trans. Sig. Proc., vol. 50, pp. 241–253, 2002. 77
  • 78. References - block/fixed lag sampling methods – A. Doucet and S. S´en´ecal, “Fixed-Lag Sequential Monte Carlo”, accepted at EUSIPCO 2004. – S. S´en´ecal and A. Doucet, “An example of sequential Monte Carlo block sampling method,” AIC2003 Science of Modeling, pp. 418-419, 2003. – C. K. Carter and R. Kohn, “On the Gibbs sampling for state space models,” Biometrika, vol. 81, pp. 541–553, 1994. 78