The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.
The contents of the sections cover
- Exponential Family and its ML estimation
- Overview of Nonparametric methods density estimation
- Kernel Density Estimators
- Nearest-neighbour methods and its application for classification
-------------------------------------------------------------------------
研究室でのBishop著『パターン認識と機械学習』(PRML)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・指数型分布族とその最尤推定
・密度推定のためのノンパラメトリック法の概要
・カーネル密度推定法
・最近傍法とその分類への応用
2. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
3. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
4. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
5. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.
June 11, 2014
PRML 2.4-2.5
The exponential family
Shinichi TAMURA
6. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.
June 11, 2014
PRML 2.4-2.5
Bernoulli,
The exponential family
Shinichi TAMURA
7. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.
June 11, 2014
PRML 2.4-2.5
Bernoulli, multinomial,
The exponential family
Shinichi TAMURA
8. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.
June 11, 2014
PRML 2.4-2.5
Bernoulli, multinomial, Gaussian,
The exponential family
Shinichi TAMURA
9. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.
June 11, 2014
PRML 2.4-2.5
Bernoulli, multinomial, Gaussian,
beta,
The exponential family
Shinichi TAMURA
10. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.
June 11, 2014
PRML 2.4-2.5
Bernoulli, multinomial, Gaussian,
beta, gamma,
The exponential family
Shinichi TAMURA
11. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.
June 11, 2014
PRML 2.4-2.5
Bernoulli, multinomial, Gaussian,
beta, gamma, von Mises...etc.
The exponential family
Shinichi TAMURA
12. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
Almost all of the distributions we studied so far belong
to a single class, namely the exponential family.
June 11, 2014
PRML 2.4-2.5
Parametric distributions
Bernoulli, multinomial, Gaussian,
beta, gamma, von Mises...etc.
The exponential family
Gaussian mixture...etc.
Shinichi TAMURA
13. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
p(x|η) = h(x)g(η) exp ηT
u(x)
The Exponential Family
The exponential family over x given
is a class of distributions which form is
η
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
14. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
p(x|η) = h(x)g(η) exp ηT
u(x)
The Exponential Family
The exponential family over x given
is a class of distributions which form is
η
Natural parameter
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
15. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
p(x|η) = h(x)g(η) exp ηT
u(x)
The Exponential Family
The exponential family over x given
is a class of distributions which form is
η
Natural parameter
Where and
come across
x η
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
16. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
p(x|η) = h(x)g(η) exp ηT
u(x)
The Exponential Family
The exponential family over x given
is a class of distributions which form is
η
Natural parameter
Normalizing constant
Where and
come across
x η
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
17. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 1) The Bernoulli Distribution
p(x|η) = µx
(1 − µ)1−x
= σ(−η) exp(ηx)
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
18. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 1) The Bernoulli Distribution
where
η = ln
µ
1 − µ
p(x|η) = µx
(1 − µ)1−x
= σ(−η) exp(ηx)
u(x)
h(x) = 1
g(η)
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
19. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 2) The Multinomial Distribution
p(x|η) = µxk
k
= exp(ηT
x)
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
20. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 2) The Multinomial Distribution
where
η = (ln µ1, . . . , ln µM )T
⇒ exp(ηk) = µk = 1
p(x|η) = µxk
k
= exp(ηT
x)
u(x)
h(x) = 1
g(η) = 1
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
21. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 2) The Multinomial Distribution
where
η = (ln µ1, . . . , ln µM )T
⇒ exp(ηk) = µk = 1
p(x|η) = µxk
k
= exp(ηT
x)
It's inconvenient!
u(x)
h(x) = 1
g(η) = 1
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
22. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 2) The Multinomial Distribution
Remove the constraint by
µM = 1 −
M−1
k=1 µk, xM = 1 −
M−1
k=1 xk
p(x|µ) = exp
M−1
k=1
xk ln µk + 1 −
M−1
k=1
xk ln 1 −
M−1
k=1
µk
= exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
+ ln 1 −
M−1
k=1
µk
= 1 −
M−1
k=1
µk exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
23. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 2) The Multinomial Distribution
Remove the constraint by
µM = 1 −
M−1
k=1 µk, xM = 1 −
M−1
k=1 xk
p(x|µ) = exp
M−1
k=1
xk ln µk + 1 −
M−1
k=1
xk ln 1 −
M−1
k=1
µk
= exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
+ ln 1 −
M−1
k=1
µk
= 1 −
M−1
k=1
µk exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
24. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 2) The Multinomial Distribution
Remove the constraint by
Therefore...
µM = 1 −
M−1
k=1 µk, xM = 1 −
M−1
k=1 xk
p(x|µ) = exp
M−1
k=1
xk ln µk + 1 −
M−1
k=1
xk ln 1 −
M−1
k=1
µk
= exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
+ ln 1 −
M−1
k=1
µk
= 1 −
M−1
k=1
µk exp
M−1
k=1
xk ln
µk
1 −
M−1
k=1 µk
.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
25. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 2') The Multinomial Distribution
w/o constraint
where
p(x|η) = µxk
k
= 1 +
M−1
k=1
exp(ηk)
−1
exp(ηT
x)
η = ln µ1
1−
P
j µj
, . . . , ln µM−1
1−
P
j µj
, 0
T
u(x)
h(x) = 1
g(η)
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
26. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 3) The Gaussian Distribution
p(x|η) =
1
(2πσ2)1/2
exp −
1
2σ2
(x − µ)2
= (2π)−1/2
(−2η2)1/2
exp
η2
1
4η2
exp η1 η2
x
x2
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
27. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
The Exponential Family
E.g. 3) The Gaussian Distribution
where
u(x)
h(x) = 1
g(η)
p(x|η) =
1
(2πσ2)1/2
exp −
1
2σ2
(x − µ)2
= (2π)−1/2
(−2η2)1/2
exp
η2
1
4η2
exp η1 η2
x
x2
η =
µ
σ2
, −
1
2σ2
T
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
28. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
29. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
30. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
OK, we know what EF looks like.
Then, how to estimate the parameter?
Maximize likelihood!
Frequentist way.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
31. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
Suppose we have i.i.d. data ,
The log-likelihood of is
June 11, 2014
PRML 2.4-2.5
η
X = {x1, . . . , xN }
Shinichi TAMURA
ln p(X|η) = ln
N
n=1
p(xn|η)
= ln
N
n=1
h(xn)g(η) exp ηT
u(xn)
=
N
n=1
ln h(xn) + N ln g(η) + ηT
N
n=1
u(xn).
∴ η ln p(X|η) = N η ln g(η) +
N
n=1
u(xn). −→ 0
32. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
Suppose we have i.i.d. data ,
The log-likelihood of is
June 11, 2014
PRML 2.4-2.5
η
X = {x1, . . . , xN }
Shinichi TAMURA
ln p(X|η) = ln
N
n=1
p(xn|η)
= ln
N
n=1
h(xn)g(η) exp ηT
u(xn)
=
N
n=1
ln h(xn) + N ln g(η) + ηT
N
n=1
u(xn).
∴ η ln p(X|η) = N η ln g(η) +
N
n=1
u(xn). −→ 0
33. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
Suppose we have i.i.d. data ,
The log-likelihood of is
June 11, 2014
PRML 2.4-2.5
η
X = {x1, . . . , xN }
Shinichi TAMURA
ln p(X|η) = ln
N
n=1
p(xn|η)
= ln
N
n=1
h(xn)g(η) exp ηT
u(xn)
=
N
n=1
ln h(xn) + N ln g(η) + ηT
N
n=1
u(xn).
∴ η ln p(X|η) = N η ln g(η) +
N
n=1
u(xn). −→ 0
By putting this to zero
34. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
Therefore
Here, is determined only through ,
so it is called “sufficient statistics”.
We need to store only for estimation.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
− η ln g(ηML) =
1
N
N
n=1
u(xn).
ηML n u(xn)
n u(xn)
35. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
E.g.) Gaussian distribution
By and ,
That's what we already know.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
g(η) = (−2η2)1/2
exp η2
1/4η2 u(x) = (x, x2
)T
− ln g(η) =
− η1
2η2
− 1
2η2
+
η2
1
4η2
2
=
µ
σ2
+ µ2 .
∴ µML =
1
N n
xn,
σ2
ML =
1
N n
x2
n −
1
N n
xn
2
.
36. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
By the way, we want to know
the relation between and .
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
ηηML
37. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
Gradient of
by gives
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
η
h(x)g(η) exp ηT
u(x) dx = 1
g(η) h(x) exp ηT
u(x) dx
+ h(x)g(η) exp ηT
u(x) u(x)dx = 0.
⇔ − ln g(η) = E [u(x)] .
38. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
Gradient of
by gives
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
η
h(x)g(η) exp ηT
u(x) dx = 1
g(η) h(x) exp ηT
u(x) dx
+ h(x)g(η) exp ηT
u(x) u(x)dx = 0.
⇔ − ln g(η) = E [u(x)] .
Similar to
− η ln g(ηML) =
1
N
N
n=1
u(xn)
39. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
According to LLN, sample mean will converge to the
expectation, so will converge to .
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
ηηML
− η ln g(ηML) =
1
N
N
n=1
u(xn)
− ln g(η) = E [u(x)]
40. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
According to LLN, sample mean will converge to the
expectation, so will converge to .
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
ηηML
− η ln g(ηML) =
1
N
N
n=1
u(xn)
− ln g(η) = E [u(x)]
Converge
41. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Maximum likelihood for EF
According to LLN, sample mean will converge to the
expectation, so will converge to .
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
ηηML
− η ln g(ηML) =
1
N
N
n=1
u(xn)
− ln g(η) = E [u(x)]
Converge
Converge
42. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
43. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
44. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF
If you want to use the Bayesian inference,
a prior distribution is needed.
Then, how to decide it,
if we don't know anything about the parameter?
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
45. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF
Three candidates:
1. Conjugate priors
2. Uniform distributions
3. Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
46. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF
Three candidates:
1. Conjugate priors
... Easy to handle
2. Uniform distributions
3. Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
47. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF
Three candidates:
1. Conjugate priors
... Easy to handle
2. Uniform distributions
... Principle of indifference
3. Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
48. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF
Three candidates:
1. Conjugate priors
... Easy to handle
2. Uniform distributions
... Principle of indifference
3. Noninformative priors
... Make effects of priors little
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
49. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Conjugate priors
Three candidates:
1. Conjugate priors
... Easy to handle
2. Uniform distributions
... Principle of indifference
3. Noninformative priors
... Make effects of priors little
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
50. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Conjugate priors
Distributions of EF has factors of ,
so conjugate priors is
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
51. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Conjugate priors
Distributions of EF has factors of ,
so conjugate priors is
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
Correspond
52. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Conjugate priors
Distributions of EF has factors of ,
so conjugate priors is
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
Normalizing constant
53. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Conjugate priors
Distributions of EF has factors of ,
so conjugate priors is
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
54. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Conjugate priors
Distributions of EF has factors of ,
so conjugate priors is
It will give posteriors as follows.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
p(η|X, X, ν) ∝
N
n=1
h(xn)g(η) exp ηT
u(xn) × g(η)ν
exp{ηT
X}
∝ g(η)N+ν
exp ηT
N
n=1
u(xn) + νX
55. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Conjugate priors
Distributions of EF has factors of ,
so conjugate priors is
It will give posteriors as follows.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
g(η) exp(ηT
u)
p(η|X, ν) = f(X, ν) g(η) exp{ηT
X}
ν
= f(X, ν)g(η)ν
exp{νηT
X}.
p(η|X, X, ν) ∝
N
n=1
h(xn)g(η) exp ηT
u(xn) × g(η)ν
exp{ηT
X}
∝ g(η)N+ν
exp ηT
N
n=1
u(xn) + νX
Correspond
56. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Uniform distributions
Three candidates:
1. Conjugate priors
... Easy to handle
2. Uniform distributions
... Principle of indifference
3. Noninformative priors
... Make effects of priors little
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
57. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Uniform distributions
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
The uniform distribution is common choice for discrete
bounded variable.
C.f.: Principle of insufficient reason (or Principle of indifference)
58. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Uniform distributions
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
The uniform distribution is common choice for discrete
bounded variable.
C.f.: Principle of insufficient reason (or Principle of indifference)
But two problems arise when it is applied to continuous
variables:
1. The normalization problem
2. The transformation problem
59. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Uniform distributions
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
1. Normalization Problem
If the parameter is unbounded
These priors are called “improper”.
∞
−∞
p(λ)dλ =
∞
−∞
const dλ → ∞
60. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Uniform distributions
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
1. Normalization Problem
If the parameter is unbounded
These priors are called “improper”.
Note that these priors can give proper posteriors,
because posteriors are proportional to likelihood,
which can be normalized.
∞
−∞
p(λ)dλ =
∞
−∞
const dλ → ∞
61. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Uniform distributions
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
2. Transformation problem
Non-linear transformation gives non-constant priors.
E.g.)
(Sometimes, the posteriors are not sensitive to the difference.)
p(λ) = 1
η=
√
λ
p(η) = p(λ)
dλ
dη
= 2η
62. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Uniform distributions
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
2. Transformation problem
Non-linear transformation gives non-constant priors.
E.g.)
(Sometimes, the posteriors are not sensitive to the difference.)
Not constant for
η
p(λ) = 1
η=
√
λ
p(η) = p(λ)
dλ
dη
= 2η
63. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Uniform distributions
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
2. Transformation problem
Non-linear transformation gives non-constant priors.
E.g.)
(Sometimes, the posteriors are not sensitive to the difference.)
Not constant for
η
Think "constant for what?"
p(λ) = 1
η=
√
λ
p(η) = p(λ)
dλ
dη
= 2η
64. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Uniform distributions
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Keep these problems in mind:
1. The normalization problem
2. The transformation problem
65. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
Three candidates:
1. Conjugate priors
... Easy to handle
2. Uniform distributions
... Principle of indifference
3. Noninformative priors
... Make effects of priors little
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
66. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Two examples of noninformative priors:
1. Priors for location parameters
2. Priors for scale parameters
These are constructed to make effects to posteriors
as little as possible, so that the inference would be
objective.
67. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
1. Priors for location parameters
If the density form is
p(x|µ) = f(x − µ),
68. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
1. Priors for location parameters
If the density form is
the constant shift gives same density:
x = x + c
p(x|µ) = f(x − µ),
p(x|µ) = f(x − µ).
69. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
1. Priors for location parameters
If the density form is
the constant shift gives same density:
This property is “translation invariance” and
these parameter is “location parameter”.
x = x + c
p(x|µ) = f(x − µ),
p(x|µ) = f(x − µ).
70. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
1. Priors for location parameters
To reflect the translation invariance, priors should be
A
B
p(µ)dµ =
A
B
p(µ − c)dµ for∀A, B.
⇐⇒ p(µ) = p(µ − c).
⇐⇒ p(µ) = constant.
71. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
1. Priors for location parameters
To reflect the translation invariance, priors should be
A
B
p(µ)dµ =
A
B
p(µ − c)dµ for∀A, B.
⇐⇒ p(µ) = p(µ − c).
⇐⇒ p(µ) = constant.
We obtained uniform distributions after all.
But unlike before, we know when to use it.
72. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
1. Priors for location parameters
E.g.) The mean in Gaussian
p(x|µ) =
1
(2πσ2)1/2
exp −
1
2σ2
(x − µ)2
73. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
1. Priors for location parameters
E.g.) The mean in Gaussian
p(x|µ) =
1
(2πσ2)1/2
exp −
1
2σ2
(x − µ)2
f(x − µ)This form is
74. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
1. Priors for location parameters
E.g.) The mean in Gaussian
This prior is also obtained as a limit of conjugates.
p(x|µ) =
1
(2πσ2)1/2
exp −
1
2σ2
(x − µ)2
f(x − µ)This form is
p(µ) = N(µ|µ0, σ2
0)
σ2
0 →∞
−−−−→const.,
µN =
σ2
Nσ2
0 + σ2
µ0 +
Nσ2
0
Nσ2
0 + σ2
µML →µML,
1
σ2
N
=
1
σ2
0
+
N
σ2
→
N
σ2
.
75. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
2. Priors for scale parameters
If the density form is
p(x|σ) =
1
σ
f
x
σ
76. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
2. Priors for scale parameters
If the density form is
the constant scale gives same density:
p(x|σ) =
1
σ
f
x
σ
p(x|σ) =
1
σ
f
x
σ
x = cx
77. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
2. Priors for scale parameters
If the density form is
the constant scale gives same density:
This property is “scale invariance” and
these parameter is “scale parameter”.
p(x|σ) =
1
σ
f
x
σ
p(x|σ) =
1
σ
f
x
σ
x = cx
78. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
2. Priors for scale parameters
To reflect the scale invariance, priors should be
A
B
p(σ)dσ =
A
B
p
1
c
σ
dσ
d(cσ)
dσ for∀A, B.
⇐⇒ p(σ) =
1
c
p
1
c
σ .
⇐⇒ p(σ) ∝
1
σ
.
⇐⇒ p(ln σ) = const.
79. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
2. Priors for scale parameters
E.g.) The deviation in Gaussian
p(x|σ) =
1
(2πσ2)1/2
exp −
1
2σ2
x2
80. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
2. Priors for scale parameters
E.g.) The deviation in Gaussian
This form is
1
σ f x
σ
p(x|σ) =
1
(2πσ2)1/2
exp −
1
2σ2
x2
81. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
2. Priors for scale parameters
E.g.) The deviation in Gaussian
This prior is also obtained as a limit of conjugates.
This form is
1
σ f x
σ
p(x|σ) =
1
(2πσ2)1/2
exp −
1
2σ2
x2
p(λ) = Gam(λ|a0, b0)
a0,b0→∞
−−−−−−→
const
λ
,
aN = a0 +
N
2
→
N
2
,
bN = b0 +
N
2
σ2
ML →
N
2
σ2
ML,
82. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Priors for EF – Noninformative priors
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Two examples of noninformative priors:
1. Priors for location parameters
2. Priors for scale parameters
p(x|µ) = f(x − µ) =⇒ p(µ) = const.
p(x|σ) =
1
σ
f
x
σ
=⇒ p(σ) ∝
1
σ
83. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
84. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
85. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
86. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
We learned
“parametric approach”
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
87. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
We learned
“parametric approach”
vs.
We will learn
“nonparametric approach”
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
88. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
We learned
“parametric approach”
vs.
We will learn
“nonparametric approach”
What is the difference?
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
89. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Parametric
Nonparametric
Assume a specific form
of the distribution
Put few assumption about
the form of distribution
Simple
Complex
(depend on data size)
Poor
Rich / Flexible
Efficient
Inefficient
90. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Parametric
Nonparametric
Assume a specific form
of the distribution
Put few assumption about
the form of distribution
Simple
Complex
(depend on data size)
Poor
Rich / Flexible
Efficient
Inefficient
91. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
92. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
We will learn:
1. Histogram methods
2. Kernel density estimators
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
93. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
1. Histogram methods
Split the space into grids (or bins), and count data points.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
94. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
1. Histogram methods
Split the space into grids (or bins), and count data points.
where
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
p(x) = pi =
ni
N∆i
(x ∈ i-th bin),
∆i = Width of ith
bin (usually same for all i),
ni = # of observations which is assigned to ith
bin,
N = Total # of observations.
95. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
1. Histogram methods
Split the space into grids (or bins), and count data points.
where
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
p(x) = pi =
ni
N∆i
(x ∈ i-th bin),
∆i = Width of ith
bin (usually same for all i),
ni = # of observations which is assigned to ith
bin,
N = Total # of observations.
This is piecewise constant, hence discontinuous.
97. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
1. Histogram methods – Example
is...
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
Too narrow to catch enough points
Too spiky (noisy)
∆
98. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
1. Histogram methods – Example
is...
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
Too narrow to catch enough points
Too spiky (noisy)
# of bins = MD (curse of dimensionality)
∆
99. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
1. Histogram methods – Example
is...
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
Too narrow to catch enough points
Too spiky (noisy)
Good intermediate value
# of bins = MD (curse of dimensionality)
∆
100. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
1. Histogram methods – Example
is...
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
Too narrow to catch enough points
Too spiky (noisy)
Good intermediate value
Too wide to express the data
Too smooth (less info)
# of bins = MD (curse of dimensionality)
∆
101. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
1. Histogram methods – Example
is...
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
∆ = 0.04
0 0.5 1
0
5
∆ = 0.08
0 0.5 1
0
5
∆ = 0.25
0 0.5 1
0
5
Too narrow to catch enough points
Too spiky (noisy)
Good intermediate value
Too wide to express the data
Too smooth (less info)
Find good value is very important!
# of bins = MD (curse of dimensionality)
∆
102. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
Lessons from histogram methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Estimate density at a particular point
from data points of small local region.
103. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
Lessons from histogram methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Estimate density at a particular point
from data points of small local region.
The regions are defined by “smoothing
parameter”, which control the
complexity in relation with data size.
104. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
Lessons from histogram methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Estimate density at a particular point
from data points of small local region.
The regions are defined by “smoothing
parameter”, which control the
complexity in relation with data size.
Other problems
• Discontinuity
• Not scalable (curse of dimensionality)
105. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
Lessons from histogram methods
Let's consider a small local region , then
where .
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
R
P = R
p(x)dx
Pr(K out of N data ∈ R) =
N!
K!(N − K)!
PK
(1 − P)N−K
,
106. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
Lessons from histogram methods
Let's consider a small local region , then
where .
If
1. K is large enough (smoother not too small)
2. N is constant over (smoother small enough)
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
R
P = R
p(x)dx
Pr(K out of N data ∈ R) =
N!
K!(N − K)!
PK
(1 − P)N−K
,
R
107. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
Lessons from histogram methods
Let's consider a small local region , then
where .
If
1. K is large enough (smoother not too small)
2. N is constant over (smoother small enough)
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
R
P = R
p(x)dx
Pr(K out of N data ∈ R) =
N!
K!(N − K)!
PK
(1 − P)N−K
,
R
Contradictory
108. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
Lessons from histogram methods
Let's consider a small local region , then
where .
If
1. K is large enough (smoother not too small)
2. N is constant over (smoother small enough)
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
R
P = R
p(x)dx
Pr(K out of N data ∈ R) =
N!
K!(N − K)!
PK
(1 − P)N−K
,
R
Contradictory
Depend on data size
109. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
Lessons from histogram methods
Let's consider a small local region , then
where .
If
1. K is large enough (smoother not too small)
2. N is constant over (smoother small enough)
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
R
P = R
p(x)dx
Pr(K out of N data ∈ R) =
N!
K!(N − K)!
PK
(1 − P)N−K
,
R
⇒ p(x) =
K
NV
.
110. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
111. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
112. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Kernel density estimators
Fix a region (e.g., hypercube centered on x, side is h)
and count data by kernel function k(u) (Parzen window).
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
k(u) =
1, |ui| 1/2, (i = 1, . . . D)
0, otherwise.
113. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Kernel density estimators
Fix a region (e.g., hypercube centered on x, side is h)
and count data by kernel function k(u) (Parzen window).
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Centered on origin,
side is 1
k(u) =
1, |ui| 1/2, (i = 1, . . . D)
0, otherwise.
114. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Kernel density estimators
Fix a region (e.g., hypercube centered on x, side is h)
and count data by kernel function k(u) (Parzen window).
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
k(u) =
1, |ui| 1/2, (i = 1, . . . D)
0, otherwise.
Discontinuous kernel
115. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Kernel density estimators
Fix a region (e.g., hypercube centred on x, side is h)
and count data by kernel function k(u) (Parzen window).
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
K =
N
n=1
k
x − xn
h
,
V = hD
,
∴ p(x) =
1
N
N
n=1
1
hD
k
x − xn
h
.
k(u) =
1, |ui| 1/2, (i = 1, . . . D)
0, otherwise.
116. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Kernel density estimators
Symmetry of k(u) let us re-interpret the result.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
N data points in the single
cube centered on x
117. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Kernel density estimators
Symmetry of k(u) let us re-interpret the result.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
N data points in the single
cube centered on x
N cubes centered on xn
around x
118. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Kernel density estimators
Other choice of k(u): Gaussian
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
k(u) =
1
(2π)D/2
exp −
u 2
2
.
119. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Kernel density estimators
Other choice of k(u): Gaussian
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
k(u) =
1
(2π)D/2
exp −
u 2
2
.
This kernel give continuous density.
120. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Kernel density estimators
Other choice of k(u): Gaussian
You can use anything as long as it holds
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
k(u) 0,
k(u)du = 1.
k(u) =
1
(2π)D/2
exp −
u 2
2
.
121. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Kernel density estimators
Example
Again, we can see that
smooth parameter h controls
the outcome of estimations.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
h = 0.005
0 0.5 1
0
5
h = 0.07
0 0.5 1
0
5
h = 0.2
0 0.5 1
0
5
122. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
123. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
124. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour methods
Use a sphere as a region which centred on x and
contains K (fixed number) data points.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
125. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour methods
Use a sphere as a region which centred on x and
contains K (fixed number) data points.
where V(x) denotes the volume
of the sphere.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
p(x) =
K
NV (x)
,
126. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour methods
Note that this density can not be normalized.
From x* where faraway from all data points, the radius
of the sphere is inversely proportional to x, thus integral
diverge.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
∞
−∞
dx
r(x)
∞
x∗
dx
r(x)
∞
x∗
dx
x − x†
→ ∞.
∴
RD
K
NV (x)
dx ∝
RD
dx
r(x)D
→ ∞.
127. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour estimators
Example
Here again, smooth parameter
K controls the outcome of
estimations.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
K = 1
0 0.5 1
0
5
K = 5
0 0.5 1
0
5
K = 30
0 0.5 1
0
5
128. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour estimators
Example
Here again, smooth parameter
K controls the outcome of
estimations.
Furthermore, we can observe
that in K=1 case.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
K = 1
0 0.5 1
0
5
K = 5
0 0.5 1
0
5
K = 30
0 0.5 1
0
5
p(x) → ∞
129. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
Another problem of Kernels and NNs
These methods need all observed data for estimation,
so both time and space complexity is O(N). It is very
inefficient.
On that point, parametric methods are quite efficient
(c.f., sufficient statistics).
Histograms are also efficient.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
130. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Histograms
Kernels
NNs
K
Not fixed
Not fixed
Fixed
V
Not fixed
Fixed
Not fixed
Smoother
h
V
Continuity
No
It depends
Yes*
Dimensionality
Suffer
Scalable
Scalable
Normalization
Proper
Proper
Improper
Data set
Discard
Keep
Keep
∆
* If K=1, not continuous
131. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Histograms
Kernels
NNs
K
Not fixed
Not fixed
Fixed
V
Not fixed
Fixed
Not fixed
Smoother
h
V
Continuity
No
It depends
Yes*
Dimensionality
Suffer
Scalable
Scalable
Normalization
Proper
Proper
Improper
Data set
Discard
Keep
Keep
∆
* If K=1, not continuous
132. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Histograms
Kernels
NNs
K
Not fixed
Not fixed
Fixed
V
Not fixed
Fixed
Not fixed
Smoother
h
V
Continuity
No
It depends
Yes*
Dimensionality
Suffer
Scalable
Scalable
Normalization
Proper
Proper
Improper
Data set
Discard
Keep
Keep
∆
* If K=1, not continuous
133. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nonparametric methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
Histograms
Kernels
NNs
K
Not fixed
Not fixed
Fixed
V
Not fixed
Fixed
Not fixed
Smoother
h
V
Continuity
No
It depends
Yes*
Dimensionality
Suffer
Scalable
Scalable
Normalization
Proper
Proper
Improper
Data set
Discard
Keep
Keep
∆
* If K=1, not continuous
134. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour methods
Use NNs as classifier
To do this, use the sphere contains
K points irrespective to the class.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
135. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour methods
Use NNs as classifier
To do this, use the sphere contains
K points irrespective to the class.
where Kk is # in class k and sphere.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
p(x|Ck) =
Kk
NkV
,
p(x) =
K
NV
,
136. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour methods
Use NNs as classifier
To do this, use the sphere contains
K points irrespective to the class.
where Kk is # in class k and sphere.
Class priors are , so
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
p(x|Ck) =
Kk
NkV
,
p(x) =
K
NV
,
p(Ck|x) =
p(x|Ck)p(Ck)
p(x)
=
Kk
K
.
p(Ck) = Nk/N
137. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour methods
Use NNs as classifier
Therefore, x will be classified to
the greatest majority among x's
K-nearest neighbours.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
138. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour methods
Use NNs as classifier
Therefore, x will be classified to
the greatest majority among x's
K-nearest neighbours.
If K=1, it is called “nearest-
neighbour rule”.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
139. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Nearest-neighbour methods
Use NNs as classifier – Example
Same as the discussion so far, here K acts as
smooth parameter.
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
x6
x7
K = 1
0 1 2
0
1
2
x6
x7
K = 3
0 1 2
0
1
2
x6
x7
K = 31
0 1 2
0
1
2
140. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA
141. NONPARAMETRIC METHODS
THE EXPONENTIAL FAMILY
Today's topics
1. The exponential family
1. What is exponential family?
2. Maximum likelihood for EF
3. How to decide priors for EF
2. Nonparametric methods
1. What is the point of nonparametric methods ?
2. Kernel density estimator
3. Nearest-neighbour methods
June 11, 2014
PRML 2.4-2.5
Shinichi TAMURA