SlideShare a Scribd company logo
1 of 16
Download to read offline
.
.
Universal Prediction
without assuming either Discrete or Continuous
Joe Suzuki
Osaka University
November 13, 2012
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 1 / 16
Problem
What is the probability that the sun will rise tomorrow?
Predict xn+1 ∈ {0, 1} given xn := (x1, · · · , xn) ∈ {0, 1}n
.
.
Construct a computable Q(xn+1|xn) → P(xn+1|xn)
such as
1 Q(xn+1|xn
) =
c
n
2 For a, b > 0, Q(xn+1|xn
) =
c + a
n + a + b
 
c: the number of xn+1 in xn.
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 2 / 16
Problem
Open Problems raised by Tom Cover in 1975, Moscow
In the betting, obtain 2 dollars if you win, or lose 1 dollar otherwise.
 
Problem 1: Existence of a universal gambling scheme
.
Is there any Qn s.t.
1
n
log[2n
Qn
(xn
)] →
1
n
log[2n
Pn
(xn
)]
a.s. n → ∞ for any unknown stationary ergodic Pn ?
Betting without knowledge converges to one with knowledge
(Bayesian strategy realizes the property)
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 3 / 16
Problem
Problem 2: Existence of a universal prediction scheme
.
.
Is there any Q s.t. for x ∈ {0, 1}
Q(x|x−1
−n ) → P(x|x−1
−∞)
a.s. n → ∞ for any unknown stationary ergodic P ?
Ornstein 1978 (discrete, Non-Bayesian)
Algoet 1992 (extended to the Polish spaces, Non-Bayesian)
x−1
−∞ ∈ {0, 1}∞ → ({sk}, {tk}), s0 < s1 < · · · , t0 < t1 < · · · s.t.
Q(x|x−1
−tk
) =
#Ik(x) + 1/2
#Ik(0) + #Ik(1) + 1
Ik(x) = {1 ≤ τ ≤ sk|x = x−τ , x−1
−tk
= x−τ−1
−τ−tk
}
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 4 / 16
Problem
Bayesian for binary i.i.d. sources
Qn
(xn
) =
∫
w(θ)P(xn
|θ)dθ , P(xn
|θ) = θc
(1 − θ)n−c
For a, b > 0,
w(θ) ∝ θ−a
(1 − θ)−b
⇐⇒ Q(xn+1|xn
) =
Qn+1(xn+1)
Qn(xn)
=
c + a
n + a + b
For a = b = 1/2 (Krichevsky-Trofimov),
−
1
n
log Qn
(xn
) → H :=
∑
x∈A
−P(x) log P(x)
−
1
n
log Pn
(xn
) =
1
n
n∑
i=1
− log P(xi ) → E[− log P(xi )] = H
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 5 / 16
Problem
Universality
There exists Qn s.t. for any Pn
1
Q(x|x−1
−n ) → P(x|x−1
−∞) (1)
2
1
n
log
Pn(xn)
Qn(xn)
→ 0 (2)
m-nary (m ≥ 2) rather than binary
stationary ergodic rather than i.i.d.
Ornstein 1978 (1)
Bayesian (2) as well as (1)
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 6 / 16
Problem
Problem
Construct Qn satisfying (2) for the genaral case
.
.
Xn should be stationary ergodic but can be either
discrete,
continuous, or
neither of them
Counting how many (X = xi+1, Xi = xi ) occurs does not help.
Algoet 1992 does not imply (2) for the general case.
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 7 / 16
Density Functions
Suppose a density function f exists for X
A: the range of X
A0 := {A}
Aj+1 is a refinement of Aj
Example 1: Quantize f over A = [0, 1) to obtain histogram approximations
f1 over A1 = {[0, 1/2), [1/2, 1)}
f2 over A2 = {[0, 1/4), [1/4, 1/2), [1/2, 3/4), [3/4, 1)}
. . .
fj over Aj = {[0, 2−(j−1)), [2−(j−1), 2 · 2−(j−1)), · · · , [(2j−1 − 1)2−(j−1), 1)}
. . .
Pn
j (an) =
∏n
i=1 Pj (ai ), the probability of an = (a1, · · · , an) ∈ An
j
Qn
j : a Bayesian measure
1
n
log
Pn
j (an)
Qn
j (an)
→ 0 as n → ∞
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 8 / 16
Density Functions
λ : R → B (Lebesgue measure, a = [b, c) =⇒ λ(a) = c − b)
(x1, · · · , xn) ∈ (a1, · · · , an) ∈ An
j
=⇒



f n
j (xn
) := fj (x1) · · · fj (xn) =
Pj (a1) · · · Pj (an)
λ(a1) . . . λ(an)
gn
j (xn
) :=
Qn
j (a1, · · · , an)
λ(a1) · · · λ(an)
For {ωj }∞
j=1:
∑
ωj = 1, ωj > 0, gn
(xn
) :=
∞∑
j=1
ωj gn
j (xn
)
If we choose {Aj } such that fj → f as j → ∞, for any f , almost surely
1
n
log
f n(xn)
gn(xn)
→ 0 (3)
B. Ryabko. IEEE Trans. on Inform. Theory, 55, 9, 2009.
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 9 / 16
Generalized Density Functions
Exactly when does density function exist?
B: the Borel sets of R
µ(D): the probabbility of D ∈ B
When a density function exists
.
The following are equivalent (µ ≪ λ):
for each D ∈ B, λ(D) = 0 =⇒ µ(D) = 0
∃ B-measurable
dµ
dλ
:= f s.t. µ(D) =
∫
D
f (t)dλ(t)
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 10 / 16
Generalized Density Functions
Estimating generalized density functions
Radon-Nikodym’s Theorem
.
.
The following are equivalent (µ ≪ η):
for each D ∈ B, η(D) = 0 =⇒ µ(D) = 0
∃ B-measurable
dµ
dη
:= f s.t. µ(D) =
∫
D
f (t)dη(t)
Example 2: µ({k}) > 0, η({k}) :=
1
k(k + 1)
, k ∈ B := {1, 2, · · · }
µ(D) =
∑
k∈D
f (k)η({k}) , D ⊆ B
µ ≪ η =⇒
dµ
dη
(k) = f (k) =
µ({k})
η({k})
= k(k + 1)µ({k})
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 11 / 16
Generalized Density Functions
f1 over B1 := {{1}, {2, 3, · · · }}
f2 over B2 := {{1}, {2}, {3, 4, · · · }}
. . .
fk over Bk := {{1}, {2}, · · · , {k}, {k + 1, k + 2, · · · }}
. . .
(y1, · · · , yn) ∈ (b1, · · · , bn) ∈ Bn
k =⇒ gn
k (yn
) :=
Qn
k (b1, · · · , bn)
η(b1) · · · η(bn)
gn
(yn
) :=
∞∑
k=1
ωkgn
k (yn
)
If we choose {Bk} s.t. fk → f , for any f , almost surely
1
n
log
f n(yn)
gn(yn)
→ 0 (4)
gn(yn)
∏n
i=1 ηn({yi }) estimates P(yn) = f n(yn)
∏n
i=1 ηn({yi })
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 12 / 16
Generalized Density Functions
The original case was contained as a special case
For C = {0, 1, · · · , m − 1}, if we quantize
C1 = C2 = · · · = {{0}, {1}, · · · , {m − 1}}
η({0}) = · · · η({m − 1}) = 1/m
then µ ≪ η and
zn
∈ Cn
⇐⇒ cn
∈ Cn
1 = Cn
2 = · · ·
=⇒



f n
(zn
) =
Pn(cn)
(1/m)n
,
gn
1 (zn
) = gn
2 (zn
) = · · · = gn
(zn
) =
∞∑
l=1
ωl gn
l (zn
) =
Qn(cn)
(1/m)n
=⇒
1
n
log
f n(zn)
gn(zn)
=
1
n
log
Pn(cn)
Qn(cn)
→ 0
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 13 / 16
The Solution
Universality in the generalized sense
If µn ≪ ηn, there exists gn without depending on f n s.t.
1
n
log
f n(zn)
gn(zn)
→ 0
µn
(Dn
) :=
∫
D
f n
(zn
)dηn
(zn
) , νn
(Dn
) :=
∫
D
gn
(zn
)dηn
(zn
)
f n(zn)
gn(zn)
=
dµn
dηn
(zn
)/
dνn
dηn
(zn
) =
dµn
dνn
(zn
)
Theorem (Suzuki, 2011)
1
n
log
dµn
dνn
(zn
) → 0
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 14 / 16
The Solution
Universal Prediction in the generalized sense
The generalzed universal density function tells everything:
g(xn+1|xn
) =
gn+1(xn+1)
gn(xn)
→ f (xn+1|xn
) =
f n+1(xn+1)
f n(xn)
 
For any D ∈ B,
ν(D|xn
) =
∫
D
g(x|xn
)dη(x)
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 15 / 16
Summary
Summary and Discussion
Universal Prediction
.
.
Connection to Universal Bayesian Measures
Generalization without assuming Discrete or Continuous
Stronger universality in the sense of Bayes.
Many Applications except Prediction
Bayesian network structure estimation (DCC 2012)
The Bayesian Chow-Liu Algorithm (PGM 2012)
Markov order estimation even when {Xi } is continuous
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 16 / 16

More Related Content

What's hot

Bayes Independence Test
Bayes Independence TestBayes Independence Test
Bayes Independence TestJoe Suzuki
 
Multilinear Twisted Paraproducts
Multilinear Twisted ParaproductsMultilinear Twisted Paraproducts
Multilinear Twisted ParaproductsVjekoslavKovac1
 
Bellman functions and Lp estimates for paraproducts
Bellman functions and Lp estimates for paraproductsBellman functions and Lp estimates for paraproducts
Bellman functions and Lp estimates for paraproductsVjekoslavKovac1
 
Scattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisScattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisVjekoslavKovac1
 
Multilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureMultilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureVjekoslavKovac1
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Tomoya Murata
 
A Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeA Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeVjekoslavKovac1
 
A sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentialsA sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentialsVjekoslavKovac1
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flowsVjekoslavKovac1
 
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremS. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremSteven Duplij (Stepan Douplii)
 
The Universal Bayesian Chow-Liu Algorithm
The Universal Bayesian Chow-Liu AlgorithmThe Universal Bayesian Chow-Liu Algorithm
The Universal Bayesian Chow-Liu AlgorithmJoe Suzuki
 
QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017Fred J. Hickernell
 
Paraproducts with general dilations
Paraproducts with general dilationsParaproducts with general dilations
Paraproducts with general dilationsVjekoslavKovac1
 
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsA T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsVjekoslavKovac1
 
Generarlized operations on fuzzy graphs
Generarlized operations on fuzzy graphsGenerarlized operations on fuzzy graphs
Generarlized operations on fuzzy graphsAlexander Decker
 
Some Examples of Scaling Sets
Some Examples of Scaling SetsSome Examples of Scaling Sets
Some Examples of Scaling SetsVjekoslavKovac1
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeVjekoslavKovac1
 
Elementary Linear Algebra 5th Edition Larson Solutions Manual
Elementary Linear Algebra 5th Edition Larson Solutions ManualElementary Linear Algebra 5th Edition Larson Solutions Manual
Elementary Linear Algebra 5th Edition Larson Solutions Manualzuxigytix
 
orlando_fest
orlando_festorlando_fest
orlando_festAndy Hone
 

What's hot (20)

Bayes Independence Test
Bayes Independence TestBayes Independence Test
Bayes Independence Test
 
Multilinear Twisted Paraproducts
Multilinear Twisted ParaproductsMultilinear Twisted Paraproducts
Multilinear Twisted Paraproducts
 
Bellman functions and Lp estimates for paraproducts
Bellman functions and Lp estimates for paraproductsBellman functions and Lp estimates for paraproducts
Bellman functions and Lp estimates for paraproducts
 
Scattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisScattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysis
 
Multilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureMultilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structure
 
cheb_conf_aksenov.pdf
cheb_conf_aksenov.pdfcheb_conf_aksenov.pdf
cheb_conf_aksenov.pdf
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
 
A Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeA Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cube
 
A sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentialsA sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentials
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flows
 
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremS. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
 
The Universal Bayesian Chow-Liu Algorithm
The Universal Bayesian Chow-Liu AlgorithmThe Universal Bayesian Chow-Liu Algorithm
The Universal Bayesian Chow-Liu Algorithm
 
QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017
 
Paraproducts with general dilations
Paraproducts with general dilationsParaproducts with general dilations
Paraproducts with general dilations
 
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsA T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
 
Generarlized operations on fuzzy graphs
Generarlized operations on fuzzy graphsGenerarlized operations on fuzzy graphs
Generarlized operations on fuzzy graphs
 
Some Examples of Scaling Sets
Some Examples of Scaling SetsSome Examples of Scaling Sets
Some Examples of Scaling Sets
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cube
 
Elementary Linear Algebra 5th Edition Larson Solutions Manual
Elementary Linear Algebra 5th Edition Larson Solutions ManualElementary Linear Algebra 5th Edition Larson Solutions Manual
Elementary Linear Algebra 5th Edition Larson Solutions Manual
 
orlando_fest
orlando_festorlando_fest
orlando_fest
 

Similar to Universal Prediction without assuming either Discrete or Continuous

Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Joe Suzuki
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Frank Nielsen
 
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...Frank Nielsen
 
Slides_Resilient_State_Estimation_CDC23.pdf
Slides_Resilient_State_Estimation_CDC23.pdfSlides_Resilient_State_Estimation_CDC23.pdf
Slides_Resilient_State_Estimation_CDC23.pdfMohammad Khajenejad
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Frank Nielsen
 
On approximate bounds of zeros of polynomials within
On approximate bounds of zeros of polynomials withinOn approximate bounds of zeros of polynomials within
On approximate bounds of zeros of polynomials withineSAT Publishing House
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPK Lehre
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPer Kristian Lehre
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersTaiji Suzuki
 
Slides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histogramsSlides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histogramsFrank Nielsen
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihoodDeep Learning JP
 

Similar to Universal Prediction without assuming either Discrete or Continuous (20)

Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...
 
Ece3075 a 8
Ece3075 a 8Ece3075 a 8
Ece3075 a 8
 
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
 
Slides_Resilient_State_Estimation_CDC23.pdf
Slides_Resilient_State_Estimation_CDC23.pdfSlides_Resilient_State_Estimation_CDC23.pdf
Slides_Resilient_State_Estimation_CDC23.pdf
 
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie DelonPlug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
 
On approximate bounds of zeros of polynomials within
On approximate bounds of zeros of polynomials withinOn approximate bounds of zeros of polynomials within
On approximate bounds of zeros of polynomials within
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Lecture9 xing
Lecture9 xingLecture9 xing
Lecture9 xing
 
Slides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histogramsSlides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histograms
 
RuFiDiM
RuFiDiMRuFiDiM
RuFiDiM
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
 
Tutorial7
Tutorial7Tutorial7
Tutorial7
 
QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...
QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...
QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...
 
2014 9-16
2014 9-162014 9-16
2014 9-16
 
Bayes gauss
Bayes gaussBayes gauss
Bayes gauss
 

More from Joe Suzuki

RとPythonを比較する
RとPythonを比較するRとPythonを比較する
RとPythonを比較するJoe Suzuki
 
R集会@統数研
R集会@統数研R集会@統数研
R集会@統数研Joe Suzuki
 
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...Joe Suzuki
 
分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減するJoe Suzuki
 
連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定Joe Suzuki
 
E-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityE-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityJoe Suzuki
 
AMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップAMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップJoe Suzuki
 
CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要Joe Suzuki
 
Forest Learning from Data
Forest Learning from DataForest Learning from Data
Forest Learning from DataJoe Suzuki
 
A Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionA Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionJoe Suzuki
 
研究紹介(学生向け)
研究紹介(学生向け)研究紹介(学生向け)
研究紹介(学生向け)Joe Suzuki
 
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...Joe Suzuki
 
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Joe Suzuki
 
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Joe Suzuki
 
連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定Joe Suzuki
 
Jeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model SelectionJeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model SelectionJoe Suzuki
 
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐるJoe Suzuki
 
MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後Joe Suzuki
 

More from Joe Suzuki (20)

RとPythonを比較する
RとPythonを比較するRとPythonを比較する
RとPythonを比較する
 
R集会@統数研
R集会@統数研R集会@統数研
R集会@統数研
 
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
 
分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する
 
連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定
 
E-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityE-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka University
 
UAI 2017
UAI 2017UAI 2017
UAI 2017
 
AMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップAMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップ
 
CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要
 
Forest Learning from Data
Forest Learning from DataForest Learning from Data
Forest Learning from Data
 
A Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionA Bayesian Approach to Data Compression
A Bayesian Approach to Data Compression
 
研究紹介(学生向け)
研究紹介(学生向け)研究紹介(学生向け)
研究紹介(学生向け)
 
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...
 
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
 
2016 7-13
2016 7-132016 7-13
2016 7-13
 
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
 
連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定
 
Jeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model SelectionJeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model Selection
 
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐる
 
MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後
 

Recently uploaded

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curveAreesha Ahmad
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptRakeshMohan42
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfrohankumarsinghrore1
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Silpa
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 

Recently uploaded (20)

Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curve
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 

Universal Prediction without assuming either Discrete or Continuous

  • 1. . . Universal Prediction without assuming either Discrete or Continuous Joe Suzuki Osaka University November 13, 2012 Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 1 / 16
  • 2. Problem What is the probability that the sun will rise tomorrow? Predict xn+1 ∈ {0, 1} given xn := (x1, · · · , xn) ∈ {0, 1}n . . Construct a computable Q(xn+1|xn) → P(xn+1|xn) such as 1 Q(xn+1|xn ) = c n 2 For a, b > 0, Q(xn+1|xn ) = c + a n + a + b   c: the number of xn+1 in xn. Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 2 / 16
  • 3. Problem Open Problems raised by Tom Cover in 1975, Moscow In the betting, obtain 2 dollars if you win, or lose 1 dollar otherwise.   Problem 1: Existence of a universal gambling scheme . Is there any Qn s.t. 1 n log[2n Qn (xn )] → 1 n log[2n Pn (xn )] a.s. n → ∞ for any unknown stationary ergodic Pn ? Betting without knowledge converges to one with knowledge (Bayesian strategy realizes the property) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 3 / 16
  • 4. Problem Problem 2: Existence of a universal prediction scheme . . Is there any Q s.t. for x ∈ {0, 1} Q(x|x−1 −n ) → P(x|x−1 −∞) a.s. n → ∞ for any unknown stationary ergodic P ? Ornstein 1978 (discrete, Non-Bayesian) Algoet 1992 (extended to the Polish spaces, Non-Bayesian) x−1 −∞ ∈ {0, 1}∞ → ({sk}, {tk}), s0 < s1 < · · · , t0 < t1 < · · · s.t. Q(x|x−1 −tk ) = #Ik(x) + 1/2 #Ik(0) + #Ik(1) + 1 Ik(x) = {1 ≤ τ ≤ sk|x = x−τ , x−1 −tk = x−τ−1 −τ−tk } Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 4 / 16
  • 5. Problem Bayesian for binary i.i.d. sources Qn (xn ) = ∫ w(θ)P(xn |θ)dθ , P(xn |θ) = θc (1 − θ)n−c For a, b > 0, w(θ) ∝ θ−a (1 − θ)−b ⇐⇒ Q(xn+1|xn ) = Qn+1(xn+1) Qn(xn) = c + a n + a + b For a = b = 1/2 (Krichevsky-Trofimov), − 1 n log Qn (xn ) → H := ∑ x∈A −P(x) log P(x) − 1 n log Pn (xn ) = 1 n n∑ i=1 − log P(xi ) → E[− log P(xi )] = H Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 5 / 16
  • 6. Problem Universality There exists Qn s.t. for any Pn 1 Q(x|x−1 −n ) → P(x|x−1 −∞) (1) 2 1 n log Pn(xn) Qn(xn) → 0 (2) m-nary (m ≥ 2) rather than binary stationary ergodic rather than i.i.d. Ornstein 1978 (1) Bayesian (2) as well as (1) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 6 / 16
  • 7. Problem Problem Construct Qn satisfying (2) for the genaral case . . Xn should be stationary ergodic but can be either discrete, continuous, or neither of them Counting how many (X = xi+1, Xi = xi ) occurs does not help. Algoet 1992 does not imply (2) for the general case. Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 7 / 16
  • 8. Density Functions Suppose a density function f exists for X A: the range of X A0 := {A} Aj+1 is a refinement of Aj Example 1: Quantize f over A = [0, 1) to obtain histogram approximations f1 over A1 = {[0, 1/2), [1/2, 1)} f2 over A2 = {[0, 1/4), [1/4, 1/2), [1/2, 3/4), [3/4, 1)} . . . fj over Aj = {[0, 2−(j−1)), [2−(j−1), 2 · 2−(j−1)), · · · , [(2j−1 − 1)2−(j−1), 1)} . . . Pn j (an) = ∏n i=1 Pj (ai ), the probability of an = (a1, · · · , an) ∈ An j Qn j : a Bayesian measure 1 n log Pn j (an) Qn j (an) → 0 as n → ∞ Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 8 / 16
  • 9. Density Functions λ : R → B (Lebesgue measure, a = [b, c) =⇒ λ(a) = c − b) (x1, · · · , xn) ∈ (a1, · · · , an) ∈ An j =⇒    f n j (xn ) := fj (x1) · · · fj (xn) = Pj (a1) · · · Pj (an) λ(a1) . . . λ(an) gn j (xn ) := Qn j (a1, · · · , an) λ(a1) · · · λ(an) For {ωj }∞ j=1: ∑ ωj = 1, ωj > 0, gn (xn ) := ∞∑ j=1 ωj gn j (xn ) If we choose {Aj } such that fj → f as j → ∞, for any f , almost surely 1 n log f n(xn) gn(xn) → 0 (3) B. Ryabko. IEEE Trans. on Inform. Theory, 55, 9, 2009. Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 9 / 16
  • 10. Generalized Density Functions Exactly when does density function exist? B: the Borel sets of R µ(D): the probabbility of D ∈ B When a density function exists . The following are equivalent (µ ≪ λ): for each D ∈ B, λ(D) = 0 =⇒ µ(D) = 0 ∃ B-measurable dµ dλ := f s.t. µ(D) = ∫ D f (t)dλ(t) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 10 / 16
  • 11. Generalized Density Functions Estimating generalized density functions Radon-Nikodym’s Theorem . . The following are equivalent (µ ≪ η): for each D ∈ B, η(D) = 0 =⇒ µ(D) = 0 ∃ B-measurable dµ dη := f s.t. µ(D) = ∫ D f (t)dη(t) Example 2: µ({k}) > 0, η({k}) := 1 k(k + 1) , k ∈ B := {1, 2, · · · } µ(D) = ∑ k∈D f (k)η({k}) , D ⊆ B µ ≪ η =⇒ dµ dη (k) = f (k) = µ({k}) η({k}) = k(k + 1)µ({k}) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 11 / 16
  • 12. Generalized Density Functions f1 over B1 := {{1}, {2, 3, · · · }} f2 over B2 := {{1}, {2}, {3, 4, · · · }} . . . fk over Bk := {{1}, {2}, · · · , {k}, {k + 1, k + 2, · · · }} . . . (y1, · · · , yn) ∈ (b1, · · · , bn) ∈ Bn k =⇒ gn k (yn ) := Qn k (b1, · · · , bn) η(b1) · · · η(bn) gn (yn ) := ∞∑ k=1 ωkgn k (yn ) If we choose {Bk} s.t. fk → f , for any f , almost surely 1 n log f n(yn) gn(yn) → 0 (4) gn(yn) ∏n i=1 ηn({yi }) estimates P(yn) = f n(yn) ∏n i=1 ηn({yi }) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 12 / 16
  • 13. Generalized Density Functions The original case was contained as a special case For C = {0, 1, · · · , m − 1}, if we quantize C1 = C2 = · · · = {{0}, {1}, · · · , {m − 1}} η({0}) = · · · η({m − 1}) = 1/m then µ ≪ η and zn ∈ Cn ⇐⇒ cn ∈ Cn 1 = Cn 2 = · · · =⇒    f n (zn ) = Pn(cn) (1/m)n , gn 1 (zn ) = gn 2 (zn ) = · · · = gn (zn ) = ∞∑ l=1 ωl gn l (zn ) = Qn(cn) (1/m)n =⇒ 1 n log f n(zn) gn(zn) = 1 n log Pn(cn) Qn(cn) → 0 Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 13 / 16
  • 14. The Solution Universality in the generalized sense If µn ≪ ηn, there exists gn without depending on f n s.t. 1 n log f n(zn) gn(zn) → 0 µn (Dn ) := ∫ D f n (zn )dηn (zn ) , νn (Dn ) := ∫ D gn (zn )dηn (zn ) f n(zn) gn(zn) = dµn dηn (zn )/ dνn dηn (zn ) = dµn dνn (zn ) Theorem (Suzuki, 2011) 1 n log dµn dνn (zn ) → 0 Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 14 / 16
  • 15. The Solution Universal Prediction in the generalized sense The generalzed universal density function tells everything: g(xn+1|xn ) = gn+1(xn+1) gn(xn) → f (xn+1|xn ) = f n+1(xn+1) f n(xn)   For any D ∈ B, ν(D|xn ) = ∫ D g(x|xn )dη(x) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 15 / 16
  • 16. Summary Summary and Discussion Universal Prediction . . Connection to Universal Bayesian Measures Generalization without assuming Discrete or Continuous Stronger universality in the sense of Bayes. Many Applications except Prediction Bayesian network structure estimation (DCC 2012) The Bayesian Chow-Liu Algorithm (PGM 2012) Markov order estimation even when {Xi } is continuous Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 16 / 16