SlideShare a Scribd company logo
1 of 58
Download to read offline
Harmonic Analysis
&

Deep Learning
Sungbin Lim
In this talk…
Mathematical theory about filter, activation,
pooling through multi-layers based on DCNN
Encompass general ingredients
Lipschitz continuity & Deformation sensitivity
WARNING : Very tough mathematics
…without non-Euclidean geometry (e.g. Geometric DL)
What is Harmonic Analysis?
f(x)=
X
n2N
an n(x), an := hf, niH
How to represent a function efficiently in the
sense of Hilbert space?
Number theory
Signal processing
Quantum mechanics
Neuroscience, Statistics, Finance, etc…
Includes PDE theory, Stochastic Analysis
What is Harmonic Analysis?
f(x)=
X
n2N
an n(x), an := hf, niH
How to represent a function efficiently in the
sense of Hilbert space?
Number theory
Signal processing
Quantum mechanics
Neuroscience, Statistics, Finance, etc…
Includes PDE theory, Stochastic Analysis
Hilbert space & Inner product
Banach space :
Hilbert space :
© Kyung-Min Rho
Hilbert space & Inner product
© Kyung-Min Rho
Banach space :
Normed space + Completeness
Hilbert space :
Banach space :
Normed space + Completeness
Hilbert space :
Banach space + Inner product
Hilbert space & Inner product
© Kyung-Min Rho
Banach space :
Normed space + Completeness
Hilbert space :
Banach space + Inner product
Rd
, L2, Wn
2 , · · ·
Hilbert space & Inner product
Cn
, Lp, Wn
p · · ·
© Kyung-Min Rho
Banach space :
Normed space + Completeness
Hilbert space :
Banach space + Inner product
Rd
, L2, Wn
2 , · · ·
hu, vi =
dX
k=1
ukvk
hf, giL2
=
Z
f(x)g(x)dx
hf, giW n
2
= hf, giL2 +
nX
k=1
h@k
xf, @k
xgiL2
Hilbert space & Inner product
Cn
, Lp, Wn
p · · ·
© Kyung-Min Rho
Why Harmonic Analysis?
Pn(x) = anxn
+ an 1xn 1
+ · · · + a1x + a0
Why Harmonic Analysis?
Pn(x) = anxn
+ an 1xn 1
+ · · · + a1x + a0
(an, an 1, . . . , a1 , a0)
Encoding
Why Harmonic Analysis?
Pn(x) = anxn
+ an 1xn 1
+ · · · + a1x + a0
(an, an 1, . . . , a1 , a0)
Encoding
Pn(x) = anxn
+ an 1xn 1
+ · · · + a1x + a0
Decoding
Why Harmonic Analysis?
Pn(x) = anxn
+ an 1xn 1
+ · · · + a1x + a0
(an, an 1, . . . , a1 , a0)
Encoding
Pn(x) = anxn
+ an 1xn 1
+ · · · + a1x + a0
Decoding
Why we prefer polynomial?
Stone-Weierstrass theorem
Polynomial is Universal approximation!
8f 2 C(X), 8" > 0,
9Pn s.t. max
x2X
|f(x) Pn(x)| < "
© Wikipedia
8f 2 C(X),
9Pn s.t. lim
n!1
kf Pnk1 = 0
Stone-Weierstrass theorem
Polynomial is Universal approximation!
© Wikipedia
Stone-Weierstrass theorem
Even we can approximate derivatives!
9Pn s.t. lim
n!1
kf PnkCn ! 0
Polynomial is Universal approximation!
8f 2 Ck
(X),
© Wikipedia
Stone-Weierstrass theorem
Even we can approximate derivatives!
Universal approximation = {DL, polynomials, Tree,…}
Polynomial is Universal approximation!
9Pn s.t. lim
n!1
kf PnkCn ! 0
8f 2 Ck
(X),
© Wikipedia
Stone-Weierstrass theorem
Even we can approximate derivatives!
Universal approximation = {DL, polynomials, Tree,…}
But why we do not use polynomial?
Polynomial is Universal approximation!
9Pn s.t. lim
n!1
kf PnkCn ! 0
8f 2 Ck
(X),
© Wikipedia
Local interpolation works well for low dimension
© S. Mallat
Local interpolation works well for low dimension
Need " d
points to cover [0, 1]d
at a distance "
© S. Mallat
Local interpolation works well for low dimension
Need " d
points to cover [0, 1]d
at a distance "
High dimension ⇢ Curse of dimension!
© H. Bölcskei
Universal approximator
= Good feature extractor
?
Universal approximator
= Good feature extractor
…in HIGH dimension!
Nonlinear Feature Extraction
© S. Mallat, © H. Bölcskei
Dimension Reduction ⇢ Invariants
© S. Mallat
Dimension Reduction ⇢ Invariants
How?
© S. Mallat
Main Topic in Harmonic Analysis
Linear operator ⇢ Convolution + Multiplier
Invariance vs Discriminability
L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!)
Main Topic in Harmonic Analysis
L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!)
Linear operator ⇢ Convolution + Multiplier
Invariance vs Discriminability
Main Topic in Harmonic Analysis
L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!)
Linear operator ⇢ Convolution + Multiplier
Discriminability vs Invariance
Littlewood-Paley Condition ⇢ Semi-discrete Frame
AkfkH  kL[f]kH  BkfkH
Main Topic in Harmonic Analysis
L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!)
AkfkH  kL[f]kH  BkfkH
Linear operator ⇢ Convolution + Multiplier
Discriminability vs Invariance
Littlewood-Paley Condition ⇢ Semi-discrete Frame
kL[f1] L[f2]kH = kL[f1 f2]kH Akf1 f2kH
i.e. f1 6= f2 ) L[f1] 6= L[f2]
Main Topic in Harmonic Analysis
L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!)
AkfkH  kL[f]kH  BkfkH
Linear operator ⇢ Convolution + Multiplier
Discriminability vs Invariance
Littlewood-Paley Condition ⇢ Semi-discrete Frame
k L · · · L| {z }
n-fold
[f]kH  Bk L · · · L| {z }
(n-1)-fold
[f]kH  · · ·  Bn
kfkH
Main Topic in Harmonic Analysis
L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!)
AkfkH  kL[f]kH  BkfkH
Linear operator ⇢ Convolution + Multiplier
Discriminability vs Invariance
Littlewood-Paley Condition ⇢ Semi-discrete Frame
k L · · · L| {z }
n-fold
[f]kH  Bk L · · · L| {z }
(n-1)-fold
[f]kH  · · ·  Bn
kfkH
Banach fixed-point theorem
Main Tasks in Deep CNN
Representation learning
Feature Extraction
Nonlinear transform
Main Tasks in Deep CNN
Representation learning
Feature Extraction
Nonlinear transform
Main Tasks in Deep CNN
Representation learning
Feature Extraction
Nonlinear transform
Lipschitz continuity
ex) ReLU, tanh, sigmoid …
|f(x) f(y)|  Ckx yk () krf(x)k  C
How to control Lipschitz ?
k⇢(L[f])kH  N(B, C)kfkH
Theorem
No change in Invariance!
k⇢(L[f])kH  N(B, C)kfkH
Proof)
No change in Invariance!
Let ⇢ = ReLU, H = W1
2 . Then
Theorem
How to control Lipschitz ?
k⇢(L[f])kH  N(B, C)kfkH
Proof)
No change in Invariance!
Let ⇢ = ReLU, H = W1
2 . Then
Theorem
k⇢(L[f])kW 1
2
= k max{L[f], 0}kL2
+ kr⇢(L[f])kL2
 kL[f]kL2 + k ⇢0
(L[f])
| {z }
=1 or 0
r(L[f])kL2
 kL[f]kL2
+ kr(L[f])kL2
= kL[f]kW 1
2
 BkfkW 1
2
How to control Lipschitz ?
k⇢(L[f])kH  N(B, C)kfkH
Proof)
No change in Invariance!
Let ⇢ = ReLU, H = W1
2 . Then
Theorem
k⇢(L[f])kW 1
2
= k max{L[f], 0}kL2
+ kr⇢(L[f])kL2
 kL[f]kL2 + k ⇢0
(L[f])
| {z }
=1 or 0
r(L[f])kL2
 kL[f]kL2
+ kr(L[f])kL2
= kL[f]kW 1
2
 BkfkW 1
2
How to control Lipschitz ?
k⇢(L[f])kH  N(B, C)kfkH
Proof)
No change in Invariance!
Let ⇢ = ReLU, H = W1
2 . Then
Theorem
k⇢(L[f])kW 1
2
= k max{L[f], 0}kL2
+ kr⇢(L[f])kL2
 kL[f]kL2 + k ⇢0
(L[f])
| {z }
=1 or 0
r(L[f])kL2
 kL[f]kL2
+ kr(L[f])kL2
= kL[f]kW 1
2
 BkfkW 1
2
How to control Lipschitz ?
k⇢(L[f])kH  N(B, C)kfkH
Proof)
No change in Invariance!
Let ⇢ = ReLU, H = W1
2 . Then
Theorem
k⇢(L[f])kW 1
2
= k max{L[f], 0}kL2
+ kr⇢(L[f])kL2
 kL[f]kL2 + k ⇢0
(L[f])
| {z }
=1 or 0
r(L[f])kL2
 kL[f]kL2
+ kr(L[f])kL2
= kL[f]kW 1
2
 BkfkW 1
2
How to control Lipschitz ?
k⇢(L[f])kH  N(B, C)kfkH
Proof)
No change in Invariance!
Let ⇢ = ReLU, H = W1
2 . Then
Theorem
k⇢(L[f])kW 1
2
= k max{L[f], 0}kL2
+ kr⇢(L[f])kL2
 kL[f]kL2 + k ⇢0
(L[f])
| {z }
=1 or 0
r(L[f])kL2
 kL[f]kL2
+ kr(L[f])kL2
= kL[f]kW 1
2
 BkfkW 1
2
How to control Lipschitz ?
k⇢(L[f])kH  N(B, C)kfkH
Proof)
No change in Invariance!
Let ⇢ = ReLU, H = W1
2 . Then
Theorem
k⇢(L[f])kW 1
2
= k max{L[f], 0}kL2
+ kr⇢(L[f])kL2
 kL[f]kL2 + k ⇢0
(L[f])
| {z }
=1 or 0
r(L[f])kL2
 kL[f]kL2
+ kr(L[f])kL2
= kL[f]kW 1
2
 BkfkW 1
2
How to control Lipschitz ?
What about Discriminability?
Scale Invariant Feature
Translation Invariant
Stable at Deformation
© S. Mallat
Scale Invariant Feature
Translation Invariant
Stable at Deformation
Scattering Network (Mallat, 2012)
(f) =
[
n
(
· · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p)
| {z }
n-fold convolution
⇤ n
)
(j),··· , (p)
© H. Bölcskei
Generalized Scattering Network (Wiatowski, 2015)
(f) =
[
n
(
· · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p)
| {z }
n-fold convolution
⇤ n
)
(j),··· , (p)
Gabor frame
Tensor wavelet Directional wavelet
Ridgelet frame Curvelet frame
© H. Bölcskei
Generalized Scattering Network (Wiatowski, 2015)
(f) =
[
n
(
· · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p)
| {z }
n-fold convolution
⇤ n
)
(j),··· , (p)
© S. Mallat
Generalized Scattering Network (Wiatowski, 2015)
(f) =
[
n
(
· · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p)
| {z }
n-fold convolution
⇤ n
)
(j),··· , (p)
Linearize symmetries
© S. Mallat
Generalized Scattering Network (Wiatowski, 2015)
(f) =
[
n
(
· · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p)
| {z }
n-fold convolution
⇤ n
)
(j),··· , (p)
Linearize symmetries
“Space folding”, Cho (2014)
© S. Mallat
(f) =
[
n
(
· · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p)
| {z }
n-fold convolution
⇤ n
)
(j),··· , (p)
f 7! Sd/2
n Pn(f)(Sn·)
|k n(Ttf) n(f)|k = O
ktk
Qn
j=1 Sj
!
Theorem
Generalized Scattering Network (Wiatowski, 2015)
f 7! Sd/2
n Pn(f)(Sn·)
|k n(Ttf) n(f)|k = O
ktk
Qn
j=1 Sj
!
Theorem
Features become more translation invariant
with increasing network depth
Generalized Scattering Network (Wiatowski, 2015)
Generalized Scattering Network (Wiatowski, 2015)
© Philip Scott Johnson
(f) =
[
n
(
· · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p)
| {z }
n-fold convolution
⇤ n
)
(j),··· , (p)
Theorem
F⌧,! = e2⇡i!(x)
f(x ⌧(x))
|k (F⌧,![f]) (f)k|  C(k⌧k1 + k!k1)kfkL2
Generalized Scattering Network (Wiatowski, 2015)
© Philip Scott Johnson
Theorem
F⌧,! = e2⇡i!(x)
f(x ⌧(x))
|k (F⌧,![f]) (f)k|  C(k⌧k1 + k!k1)kfkL2
Multi-layer convolution linearize Features
i.e. stable to deformations
Generalized Scattering Network (Wiatowski, 2015)
© Philip Scott Johnson
Ergodic Reconstructions
© Philip Scott Johnson
© S. Mallat
David Hilbert
Wir müssen wissen.
Wir werden wissen.
Q.A

More Related Content

What's hot

Lecture 2 predicates quantifiers and rules of inference
Lecture 2 predicates quantifiers and rules of inferenceLecture 2 predicates quantifiers and rules of inference
Lecture 2 predicates quantifiers and rules of inferenceasimnawaz54
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsCaleb (Shiqiang) Jin
 
Reformulation of Nash Equilibrium with an Application to Interchangeability
Reformulation of Nash Equilibrium with an Application to InterchangeabilityReformulation of Nash Equilibrium with an Application to Interchangeability
Reformulation of Nash Equilibrium with an Application to InterchangeabilityYosuke YASUDA
 
Lesson 26: The Fundamental Theorem of Calculus (slides)
Lesson 26: The Fundamental Theorem of Calculus (slides)Lesson 26: The Fundamental Theorem of Calculus (slides)
Lesson 26: The Fundamental Theorem of Calculus (slides)Matthew Leingang
 
Bregman divergences from comparative convexity
Bregman divergences from comparative convexityBregman divergences from comparative convexity
Bregman divergences from comparative convexityFrank Nielsen
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Advanced Microeconomics - Lecture Slides
Advanced Microeconomics - Lecture SlidesAdvanced Microeconomics - Lecture Slides
Advanced Microeconomics - Lecture SlidesYosuke YASUDA
 
Herbrand-satisfiability of a Quantified Set-theoretical Fragment (Cantone, Lo...
Herbrand-satisfiability of a Quantified Set-theoretical Fragment (Cantone, Lo...Herbrand-satisfiability of a Quantified Set-theoretical Fragment (Cantone, Lo...
Herbrand-satisfiability of a Quantified Set-theoretical Fragment (Cantone, Lo...Cristiano Longo
 
Optimization Approach to Nash Euilibria with Applications to Interchangeability
Optimization Approach to Nash Euilibria with Applications to InterchangeabilityOptimization Approach to Nash Euilibria with Applications to Interchangeability
Optimization Approach to Nash Euilibria with Applications to InterchangeabilityYosuke YASUDA
 
Lesson 21: Curve Sketching (slides)
Lesson 21: Curve Sketching (slides)Lesson 21: Curve Sketching (slides)
Lesson 21: Curve Sketching (slides)Matthew Leingang
 
Limits and continuity[1]
Limits and continuity[1]Limits and continuity[1]
Limits and continuity[1]indu thakur
 
A series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyA series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyFrank Nielsen
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsRao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsChristian Robert
 
Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)Matthew Leingang
 

What's hot (17)

Lecture 2 predicates quantifiers and rules of inference
Lecture 2 predicates quantifiers and rules of inferenceLecture 2 predicates quantifiers and rules of inference
Lecture 2 predicates quantifiers and rules of inference
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
 
Reformulation of Nash Equilibrium with an Application to Interchangeability
Reformulation of Nash Equilibrium with an Application to InterchangeabilityReformulation of Nash Equilibrium with an Application to Interchangeability
Reformulation of Nash Equilibrium with an Application to Interchangeability
 
Lesson 26: The Fundamental Theorem of Calculus (slides)
Lesson 26: The Fundamental Theorem of Calculus (slides)Lesson 26: The Fundamental Theorem of Calculus (slides)
Lesson 26: The Fundamental Theorem of Calculus (slides)
 
Bregman divergences from comparative convexity
Bregman divergences from comparative convexityBregman divergences from comparative convexity
Bregman divergences from comparative convexity
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Advanced Microeconomics - Lecture Slides
Advanced Microeconomics - Lecture SlidesAdvanced Microeconomics - Lecture Slides
Advanced Microeconomics - Lecture Slides
 
Herbrand-satisfiability of a Quantified Set-theoretical Fragment (Cantone, Lo...
Herbrand-satisfiability of a Quantified Set-theoretical Fragment (Cantone, Lo...Herbrand-satisfiability of a Quantified Set-theoretical Fragment (Cantone, Lo...
Herbrand-satisfiability of a Quantified Set-theoretical Fragment (Cantone, Lo...
 
Optimization Approach to Nash Euilibria with Applications to Interchangeability
Optimization Approach to Nash Euilibria with Applications to InterchangeabilityOptimization Approach to Nash Euilibria with Applications to Interchangeability
Optimization Approach to Nash Euilibria with Applications to Interchangeability
 
Lesson 21: Curve Sketching (slides)
Lesson 21: Curve Sketching (slides)Lesson 21: Curve Sketching (slides)
Lesson 21: Curve Sketching (slides)
 
Limits and continuity[1]
Limits and continuity[1]Limits and continuity[1]
Limits and continuity[1]
 
A series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyA series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropy
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsRao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
 
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
 
Imc2017 day1-solutions
Imc2017 day1-solutionsImc2017 day1-solutions
Imc2017 day1-solutions
 
Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)
 

Similar to Harmonic Analysis and Deep Learning

Some Thoughts on Sampling
Some Thoughts on SamplingSome Thoughts on Sampling
Some Thoughts on SamplingDon Sheehy
 
Theoretical Spectroscopy Lectures: real-time approach 1
Theoretical Spectroscopy Lectures: real-time approach 1Theoretical Spectroscopy Lectures: real-time approach 1
Theoretical Spectroscopy Lectures: real-time approach 1Claudio Attaccalite
 
Wavelets and Other Adaptive Methods
Wavelets and Other Adaptive MethodsWavelets and Other Adaptive Methods
Wavelets and Other Adaptive MethodsKamrul Hasan
 
Introduction to Fourier transform and signal analysis
Introduction to Fourier transform and signal analysisIntroduction to Fourier transform and signal analysis
Introduction to Fourier transform and signal analysis宗翰 謝
 
Nodal Domain Theorem for the p-Laplacian on Graphs and the Related Multiway C...
Nodal Domain Theorem for the p-Laplacian on Graphs and the Related Multiway C...Nodal Domain Theorem for the p-Laplacian on Graphs and the Related Multiway C...
Nodal Domain Theorem for the p-Laplacian on Graphs and the Related Multiway C...Francesco Tudisco
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?Christian Robert
 
The dual geometry of Shannon information
The dual geometry of Shannon informationThe dual geometry of Shannon information
The dual geometry of Shannon informationFrank Nielsen
 
Divergence clustering
Divergence clusteringDivergence clustering
Divergence clusteringFrank Nielsen
 
A sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentialsA sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentialsVjekoslavKovac1
 
Divergence center-based clustering and their applications
Divergence center-based clustering and their applicationsDivergence center-based clustering and their applications
Divergence center-based clustering and their applicationsFrank Nielsen
 
Approximation Methods Of Solutions For Equilibrium Problem In Hilbert Spaces
Approximation Methods Of Solutions For Equilibrium Problem In Hilbert SpacesApproximation Methods Of Solutions For Equilibrium Problem In Hilbert Spaces
Approximation Methods Of Solutions For Equilibrium Problem In Hilbert SpacesLisa Garcia
 
Building Compatible Bases on Graphs, Images, and Manifolds
Building Compatible Bases on Graphs, Images, and ManifoldsBuilding Compatible Bases on Graphs, Images, and Manifolds
Building Compatible Bases on Graphs, Images, and ManifoldsDavide Eynard
 
Stochastic Control and Information Theoretic Dualities (Complete Version)
Stochastic Control and Information Theoretic Dualities (Complete Version)Stochastic Control and Information Theoretic Dualities (Complete Version)
Stochastic Control and Information Theoretic Dualities (Complete Version)Haruki Nishimura
 
GradStudentSeminarSept30
GradStudentSeminarSept30GradStudentSeminarSept30
GradStudentSeminarSept30Ryan White
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Alexander Litvinenko
 

Similar to Harmonic Analysis and Deep Learning (20)

Some Thoughts on Sampling
Some Thoughts on SamplingSome Thoughts on Sampling
Some Thoughts on Sampling
 
Theoretical Spectroscopy Lectures: real-time approach 1
Theoretical Spectroscopy Lectures: real-time approach 1Theoretical Spectroscopy Lectures: real-time approach 1
Theoretical Spectroscopy Lectures: real-time approach 1
 
Wavelets and Other Adaptive Methods
Wavelets and Other Adaptive MethodsWavelets and Other Adaptive Methods
Wavelets and Other Adaptive Methods
 
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
 
Introduction to Fourier transform and signal analysis
Introduction to Fourier transform and signal analysisIntroduction to Fourier transform and signal analysis
Introduction to Fourier transform and signal analysis
 
Signal lexture
Signal lextureSignal lexture
Signal lexture
 
QMC: Transition Workshop - Probabilistic Integrators for Deterministic Differ...
QMC: Transition Workshop - Probabilistic Integrators for Deterministic Differ...QMC: Transition Workshop - Probabilistic Integrators for Deterministic Differ...
QMC: Transition Workshop - Probabilistic Integrators for Deterministic Differ...
 
Nodal Domain Theorem for the p-Laplacian on Graphs and the Related Multiway C...
Nodal Domain Theorem for the p-Laplacian on Graphs and the Related Multiway C...Nodal Domain Theorem for the p-Laplacian on Graphs and the Related Multiway C...
Nodal Domain Theorem for the p-Laplacian on Graphs and the Related Multiway C...
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 
The dual geometry of Shannon information
The dual geometry of Shannon informationThe dual geometry of Shannon information
The dual geometry of Shannon information
 
Divergence clustering
Divergence clusteringDivergence clustering
Divergence clustering
 
A sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentialsA sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentials
 
QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...
QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...
QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...
 
Divergence center-based clustering and their applications
Divergence center-based clustering and their applicationsDivergence center-based clustering and their applications
Divergence center-based clustering and their applications
 
Approximation Methods Of Solutions For Equilibrium Problem In Hilbert Spaces
Approximation Methods Of Solutions For Equilibrium Problem In Hilbert SpacesApproximation Methods Of Solutions For Equilibrium Problem In Hilbert Spaces
Approximation Methods Of Solutions For Equilibrium Problem In Hilbert Spaces
 
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
 
Building Compatible Bases on Graphs, Images, and Manifolds
Building Compatible Bases on Graphs, Images, and ManifoldsBuilding Compatible Bases on Graphs, Images, and Manifolds
Building Compatible Bases on Graphs, Images, and Manifolds
 
Stochastic Control and Information Theoretic Dualities (Complete Version)
Stochastic Control and Information Theoretic Dualities (Complete Version)Stochastic Control and Information Theoretic Dualities (Complete Version)
Stochastic Control and Information Theoretic Dualities (Complete Version)
 
GradStudentSeminarSept30
GradStudentSeminarSept30GradStudentSeminarSept30
GradStudentSeminarSept30
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 

Recently uploaded

GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXDole Philippines School
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...navyadasi1992
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicAditi Jain
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxuniversity
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 

Recently uploaded (20)

GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by Petrovic
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 

Harmonic Analysis and Deep Learning

  • 2. In this talk… Mathematical theory about filter, activation, pooling through multi-layers based on DCNN Encompass general ingredients Lipschitz continuity & Deformation sensitivity WARNING : Very tough mathematics …without non-Euclidean geometry (e.g. Geometric DL)
  • 3. What is Harmonic Analysis? f(x)= X n2N an n(x), an := hf, niH How to represent a function efficiently in the sense of Hilbert space? Number theory Signal processing Quantum mechanics Neuroscience, Statistics, Finance, etc… Includes PDE theory, Stochastic Analysis
  • 4. What is Harmonic Analysis? f(x)= X n2N an n(x), an := hf, niH How to represent a function efficiently in the sense of Hilbert space? Number theory Signal processing Quantum mechanics Neuroscience, Statistics, Finance, etc… Includes PDE theory, Stochastic Analysis
  • 5. Hilbert space & Inner product Banach space : Hilbert space : © Kyung-Min Rho
  • 6. Hilbert space & Inner product © Kyung-Min Rho Banach space : Normed space + Completeness Hilbert space :
  • 7. Banach space : Normed space + Completeness Hilbert space : Banach space + Inner product Hilbert space & Inner product © Kyung-Min Rho
  • 8. Banach space : Normed space + Completeness Hilbert space : Banach space + Inner product Rd , L2, Wn 2 , · · · Hilbert space & Inner product Cn , Lp, Wn p · · · © Kyung-Min Rho
  • 9. Banach space : Normed space + Completeness Hilbert space : Banach space + Inner product Rd , L2, Wn 2 , · · · hu, vi = dX k=1 ukvk hf, giL2 = Z f(x)g(x)dx hf, giW n 2 = hf, giL2 + nX k=1 h@k xf, @k xgiL2 Hilbert space & Inner product Cn , Lp, Wn p · · · © Kyung-Min Rho
  • 10. Why Harmonic Analysis? Pn(x) = anxn + an 1xn 1 + · · · + a1x + a0
  • 11. Why Harmonic Analysis? Pn(x) = anxn + an 1xn 1 + · · · + a1x + a0 (an, an 1, . . . , a1 , a0) Encoding
  • 12. Why Harmonic Analysis? Pn(x) = anxn + an 1xn 1 + · · · + a1x + a0 (an, an 1, . . . , a1 , a0) Encoding Pn(x) = anxn + an 1xn 1 + · · · + a1x + a0 Decoding
  • 13. Why Harmonic Analysis? Pn(x) = anxn + an 1xn 1 + · · · + a1x + a0 (an, an 1, . . . , a1 , a0) Encoding Pn(x) = anxn + an 1xn 1 + · · · + a1x + a0 Decoding Why we prefer polynomial?
  • 14. Stone-Weierstrass theorem Polynomial is Universal approximation! 8f 2 C(X), 8" > 0, 9Pn s.t. max x2X |f(x) Pn(x)| < " © Wikipedia
  • 15. 8f 2 C(X), 9Pn s.t. lim n!1 kf Pnk1 = 0 Stone-Weierstrass theorem Polynomial is Universal approximation! © Wikipedia
  • 16. Stone-Weierstrass theorem Even we can approximate derivatives! 9Pn s.t. lim n!1 kf PnkCn ! 0 Polynomial is Universal approximation! 8f 2 Ck (X), © Wikipedia
  • 17. Stone-Weierstrass theorem Even we can approximate derivatives! Universal approximation = {DL, polynomials, Tree,…} Polynomial is Universal approximation! 9Pn s.t. lim n!1 kf PnkCn ! 0 8f 2 Ck (X), © Wikipedia
  • 18. Stone-Weierstrass theorem Even we can approximate derivatives! Universal approximation = {DL, polynomials, Tree,…} But why we do not use polynomial? Polynomial is Universal approximation! 9Pn s.t. lim n!1 kf PnkCn ! 0 8f 2 Ck (X), © Wikipedia
  • 19. Local interpolation works well for low dimension © S. Mallat
  • 20. Local interpolation works well for low dimension Need " d points to cover [0, 1]d at a distance " © S. Mallat
  • 21. Local interpolation works well for low dimension Need " d points to cover [0, 1]d at a distance " High dimension ⇢ Curse of dimension! © H. Bölcskei
  • 22. Universal approximator = Good feature extractor ?
  • 23. Universal approximator = Good feature extractor …in HIGH dimension!
  • 24. Nonlinear Feature Extraction © S. Mallat, © H. Bölcskei
  • 25. Dimension Reduction ⇢ Invariants © S. Mallat
  • 26. Dimension Reduction ⇢ Invariants How? © S. Mallat
  • 27. Main Topic in Harmonic Analysis Linear operator ⇢ Convolution + Multiplier Invariance vs Discriminability L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!)
  • 28. Main Topic in Harmonic Analysis L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!) Linear operator ⇢ Convolution + Multiplier Invariance vs Discriminability
  • 29. Main Topic in Harmonic Analysis L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!) Linear operator ⇢ Convolution + Multiplier Discriminability vs Invariance Littlewood-Paley Condition ⇢ Semi-discrete Frame AkfkH  kL[f]kH  BkfkH
  • 30. Main Topic in Harmonic Analysis L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!) AkfkH  kL[f]kH  BkfkH Linear operator ⇢ Convolution + Multiplier Discriminability vs Invariance Littlewood-Paley Condition ⇢ Semi-discrete Frame kL[f1] L[f2]kH = kL[f1 f2]kH Akf1 f2kH i.e. f1 6= f2 ) L[f1] 6= L[f2]
  • 31. Main Topic in Harmonic Analysis L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!) AkfkH  kL[f]kH  BkfkH Linear operator ⇢ Convolution + Multiplier Discriminability vs Invariance Littlewood-Paley Condition ⇢ Semi-discrete Frame k L · · · L| {z } n-fold [f]kH  Bk L · · · L| {z } (n-1)-fold [f]kH  · · ·  Bn kfkH
  • 32. Main Topic in Harmonic Analysis L[f](x) = hTx[K], fi () dL[f](!) = bK(!) bf(!) AkfkH  kL[f]kH  BkfkH Linear operator ⇢ Convolution + Multiplier Discriminability vs Invariance Littlewood-Paley Condition ⇢ Semi-discrete Frame k L · · · L| {z } n-fold [f]kH  Bk L · · · L| {z } (n-1)-fold [f]kH  · · ·  Bn kfkH Banach fixed-point theorem
  • 33. Main Tasks in Deep CNN Representation learning Feature Extraction Nonlinear transform
  • 34. Main Tasks in Deep CNN Representation learning Feature Extraction Nonlinear transform
  • 35. Main Tasks in Deep CNN Representation learning Feature Extraction Nonlinear transform Lipschitz continuity ex) ReLU, tanh, sigmoid … |f(x) f(y)|  Ckx yk () krf(x)k  C
  • 36. How to control Lipschitz ? k⇢(L[f])kH  N(B, C)kfkH Theorem No change in Invariance!
  • 37. k⇢(L[f])kH  N(B, C)kfkH Proof) No change in Invariance! Let ⇢ = ReLU, H = W1 2 . Then Theorem How to control Lipschitz ?
  • 38. k⇢(L[f])kH  N(B, C)kfkH Proof) No change in Invariance! Let ⇢ = ReLU, H = W1 2 . Then Theorem k⇢(L[f])kW 1 2 = k max{L[f], 0}kL2 + kr⇢(L[f])kL2  kL[f]kL2 + k ⇢0 (L[f]) | {z } =1 or 0 r(L[f])kL2  kL[f]kL2 + kr(L[f])kL2 = kL[f]kW 1 2  BkfkW 1 2 How to control Lipschitz ?
  • 39. k⇢(L[f])kH  N(B, C)kfkH Proof) No change in Invariance! Let ⇢ = ReLU, H = W1 2 . Then Theorem k⇢(L[f])kW 1 2 = k max{L[f], 0}kL2 + kr⇢(L[f])kL2  kL[f]kL2 + k ⇢0 (L[f]) | {z } =1 or 0 r(L[f])kL2  kL[f]kL2 + kr(L[f])kL2 = kL[f]kW 1 2  BkfkW 1 2 How to control Lipschitz ?
  • 40. k⇢(L[f])kH  N(B, C)kfkH Proof) No change in Invariance! Let ⇢ = ReLU, H = W1 2 . Then Theorem k⇢(L[f])kW 1 2 = k max{L[f], 0}kL2 + kr⇢(L[f])kL2  kL[f]kL2 + k ⇢0 (L[f]) | {z } =1 or 0 r(L[f])kL2  kL[f]kL2 + kr(L[f])kL2 = kL[f]kW 1 2  BkfkW 1 2 How to control Lipschitz ?
  • 41. k⇢(L[f])kH  N(B, C)kfkH Proof) No change in Invariance! Let ⇢ = ReLU, H = W1 2 . Then Theorem k⇢(L[f])kW 1 2 = k max{L[f], 0}kL2 + kr⇢(L[f])kL2  kL[f]kL2 + k ⇢0 (L[f]) | {z } =1 or 0 r(L[f])kL2  kL[f]kL2 + kr(L[f])kL2 = kL[f]kW 1 2  BkfkW 1 2 How to control Lipschitz ?
  • 42. k⇢(L[f])kH  N(B, C)kfkH Proof) No change in Invariance! Let ⇢ = ReLU, H = W1 2 . Then Theorem k⇢(L[f])kW 1 2 = k max{L[f], 0}kL2 + kr⇢(L[f])kL2  kL[f]kL2 + k ⇢0 (L[f]) | {z } =1 or 0 r(L[f])kL2  kL[f]kL2 + kr(L[f])kL2 = kL[f]kW 1 2  BkfkW 1 2 How to control Lipschitz ?
  • 43. k⇢(L[f])kH  N(B, C)kfkH Proof) No change in Invariance! Let ⇢ = ReLU, H = W1 2 . Then Theorem k⇢(L[f])kW 1 2 = k max{L[f], 0}kL2 + kr⇢(L[f])kL2  kL[f]kL2 + k ⇢0 (L[f]) | {z } =1 or 0 r(L[f])kL2  kL[f]kL2 + kr(L[f])kL2 = kL[f]kW 1 2  BkfkW 1 2 How to control Lipschitz ? What about Discriminability?
  • 44. Scale Invariant Feature Translation Invariant Stable at Deformation © S. Mallat
  • 45. Scale Invariant Feature Translation Invariant Stable at Deformation
  • 46. Scattering Network (Mallat, 2012) (f) = [ n ( · · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p) | {z } n-fold convolution ⇤ n ) (j),··· , (p) © H. Bölcskei
  • 47. Generalized Scattering Network (Wiatowski, 2015) (f) = [ n ( · · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p) | {z } n-fold convolution ⇤ n ) (j),··· , (p) Gabor frame Tensor wavelet Directional wavelet Ridgelet frame Curvelet frame © H. Bölcskei
  • 48. Generalized Scattering Network (Wiatowski, 2015) (f) = [ n ( · · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p) | {z } n-fold convolution ⇤ n ) (j),··· , (p) © S. Mallat
  • 49. Generalized Scattering Network (Wiatowski, 2015) (f) = [ n ( · · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p) | {z } n-fold convolution ⇤ n ) (j),··· , (p) Linearize symmetries © S. Mallat
  • 50. Generalized Scattering Network (Wiatowski, 2015) (f) = [ n ( · · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p) | {z } n-fold convolution ⇤ n ) (j),··· , (p) Linearize symmetries “Space folding”, Cho (2014) © S. Mallat
  • 51. (f) = [ n ( · · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p) | {z } n-fold convolution ⇤ n ) (j),··· , (p) f 7! Sd/2 n Pn(f)(Sn·) |k n(Ttf) n(f)|k = O ktk Qn j=1 Sj ! Theorem Generalized Scattering Network (Wiatowski, 2015)
  • 52. f 7! Sd/2 n Pn(f)(Sn·) |k n(Ttf) n(f)|k = O ktk Qn j=1 Sj ! Theorem Features become more translation invariant with increasing network depth Generalized Scattering Network (Wiatowski, 2015)
  • 53. Generalized Scattering Network (Wiatowski, 2015) © Philip Scott Johnson (f) = [ n ( · · · |f ⇤ g (j) | ⇤ g (k) · · · ⇤ g (p) | {z } n-fold convolution ⇤ n ) (j),··· , (p) Theorem F⌧,! = e2⇡i!(x) f(x ⌧(x)) |k (F⌧,![f]) (f)k|  C(k⌧k1 + k!k1)kfkL2
  • 54. Generalized Scattering Network (Wiatowski, 2015) © Philip Scott Johnson Theorem F⌧,! = e2⇡i!(x) f(x ⌧(x)) |k (F⌧,![f]) (f)k|  C(k⌧k1 + k!k1)kfkL2 Multi-layer convolution linearize Features i.e. stable to deformations
  • 55. Generalized Scattering Network (Wiatowski, 2015) © Philip Scott Johnson
  • 56. Ergodic Reconstructions © Philip Scott Johnson © S. Mallat
  • 57. David Hilbert Wir müssen wissen. Wir werden wissen.
  • 58. Q.A