SlideShare a Scribd company logo
1 of 94
Download to read offline
Low Complexity
Regularization of
Inverse Problems
Cours #3
Proximal Splitting Methods
Gabriel Peyré
www.numerical-tours.com
Overview of the Course

• Course #1: Inverse Problems

• Course #2: Recovery Guarantees

• Course #3: Proximal Splitting Methods
Convex Optimization
Setting: G : H
R ⇤ {+⇥}
H: Hilbert space. Here: H = RN .
Problem:

min G(x)

x H
Convex Optimization
Setting: G : H
R ⇤ {+⇥}
H: Hilbert space. Here: H = RN .
Problem:

Class of functions:
Convex: G(tx + (1

min G(x)

x H

y

x
t)y)

tG(x) + (1

t)G(y)

t

[0, 1]
Convex Optimization
Setting: G : H
R ⇤ {+⇥}
H: Hilbert space. Here: H = RN .
Problem:

Class of functions:
Convex: G(tx + (1

min G(x)

x H

y

x
t)y)

Lower semi-continuous:

tG(x) + (1

t)G(y)

lim inf G(x)

G(x0 )

x

x0

Proper: {x ⇥ H  G(x) ⇤= + } = ⌅
⇤

t

[0, 1]
Convex Optimization
Setting: G : H
R ⇤ {+⇥}
H: Hilbert space. Here: H = RN .
min G(x)

Problem:

Class of functions:
Convex: G(tx + (1

x H

t)y)

Lower semi-continuous:

tG(x) + (1

t)G(y)

lim inf G(x)

G(x0 )

x

x0

Proper: {x ⇥ H  G(x) ⇤= + } = ⌅
⇤
Indicator:

y

x

C (x)

=

(C closed and convex)

0 if x ⇥ C,
+
otherwise.

t

[0, 1]
Example:
Inverse problem:

f0

K

1

Regularization

measurements
Kf0

y = Kf0 + w
K : RN

RP ,

P

N
Example:
Inverse problem:

f0

Model: f0 =
x RQ
coe cients

K

1

Regularization

measurements
Kf0

y = Kf0 + w
K : RN

x0 sparse in dictionary
f=

x R
image

N

= K ⇥ ⇥ RP

K
Q

RP ,

RN

Q

,Q

P

N

N.

y = Kf RP
observations
Example:
Inverse problem:

f0

Model: f0 =
x RQ
coe cients

K

1

Regularization

measurements
Kf0

y = Kf0 + w
K : RN

x0 sparse in dictionary
f=

x R
image

N

= K ⇥ ⇥ RP

K
Q

Sparse recovery: f = x where x solves
1
min
||y
x||2 + ||x||1
x RN 2
Fidelity Regularization

RP ,

RN

Q

,Q

P

N

N.

y = Kf RP
observations
Example:

1

Regularization

Inpainting: masking operator K
fi if i
,
(Kf )i =
0 otherwise.

K : RN
RN

Q

RP

c

P =| |

translation invariant wavelet frame.

Orignal f0 =

x0

y = x0 + w

Recovery

x
Overview
• Subdifferential Calculus
• Proximal Calculus
• Forward Backward
• Douglas Rachford
• Generalized Forward-Backward
• Duality
Sub-differential
Sub-di erential:
G(x) = {u ⇥ H  ⇤ z, G(z)

G(x) + ⌅u, z

x⇧}

G(x) = |x|

G(0) = [ 1, 1]
Sub-differential
Sub-di erential:
G(x) = {u ⇥ H  ⇤ z, G(z)

Smooth functions:

G(x) + ⌅u, z

x⇧}

G(x) = |x|

If F is C 1 , F (x) = { F (x)}

G(0) = [ 1, 1]
Sub-differential
Sub-di erential:
G(x) = {u ⇥ H  ⇤ z, G(z)

G(x) + ⌅u, z

x⇧}

G(x) = |x|

Smooth functions:
If F is C 1 , F (x) = { F (x)}

G(0) = [ 1, 1]

First-order conditions:
x

argmin G(x)
x H

0

G(x )
Sub-differential
Sub-di erential:
G(x) = {u ⇥ H  ⇤ z, G(z)

G(x) + ⌅u, z

x⇧}

G(x) = |x|

Smooth functions:
If F is C 1 , F (x) = { F (x)}

G(0) = [ 1, 1]

First-order conditions:
x

argmin G(x)

0

x H

Monotone operator:
(u, v)

U (x)

G(x )

U (x)

x

U (x) = G(x)
U (y),

y

x, v

u

0
Example:

1

Regularization

1
x ⇥ argmin G(x) = ||y
2
x RQ

⇥G(x) =
|| · ||1 (x)i =

( x

y) + ⇥|| · ||1 (x)

x||2 + ||x||1

sign(xi ) if xi ⇥= 0,
[ 1, 1] if xi = 0.
Example:

1

Regularization

1
x ⇥ argmin G(x) = ||y
2
x RQ

⇥G(x) =
|| · ||1 (x)i =

( x

x||2 + ||x||1

y) + ⇥|| · ||1 (x)

sign(xi ) if xi ⇥= 0,
[ 1, 1] if xi = 0.

Support of the solution:
I = {i ⇥ {0, . . . , N 1}  xi ⇤= 0}

xi
i
Example:

1

Regularization

1
x ⇥ argmin G(x) = ||y
2
x RQ

⇥G(x) =
|| · ||1 (x)i =

( x

x||2 + ||x||1

y) + ⇥|| · ||1 (x)

sign(xi ) if xi ⇥= 0,
[ 1, 1] if xi = 0.

xi
i

Support of the solution:
I = {i ⇥ {0, . . . , N 1}  xi ⇤= 0}
First-order conditions:

s

RN ,

( x

i,

y) + s = 0

sI = sign(xI ),
||sI c ||
1.

y

x

i
Example: Total Variation Denoising
Important: the optimization variable is f .
1
f ⇥ argmin ||y f ||2 + J(f )
f RN 2
Finite di erence gradient:
Discrete TV norm:

:R

J(f ) =
i

= 0 (noisy)

N

R

N 2

||( f )i ||

( f )i

R2
Example: Total Variation Denoising
1
f ⇥ argmin ||y
f RN 2

J(f ) = G( f )

f ||2 + J(f )

G(u) =
i

Composition by linear maps:
J(f ) =
⇥G(u)i =

(J

||ui ||

A) = A

( J) A

div ( G( f ))
ui
||ui ||

if ui ⇥= 0,
R2  || || 1

if ui = 0.
Example: Total Variation Denoising
1
f ⇥ argmin ||y
f RN 2

J(f ) = G( f )

f ||2 + J(f )

G(u) =
i

(J

Composition by linear maps:
J(f ) =
⇥G(u)i =

A) = A

( J) A

div ( G( f ))
if ui ⇥= 0,
R2  || || 1

ui
||ui ||

First-order conditions:
⇥i
⇥i

||ui ||

I, vi =
I c , ||vi ||

v

fi
|| fi || ,

1

RN

if ui = 0.
2

, f = y + div(v)

I = {i  (⇥f )i = 0}
Overview
• Subdifferential Calculus
• Proximal Calculus
• Forward Backward
• Douglas Rachford
• Generalized Forward-Backward
• Duality
Proximal Operators
Proximal operator of G:
1
Prox G (x) = argmin ||x
2
z

z||2 + G(z)
Proximal Operators
Proximal operator of G:
1
Prox G (x) = argmin ||x
2
z
G(x) = ||x||1 =

z||2 + G(z)
log(1 + x2 )
|x| ||x||0

12

i

|xi |

10

8

6

4

2

0

G(x) = ||x||0 = | {i  xi = 0} |

G(x) =
i

log(1 + |xi |2 )

G(x)

−2

−10

−8

−6

−4

−2

0

2

4

6

8

10
Proximal Operators
Proximal operator of G:
1
Prox G (x) = argmin ||x
2
z
G(x) = ||x||1 =
Prox

G (x)i

z||2 + G(z)
12

i

|xi |

= max 0, 1

10

8

|xi |

G(x) = ||x||0 = | {i  xi = 0} |
Prox

G (x)i

log(1 + x2 )
|x| ||x||0

=

xi if |xi |
0 otherwise.

xi

6

4

2

0

G(x)

−2

−10

2 ,

−8

−6

−4

−2

0

2

4

6

8

10

10

8

6

4

2

0

G(x) =
i

log(1 + |xi |2 )

3rd order polynomial root.

−2

−4

−6

ProxG (x)

−8

−10

−10

−8

−6

−4

−2

0

2

4

6

8

10
Proximal Calculus
Separability:

G(x) = G1 (x1 ) + . . . + Gn (xn )

ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn ))
Proximal Calculus
Separability:

G(x) = G1 (x1 ) + . . . + Gn (xn )

ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn ))
1
Quadratic functionals:
G(x) = || x y||2
2
Prox G = (Id +
) 1
=

(Id +

)

1
Proximal Calculus
Separability:

G(x) = G1 (x1 ) + . . . + Gn (xn )

ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn ))
1
Quadratic functionals:
G(x) = || x y||2
2
Prox G = (Id +
) 1
=

(Id +

)

1

Composition by tight frame: A A = Id
ProxG

A (x)

=A

ProxG A + Id

A

A
Proximal Calculus
G(x) = G1 (x1 ) + . . . + Gn (xn )

Separability:

ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn ))
1
Quadratic functionals:
G(x) = || x y||2
2
Prox G = (Id +
) 1
=

(Id +

)

1

Composition by tight frame: A A = Id
ProxG

A (x)

Indicators:

Prox

G (x)

=A

G(x) =

ProxG A + Id

z C

A

x

C (x)

= ProjC (x)
= argmin ||x

A

C
z||

ProjC (x)
Prox and Subdifferential
Resolvant of G:
z = Prox
x

G (x)

0

(Id + ⇥G)(z)

z

x + ⇥G(z)

z = (Id + ⇥G)

1

(x)

Inverse of a set-valued mapping:
where x

Prox

G

U (y)

= (Id + ⇥G)

y
1

U

1

(x)

is a single-valued mapping
Prox and Subdifferential
Resolvant of G:
z = Prox
x

G (x)

0

(Id + ⇥G)(z)

z

x + ⇥G(z)

z = (Id + ⇥G)

1

(x)

Inverse of a set-valued mapping:
where x

Prox

G

Fix point:

U (y)

= (Id + ⇥G)
x

y
1

U

(x)

is a single-valued mapping

argmin G(x)
x

0

1

G(x )

x⇥ = (Id + ⇥G)

x
1

(Id + ⇥G)(x )

(x⇥ ) = Prox

(x⇥ )
G
Gradient and Proximal Descents
x( +1) = x( )
G(x( ) )
Gradient descent:
G is C 1 and G is L-Lipschitz
Theorem:

If 0 <

< 2/L, x(

)

[explicit]

x a solution.
Gradient and Proximal Descents
x( +1) = x( )
G(x( ) )
Gradient descent:
G is C 1 and G is L-Lipschitz
Theorem:

< 2/L, x(

If 0 <

Sub-gradient descent: x(
Theorem:

If

+1)

= x(

1/⇥, x(

Problem: slow.

)

)

)

[explicit]

x a solution.

v( ) ,

v(

)

x a solution.

G(x( ) )
Gradient and Proximal Descents
x( +1) = x( )
G(x( ) )
Gradient descent:
G is C 1 and G is L-Lipschitz
Theorem:

< 2/L, x(

If 0 <

Sub-gradient descent: x(
Theorem:

+1)

= x(

1/⇥, x(

If

)

)

[explicit]

x a solution.

v( ) ,

v(

)

G(x( ) )

x a solution.

)

Problem: slow.

Proximal-point algorithm: x(⇥+1) = Prox
Theorem:

c > 0, x(

If

Prox

G

)

(x(⇥) ) [implicit]
G

x a solution.

hard to compute.
Overview
• Subdifferential Calculus
• Proximal Calculus
• Forward Backward
• Douglas Rachford
• Generalized Forward-Backward
• Duality
Proximal Splitting Methods
Solve
Problem:

Prox

min E(x)

x H
E

is not available.
Proximal Splitting Methods
Solve

min E(x)

x H

is not available.

Problem:

Prox

Splitting:

E(x) = F (x) +

E

Smooth

Gi (x)
i

Simple
Proximal Splitting Methods
Solve

min E(x)

x H

is not available.

Problem:

Prox

Splitting:

E(x) = F (x) +

E

Smooth

Gi (x)
i

Iterative algorithms using:
Forward-Backward:

solves

Simple
F (x)
Prox Gi (x)

F + G

Douglas-Rachford:

Gi

Primal-Dual:

Gi A

Generalized FB:

F+

Gi
Smooth + Simple Splitting
Inverse problem:

f0

K

Model: f0 =

measurements
Kf0

y = Kf0 + w
K : RN

x0 sparse in dictionary

Sparse recovery: f =

RP ,

P

.

x where x solves

min F (x) + G(x)

x RN

Smooth Simple
1
Data fidelity:
F (x) = ||y
x||2
2
Regularization: G(x) = ||x||1 =
|xi |
i

=K ⇥

N
Forward-Backward
Fix point equation:
x

argmin F (x) + G(x)
x

(x

0

F (x ) + G(x )

F (x ))

x + ⇥G(x )

x⇥ = Prox

(x⇥
G

F (x⇥ ))
Forward-Backward
Fix point equation:
x

argmin F (x) + G(x)
x

(x

0

F (x ) + G(x )

F (x ))

x + ⇥G(x )

x⇥ = Prox
Forward-backward:

x(⇥+1) = Prox

(x⇥
G
G

x(⇥)

F (x⇥ ))
F (x(⇥) )
Forward-Backward
Fix point equation:
x

argmin F (x) + G(x)
x

(x

0

F (x ) + G(x )

F (x ))

x + ⇥G(x )

x⇥ = Prox
Forward-backward:

(x⇥
G

x(⇥+1) = Prox

Projected gradient descent:

G=

G

C

x(⇥)

F (x⇥ ))
F (x(⇥) )
Forward-Backward
Fix point equation:
x

argmin F (x) + G(x)
x

F (x ) + G(x )

F (x ))

(x

0

x + ⇥G(x )

x⇥ = Prox
Forward-backward:

x(⇥+1) = Prox

G=

Projected gradient descent:
Theorem:
If

< 2/L,

(x⇥
G

Let
x(

)

G

x(⇥)

F (x⇥ ))
F (x(⇥) )

C

F be L-Lipschitz.
x

a solution of ( )
Example: L1 Regularization
1
min || x
x 2

y||2 + ||x||1

1
F (x) = || x
2

min F (x) + G(x)
x

y||2

F (x) =

( x

G(x) = ||x||1
Prox

G (x)i

Forward-backward

L = ||

y)

= max 0, 1

⇥
|xi |

||

xi

Iterative soft thresholding
Convergence Speed
min E(x) = F (x) + G(x)
x

F is L-Lipschitz.

G is simple.
Theorem:

If L > 0, FB iterates x(

E(x( ) )

E(x )

C degrades with L

C/

0.

)

satisfies
Multi-steps Accelerations
t(0) = 1
Beck-Teboule accelerated FB:
✓
◆
1
(`+1)
(`)
x
= Prox1/L y
rF (y (`) )
L

1+

1 + 4(t( ) )2
t( +1) =
2()
t
1 (
( +1)
( +1)
y
=x
+ ( +1) (x
t

+1)

x( ) )

(see also Nesterov method)

Theorem:

If L > 0,

( )

E(x

)

E(x )

C

Complexity theory: optimal in a worse-case sense.
Overview
• Subdifferential Calculus
• Proximal Calculus
• Forward Backward
• Douglas Rachford
• Generalized Forward-Backward
• Duality
Douglas Rachford Scheme
min G1 (x) + G2 (x)
x

Simple

( )

Simple

Douglas-Rachford iterations:

z (⇥+1) = 1
x(`+1)

2

z (⇥) +

2
= Prox G1 (z (`+1) )

Reflexive prox:
RProx

G (x)

RProx

= 2Prox

G2

G (x)

RProx

x

(z (⇥) )
G1
Douglas Rachford Scheme
min G1 (x) + G2 (x)
x

Simple

( )

Simple

Douglas-Rachford iterations:

z (⇥+1) = 1
x(`+1)

z (⇥) +

2

2
= Prox G1 (z (`+1) )

Reflexive prox:
RProx
Theorem:
x(

G (x)

= 2Prox

If 0 <
)

RProx

x

G2

G (x)

RProx

x

< 2 and ⇥ > 0,
a solution of ( )

(z (⇥) )
G1
DR Fix Point Equation
min G1 (x) + G2 (x)

0

x

z, z

x

x = Prox

(G1 + G2 )(x)

⇥( G1 )(x) and x
G1 (z)

and

(2x

z)

⇥( G2 )(x)

z
x

⇥( G2 )(x)
DR Fix Point Equation
min G1 (x) + G2 (x)

0

x

z, z

⇥( G1 )(x) and x

x

x = Prox

(G1 + G2 )(x)

G1 (z)

x = Prox

and

(2x

⇥( G2 )(x)

z

G2

⇥( G2 )(x)

x

z) = Prox

G2 (2x

z)

RProx

G1 (z)

z = 2Prox

G2

RProx

G1 (y)

(2x

z = 2Prox

G2

RProx

G1 (z)

RProx

G1 (z)

RProx

G1 (z)

z = RProx
z= 1

2

G2

RProx

z+

2

z)

G1 (z)

RProx

G2
Example: Constrainted L1
min ||x||1

min G1 (x) + G2 (x)

x=y

C = {x  x = y}

G1 (x) = iC (x),

Prox

x

G1 (x) = ProjC (x) = x +

G2 (x) = ||x||1

Prox

e⇥cient if

G2 (x)

=

⇥

(

⇥

)

max 0, 1

easy to invert.

1

(y
|xi |

x)
xi
i
Example: Constrainted L1
min ||x||1

min G1 (x) + G2 (x)

x=y

C = {x  x = y}

G1 (x) = iC (x),

Prox

x

G1 (x) = ProjC (x) = x +

G2 (x) = ||x||1

Prox

e⇥cient if

G2 (x)

=

⇥

(

easy to invert.

Example: compressed sensing
y = x0

400

Gaussian matrix
||x0 ||0 = 17

)

1

max 0, 1
1

R100

⇥

(y

x)
xi

|xi |

i

log10 (||x( ) ||1

||x ||1 )

0
−1
−2
−3
−4
−5

= 0.01
=1
= 10
50

100

150

200

250
More than 2 Functionals
min G1 (x) + . . . + Gk (x)
x

min

(x1 ,...,xk )

each Fi is simple

G(x1 , . . . , xk ) + ◆C (x1 , . . . , xk )

G(x1 , . . . , xk ) = G1 (x1 ) + . . . + Gk (xk )
C = (x1 , . . . , xk )

Hk  x1 = . . . = xk
More than 2 Functionals
each Fi is simple

min G1 (x) + . . . + Gk (x)
x

min

(x1 ,...,xk )

G(x1 , . . . , xk ) + ◆C (x1 , . . . , xk )

G(x1 , . . . , xk ) = G1 (x1 ) + . . . + Gk (xk )
C = (x1 , . . . , xk )

G and
Prox

Prox

C

Hk  x1 = . . . = xk

are simple:

G (x1 , . . . , xk )

= (Prox

Gi (xi ))i

⇥C (x1 , . . . , xk )

= (˜, . . . , x)
x
˜

1
where x =
˜
k

xi
i
Auxiliary Variables: DR
Linear map A : E

min G1 (x) + G2 A(x)
x

min G(z) +

z⇥H E

G1 , G2 simple.

C (z)

G(x, y) = G1 (x) + G2 (y)
C = {(x, y) ⇥ H

E  Ax = y}

H.
Auxiliary Variables: DR
Linear map A : E

min G1 (x) + G2 A(x)
x

min G(z) +

z⇥H E

G1 , G2 simple.

C (z)

G(x, y) = G1 (x) + G2 (y)
C = {(x, y) ⇥ H

Prox

G (x, y)

= (Prox

G1 (x), Prox G2 (y))

˜
Prox C (x, y) = (x + A y , y
where

E  Ax = y}

x x
y ) = (˜, A˜)
˜

y = (Id + AA )
˜

1

(Ax

x = (Id + A A)
˜

1

(A y + x)

y)

e cient if Id + AA or Id + A A easy to invert.

H.
Example: TV Regularization
1
min ||Kf y||2 + ||⇥f ||1
f
2
min G1 (f ) + G2
(f )

||u||1 =

i

||ui ||

x

G1 (u) = ||u||1

1
G2 (f ) = ||Kf
2
C = (f, u) ⇥ RN

Prox

G1 (u)i

y||2
RN

Prox
2

= max 0, 1
G2

 u = ⇤f

˜ ˜
Prox C (f, u) = (f , f )

||ui ||

= (Id + K K)

ui
1

K
Example: TV Regularization
1
min ||Kf y||2 + ||⇥f ||1
f
2
min G1 (f ) + G2
(f )

||u||1 =

i

||ui ||

x

G1 (u) = ||u||1

1
G2 (f ) = ||Kf
2
C = (f, u) ⇥ RN

Prox

G1 (u)i

y||2
RN

Prox
2

= max 0, 1
G2

||ui ||

= (Id + K K)

ui
1

K

 u = ⇤f

˜ ˜
Prox C (f, u) = (f , f )

Compute the solution of:

(Id +

˜
)f =

div(u) + f

O(N log(N )) operations using FFT.
Example: TV Regularization

Orignal f0

y = Kx0

y = f0 + w

Recovery f

Iteration
Overview
• Subdifferential Calculus
• Proximal Calculus
• Forward Backward
• Douglas Rachford
• Generalized Forward-Backward
• Duality
GFB Splitting
n

min F (x) +

x RN

(⇥+1)
(⇥)
zi
= zi +
n
1
( +1)

x

=

n

Proxn

i=1

( )

i=1

Smooth

i = 1, . . . , n,

Gi (x)

( +1)

zi

G

Simple

(2x

(⇥)

(⇥)
zi

F (x(⇥) )) x(⇥)
GFB Splitting
n

min F (x) +

x RN

(⇥+1)
(⇥)
zi
= zi +
n
1
( +1)

=

x

n

Simple

Proxn

G

(2x

(⇥)

(⇥)
zi

F (x(⇥) )) x(⇥)

( +1)

zi

i=1

Theorem:
If

( )

i=1

Smooth

i = 1, . . . , n,

Gi (x)

< 2/L,

Let
x(

)

F be L-Lipschitz.
x

a solution of ( )
GFB Splitting
n

min F (x) +

x RN

(⇥+1)
(⇥)
zi
= zi +
n
1
( +1)

=

x

n

Proxn

Simple

G

(2x

(⇥)

(⇥)
zi

F (x(⇥) )) x(⇥)

( +1)

zi

i=1

Theorem:
If

( )

i=1

Smooth

i = 1, . . . , n,

Gi (x)

< 2/L,

n=1
F =0

Let
x(

)

F be L-Lipschitz.
x

a solution of ( )

Forward-backward.
Douglas-Rachford.
GFB Fix Point
x

argmin F (x) +
x RN

yi

i

Gi (x)

Gi (x ),

0
F (x ) +

F (x ) +
i yi

=0

i

Gi (x )
GFB Fix Point
x

argmin F (x) +
x RN

i

Gi (x)

Gi (x ),

yi
(zi )n ,
i=1

1
i, x
n

x =

i zi

1
n

0
F (x ) +
zi

F (x ) +
i yi

Gi (x )

=0

F (x )

(use zi = x

i

⇥Gi (x )
F (x )

N yi )
GFB Fix Point
x

argmin F (x) +
x RN

i

Gi (x)

Gi (x ),

yi
(zi )n ,
i=1

i zi

1
n

(2x

zi

x⇥ = Proxn

F (x ) +

F (x ) +

1
i, x
n

x =

0

i yi

(use zi = x
F (x ))

(2x⇥
Gi

zi = zi + Proxn

G

⇥Gi (x )
F (x )

N yi )

n ⇥Gi (x )

x

F (x⇥ ))

zi

(2x⇥

Gi (x )

=0

F (x )

zi

i

zi

F (x⇥ ))

x⇥
GFB Fix Point
x

argmin F (x) +
x RN

i

Gi (x)

Gi (x ),

yi
(zi )n ,
i=1

i zi

1
n

(2x

zi

x⇥ = Proxn

i yi

(use zi = x
F (x ))

(2x⇥
Gi
G

Gi (x )

⇥Gi (x )
F (x )

N yi )

n ⇥Gi (x )

x

F (x⇥ ))

zi

(2x⇥

i

=0

F (x )

zi

zi = zi + Proxn
+

F (x ) +

F (x ) +

1
i, x
n

x =

0

zi

F (x⇥ ))

x⇥

Fix point equation on (x , z1 , . . . , zn ).
Block Regularization
1

2

block sparsity: G(x) =
b B

iments

2

+

(2)
` 1 `2

4
k=1

N: 256

x

x2
m
m b

Towards More Complex Penalization

Bk
1,2

⇥ x⇥⇥1 =

i ⇥xi ⇥

b

Image f =

||x[b] ||2 =

||x[b] ||,

B

x Coe cients x.

b B

i

xi2
b

b B1
b B2

+

i b xi

i b xi
Block Regularization
1

2

block sparsity: G(x) =
b B

||x[b] ||,

||x[b] ||2 =

x2
m
m b

... B
Non-overlapping decomposition: B = B
iments Towards More Complex Penalization
Towards More Complex Penalization
Towards More Complex Penalization

2

1

n

(2)
G(x) =4 x iBk
(x)
+ ` ` k=1 G 1,2
1

2

N: 256

Gi (x) =

b Bi

i=1

⇥=
⇥ x⇥x⇥x⇥⇥1 =i ⇥x⇥x⇥xi ⇥
⇥ ⇥1 ⇥1 = i i ⇥i i ⇥

b

Image f =

||x[b] ||,

bb B B i
Bb

xii2bi2xi2
bbx
i

B

x Coe cients x.

n

Blocks B1

22
b b 1b1 B1 i b xiixb xi
BB
i b i

++ +

b b 2b2 B2 i
BB

B1

xi2 b2xi
b b xi
i

B2
Block Regularization
1

2

block sparsity: G(x) =
b B

||x[b] ||,

||x[b] ||2 =

x2
m
m b

... B
Non-overlapping decomposition: B = B
iments Towards More Complex Penalization
Towards More Complex Penalization
Towards More Complex Penalization

2

1

n

(2)
G(x) =4 x iBk
(x)
+ ` ` k=1 G 1,2
1

2

Gi (x) =

b Bi

i=1

||x[b] ||,

Each Gi is simple:
⇥ ⇥1 = i ⇥i i
⇥ x⇥x⇥x⇥⇥1 =i ⇥xG ⇥xi ⇥ m = b B B i b xii2bi2xi2
=
Bb
⇤ m ⇥ b ⇥ Bi , ⇥ ⇥1Prox i ⇥xi ⇥(x) b max i0, 1
bx
N: 256

b

Image f =

B

x Coe cients x.

n

Blocks B1

22
b b 1b1 B1 i b xiixb xi
BB
i b i

||x[b]b||B
b B b

++m
x +

2 2 B2

B1

i

xi2 b2xi
b b xi
i

B2
10

10

x+1,2`
1

`2

k=1

Numerical
Numerical Experiments Experiments
1

1

1
0

log10(E−Emin)
log10(E−Emin)

tmin
: 283s; t : 298s; t :: 283s; t : 298s; t (2)
t CP 2 +
368s
||y x 1 ⇥x||368s PRx 2 minix(x)Y ⇥ K
PR
Deconvolution +GCP: 1` 4
−1 EFB
−1 EFB
Deconvolution min 2 Y ⇥ K
`
x 102
10 40
20
30 1 2 2 40k=1
20
30
EFB iteration #
i
EFB
iteration 3
#
3
0

log10(E−Emin)

x
k=1

Numerical Illustration
log (E−

log (E−E

Deconv. + Inpaint. 2min+CP Y ⇥ P K x CP Y + P 1 K2
x
Deconv. x 2Inpaint. min 2 ⇥ ` `
2

PR
CP

= convolution
2

x

PR
CP 2
λ

Bk 2
TI (2)`2 4
x= + `wavelets x
k=1
1,2
1

λ2 : 1.30e−03;
: 1.30e−03;
= inpainting+convolution l1/l2
l1/l2
tEFB: 161s; tPR: 173s; tCP N: 256
190s
t
: 161s; noise: 0.025; :convol.: 2
t : 173s; t
190s noise: 0.025; convol.::it. #50; SNR: 22.49dB #50; SNR: 22.49dB
it.
2
N: 256
EFB
PR
CP

3

2

Numerical Experiments

1
0

onv. + Inpaint. minx
2
10

20

1

EFB
0
3
PR
1
CP
2 30
2

1

iteration #
1

0

0

Y ⇥P K
10

40

20

x

2

+

30

iteration #

EFB
PR
(4)
CP
`140`2

16
k=1

x

λ4 : 1.00e−03;
l1/l2

Bk
1,2

λ4 : 1.00e−03;
l1/l2

10

min

tEFB: 283s; tPR: 298s; tCP: 368s
it. #50; SNR: 21.80dB #50; SNR: 21.80dB
noise: 0.025; degrad.: 0.4; 0.025; degrad.: 0.4; convol.: 2
it.
noise: convol.: 2
−1
−1
10

20

30
EFB
PR
CP

iteration #

3
2
1

40

10

20

30

x0

40

iteration #
λ2

: 1.30e−03;

l1/l2

noise: 0.025; it. #50; SNR: 22.49dB
convol.: 2

noise: 0.025; convol.: 2

0
10

log10

20

iteration
(E(x( ) ) #

y = x0 + w
E(x ))
30

40

4

x

λ2 : 1.30e−03;
l1/l2

it. #50; SNR: 22.49dB
Overview
• Subdifferential Calculus
• Proximal Calculus
• Forward Backward
• Douglas Rachford
• Generalized Forward-Backward
• Duality
Legendre-Fenchel Duality
Legendre-Fenchel transform:
G (u) =

sup
x dom(G)

u, x

G(x)

eu
lop
S

G(x)
G (u)

x
Legendre-Fenchel Duality
Legendre-Fenchel transform:
G (u) =

sup

u, x

G(x)

x dom(G)

Example: quadratic functional
1
G(x) = Ax, x + x, b
2
1
G (u) = u b, A 1 (u b)
2

eu
lop
S

G(x)
G (u)

x
Legendre-Fenchel Duality
Legendre-Fenchel transform:
G (u) =

sup

u, x

G(x)

x dom(G)

Example: quadratic functional
1
G(x) = Ax, x + x, b
2
1
G (u) = u b, A 1 (u b)
2

G(x)
G (u)

Moreau’s identity:
Prox

G

(x) = x

G simple

eu
lop
S

ProxG/ (x/ )

G simple

x
Indicator and Homogeneous
Positively 1-homogeneous functional:
Example: norm

Duality:

G (x) =

G(x) = ||x||

G (·) 1 (x)

G( x) = |x|G(x)

G (y) = min

G(x) 1

x, y
Indicator and Homogeneous
Positively 1-homogeneous functional:
Example: norm

Duality:

G (x) =

G(x) = ||x||

G (·) 1 (x)

G( x) = |x|G(x)

G (y) = min

G(x) 1

p

norms:

G(x) = ||x||p

G (x) = ||x||q

1 1
+ =1
p q

1

x, y

p, q

+
Indicator and Homogeneous
G( x) = |x|G(x)

Positively 1-homogeneous functional:
G(x) = ||x||

Example: norm

Duality:

G (x) =

G (·) 1 (x)

G (y) = min

G(x) 1

p

norms:

G(x) = ||x||p

G (x) = ||x||q

1 1
+ =1
p q

Example: Proximal operator of
Prox

||·||

Proj||·||1

= Id

norm

Proj||·||1

(x)i = max 0, 1

|xi |

for a well-chosen ⇥ = ⇥ (x, )

xi

1

x, y

p, q

+
Primal-dual Formulation
A:H⇥

Fenchel-Rockafellar duality:

L

linear

min G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui

x2H

x

u2L

G⇤ (u)
2
Primal-dual Formulation
A:H⇥

Fenchel-Rockafellar duality:

L

linear

min G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui
x

x2H

u2L

G⇤ (u)
2

Strong duality:

0 2 ri(dom(G2 ))

(min $ max)

= max

G⇤ (u) + min G1 (x) + hx, A⇤ ui
2

= max

G⇤ (u)
2

u

u

A ri(dom(G1 ))
x
G⇤ (
1

A⇤ u)
Primal-dual Formulation
A:H⇥

Fenchel-Rockafellar duality:

L

linear

min G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui
x

x2H

u2L

G⇤ (u)
2

Strong duality:

0 2 ri(dom(G2 ))

(min $ max)

= max

G⇤ (u) + min G1 (x) + hx, A⇤ ui
2

= max

G⇤ (u)
2

u

u

A ri(dom(G1 ))
x
G⇤ (
1

Recovering x? from some u? :
x? = argmin G1 (x? ) + hx? , A⇤ u? i
x

A⇤ u)
Primal-dual Formulation
A:H⇥

Fenchel-Rockafellar duality:

L

linear

min G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui
x

x2H

u2L

G⇤ (u)
2

Strong duality:

0 2 ri(dom(G2 ))

(min $ max)

= max

G⇤ (u) + min G1 (x) + hx, A⇤ ui
2

= max

G⇤ (u)
2

u

u

A ri(dom(G1 ))
x
G⇤ (
1

A⇤ u)

Recovering x? from some u? :
x? = argmin G1 (x? ) + hx? , A⇤ u? i
x

()

A⇤ u? 2 @G1 (x? )

() x? 2 (@G1 )

1

( A⇤ u? ) = @G⇤ ( A⇤ u? )
1
Forward-Backward on the Dual
If G1 is strongly convex:
G1 (tx + (1

r2 G1 > cId

t)y) 6 tG1 (x) + (1

t)G1 (y)

c
t(1
2

t)||x

y||2
Forward-Backward on the Dual
If G1 is strongly convex:
G1 (tx + (1

r2 G1 > cId

t)y) 6 tG1 (x) + (1

x? uniquely defined.
G? is of class C 1 .
1

t)G1 (y)

c
t(1
2

t)||x

x? = rG? ( A⇤ u? )
1

y||2
Forward-Backward on the Dual
r2 G1 > cId

If G1 is strongly convex:
G1 (tx + (1

t)y) 6 tG1 (x) + (1

x? uniquely defined.
G? is of class C 1 .
1

FB on the dual:

t)G1 (y)

c
t(1
2

t)||x

x? = rG? ( A⇤ u? )
1

min G1 (x) + G2 A(x)

x2H

=

min G? ( A⇤ u) + G? (u)
1
2
u2L
Simple
Smooth
⇣

u(`+1) = Prox⌧ G? u(`) + ⌧ A⇤ rG? ( A⇤ u(`) )
1
2

⌘

y||2
Example: TV Denoising
1
min ||f
f RN 2

y||2 + ||⇥f ||1

||u||1 =
Dual solution u

i

||ui ||

min ||y + div(u)||2

||u||

||u||

= max ||ui ||
i

Primal solution f = y + div(u )
[Chambolle 2004]
Example: TV Denoising
1
min ||f
f RN 2

min ||y + div(u)||2

y||2 + ||⇥f ||1

||u||1 =
Dual solution u

i

||u||

||u||

||ui ||

+1)

= Proj||·||

i

Primal solution f = y + div(u )

FB (aka projected gradient descent):
u(

= max ||ui ||

u( ) +

[Chambolle 2004]

(y + div(u( ) ))

ui
v = Proj||·||
(u)
vi =
max(||ui ||/ , 1)
2
1
<
=
Convergence if
||div ⇥||
4
Primal-Dual Algorithm
min G1 (x) + G2 A(x)

x H

() min max G1 (x)
x

z

G⇤ (z) + hA(x), zi
2
Primal-Dual Algorithm
min G1 (x) + G2 A(x)

x H

G⇤ (z) + hA(x), zi
2

() min max G1 (x)
x

z

z (`+1) = Prox

G⇤
2

x(⇥+1) = Prox

(x(⇥)
G1

x(
˜

+ (x(

+1)

= x(

+1)

(z (`) + A(˜(`) )
x

A (z (⇥) ))
+1)

x( ) )

= 0: Arrow-Hurwicz algorithm.
= 1: convergence speed on duality gap.
Primal-Dual Algorithm
min G1 (x) + G2 A(x)

x H

G⇤ (z) + hA(x), zi
2

() min max G1 (x)
x

z

z (`+1) = Prox

G⇤
2

x(⇥+1) = Prox

(x(⇥)
G1

x(
˜

+ (x(

+1)

= x(

+1)

(z (`) + A(˜(`) )
x

A (z (⇥) ))
+1)

x( ) )

= 0: Arrow-Hurwicz algorithm.
= 1: convergence speed on duality gap.
Theorem: [Chambolle-Pock 2011]
If 0

x(

)

1 and ⇥⇤ ||A||2 < 1 then

x minimizer of G1 + G2 A.
Conclusion
Inverse problems in imaging:
Large scale, N 106 .
Non-smooth (sparsity, TV, . . . )
(Sometimes) convex.
Highly structured (separability,

p

norms, . . . ).
Conclusion
Inverse problems in imaging:
Large scale, N 106 .

Towards More Complex Penalization

Non-smooth (sparsity, TV, . . . )
(Sometimes) convex.
⇥ x⇥⇥1 =

i ⇥xi ⇥

b B

Highly structured (separability,

b B1

2

i p xi
b

+

2
i b xi

norms, . . . ).
b B2

Proximal splitting:
Unravel the structure of problems.
Parallelizable.
Decomposition G =

k

Gk

i

xi2
b
Conclusion
Inverse problems in imaging:
Large scale, N 106 .

Towards More Complex Penalization

Non-smooth (sparsity, TV, . . . )
(Sometimes) convex.
⇥ x⇥⇥1 =

i ⇥xi ⇥

b B

Highly structured (separability,

2

i p xi
b

Proximal splitting:
Unravel the structure of problems.

b B1

+

2
i b xi

norms, . . . ).
b B2

Parallelizable.

Open problems:
Less structured problems without smoothness.
Decomposition G = k Gk
Non-convex optimization.

i

xi2
b

More Related Content

What's hot

Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGeodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGabriel Peyré
 
Learning Sparse Representation
Learning Sparse RepresentationLearning Sparse Representation
Learning Sparse RepresentationGabriel Peyré
 
Mesh Processing Course : Multiresolution
Mesh Processing Course : MultiresolutionMesh Processing Course : Multiresolution
Mesh Processing Course : MultiresolutionGabriel Peyré
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsGabriel Peyré
 
Open GL 04 linealgos
Open GL 04 linealgosOpen GL 04 linealgos
Open GL 04 linealgosRoziq Bahtiar
 
A series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyA series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyFrank Nielsen
 
Bregman divergences from comparative convexity
Bregman divergences from comparative convexityBregman divergences from comparative convexity
Bregman divergences from comparative convexityFrank Nielsen
 
Classification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metricsClassification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metricsFrank Nielsen
 
Lecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualLecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualStéphane Canu
 
Adaptive Signal and Image Processing
Adaptive Signal and Image ProcessingAdaptive Signal and Image Processing
Adaptive Signal and Image ProcessingGabriel Peyré
 
Mesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh ParameterizationMesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh ParameterizationGabriel Peyré
 
The dual geometry of Shannon information
The dual geometry of Shannon informationThe dual geometry of Shannon information
The dual geometry of Shannon informationFrank Nielsen
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas EberleBigMC
 
Image Processing 3
Image Processing 3Image Processing 3
Image Processing 3jainatin
 
Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Nikita V. Artamonov
 
Mesh Processing Course : Geodesics
Mesh Processing Course : GeodesicsMesh Processing Course : Geodesics
Mesh Processing Course : GeodesicsGabriel Peyré
 
Lecture3 linear svm_with_slack
Lecture3 linear svm_with_slackLecture3 linear svm_with_slack
Lecture3 linear svm_with_slackStéphane Canu
 
Lecture 1: linear SVM in the primal
Lecture 1: linear SVM in the primalLecture 1: linear SVM in the primal
Lecture 1: linear SVM in the primalStéphane Canu
 

What's hot (20)

Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGeodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and Graphics
 
Learning Sparse Representation
Learning Sparse RepresentationLearning Sparse Representation
Learning Sparse Representation
 
Mesh Processing Course : Multiresolution
Mesh Processing Course : MultiresolutionMesh Processing Course : Multiresolution
Mesh Processing Course : Multiresolution
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse Problems
 
Open GL 04 linealgos
Open GL 04 linealgosOpen GL 04 linealgos
Open GL 04 linealgos
 
A series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyA series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropy
 
Bregman divergences from comparative convexity
Bregman divergences from comparative convexityBregman divergences from comparative convexity
Bregman divergences from comparative convexity
 
Classification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metricsClassification with mixtures of curved Mahalanobis metrics
Classification with mixtures of curved Mahalanobis metrics
 
Lecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualLecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dual
 
Adaptive Signal and Image Processing
Adaptive Signal and Image ProcessingAdaptive Signal and Image Processing
Adaptive Signal and Image Processing
 
Mesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh ParameterizationMesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh Parameterization
 
The dual geometry of Shannon information
The dual geometry of Shannon informationThe dual geometry of Shannon information
The dual geometry of Shannon information
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas Eberle
 
Lecture5 kernel svm
Lecture5 kernel svmLecture5 kernel svm
Lecture5 kernel svm
 
Image Processing 3
Image Processing 3Image Processing 3
Image Processing 3
 
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
 
Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014
 
Mesh Processing Course : Geodesics
Mesh Processing Course : GeodesicsMesh Processing Course : Geodesics
Mesh Processing Course : Geodesics
 
Lecture3 linear svm_with_slack
Lecture3 linear svm_with_slackLecture3 linear svm_with_slack
Lecture3 linear svm_with_slack
 
Lecture 1: linear SVM in the primal
Lecture 1: linear SVM in the primalLecture 1: linear SVM in the primal
Lecture 1: linear SVM in the primal
 

Similar to Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitting Methods

functions limits and continuity
functions limits and continuityfunctions limits and continuity
functions limits and continuityPume Ananda
 
Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...HidenoriOgata
 
2. Fixed Point Iteration.pptx
2. Fixed Point Iteration.pptx2. Fixed Point Iteration.pptx
2. Fixed Point Iteration.pptxsaadhaq6
 
03 convexfunctions
03 convexfunctions03 convexfunctions
03 convexfunctionsSufyan Sahoo
 
Chapter 1 (math 1)
Chapter 1 (math 1)Chapter 1 (math 1)
Chapter 1 (math 1)Amr Mohamed
 
Operation on functions
Operation on functionsOperation on functions
Operation on functionsJeralyn Obsina
 
Quantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averagesQuantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averagesVjekoslavKovac1
 
Function evaluation, termination, vertical line test etc
Function evaluation, termination, vertical line test etcFunction evaluation, termination, vertical line test etc
Function evaluation, termination, vertical line test etcsurprisesibusiso07
 
Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationMark Chang
 
Absolute value functions
Absolute value functionsAbsolute value functions
Absolute value functionsAlexander Nwatu
 

Similar to Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitting Methods (20)

Functions limits and continuity
Functions limits and continuityFunctions limits and continuity
Functions limits and continuity
 
functions limits and continuity
functions limits and continuityfunctions limits and continuity
functions limits and continuity
 
Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...
 
Ece3075 a 8
Ece3075 a 8Ece3075 a 8
Ece3075 a 8
 
2.1 Calculus 2.formulas.pdf.pdf
2.1 Calculus 2.formulas.pdf.pdf2.1 Calculus 2.formulas.pdf.pdf
2.1 Calculus 2.formulas.pdf.pdf
 
2. Fixed Point Iteration.pptx
2. Fixed Point Iteration.pptx2. Fixed Point Iteration.pptx
2. Fixed Point Iteration.pptx
 
03 convexfunctions
03 convexfunctions03 convexfunctions
03 convexfunctions
 
Chapter 1 (math 1)
Chapter 1 (math 1)Chapter 1 (math 1)
Chapter 1 (math 1)
 
Operation on functions
Operation on functionsOperation on functions
Operation on functions
 
0210 ch 2 day 10
0210 ch 2 day 100210 ch 2 day 10
0210 ch 2 day 10
 
Quantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averagesQuantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averages
 
Function evaluation, termination, vertical line test etc
Function evaluation, termination, vertical line test etcFunction evaluation, termination, vertical line test etc
Function evaluation, termination, vertical line test etc
 
Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential Equation
 
Tabela derivada
Tabela derivadaTabela derivada
Tabela derivada
 
Absolute value functions
Absolute value functionsAbsolute value functions
Absolute value functions
 
stoch41.pdf
stoch41.pdfstoch41.pdf
stoch41.pdf
 
Number theory lecture (part 2)
Number theory lecture (part 2)Number theory lecture (part 2)
Number theory lecture (part 2)
 
QMC: Operator Splitting Workshop, Boundedness of the Sequence if Iterates Gen...
QMC: Operator Splitting Workshop, Boundedness of the Sequence if Iterates Gen...QMC: Operator Splitting Workshop, Boundedness of the Sequence if Iterates Gen...
QMC: Operator Splitting Workshop, Boundedness of the Sequence if Iterates Gen...
 
maths basics
maths basicsmaths basics
maths basics
 
Dif int
Dif intDif int
Dif int
 

More from Gabriel Peyré

Mesh Processing Course : Introduction
Mesh Processing Course : IntroductionMesh Processing Course : Introduction
Mesh Processing Course : IntroductionGabriel Peyré
 
Mesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic SamplingMesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic SamplingGabriel Peyré
 
Mesh Processing Course : Differential Calculus
Mesh Processing Course : Differential CalculusMesh Processing Course : Differential Calculus
Mesh Processing Course : Differential CalculusGabriel Peyré
 
Signal Processing Course : Theory for Sparse Recovery
Signal Processing Course : Theory for Sparse RecoverySignal Processing Course : Theory for Sparse Recovery
Signal Processing Course : Theory for Sparse RecoveryGabriel Peyré
 
Signal Processing Course : Presentation of the Course
Signal Processing Course : Presentation of the CourseSignal Processing Course : Presentation of the Course
Signal Processing Course : Presentation of the CourseGabriel Peyré
 
Signal Processing Course : Orthogonal Bases
Signal Processing Course : Orthogonal BasesSignal Processing Course : Orthogonal Bases
Signal Processing Course : Orthogonal BasesGabriel Peyré
 
Signal Processing Course : Fourier
Signal Processing Course : FourierSignal Processing Course : Fourier
Signal Processing Course : FourierGabriel Peyré
 
Signal Processing Course : Denoising
Signal Processing Course : DenoisingSignal Processing Course : Denoising
Signal Processing Course : DenoisingGabriel Peyré
 
Signal Processing Course : Compressed Sensing
Signal Processing Course : Compressed SensingSignal Processing Course : Compressed Sensing
Signal Processing Course : Compressed SensingGabriel Peyré
 
Signal Processing Course : Approximation
Signal Processing Course : ApproximationSignal Processing Course : Approximation
Signal Processing Course : ApproximationGabriel Peyré
 
Signal Processing Course : Wavelets
Signal Processing Course : WaveletsSignal Processing Course : Wavelets
Signal Processing Course : WaveletsGabriel Peyré
 
Sparsity and Compressed Sensing
Sparsity and Compressed SensingSparsity and Compressed Sensing
Sparsity and Compressed SensingGabriel Peyré
 
Optimal Transport in Imaging Sciences
Optimal Transport in Imaging SciencesOptimal Transport in Imaging Sciences
Optimal Transport in Imaging SciencesGabriel Peyré
 
An Introduction to Optimal Transport
An Introduction to Optimal TransportAn Introduction to Optimal Transport
An Introduction to Optimal TransportGabriel Peyré
 
A Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New OneA Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New OneGabriel Peyré
 

More from Gabriel Peyré (15)

Mesh Processing Course : Introduction
Mesh Processing Course : IntroductionMesh Processing Course : Introduction
Mesh Processing Course : Introduction
 
Mesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic SamplingMesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic Sampling
 
Mesh Processing Course : Differential Calculus
Mesh Processing Course : Differential CalculusMesh Processing Course : Differential Calculus
Mesh Processing Course : Differential Calculus
 
Signal Processing Course : Theory for Sparse Recovery
Signal Processing Course : Theory for Sparse RecoverySignal Processing Course : Theory for Sparse Recovery
Signal Processing Course : Theory for Sparse Recovery
 
Signal Processing Course : Presentation of the Course
Signal Processing Course : Presentation of the CourseSignal Processing Course : Presentation of the Course
Signal Processing Course : Presentation of the Course
 
Signal Processing Course : Orthogonal Bases
Signal Processing Course : Orthogonal BasesSignal Processing Course : Orthogonal Bases
Signal Processing Course : Orthogonal Bases
 
Signal Processing Course : Fourier
Signal Processing Course : FourierSignal Processing Course : Fourier
Signal Processing Course : Fourier
 
Signal Processing Course : Denoising
Signal Processing Course : DenoisingSignal Processing Course : Denoising
Signal Processing Course : Denoising
 
Signal Processing Course : Compressed Sensing
Signal Processing Course : Compressed SensingSignal Processing Course : Compressed Sensing
Signal Processing Course : Compressed Sensing
 
Signal Processing Course : Approximation
Signal Processing Course : ApproximationSignal Processing Course : Approximation
Signal Processing Course : Approximation
 
Signal Processing Course : Wavelets
Signal Processing Course : WaveletsSignal Processing Course : Wavelets
Signal Processing Course : Wavelets
 
Sparsity and Compressed Sensing
Sparsity and Compressed SensingSparsity and Compressed Sensing
Sparsity and Compressed Sensing
 
Optimal Transport in Imaging Sciences
Optimal Transport in Imaging SciencesOptimal Transport in Imaging Sciences
Optimal Transport in Imaging Sciences
 
An Introduction to Optimal Transport
An Introduction to Optimal TransportAn Introduction to Optimal Transport
An Introduction to Optimal Transport
 
A Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New OneA Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New One
 

Recently uploaded

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitting Methods

  • 1. Low Complexity Regularization of Inverse Problems Cours #3 Proximal Splitting Methods Gabriel Peyré www.numerical-tours.com
  • 2. Overview of the Course • Course #1: Inverse Problems • Course #2: Recovery Guarantees • Course #3: Proximal Splitting Methods
  • 3. Convex Optimization Setting: G : H R ⇤ {+⇥} H: Hilbert space. Here: H = RN . Problem: min G(x) x H
  • 4. Convex Optimization Setting: G : H R ⇤ {+⇥} H: Hilbert space. Here: H = RN . Problem: Class of functions: Convex: G(tx + (1 min G(x) x H y x t)y) tG(x) + (1 t)G(y) t [0, 1]
  • 5. Convex Optimization Setting: G : H R ⇤ {+⇥} H: Hilbert space. Here: H = RN . Problem: Class of functions: Convex: G(tx + (1 min G(x) x H y x t)y) Lower semi-continuous: tG(x) + (1 t)G(y) lim inf G(x) G(x0 ) x x0 Proper: {x ⇥ H G(x) ⇤= + } = ⌅ ⇤ t [0, 1]
  • 6. Convex Optimization Setting: G : H R ⇤ {+⇥} H: Hilbert space. Here: H = RN . min G(x) Problem: Class of functions: Convex: G(tx + (1 x H t)y) Lower semi-continuous: tG(x) + (1 t)G(y) lim inf G(x) G(x0 ) x x0 Proper: {x ⇥ H G(x) ⇤= + } = ⌅ ⇤ Indicator: y x C (x) = (C closed and convex) 0 if x ⇥ C, + otherwise. t [0, 1]
  • 8. Example: Inverse problem: f0 Model: f0 = x RQ coe cients K 1 Regularization measurements Kf0 y = Kf0 + w K : RN x0 sparse in dictionary f= x R image N = K ⇥ ⇥ RP K Q RP , RN Q ,Q P N N. y = Kf RP observations
  • 9. Example: Inverse problem: f0 Model: f0 = x RQ coe cients K 1 Regularization measurements Kf0 y = Kf0 + w K : RN x0 sparse in dictionary f= x R image N = K ⇥ ⇥ RP K Q Sparse recovery: f = x where x solves 1 min ||y x||2 + ||x||1 x RN 2 Fidelity Regularization RP , RN Q ,Q P N N. y = Kf RP observations
  • 10. Example: 1 Regularization Inpainting: masking operator K fi if i , (Kf )i = 0 otherwise. K : RN RN Q RP c P =| | translation invariant wavelet frame. Orignal f0 = x0 y = x0 + w Recovery x
  • 11. Overview • Subdifferential Calculus • Proximal Calculus • Forward Backward • Douglas Rachford • Generalized Forward-Backward • Duality
  • 12. Sub-differential Sub-di erential: G(x) = {u ⇥ H ⇤ z, G(z) G(x) + ⌅u, z x⇧} G(x) = |x| G(0) = [ 1, 1]
  • 13. Sub-differential Sub-di erential: G(x) = {u ⇥ H ⇤ z, G(z) Smooth functions: G(x) + ⌅u, z x⇧} G(x) = |x| If F is C 1 , F (x) = { F (x)} G(0) = [ 1, 1]
  • 14. Sub-differential Sub-di erential: G(x) = {u ⇥ H ⇤ z, G(z) G(x) + ⌅u, z x⇧} G(x) = |x| Smooth functions: If F is C 1 , F (x) = { F (x)} G(0) = [ 1, 1] First-order conditions: x argmin G(x) x H 0 G(x )
  • 15. Sub-differential Sub-di erential: G(x) = {u ⇥ H ⇤ z, G(z) G(x) + ⌅u, z x⇧} G(x) = |x| Smooth functions: If F is C 1 , F (x) = { F (x)} G(0) = [ 1, 1] First-order conditions: x argmin G(x) 0 x H Monotone operator: (u, v) U (x) G(x ) U (x) x U (x) = G(x) U (y), y x, v u 0
  • 16. Example: 1 Regularization 1 x ⇥ argmin G(x) = ||y 2 x RQ ⇥G(x) = || · ||1 (x)i = ( x y) + ⇥|| · ||1 (x) x||2 + ||x||1 sign(xi ) if xi ⇥= 0, [ 1, 1] if xi = 0.
  • 17. Example: 1 Regularization 1 x ⇥ argmin G(x) = ||y 2 x RQ ⇥G(x) = || · ||1 (x)i = ( x x||2 + ||x||1 y) + ⇥|| · ||1 (x) sign(xi ) if xi ⇥= 0, [ 1, 1] if xi = 0. Support of the solution: I = {i ⇥ {0, . . . , N 1} xi ⇤= 0} xi i
  • 18. Example: 1 Regularization 1 x ⇥ argmin G(x) = ||y 2 x RQ ⇥G(x) = || · ||1 (x)i = ( x x||2 + ||x||1 y) + ⇥|| · ||1 (x) sign(xi ) if xi ⇥= 0, [ 1, 1] if xi = 0. xi i Support of the solution: I = {i ⇥ {0, . . . , N 1} xi ⇤= 0} First-order conditions: s RN , ( x i, y) + s = 0 sI = sign(xI ), ||sI c || 1. y x i
  • 19. Example: Total Variation Denoising Important: the optimization variable is f . 1 f ⇥ argmin ||y f ||2 + J(f ) f RN 2 Finite di erence gradient: Discrete TV norm: :R J(f ) = i = 0 (noisy) N R N 2 ||( f )i || ( f )i R2
  • 20. Example: Total Variation Denoising 1 f ⇥ argmin ||y f RN 2 J(f ) = G( f ) f ||2 + J(f ) G(u) = i Composition by linear maps: J(f ) = ⇥G(u)i = (J ||ui || A) = A ( J) A div ( G( f )) ui ||ui || if ui ⇥= 0, R2 || || 1 if ui = 0.
  • 21. Example: Total Variation Denoising 1 f ⇥ argmin ||y f RN 2 J(f ) = G( f ) f ||2 + J(f ) G(u) = i (J Composition by linear maps: J(f ) = ⇥G(u)i = A) = A ( J) A div ( G( f )) if ui ⇥= 0, R2 || || 1 ui ||ui || First-order conditions: ⇥i ⇥i ||ui || I, vi = I c , ||vi || v fi || fi || , 1 RN if ui = 0. 2 , f = y + div(v) I = {i (⇥f )i = 0}
  • 22. Overview • Subdifferential Calculus • Proximal Calculus • Forward Backward • Douglas Rachford • Generalized Forward-Backward • Duality
  • 23. Proximal Operators Proximal operator of G: 1 Prox G (x) = argmin ||x 2 z z||2 + G(z)
  • 24. Proximal Operators Proximal operator of G: 1 Prox G (x) = argmin ||x 2 z G(x) = ||x||1 = z||2 + G(z) log(1 + x2 ) |x| ||x||0 12 i |xi | 10 8 6 4 2 0 G(x) = ||x||0 = | {i xi = 0} | G(x) = i log(1 + |xi |2 ) G(x) −2 −10 −8 −6 −4 −2 0 2 4 6 8 10
  • 25. Proximal Operators Proximal operator of G: 1 Prox G (x) = argmin ||x 2 z G(x) = ||x||1 = Prox G (x)i z||2 + G(z) 12 i |xi | = max 0, 1 10 8 |xi | G(x) = ||x||0 = | {i xi = 0} | Prox G (x)i log(1 + x2 ) |x| ||x||0 = xi if |xi | 0 otherwise. xi 6 4 2 0 G(x) −2 −10 2 , −8 −6 −4 −2 0 2 4 6 8 10 10 8 6 4 2 0 G(x) = i log(1 + |xi |2 ) 3rd order polynomial root. −2 −4 −6 ProxG (x) −8 −10 −10 −8 −6 −4 −2 0 2 4 6 8 10
  • 26. Proximal Calculus Separability: G(x) = G1 (x1 ) + . . . + Gn (xn ) ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn ))
  • 27. Proximal Calculus Separability: G(x) = G1 (x1 ) + . . . + Gn (xn ) ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn )) 1 Quadratic functionals: G(x) = || x y||2 2 Prox G = (Id + ) 1 = (Id + ) 1
  • 28. Proximal Calculus Separability: G(x) = G1 (x1 ) + . . . + Gn (xn ) ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn )) 1 Quadratic functionals: G(x) = || x y||2 2 Prox G = (Id + ) 1 = (Id + ) 1 Composition by tight frame: A A = Id ProxG A (x) =A ProxG A + Id A A
  • 29. Proximal Calculus G(x) = G1 (x1 ) + . . . + Gn (xn ) Separability: ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn )) 1 Quadratic functionals: G(x) = || x y||2 2 Prox G = (Id + ) 1 = (Id + ) 1 Composition by tight frame: A A = Id ProxG A (x) Indicators: Prox G (x) =A G(x) = ProxG A + Id z C A x C (x) = ProjC (x) = argmin ||x A C z|| ProjC (x)
  • 30. Prox and Subdifferential Resolvant of G: z = Prox x G (x) 0 (Id + ⇥G)(z) z x + ⇥G(z) z = (Id + ⇥G) 1 (x) Inverse of a set-valued mapping: where x Prox G U (y) = (Id + ⇥G) y 1 U 1 (x) is a single-valued mapping
  • 31. Prox and Subdifferential Resolvant of G: z = Prox x G (x) 0 (Id + ⇥G)(z) z x + ⇥G(z) z = (Id + ⇥G) 1 (x) Inverse of a set-valued mapping: where x Prox G Fix point: U (y) = (Id + ⇥G) x y 1 U (x) is a single-valued mapping argmin G(x) x 0 1 G(x ) x⇥ = (Id + ⇥G) x 1 (Id + ⇥G)(x ) (x⇥ ) = Prox (x⇥ ) G
  • 32. Gradient and Proximal Descents x( +1) = x( ) G(x( ) ) Gradient descent: G is C 1 and G is L-Lipschitz Theorem: If 0 < < 2/L, x( ) [explicit] x a solution.
  • 33. Gradient and Proximal Descents x( +1) = x( ) G(x( ) ) Gradient descent: G is C 1 and G is L-Lipschitz Theorem: < 2/L, x( If 0 < Sub-gradient descent: x( Theorem: If +1) = x( 1/⇥, x( Problem: slow. ) ) ) [explicit] x a solution. v( ) , v( ) x a solution. G(x( ) )
  • 34. Gradient and Proximal Descents x( +1) = x( ) G(x( ) ) Gradient descent: G is C 1 and G is L-Lipschitz Theorem: < 2/L, x( If 0 < Sub-gradient descent: x( Theorem: +1) = x( 1/⇥, x( If ) ) [explicit] x a solution. v( ) , v( ) G(x( ) ) x a solution. ) Problem: slow. Proximal-point algorithm: x(⇥+1) = Prox Theorem: c > 0, x( If Prox G ) (x(⇥) ) [implicit] G x a solution. hard to compute.
  • 35. Overview • Subdifferential Calculus • Proximal Calculus • Forward Backward • Douglas Rachford • Generalized Forward-Backward • Duality
  • 37. Proximal Splitting Methods Solve min E(x) x H is not available. Problem: Prox Splitting: E(x) = F (x) + E Smooth Gi (x) i Simple
  • 38. Proximal Splitting Methods Solve min E(x) x H is not available. Problem: Prox Splitting: E(x) = F (x) + E Smooth Gi (x) i Iterative algorithms using: Forward-Backward: solves Simple F (x) Prox Gi (x) F + G Douglas-Rachford: Gi Primal-Dual: Gi A Generalized FB: F+ Gi
  • 39. Smooth + Simple Splitting Inverse problem: f0 K Model: f0 = measurements Kf0 y = Kf0 + w K : RN x0 sparse in dictionary Sparse recovery: f = RP , P . x where x solves min F (x) + G(x) x RN Smooth Simple 1 Data fidelity: F (x) = ||y x||2 2 Regularization: G(x) = ||x||1 = |xi | i =K ⇥ N
  • 40. Forward-Backward Fix point equation: x argmin F (x) + G(x) x (x 0 F (x ) + G(x ) F (x )) x + ⇥G(x ) x⇥ = Prox (x⇥ G F (x⇥ ))
  • 41. Forward-Backward Fix point equation: x argmin F (x) + G(x) x (x 0 F (x ) + G(x ) F (x )) x + ⇥G(x ) x⇥ = Prox Forward-backward: x(⇥+1) = Prox (x⇥ G G x(⇥) F (x⇥ )) F (x(⇥) )
  • 42. Forward-Backward Fix point equation: x argmin F (x) + G(x) x (x 0 F (x ) + G(x ) F (x )) x + ⇥G(x ) x⇥ = Prox Forward-backward: (x⇥ G x(⇥+1) = Prox Projected gradient descent: G= G C x(⇥) F (x⇥ )) F (x(⇥) )
  • 43. Forward-Backward Fix point equation: x argmin F (x) + G(x) x F (x ) + G(x ) F (x )) (x 0 x + ⇥G(x ) x⇥ = Prox Forward-backward: x(⇥+1) = Prox G= Projected gradient descent: Theorem: If < 2/L, (x⇥ G Let x( ) G x(⇥) F (x⇥ )) F (x(⇥) ) C F be L-Lipschitz. x a solution of ( )
  • 44. Example: L1 Regularization 1 min || x x 2 y||2 + ||x||1 1 F (x) = || x 2 min F (x) + G(x) x y||2 F (x) = ( x G(x) = ||x||1 Prox G (x)i Forward-backward L = || y) = max 0, 1 ⇥ |xi | || xi Iterative soft thresholding
  • 45. Convergence Speed min E(x) = F (x) + G(x) x F is L-Lipschitz. G is simple. Theorem: If L > 0, FB iterates x( E(x( ) ) E(x ) C degrades with L C/ 0. ) satisfies
  • 46. Multi-steps Accelerations t(0) = 1 Beck-Teboule accelerated FB: ✓ ◆ 1 (`+1) (`) x = Prox1/L y rF (y (`) ) L 1+ 1 + 4(t( ) )2 t( +1) = 2() t 1 ( ( +1) ( +1) y =x + ( +1) (x t +1) x( ) ) (see also Nesterov method) Theorem: If L > 0, ( ) E(x ) E(x ) C Complexity theory: optimal in a worse-case sense.
  • 47. Overview • Subdifferential Calculus • Proximal Calculus • Forward Backward • Douglas Rachford • Generalized Forward-Backward • Duality
  • 48. Douglas Rachford Scheme min G1 (x) + G2 (x) x Simple ( ) Simple Douglas-Rachford iterations: z (⇥+1) = 1 x(`+1) 2 z (⇥) + 2 = Prox G1 (z (`+1) ) Reflexive prox: RProx G (x) RProx = 2Prox G2 G (x) RProx x (z (⇥) ) G1
  • 49. Douglas Rachford Scheme min G1 (x) + G2 (x) x Simple ( ) Simple Douglas-Rachford iterations: z (⇥+1) = 1 x(`+1) z (⇥) + 2 2 = Prox G1 (z (`+1) ) Reflexive prox: RProx Theorem: x( G (x) = 2Prox If 0 < ) RProx x G2 G (x) RProx x < 2 and ⇥ > 0, a solution of ( ) (z (⇥) ) G1
  • 50. DR Fix Point Equation min G1 (x) + G2 (x) 0 x z, z x x = Prox (G1 + G2 )(x) ⇥( G1 )(x) and x G1 (z) and (2x z) ⇥( G2 )(x) z x ⇥( G2 )(x)
  • 51. DR Fix Point Equation min G1 (x) + G2 (x) 0 x z, z ⇥( G1 )(x) and x x x = Prox (G1 + G2 )(x) G1 (z) x = Prox and (2x ⇥( G2 )(x) z G2 ⇥( G2 )(x) x z) = Prox G2 (2x z) RProx G1 (z) z = 2Prox G2 RProx G1 (y) (2x z = 2Prox G2 RProx G1 (z) RProx G1 (z) RProx G1 (z) z = RProx z= 1 2 G2 RProx z+ 2 z) G1 (z) RProx G2
  • 52. Example: Constrainted L1 min ||x||1 min G1 (x) + G2 (x) x=y C = {x x = y} G1 (x) = iC (x), Prox x G1 (x) = ProjC (x) = x + G2 (x) = ||x||1 Prox e⇥cient if G2 (x) = ⇥ ( ⇥ ) max 0, 1 easy to invert. 1 (y |xi | x) xi i
  • 53. Example: Constrainted L1 min ||x||1 min G1 (x) + G2 (x) x=y C = {x x = y} G1 (x) = iC (x), Prox x G1 (x) = ProjC (x) = x + G2 (x) = ||x||1 Prox e⇥cient if G2 (x) = ⇥ ( easy to invert. Example: compressed sensing y = x0 400 Gaussian matrix ||x0 ||0 = 17 ) 1 max 0, 1 1 R100 ⇥ (y x) xi |xi | i log10 (||x( ) ||1 ||x ||1 ) 0 −1 −2 −3 −4 −5 = 0.01 =1 = 10 50 100 150 200 250
  • 54. More than 2 Functionals min G1 (x) + . . . + Gk (x) x min (x1 ,...,xk ) each Fi is simple G(x1 , . . . , xk ) + ◆C (x1 , . . . , xk ) G(x1 , . . . , xk ) = G1 (x1 ) + . . . + Gk (xk ) C = (x1 , . . . , xk ) Hk x1 = . . . = xk
  • 55. More than 2 Functionals each Fi is simple min G1 (x) + . . . + Gk (x) x min (x1 ,...,xk ) G(x1 , . . . , xk ) + ◆C (x1 , . . . , xk ) G(x1 , . . . , xk ) = G1 (x1 ) + . . . + Gk (xk ) C = (x1 , . . . , xk ) G and Prox Prox C Hk x1 = . . . = xk are simple: G (x1 , . . . , xk ) = (Prox Gi (xi ))i ⇥C (x1 , . . . , xk ) = (˜, . . . , x) x ˜ 1 where x = ˜ k xi i
  • 56. Auxiliary Variables: DR Linear map A : E min G1 (x) + G2 A(x) x min G(z) + z⇥H E G1 , G2 simple. C (z) G(x, y) = G1 (x) + G2 (y) C = {(x, y) ⇥ H E Ax = y} H.
  • 57. Auxiliary Variables: DR Linear map A : E min G1 (x) + G2 A(x) x min G(z) + z⇥H E G1 , G2 simple. C (z) G(x, y) = G1 (x) + G2 (y) C = {(x, y) ⇥ H Prox G (x, y) = (Prox G1 (x), Prox G2 (y)) ˜ Prox C (x, y) = (x + A y , y where E Ax = y} x x y ) = (˜, A˜) ˜ y = (Id + AA ) ˜ 1 (Ax x = (Id + A A) ˜ 1 (A y + x) y) e cient if Id + AA or Id + A A easy to invert. H.
  • 58. Example: TV Regularization 1 min ||Kf y||2 + ||⇥f ||1 f 2 min G1 (f ) + G2 (f ) ||u||1 = i ||ui || x G1 (u) = ||u||1 1 G2 (f ) = ||Kf 2 C = (f, u) ⇥ RN Prox G1 (u)i y||2 RN Prox 2 = max 0, 1 G2 u = ⇤f ˜ ˜ Prox C (f, u) = (f , f ) ||ui || = (Id + K K) ui 1 K
  • 59. Example: TV Regularization 1 min ||Kf y||2 + ||⇥f ||1 f 2 min G1 (f ) + G2 (f ) ||u||1 = i ||ui || x G1 (u) = ||u||1 1 G2 (f ) = ||Kf 2 C = (f, u) ⇥ RN Prox G1 (u)i y||2 RN Prox 2 = max 0, 1 G2 ||ui || = (Id + K K) ui 1 K u = ⇤f ˜ ˜ Prox C (f, u) = (f , f ) Compute the solution of: (Id + ˜ )f = div(u) + f O(N log(N )) operations using FFT.
  • 60. Example: TV Regularization Orignal f0 y = Kx0 y = f0 + w Recovery f Iteration
  • 61. Overview • Subdifferential Calculus • Proximal Calculus • Forward Backward • Douglas Rachford • Generalized Forward-Backward • Duality
  • 62. GFB Splitting n min F (x) + x RN (⇥+1) (⇥) zi = zi + n 1 ( +1) x = n Proxn i=1 ( ) i=1 Smooth i = 1, . . . , n, Gi (x) ( +1) zi G Simple (2x (⇥) (⇥) zi F (x(⇥) )) x(⇥)
  • 63. GFB Splitting n min F (x) + x RN (⇥+1) (⇥) zi = zi + n 1 ( +1) = x n Simple Proxn G (2x (⇥) (⇥) zi F (x(⇥) )) x(⇥) ( +1) zi i=1 Theorem: If ( ) i=1 Smooth i = 1, . . . , n, Gi (x) < 2/L, Let x( ) F be L-Lipschitz. x a solution of ( )
  • 64. GFB Splitting n min F (x) + x RN (⇥+1) (⇥) zi = zi + n 1 ( +1) = x n Proxn Simple G (2x (⇥) (⇥) zi F (x(⇥) )) x(⇥) ( +1) zi i=1 Theorem: If ( ) i=1 Smooth i = 1, . . . , n, Gi (x) < 2/L, n=1 F =0 Let x( ) F be L-Lipschitz. x a solution of ( ) Forward-backward. Douglas-Rachford.
  • 65. GFB Fix Point x argmin F (x) + x RN yi i Gi (x) Gi (x ), 0 F (x ) + F (x ) + i yi =0 i Gi (x )
  • 66. GFB Fix Point x argmin F (x) + x RN i Gi (x) Gi (x ), yi (zi )n , i=1 1 i, x n x = i zi 1 n 0 F (x ) + zi F (x ) + i yi Gi (x ) =0 F (x ) (use zi = x i ⇥Gi (x ) F (x ) N yi )
  • 67. GFB Fix Point x argmin F (x) + x RN i Gi (x) Gi (x ), yi (zi )n , i=1 i zi 1 n (2x zi x⇥ = Proxn F (x ) + F (x ) + 1 i, x n x = 0 i yi (use zi = x F (x )) (2x⇥ Gi zi = zi + Proxn G ⇥Gi (x ) F (x ) N yi ) n ⇥Gi (x ) x F (x⇥ )) zi (2x⇥ Gi (x ) =0 F (x ) zi i zi F (x⇥ )) x⇥
  • 68. GFB Fix Point x argmin F (x) + x RN i Gi (x) Gi (x ), yi (zi )n , i=1 i zi 1 n (2x zi x⇥ = Proxn i yi (use zi = x F (x )) (2x⇥ Gi G Gi (x ) ⇥Gi (x ) F (x ) N yi ) n ⇥Gi (x ) x F (x⇥ )) zi (2x⇥ i =0 F (x ) zi zi = zi + Proxn + F (x ) + F (x ) + 1 i, x n x = 0 zi F (x⇥ )) x⇥ Fix point equation on (x , z1 , . . . , zn ).
  • 69. Block Regularization 1 2 block sparsity: G(x) = b B iments 2 + (2) ` 1 `2 4 k=1 N: 256 x x2 m m b Towards More Complex Penalization Bk 1,2 ⇥ x⇥⇥1 = i ⇥xi ⇥ b Image f = ||x[b] ||2 = ||x[b] ||, B x Coe cients x. b B i xi2 b b B1 b B2 + i b xi i b xi
  • 70. Block Regularization 1 2 block sparsity: G(x) = b B ||x[b] ||, ||x[b] ||2 = x2 m m b ... B Non-overlapping decomposition: B = B iments Towards More Complex Penalization Towards More Complex Penalization Towards More Complex Penalization 2 1 n (2) G(x) =4 x iBk (x) + ` ` k=1 G 1,2 1 2 N: 256 Gi (x) = b Bi i=1 ⇥= ⇥ x⇥x⇥x⇥⇥1 =i ⇥x⇥x⇥xi ⇥ ⇥ ⇥1 ⇥1 = i i ⇥i i ⇥ b Image f = ||x[b] ||, bb B B i Bb xii2bi2xi2 bbx i B x Coe cients x. n Blocks B1 22 b b 1b1 B1 i b xiixb xi BB i b i ++ + b b 2b2 B2 i BB B1 xi2 b2xi b b xi i B2
  • 71. Block Regularization 1 2 block sparsity: G(x) = b B ||x[b] ||, ||x[b] ||2 = x2 m m b ... B Non-overlapping decomposition: B = B iments Towards More Complex Penalization Towards More Complex Penalization Towards More Complex Penalization 2 1 n (2) G(x) =4 x iBk (x) + ` ` k=1 G 1,2 1 2 Gi (x) = b Bi i=1 ||x[b] ||, Each Gi is simple: ⇥ ⇥1 = i ⇥i i ⇥ x⇥x⇥x⇥⇥1 =i ⇥xG ⇥xi ⇥ m = b B B i b xii2bi2xi2 = Bb ⇤ m ⇥ b ⇥ Bi , ⇥ ⇥1Prox i ⇥xi ⇥(x) b max i0, 1 bx N: 256 b Image f = B x Coe cients x. n Blocks B1 22 b b 1b1 B1 i b xiixb xi BB i b i ||x[b]b||B b B b ++m x + 2 2 B2 B1 i xi2 b2xi b b xi i B2
  • 72. 10 10 x+1,2` 1 `2 k=1 Numerical Numerical Experiments Experiments 1 1 1 0 log10(E−Emin) log10(E−Emin) tmin : 283s; t : 298s; t :: 283s; t : 298s; t (2) t CP 2 + 368s ||y x 1 ⇥x||368s PRx 2 minix(x)Y ⇥ K PR Deconvolution +GCP: 1` 4 −1 EFB −1 EFB Deconvolution min 2 Y ⇥ K ` x 102 10 40 20 30 1 2 2 40k=1 20 30 EFB iteration # i EFB iteration 3 # 3 0 log10(E−Emin) x k=1 Numerical Illustration log (E− log (E−E Deconv. + Inpaint. 2min+CP Y ⇥ P K x CP Y + P 1 K2 x Deconv. x 2Inpaint. min 2 ⇥ ` ` 2 PR CP = convolution 2 x PR CP 2 λ Bk 2 TI (2)`2 4 x= + `wavelets x k=1 1,2 1 λ2 : 1.30e−03; : 1.30e−03; = inpainting+convolution l1/l2 l1/l2 tEFB: 161s; tPR: 173s; tCP N: 256 190s t : 161s; noise: 0.025; :convol.: 2 t : 173s; t 190s noise: 0.025; convol.::it. #50; SNR: 22.49dB #50; SNR: 22.49dB it. 2 N: 256 EFB PR CP 3 2 Numerical Experiments 1 0 onv. + Inpaint. minx 2 10 20 1 EFB 0 3 PR 1 CP 2 30 2 1 iteration # 1 0 0 Y ⇥P K 10 40 20 x 2 + 30 iteration # EFB PR (4) CP `140`2 16 k=1 x λ4 : 1.00e−03; l1/l2 Bk 1,2 λ4 : 1.00e−03; l1/l2 10 min tEFB: 283s; tPR: 298s; tCP: 368s it. #50; SNR: 21.80dB #50; SNR: 21.80dB noise: 0.025; degrad.: 0.4; 0.025; degrad.: 0.4; convol.: 2 it. noise: convol.: 2 −1 −1 10 20 30 EFB PR CP iteration # 3 2 1 40 10 20 30 x0 40 iteration # λ2 : 1.30e−03; l1/l2 noise: 0.025; it. #50; SNR: 22.49dB convol.: 2 noise: 0.025; convol.: 2 0 10 log10 20 iteration (E(x( ) ) # y = x0 + w E(x )) 30 40 4 x λ2 : 1.30e−03; l1/l2 it. #50; SNR: 22.49dB
  • 73. Overview • Subdifferential Calculus • Proximal Calculus • Forward Backward • Douglas Rachford • Generalized Forward-Backward • Duality
  • 74. Legendre-Fenchel Duality Legendre-Fenchel transform: G (u) = sup x dom(G) u, x G(x) eu lop S G(x) G (u) x
  • 75. Legendre-Fenchel Duality Legendre-Fenchel transform: G (u) = sup u, x G(x) x dom(G) Example: quadratic functional 1 G(x) = Ax, x + x, b 2 1 G (u) = u b, A 1 (u b) 2 eu lop S G(x) G (u) x
  • 76. Legendre-Fenchel Duality Legendre-Fenchel transform: G (u) = sup u, x G(x) x dom(G) Example: quadratic functional 1 G(x) = Ax, x + x, b 2 1 G (u) = u b, A 1 (u b) 2 G(x) G (u) Moreau’s identity: Prox G (x) = x G simple eu lop S ProxG/ (x/ ) G simple x
  • 77. Indicator and Homogeneous Positively 1-homogeneous functional: Example: norm Duality: G (x) = G(x) = ||x|| G (·) 1 (x) G( x) = |x|G(x) G (y) = min G(x) 1 x, y
  • 78. Indicator and Homogeneous Positively 1-homogeneous functional: Example: norm Duality: G (x) = G(x) = ||x|| G (·) 1 (x) G( x) = |x|G(x) G (y) = min G(x) 1 p norms: G(x) = ||x||p G (x) = ||x||q 1 1 + =1 p q 1 x, y p, q +
  • 79. Indicator and Homogeneous G( x) = |x|G(x) Positively 1-homogeneous functional: G(x) = ||x|| Example: norm Duality: G (x) = G (·) 1 (x) G (y) = min G(x) 1 p norms: G(x) = ||x||p G (x) = ||x||q 1 1 + =1 p q Example: Proximal operator of Prox ||·|| Proj||·||1 = Id norm Proj||·||1 (x)i = max 0, 1 |xi | for a well-chosen ⇥ = ⇥ (x, ) xi 1 x, y p, q +
  • 80. Primal-dual Formulation A:H⇥ Fenchel-Rockafellar duality: L linear min G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui x2H x u2L G⇤ (u) 2
  • 81. Primal-dual Formulation A:H⇥ Fenchel-Rockafellar duality: L linear min G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui x x2H u2L G⇤ (u) 2 Strong duality: 0 2 ri(dom(G2 )) (min $ max) = max G⇤ (u) + min G1 (x) + hx, A⇤ ui 2 = max G⇤ (u) 2 u u A ri(dom(G1 )) x G⇤ ( 1 A⇤ u)
  • 82. Primal-dual Formulation A:H⇥ Fenchel-Rockafellar duality: L linear min G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui x x2H u2L G⇤ (u) 2 Strong duality: 0 2 ri(dom(G2 )) (min $ max) = max G⇤ (u) + min G1 (x) + hx, A⇤ ui 2 = max G⇤ (u) 2 u u A ri(dom(G1 )) x G⇤ ( 1 Recovering x? from some u? : x? = argmin G1 (x? ) + hx? , A⇤ u? i x A⇤ u)
  • 83. Primal-dual Formulation A:H⇥ Fenchel-Rockafellar duality: L linear min G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui x x2H u2L G⇤ (u) 2 Strong duality: 0 2 ri(dom(G2 )) (min $ max) = max G⇤ (u) + min G1 (x) + hx, A⇤ ui 2 = max G⇤ (u) 2 u u A ri(dom(G1 )) x G⇤ ( 1 A⇤ u) Recovering x? from some u? : x? = argmin G1 (x? ) + hx? , A⇤ u? i x () A⇤ u? 2 @G1 (x? ) () x? 2 (@G1 ) 1 ( A⇤ u? ) = @G⇤ ( A⇤ u? ) 1
  • 84. Forward-Backward on the Dual If G1 is strongly convex: G1 (tx + (1 r2 G1 > cId t)y) 6 tG1 (x) + (1 t)G1 (y) c t(1 2 t)||x y||2
  • 85. Forward-Backward on the Dual If G1 is strongly convex: G1 (tx + (1 r2 G1 > cId t)y) 6 tG1 (x) + (1 x? uniquely defined. G? is of class C 1 . 1 t)G1 (y) c t(1 2 t)||x x? = rG? ( A⇤ u? ) 1 y||2
  • 86. Forward-Backward on the Dual r2 G1 > cId If G1 is strongly convex: G1 (tx + (1 t)y) 6 tG1 (x) + (1 x? uniquely defined. G? is of class C 1 . 1 FB on the dual: t)G1 (y) c t(1 2 t)||x x? = rG? ( A⇤ u? ) 1 min G1 (x) + G2 A(x) x2H = min G? ( A⇤ u) + G? (u) 1 2 u2L Simple Smooth ⇣ u(`+1) = Prox⌧ G? u(`) + ⌧ A⇤ rG? ( A⇤ u(`) ) 1 2 ⌘ y||2
  • 87. Example: TV Denoising 1 min ||f f RN 2 y||2 + ||⇥f ||1 ||u||1 = Dual solution u i ||ui || min ||y + div(u)||2 ||u|| ||u|| = max ||ui || i Primal solution f = y + div(u ) [Chambolle 2004]
  • 88. Example: TV Denoising 1 min ||f f RN 2 min ||y + div(u)||2 y||2 + ||⇥f ||1 ||u||1 = Dual solution u i ||u|| ||u|| ||ui || +1) = Proj||·|| i Primal solution f = y + div(u ) FB (aka projected gradient descent): u( = max ||ui || u( ) + [Chambolle 2004] (y + div(u( ) )) ui v = Proj||·|| (u) vi = max(||ui ||/ , 1) 2 1 < = Convergence if ||div ⇥|| 4
  • 89. Primal-Dual Algorithm min G1 (x) + G2 A(x) x H () min max G1 (x) x z G⇤ (z) + hA(x), zi 2
  • 90. Primal-Dual Algorithm min G1 (x) + G2 A(x) x H G⇤ (z) + hA(x), zi 2 () min max G1 (x) x z z (`+1) = Prox G⇤ 2 x(⇥+1) = Prox (x(⇥) G1 x( ˜ + (x( +1) = x( +1) (z (`) + A(˜(`) ) x A (z (⇥) )) +1) x( ) ) = 0: Arrow-Hurwicz algorithm. = 1: convergence speed on duality gap.
  • 91. Primal-Dual Algorithm min G1 (x) + G2 A(x) x H G⇤ (z) + hA(x), zi 2 () min max G1 (x) x z z (`+1) = Prox G⇤ 2 x(⇥+1) = Prox (x(⇥) G1 x( ˜ + (x( +1) = x( +1) (z (`) + A(˜(`) ) x A (z (⇥) )) +1) x( ) ) = 0: Arrow-Hurwicz algorithm. = 1: convergence speed on duality gap. Theorem: [Chambolle-Pock 2011] If 0 x( ) 1 and ⇥⇤ ||A||2 < 1 then x minimizer of G1 + G2 A.
  • 92. Conclusion Inverse problems in imaging: Large scale, N 106 . Non-smooth (sparsity, TV, . . . ) (Sometimes) convex. Highly structured (separability, p norms, . . . ).
  • 93. Conclusion Inverse problems in imaging: Large scale, N 106 . Towards More Complex Penalization Non-smooth (sparsity, TV, . . . ) (Sometimes) convex. ⇥ x⇥⇥1 = i ⇥xi ⇥ b B Highly structured (separability, b B1 2 i p xi b + 2 i b xi norms, . . . ). b B2 Proximal splitting: Unravel the structure of problems. Parallelizable. Decomposition G = k Gk i xi2 b
  • 94. Conclusion Inverse problems in imaging: Large scale, N 106 . Towards More Complex Penalization Non-smooth (sparsity, TV, . . . ) (Sometimes) convex. ⇥ x⇥⇥1 = i ⇥xi ⇥ b B Highly structured (separability, 2 i p xi b Proximal splitting: Unravel the structure of problems. b B1 + 2 i b xi norms, . . . ). b B2 Parallelizable. Open problems: Less structured problems without smoothness. Decomposition G = k Gk Non-convex optimization. i xi2 b