SlideShare a Scribd company logo
1 of 93
Download to read offline
Exact Matrix Completion via Convex Optimization
Emmanuel J. Cand`es and Benjamin Recht
Minsu Kim1, Joonyoung Yi1
1School of Computing, KAIST
CS592, Advanced Machine Learning:
High-Dimensional Data Analysis
April, 2018
CS592 Exact Matrix Completion 2018. 4. 24 1 / 51
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discussion
CS592 Exact Matrix Completion 2018. 4. 24 2 / 51
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discussion
CS592 Exact Matrix Completion 2018. 4. 24 3 / 51
Matrix Completion Problem(Informal)
Given: Matrix M with some entries are missing
Goal: Complete the matrix M
CS592 Exact Matrix Completion 2018. 4. 24 4 / 51
Matrix Completion Problem(Informal)
Given: Matrix M with some entries are missing
Goal: Complete the matrix M
In general, it is impossible.
But, it can be possible with low-rank assumption.
CS592 Exact Matrix Completion 2018. 4. 24 4 / 51
Low-rank Assumption
Low-rank assumption
For n1 × n2 matrix M of rank r, assume the min(n1, n2) r.
Why this assumption is needed?
CS592 Exact Matrix Completion 2018. 4. 24 5 / 51
Low-rank Assumption
Low-rank assumption
For n1 × n2 matrix M of rank r, assume the min(n1, n2) r.
Why this assumption is needed?
For simplicity, think about n × n matrix M of rank r.
It has (2n − r)r degrees of freedom.
Degree of freedom is calculated by counting parameters in the SVD.
(The number of singular values)
+ (degree of freedom of left singular vectors)
+ (degree of freedom of right singular vectors)
= r + ((2n − r − 1) × r)/2 + ((2n − r − 1) × r)/2
Considerably smaller than n2
.
With low-rank assumption, degree of freedom is reduced by about
2r/n.
CS592 Exact Matrix Completion 2018. 4. 24 5 / 51
Low-rank Assumption
Low-rank assumption
For n1 × n2 matrix M of rank r, assume the min(n1, n2) r.
Why this assumption is needed?
For simplicity, think about n × n matrix M of rank r.
It has (2n − r)r degrees of freedom.
Degree of freedom is calculated by counting parameters in the SVD.
(The number of singular values)
+ (degree of freedom of left singular vectors)
+ (degree of freedom of right singular vectors)
= r + ((2n − r − 1) × r)/2 + ((2n − r − 1) × r)/2
Considerably smaller than n2
.
With low-rank assumption, degree of freedom is reduced by about
2r/n.
This assumption is similar to sparsity assumption in the Lasso paper
we’ve learned.
CS592 Exact Matrix Completion 2018. 4. 24 5 / 51
Which Matrices?
Consider the rank-1 matrix M:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 · · · 0 0





Let m be the number of observed entries of M.
CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
Which Matrices?
Consider the rank-1 matrix M:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 · · · 0 0





Let m be the number of observed entries of M.
Then, we can see only 0 with probability 1 − m/n2
.
CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
Which Matrices?
Consider the rank-1 matrix M:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 · · · 0 0





Let m be the number of observed entries of M.
Then, we can see only 0 with probability 1 − m/n2
.
If sample set doesn’t contain 1, we can not complete matrix exactly.
CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
Which Matrices?
Consider the rank-1 matrix M:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 · · · 0 0





Let m be the number of observed entries of M.
Then, we can see only 0 with probability 1 − m/n2
.
If sample set doesn’t contain 1, we can not complete matrix exactly.
So, impossible to recover all low-rank matrices.
CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
Which Matrices?
Consider the SVD of a matrix M = r
k=1 σkukv∗
k where uk’s and
vk’s are the left and right singular vectors, and the σk’s are the
singular values.
Random orthogonal model
The family {uk}1≤k≤r is selected uniformly at random among all families
of r orthonormal vectors, and similarly for the family {vk}1≤k≤r.
With this model, we can see, intuitively, small probability occuring
exception as we mentioned.
Formal proofs are provided by Lemma 2.1 and 2.2.
CS592 Exact Matrix Completion 2018. 4. 24 7 / 51
Which Sampling Sets?
Clearly, we can not recover Mij if the sampling set avoids i-th row or
j-th column.
So, we have to define sampling set to recover.
CS592 Exact Matrix Completion 2018. 4. 24 8 / 51
Which Sampling Sets?
Clearly, we can not recover Mij if the sampling set avoids i-th row or
j-th column.
So, we have to define sampling set to recover.
Definition of sampling set Ω
If Mij is observed, (i, j) ∈ Ω. The cardinality of Ω is m.
Uniform sampling assumption
The set Ω sampled uniformly at random.
CS592 Exact Matrix Completion 2018. 4. 24 8 / 51
Which Sampling Sets?
Clearly, we can not recover Mij if the sampling set avoids i-th row or
j-th column.
So, we have to define sampling set to recover.
Definition of sampling set Ω
If Mij is observed, (i, j) ∈ Ω. The cardinality of Ω is m.
Uniform sampling assumption
The set Ω sampled uniformly at random.
Additionally, this paper introduce sampling operator RΩ for
convenience.
Sampling Operator RΩ
Let RΩ : Rn1×n2 → R Ω be the sampling operator which extracts the
observed entries, RΩ(X) = (Xij)ij∈Ω.
CS592 Exact Matrix Completion 2018. 4. 24 8 / 51
Which Algorithm?
Problem 1.3: Formal Matrix Completion Problem
minimize rank(X)
subject to RΩ(X) = RΩ(M)
CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
Which Algorithm?
Problem 1.3: Formal Matrix Completion Problem
minimize rank(X)
subject to RΩ(X) = RΩ(M)
Unfortunately, this problem is NP-hard and non-convex.
rank function is not convex.
All known algorithms require exponential time to n.
CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
Which Algorithm?
Problem 1.3: Formal Matrix Completion Problem
minimize rank(X)
subject to RΩ(X) = RΩ(M)
Unfortunately, this problem is NP-hard and non-convex.
rank function is not convex.
All known algorithms require exponential time to n.
Problem 1.5: Convex Relaxation to Matrix Completion Problem
minimize X ∗
subject to RΩ(X) = RΩ(M)
CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
Which Algorithm?
Problem 1.3: Formal Matrix Completion Problem
minimize rank(X)
subject to RΩ(X) = RΩ(M)
Unfortunately, this problem is NP-hard and non-convex.
rank function is not convex.
All known algorithms require exponential time to n.
Problem 1.5: Convex Relaxation to Matrix Completion Problem
minimize X ∗
subject to RΩ(X) = RΩ(M)
Nuclear norm X ∗ is surrogate of rank.
Also, nuclear norm is convex function.
This heuristic is introduced by [Fazel et al. 2002].
CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
Nuclear Norm Relaxation is Reasonable?(Informal)
We already know that l1-norm is surrogate to l0-norm in the Lasso
paper.
CS592 Exact Matrix Completion 2018. 4. 24 10 / 51
Nuclear Norm Relaxation is Reasonable?(Informal)
We already know that l1-norm is surrogate to l0-norm in the Lasso
paper.
Figure: http://public.lanl.gov/mewall/kluwer2002.html
Let vector σ = (σ1, σ2, · · · , σn), σi is i-th singular value of matrix M.
The rank function is l0-norm of σ vector.
The nuclear norm function is l1-norm of σ vector.
CS592 Exact Matrix Completion 2018. 4. 24 10 / 51
How to Solve Nuclear Norm Minimization?
This method is also suggested by Fazel.
Convex Relaxation to Matrix Completion Problem(Nuclear Norm
Minimization) can be solved by Semi-definite Programming(SDP).
CS592 Exact Matrix Completion 2018. 4. 24 11 / 51
How to Solve Nuclear Norm Minimization?
This method is also suggested by Fazel.
Convex Relaxation to Matrix Completion Problem(Nuclear Norm
Minimization) can be solved by Semi-definite Programming(SDP).
Let SVD of matrix X = UΣV ∗.
Define W1 = UΣU∗, W2 = V ΣV ∗, X =
W1 X
X∗ W2
.
CS592 Exact Matrix Completion 2018. 4. 24 11 / 51
How to Solve Nuclear Norm Minimization?
This method is also suggested by Fazel.
Convex Relaxation to Matrix Completion Problem(Nuclear Norm
Minimization) can be solved by Semi-definite Programming(SDP).
Let SVD of matrix X = UΣV ∗.
Define W1 = UΣU∗, W2 = V ΣV ∗, X =
W1 X
X∗ W2
.
Under above settings, the following properties satisfy:
X ∗ = W1 ∗ = W2 ∗
X 0 (positive semidefinite matrix)
X =
U
V
Σ
U
V
∗
=
U
V
√
Σ
U
V
√
Σ
∗
0
∀ matrix A, AA∗
0
∀ symmetric matrix X, trace(X) = X ∗
https://chrischoy.github.io/research/matrix-norms/
CS592 Exact Matrix Completion 2018. 4. 24 11 / 51
How to Solve Nuclear Norm Minimization?
Therefore, the Problem 1.5 can be reduced to
Reduced form of nuclear norm minimization
minimize trace(X )
subject to RΩ(X) = RΩ(M), X 0
X =
W1 X
X∗
W2
trace(X ) = trace(W1) + trace(W2) = X ∗ + X ∗ = 2 X ∗
CS592 Exact Matrix Completion 2018. 4. 24 12 / 51
How to Solve Nuclear Norm Minimization?
Now, reduced form can be solved by SDP.
Many algorithms solving SDP already exist.
Semidefinite Programming(SDP)
minimizeX ∈Sn C, X Sn
subject to Ak, X Sn = bk, k = 1, . . . , m
X 0
Sn
: n × n symmetric matrix.
X, Y = trace(X∗
Y ) is element-wise product.
Ak is all 0s without one entry which is 1 for (i, j) ∈ Ω.
CS592 Exact Matrix Completion 2018. 4. 24 13 / 51
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discussion
CS592 Exact Matrix Completion 2018. 4. 24 14 / 51
Nuclear Norm Relaxation is Reasonable?(Formal)
A first typical result:
M: n1 × n2 matrix of rank r, n = max(n1, n2).
Theorem 1.1(General)
- M obeys the random orthogonal model.
- With uniformly random sampling assumption.
Then, ∃ constants C, c such that if
m ≥ Cn5/4
r log n
the minimizer to the problem 1.5 is unique and equal to M with probability
at least 1 − cn−3.
CS592 Exact Matrix Completion 2018. 4. 24 15 / 51
Nuclear Norm Relaxation is Reasonable?(Formal)
A first typical result:
M: n1 × n2 matrix of rank r, n = max(n1, n2).
Theorem 1.1(General)
- M obeys the random orthogonal model.
- With uniformly random sampling assumption.
Then, ∃ constants C, c such that if
m ≥ Cn5/4
r log n
the minimizer to the problem 1.5 is unique and equal to M with probability
at least 1 − cn−3.
For low-rank case, tighter bound is given.
Theorem 1.1(Low-rank)
if r ≤ n1/5, the recovery is exact with same probability provided that
m ≥ Cn6/5
r log n
CS592 Exact Matrix Completion 2018. 4. 24 15 / 51
Nuclear Norm Relaxation is Reasonable?(Formal)
Asymptotically,
Theorem 1.1(General): m = Ω(n1.25
r log n)
Theorem 1.1(Low-rank): m = Ω(n1.2
r log n)
This bound is not tight(Plug r ← n into Theorem 1.1 general case).
CS592 Exact Matrix Completion 2018. 4. 24 16 / 51
Nuclear Norm Relaxation is Reasonable?(Formal)
Asymptotically,
Theorem 1.1(General): m = Ω(n1.25
r log n)
Theorem 1.1(Low-rank): m = Ω(n1.2
r log n)
This bound is not tight(Plug r ← n into Theorem 1.1 general case).
This theorem tells:
With some conditions, we can recover matrix exactly.
Problem 1.3 and Problem 1.5 are formally equivalent with some
conditions.
Rationality of nuclear norm relaxation.
CS592 Exact Matrix Completion 2018. 4. 24 16 / 51
Relaxation of Random Orthogonal Model
Recall why we introduce random orthogonal model:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 · · · 0 0





CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
Relaxation of Random Orthogonal Model
Recall why we introduce random orthogonal model:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 · · · 0 0





More generally, it is hard to recover if the singular vectors of the
matrix M are similar to standard basis.
Because, information is highly concentrated on specific region.
Hence, the singular vectors need to be sufficiently spread.
CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
Relaxation of Random Orthogonal Model
Recall why we introduce random orthogonal model:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 · · · 0 0





More generally, it is hard to recover if the singular vectors of the
matrix M are similar to standard basis.
Because, information is highly concentrated on specific region.
Hence, the singular vectors need to be sufficiently spread.
Random orthogonal model is one of the models to tackle this
problem.
We want to relax this assumption of model. Why?
CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
Relaxation of Random Orthogonal Model
Recall why we introduce random orthogonal model:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 · · · 0 0





More generally, it is hard to recover if the singular vectors of the
matrix M are similar to standard basis.
Because, information is highly concentrated on specific region.
Hence, the singular vectors need to be sufficiently spread.
Random orthogonal model is one of the models to tackle this
problem.
We want to relax this assumption of model. Why?
To get more general theorem that covers a much larger set of matrices
M.
CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
Coherence
Definition 1.2
Let U be a subspace or Rn of dimension r and PU be the orthogonal
projection onto U. Then the coherence of U is defined to be
µ(U) ≡
n
r
max
1≤i≤n
PU ei
2
CS592 Exact Matrix Completion 2018. 4. 24 18 / 51
Coherence
Definition 1.2
Let U be a subspace or Rn of dimension r and PU be the orthogonal
projection onto U. Then the coherence of U is defined to be
µ(U) ≡
n
r
max
1≤i≤n
PU ei
2
Examples
Incoherence Example(Minimum Case) If U is spanned by vectors whose
entries all have magnitude 1/
√
n, µ(U) = 1.
Examples
Coherence Example(Maximum Case) If U contains a standard basis
element, µ(U) = n/r.
The more similar to standard basis element, µ(U) is larger.
The more far from standard basis element, µ(U) is smaller.
CS592 Exact Matrix Completion 2018. 4. 24 18 / 51
Coherence
From definition 1.2, we define two assumptions A0 and A1.
Assumption A0
The coherences obey max(µ(U), µ(V )) ≤ µ0 for some positive µ0.
Assumption A1
The n1 × n2 matrix Σ1≤k≤rukv∗
k has maximum entry bounded by
µ1 r/(n1n2) in absolute value for some positive µ1.
A1 is always hold with µ1 = µ0
√
r.
CS592 Exact Matrix Completion 2018. 4. 24 19 / 51
Coherence
From definition 1.2, we define two assumptions A0 and A1.
Assumption A0
The coherences obey max(µ(U), µ(V )) ≤ µ0 for some positive µ0.
Assumption A1
The n1 × n2 matrix Σ1≤k≤rukv∗
k has maximum entry bounded by
µ1 r/(n1n2) in absolute value for some positive µ1.
A1 is always hold with µ1 = µ0
√
r.
This paper insists that A0, A1 are more general compared to random
orthogonal model.
Lemma 2.2 proves that random orthogonal model is special case of A0,
A1.
CS592 Exact Matrix Completion 2018. 4. 24 19 / 51
Coherence
Let’s investigate an quite natural assumption.
The maximum entries of left and right singular matrix are bounded.
CS592 Exact Matrix Completion 2018. 4. 24 20 / 51
Coherence
Let’s investigate an quite natural assumption.
The maximum entries of left and right singular matrix are bounded.
Assumption 1.12
Assume that the uj and vj’s obey
maxij ei, uj
2 ≤ µB/n, maxij ei, vj
2 ≤ µB/n,
for some value of µB = O(1).
CS592 Exact Matrix Completion 2018. 4. 24 20 / 51
Coherence
Let’s investigate an quite natural assumption.
The maximum entries of left and right singular matrix are bounded.
Assumption 1.12
Assume that the uj and vj’s obey
maxij ei, uj
2 ≤ µB/n, maxij ei, vj
2 ≤ µB/n,
for some value of µB = O(1).
With this assumption, assumptions A0, A1 hold.
µ(U), µ(V ) ≤ µB
µ1 = O(
√
log n) (Proved by Lemma 2.1)
CS592 Exact Matrix Completion 2018. 4. 24 20 / 51
Main Result
With incoherence conditions, relax Theorem 1.1 to Theorem 1.3.
Theorem 1.3(General)
- M obeys A0 and A1.
- With uniformly random sampling assumption.
Then, ∃ constants C, c such that if
m ≥ Cmax(µ2
1, µ
1/2
0 µ1, µ0n1/4
)nr(β log n)
for some β > 2, then the minimizer to the problem (1.5) is unique and
equal to M with probability at least 1 − cn−β.
CS592 Exact Matrix Completion 2018. 4. 24 21 / 51
Main Result
With incoherence conditions, relax Theorem 1.1 to Theorem 1.3.
Theorem 1.3(General)
- M obeys A0 and A1.
- With uniformly random sampling assumption.
Then, ∃ constants C, c such that if
m ≥ Cmax(µ2
1, µ
1/2
0 µ1, µ0n1/4
)nr(β log n)
for some β > 2, then the minimizer to the problem (1.5) is unique and
equal to M with probability at least 1 − cn−β.
Theorem 1.3(Low-rank)
If r ≤ µ−1
0 n1/5, the recovery is exact with same prob. provided that
m ≥ Cµ0n6/5r(β log n).
CS592 Exact Matrix Completion 2018. 4. 24 21 / 51
Main Result
With incoherence conditions, relax Theorem 1.1 to Theorem 1.3.
Theorem 1.3(General)
- M obeys A0 and A1.
- With uniformly random sampling assumption.
Then, ∃ constants C, c such that if
m ≥ Cmax(µ2
1, µ
1/2
0 µ1, µ0n1/4
)nr(β log n)
for some β > 2, then the minimizer to the problem (1.5) is unique and
equal to M with probability at least 1 − cn−β.
Theorem 1.3(Low-rank)
If r ≤ µ−1
0 n1/5, the recovery is exact with same prob. provided that
m ≥ Cµ0n6/5r(β log n).
Asymptotically similar to Theorem 1.1
CS592 Exact Matrix Completion 2018. 4. 24 21 / 51
Connections to Lasso
Recall: Lasso Paper
Compressive sampling(or compressed sensing, matrix sensing)
Ax = b (A is a design matrix, b is a observation vector)
In the Lasso paper, they add the sparsity assumption with x to tackle
the problem that problem dimension is much larger than
observations(n p).
x is k-sparse in the Fourier domain, it can be perfectly recovered with
high probability by l1 minimization when m = Ω(k log n).
CS592 Exact Matrix Completion 2018. 4. 24 22 / 51
Connections to Lasso
Recall: Lasso Paper
Compressive sampling(or compressed sensing, matrix sensing)
Ax = b (A is a design matrix, b is a observation vector)
In the Lasso paper, they add the sparsity assumption with x to tackle
the problem that problem dimension is much larger than
observations(n p).
x is k-sparse in the Fourier domain, it can be perfectly recovered with
high probability by l1 minimization when m = Ω(k log n).
Also, this paper claim that they generalizes the notion of incoherence
to problems beyond the setting of sparse signal recovery.
CS592 Exact Matrix Completion 2018. 4. 24 22 / 51
Connections to Matrix Sensing Problem
Original Fazel’s problem was
Solving matrix sensing problem with low-rank assumption(nuclear norm
heuristic) and other assumptions.
CS592 Exact Matrix Completion 2018. 4. 24 23 / 51
Connections to Matrix Sensing Problem
Original Fazel’s problem was
Solving matrix sensing problem with low-rank assumption(nuclear norm
heuristic) and other assumptions.
Contribution of this paper compared to original Fazel’s work.
Extend Fazel’s work to matrix completion problem.
Define some conditions(including incoherence) to complete matrix
exactly.
Serve theoretical bound of convex relaxation(nuclear norm
minimization).
CS592 Exact Matrix Completion 2018. 4. 24 23 / 51
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discussion
CS592 Exact Matrix Completion 2018. 4. 24 24 / 51
Which Matrices are Incoherent?
Theorem 1.3 says completing matrix is possible if the incoherence
conditions are hold.
Now, we want to find out how incoherence assumptions are
reasonable.
CS592 Exact Matrix Completion 2018. 4. 24 25 / 51
Which Matrices are Incoherent?
Theorem 1.3 says completing matrix is possible if the incoherence
conditions are hold.
Now, we want to find out how incoherence assumptions are
reasonable.
In this section, we’ll show that most random matrices are incoherent.
Lemma 2.1 shows incoherence assumptions are reasonable.
Lemma 2.2 shows a matrix with random orthogonal model satisfies
incoherence conditions.
CS592 Exact Matrix Completion 2018. 4. 24 25 / 51
Incoherent Bases Span Incoherent Subspaces
Recall: Assumption 1.12
Assume that the uj and vj’s obey
maxij ei, uj
2 ≤ µB/n, maxij ei, vj
2 ≤ µB/n,
for some value of µB = O(1).
CS592 Exact Matrix Completion 2018. 4. 24 26 / 51
Incoherent Bases Span Incoherent Subspaces
Recall: Assumption 1.12
Assume that the uj and vj’s obey
maxij ei, uj
2 ≤ µB/n, maxij ei, vj
2 ≤ µB/n,
for some value of µB = O(1).
If the assumption 1.12 holds,
A0 holds with µ0 = µB(It is trivial).
A1 holds with µ1 = CµB
√
log n(It is not trivial).
A1 hold with µ1 = µB
√
r, but we can’t assume that r = O(log n).
So, we’ll prove that A1 holds with µ1 = CµB
√
log n.
CS592 Exact Matrix Completion 2018. 4. 24 26 / 51
Incoherent Bases Span Incoherent Subspaces
Lemma 2.1
If assumption 1.12 holds,
A1 holds with µ1 = CµB log n with high prob.
CS592 Exact Matrix Completion 2018. 4. 24 27 / 51
Incoherent Bases Span Incoherent Subspaces
Lemma 2.1
If assumption 1.12 holds,
A1 holds with µ1 = CµB log n with high prob.
(Proof) Consider the matrix M = Σr
k=1 kukv∗
k where { k}1≤k≤r is an
arbitrary sign sequence and kuk = uk.
CS592 Exact Matrix Completion 2018. 4. 24 27 / 51
Incoherent Bases Span Incoherent Subspaces
Lemma 2.1
If assumption 1.12 holds,
A1 holds with µ1 = CµB log n with high prob.
(Proof) Consider the matrix M = Σr
k=1 kukv∗
k where { k}1≤k≤r is an
arbitrary sign sequence and kuk = uk.
From this setting, simple applying Hoeffding’s inequality,
P(|Mij| ≥ λµB
√
r/n) ≤ 2e−λ2/2
With setting λ =
√
2β log n and applying union bound,
P(|M| ≥ µB
√
2βr log n/n) ≤ 2n2e−βe−β log n = 2n2n−βe−β ≤ 2n2n−β
Plug β ← β + 2, then Lemma 2.1 holds.
CS592 Exact Matrix Completion 2018. 4. 24 27 / 51
Random Subspaces Span Incoherent Subspaces
Now, we prove that random orthogonal model obeys the two
assumptions A0 and A1(with appropriate values for the µ’s) with
large probability.
CS592 Exact Matrix Completion 2018. 4. 24 28 / 51
Random Subspaces Span Incoherent Subspaces
Now, we prove that random orthogonal model obeys the two
assumptions A0 and A1(with appropriate values for the µ’s) with
large probability.
Lemma 2.3 is one of the the results of [Laurent et al. 2000]. Using
the result, we can see that Lemma 2.2 is induced from Lemma 2.3.
Lemma 2.2
Set ¯r = max(r, log n). Then ∃ constants C, c such that the random
orthogonal model obeys:
1. µ(U) = (n/r) maxi PU ei
2 ≤ C¯r/r = µ0,
2. (n/r) 1≤k≤r ukv∗
k ∞ ≤ C log n ¯r/r = µ1 with high probability.
CS592 Exact Matrix Completion 2018. 4. 24 28 / 51
Random Subspaces Span Incoherent Subspaces
Now, we prove that random orthogonal model obeys the two
assumptions A0 and A1(with appropriate values for the µ’s) with
large probability.
Lemma 2.3 is one of the the results of [Laurent et al. 2000]. Using
the result, we can see that Lemma 2.2 is induced from Lemma 2.3.
Lemma 2.2
Set ¯r = max(r, log n). Then ∃ constants C, c such that the random
orthogonal model obeys:
1. µ(U) = (n/r) maxi PU ei
2 ≤ C¯r/r = µ0,
2. (n/r) 1≤k≤r ukv∗
k ∞ ≤ C log n ¯r/r = µ1 with high probability.
µ0 = O(1), µ1 = O(log n).
As a result of Lemma 2.2, we can see that random subspace span
incoherent subspace with high probability.
CS592 Exact Matrix Completion 2018. 4. 24 28 / 51
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discussion
CS592 Exact Matrix Completion 2018. 4. 24 29 / 51
Proof Strategy
Problem 1.5: Convex relaxation to matrix completion problem
minimize X ∗
subject to RΩ(X) = RΩ(M)
We will prove the original matrix M is the unique solution to the
problem 1.5.
What are conditions for some matrix X to be a unique minimizer?
CS592 Exact Matrix Completion 2018. 4. 24 30 / 51
Subgradient of Nuclear Norm
Convex optimization theory says:
X is solution to (1.5) ⇔ ∃λ ∈ R|Ω|
s.t. R∗
Ωλ ∈ ∂ X ∗
We know nuclear norm of a matrix is sum of its singular values.
We may derive subgradient of nuclear norm from this.
CS592 Exact Matrix Completion 2018. 4. 24 31 / 51
Subgradient of Nuclear Norm
Convex optimization theory says:
X is solution to (1.5) ⇔ ∃λ ∈ R|Ω|
s.t. R∗
Ωλ ∈ ∂ X ∗
We know nuclear norm of a matrix is sum of its singular values.
We may derive subgradient of nuclear norm from this.
Preliminary: subgradient of matrix norm (from [36])
∂ A = conv{UDV ∗
, A = UΣV ∗
, d ∈ ∂φ(σ)}
where D, Σ is m × n diagonal matrix with d, σ as diagonal entries.
Since φ should be symmetric gauge function,
we put φ(σ) = σ 1 = |σi|.
CS592 Exact Matrix Completion 2018. 4. 24 31 / 51
Subgradient of Nuclear Norm (cont.)
For rank r, n1 × n2 matrix A, there are r non-zero singular values.
Then, we know
∂ σ 1 = {x ∈ Rmin(n1,n2)
| xi = 1 for i = 1..r, |xi| ≤ 1 otherwise}
CS592 Exact Matrix Completion 2018. 4. 24 32 / 51
Subgradient of Nuclear Norm (cont.)
For rank r, n1 × n2 matrix A, there are r non-zero singular values.
Then, we know
∂ σ 1 = {x ∈ Rmin(n1,n2)
| xi = 1 for i = 1..r, |xi| ≤ 1 otherwise}
If we partition singular vectors U, V like..
U = [U(1)
|U(2)
], V = [V (1)
|V (2)
] where U(1)
, V (1)
has r columns
we get the result
∂ A ∗ = { U(1)
V (1)∗
+U(2)
RV (2)∗
for all R ∈ R(n1−r)×(n2−r)
, R 2 ≤ 1}
using the definition of convex hull(ALL convex combinations of elements)
CS592 Exact Matrix Completion 2018. 4. 24 32 / 51
Subgradient of Nuclear Norm (cont.)
∂ A ∗ = { U(1)
V (1)∗
+ U(2)
RV (2)∗
for all R ∈ R(n1−r)×(n2−r)
, R 2 ≤ 1}
was expressed in the paper as,
Y =
1≤k≤r
ukv∗
k + W (3.4)
where W obeys two properties:
the column space of W is orthogonal to U ≡ span(u1, ..., ur). and
the row space of W is orthogonal to V ≡ span(v1, ..., vr);
the spectral norm of W is less than or equal to 1.
This says we can decompose ∂ A ∗ into 2 orthogonal spaces, T(blue one)
and T⊥(green one). We define PT , PT⊥ as projection onto each spaces.
CS592 Exact Matrix Completion 2018. 4. 24 33 / 51
Conditions for Unique Minimizer
Lemma 3.1 conditions for unique minimizer
Consider a matrix X0 = r
k=1 σkukv∗
k of rank r which is feasible for the
problem (1.5), and suppose that the following two conditions hold:
1. there exists a dual point λ such that Y = R∗
Ωλ obeys
PT (Y ) =
r
k=1
ukv∗
k, PT⊥ (Y ) 2 < 1
2. the sampling operator RΩ restricted to elements in T is injective.
Then X0 is the unique minimizer.
Removing the equality from spectral norm, and adding condition 2 gives
you unique minimizer! Now, constructing such Y became important.
CS592 Exact Matrix Completion 2018. 4. 24 34 / 51
Construction of Subgradient
Define PΩ(X) = Xij if (i, j) ∈ Ω, 0 otherwise. Then set matrix Y as the
solution to least square problem
minimize X F
subject to (PT PΩ)(X) = r
k=1 ukv∗
k.
This will make Y vanish in ΩC and have smaller PT⊥ (Y ) 2 as well.
Statement 4.2 specification of Y
Denote AΩT (M) = PΩPT (M). Then if A∗
ΩT AΩT = PT PΩPT has full
rank when restricted to T, the minimizer is given by
Y = AΩT (A∗
ΩT AΩT )−1
(E), where E =
r
k=1
ukv∗
k
CS592 Exact Matrix Completion 2018. 4. 24 35 / 51
Injectivity
We study the injectivity of AΩT . For convenience, we use Bernoulli model,
not uniform sampling: P(δij = 1) = p ≡ m
n1n2
, Ω = {(i, j) : δij = 1}.
Theorem 4.1 Small operator norm
Suppose Ω is sampled according to the Bernoulli model and put n =
max(n1, n2). Suppose that the coherences obey max(µ(U), µ(V )) ≤ µ0.
Then, there is a numerical constants CR such that for all β > 1,
p−1
PT PΩPT − pPT 2 ≤ CR
µ0nr(β log n)
m
with probability at least 1 − 3n−β provided that CR
µ0nr(β log n)
m < 1.
CS592 Exact Matrix Completion 2018. 4. 24 36 / 51
Injectivity (cont.)
With m large enough so that CR µ0(nr/m) log n ≤ 1/2,
p
2
PT (X) F ≤ (PT PΩPT )(X) F ≤
3p
2
PT (X) F (4.11)
Corollary 4.3 Injectivity of RΩ in T
Assume that CR µ0nr(log n)/m ≤ 1/2. With the same probability as in
Theorem 4.1, we have
PΩPT (X) F ≤ 3p/2 PT (X) F .
This provides the second condition of unique minimizer.
CS592 Exact Matrix Completion 2018. 4. 24 37 / 51
Size of Spectral Norm
We will investigate the probability of such Y will satisfy PT⊥ (Y ) 2 < 1.
Denote H ≡ PT − p−1
PT PΩPT , then we can decompose PT⊥ (Y ) as
PT ⊥ (Y ) = p−1
(PT ⊥ PΩPT )(E + H(E) + H2
(E) + · · · ), E =
1≤k≤r
ukv∗
k.
Then lemma 4.4-4.8 will give us the upper bound of these terms!
p−1
(PT⊥ PΩPT )E 2
p−1
(PT⊥ PΩPT )H2
(E) 2
p−1
(PT⊥ PΩPT )H3
(E)) 2
· · ·
p−1
(PT⊥ PΩPT )
k≥k0
Hk
(E) 2
CS592 Exact Matrix Completion 2018. 4. 24 38 / 51
Size of Spectral Norm (cont.)
If we set k0 = 3 in lemma 4.8, we can bound the spectral norm of
PT⊥ (Y ) 2 < 1 with probability at least 1 − cn−β provided
m ≥ C max(µ2
1, µ
1/2
0 µ1, µ
4/3
0 r1/3
, µ0n1/4
)nrβ log n
If µ
4/3
0 r1/3 is maximum, it leads to trivial case(greater than n2).
Without this, it is the first result of Theorem 1.3.
If we set k0 = 4 in lemma 4.8, we can bound m
m ≥ C max(µ2
0r, µ0n1/5
)nrβ log n
which is the second result of Thoerem 1.3.
CS592 Exact Matrix Completion 2018. 4. 24 39 / 51
Overall Structure of the Paper
vspace0.3cm
CS592 Exact Matrix Completion 2018. 4. 24 40 / 51
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discussion
CS592 Exact Matrix Completion 2018. 4. 24 41 / 51
Experiment Overview
1. Experiment of Theorem 1.3
2. Experiment of Theorem 1.3 with the assumption that M is
semi-definite matrix.
3. Experiment of Fazel’s work.
CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
Experiment Overview
1. Experiment of Theorem 1.3
2. Experiment of Theorem 1.3 with the assumption that M is
semi-definite matrix.
3. Experiment of Fazel’s work.
Matrix M(n × n, rank r) generating method:
Sample two n × r factors ML and MR with i.i.d. Gaussian entries.
Setting M = MLM∗
R
Sample a subset Ω of m entries uniformly at random.
CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
Experiment Overview
1. Experiment of Theorem 1.3
2. Experiment of Theorem 1.3 with the assumption that M is
semi-definite matrix.
3. Experiment of Fazel’s work.
Matrix M(n × n, rank r) generating method:
Sample two n × r factors ML and MR with i.i.d. Gaussian entries.
Setting M = MLM∗
R
Sample a subset Ω of m entries uniformly at random.
Determine exactness:
M to be recovered if the solution returned by SDP, Xopt satisfied
Xopt − M F / M F < 10−3
CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
Experiment Overview
1. Experiment of Theorem 1.3
2. Experiment of Theorem 1.3 with the assumption that M is
semi-definite matrix.
3. Experiment of Fazel’s work.
Matrix M(n × n, rank r) generating method:
Sample two n × r factors ML and MR with i.i.d. Gaussian entries.
Setting M = MLM∗
R
Sample a subset Ω of m entries uniformly at random.
Determine exactness:
M to be recovered if the solution returned by SDP, Xopt satisfied
Xopt − M F / M F < 10−3
Repeated same procedure 50 times for 1, 2. 10 for 3.
CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
Experiment 1
Success: white, Failure: Black. (a): n = 40, (b): n = 50
Two experiments in very similar plots for different n.
The results of this paper may be conservative.
CS592 Exact Matrix Completion 2018. 4. 24 43 / 51
Experiment 2
For positive semidefinite matrices case, the recovery region is much
larger.
Future work is needed to investigate.
CS592 Exact Matrix Completion 2018. 4. 24 44 / 51
Experiment 3
To check theoretical bound of matrix sensing problem(Original Fazel’s
problem).
CS592 Exact Matrix Completion 2018. 4. 24 45 / 51
Reproduce Experiments
We’ve reproduced this experiments using Python.
Using CVXOPT package to SDP.
https://github.com/JoonyoungYi/exact-mc.
CS592 Exact Matrix Completion 2018. 4. 24 46 / 51
Reproduce Experiments
We’ve reproduced this experiments using Python.
Using CVXOPT package to SDP.
https://github.com/JoonyoungYi/exact-mc.
We plotted the relation between n and m/n.
rank = 2, trial = 1, Exactness metric is same as experiments in this
paper:
Xopt satisfied Xopt − M F / M F < 10−3
CS592 Exact Matrix Completion 2018. 4. 24 46 / 51
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discussion
CS592 Exact Matrix Completion 2018. 4. 24 47 / 51
Improvements
The main result of this paper.
m = Ω(n1.25
r log n)
With low-rank assumption(r ≤ n1/5),
m = Ω(n1.2
r log n)
CS592 Exact Matrix Completion 2018. 4. 24 48 / 51
Improvements
The main result of this paper.
m = Ω(n1.25
r log n)
With low-rank assumption(r ≤ n1/5),
m = Ω(n1.2
r log n)
Can we find tighter bound?
Authors insisted that it would be hard as far as approaching in this way.
Bound Ω(n1.2
) is mainly determined by (PT ⊥ PΩPT Hk
(E)) for
k = 1, 2, 3 in the series (4.13).
If we k to expand 4, Bound will become Ω(n7/6
).
If we k to expand 5, Bound will become Ω(n8/7
).
This can be done to reach k of size about log n, but size of the
decoupling constants CD is varying on k.
This is why the stronger results are hard to see.
CS592 Exact Matrix Completion 2018. 4. 24 48 / 51
Improvements
The main result of this paper.
m = Ω(n1.25
r log n)
With low-rank assumption(r ≤ n1/5),
m = Ω(n1.2
r log n)
Can we find tighter bound?
Authors insisted that it would be hard as far as approaching in this way.
Bound Ω(n1.2
) is mainly determined by (PT ⊥ PΩPT Hk
(E)) for
k = 1, 2, 3 in the series (4.13).
If we k to expand 4, Bound will become Ω(n7/6
).
If we k to expand 5, Bound will become Ω(n8/7
).
This can be done to reach k of size about log n, but size of the
decoupling constants CD is varying on k.
This is why the stronger results are hard to see.
However, [Candes et al.2010] showed that m = Ω(nr log n) with
additional assumptions.
CS592 Exact Matrix Completion 2018. 4. 24 48 / 51
Further directions
(CASE 1) Noise Handling
Observation is not Mij, but Yij.
Yij = Mij + zij, (i, j) ∈ Ω, where z is a deterministic or stochastic
perturbation.
CS592 Exact Matrix Completion 2018. 4. 24 49 / 51
Further directions
(CASE 2) Low-rank Matrix Fitting
Let M = Σ1≤k≤nσkukv∗
k where σ1 ≥ σ2 ≥ · · · ≥ 0 and the truncated
SVD of the matrix M, Mr = Σ1≤k≤rσkukv∗
k where the sum extends
over the r largest singualar values. r is the number of singular value
that we can not negligible.
This is similar to process of PCA.
Low-rank matrix fitting(or Low-rank matrix approximation)
For general rank M,
minimize X ∗
subject to X − M ∗ X − MR ∗
It is possible to try getting theoretical bound in low-rank matrix fitting
problem.
CS592 Exact Matrix Completion 2018. 4. 24 50 / 51
Further directions
(CASE 3) Slow SDP
This direction can be done after (CASE 3).
SDP is too slow. So, algorithm described in this paper rarely used in
practically.
CS592 Exact Matrix Completion 2018. 4. 24 51 / 51
Further directions
(CASE 3) Slow SDP
This direction can be done after (CASE 3).
SDP is too slow. So, algorithm described in this paper rarely used in
practically.
Practically, SVD(Alternating Minimization) Method is widely used.
But, SVD method is really hard to prove theoretical bound.
However, [Jain et al. 2012] showed that m = Ω(nr2.5
logn).
This bound is slightly higher compared to convex optimization case,
but [Jain et al. 2012] argued that this bound is not tight.
CS592 Exact Matrix Completion 2018. 4. 24 51 / 51

More Related Content

What's hot

DSP_FOEHU - Lec 10 - FIR Filter Design
DSP_FOEHU - Lec 10 - FIR Filter DesignDSP_FOEHU - Lec 10 - FIR Filter Design
DSP_FOEHU - Lec 10 - FIR Filter DesignAmr E. Mohamed
 
An introduction to discrete wavelet transforms
An introduction to discrete wavelet transformsAn introduction to discrete wavelet transforms
An introduction to discrete wavelet transformsLily Rose
 
RF Circuit Design - [Ch2-2] Smith Chart
RF Circuit Design - [Ch2-2] Smith ChartRF Circuit Design - [Ch2-2] Smith Chart
RF Circuit Design - [Ch2-2] Smith ChartSimen Li
 
PR 127: FaceNet
PR 127: FaceNetPR 127: FaceNet
PR 127: FaceNetTaeoh Kim
 
Dsp U Lec08 Fir Filter Design
Dsp U   Lec08 Fir Filter DesignDsp U   Lec08 Fir Filter Design
Dsp U Lec08 Fir Filter Designtaha25
 
Basic galois field arithmatics required for error control codes
Basic galois field arithmatics required for error control codesBasic galois field arithmatics required for error control codes
Basic galois field arithmatics required for error control codesMadhumita Tamhane
 
Introduction - Lattice-based Cryptography
Introduction - Lattice-based CryptographyIntroduction - Lattice-based Cryptography
Introduction - Lattice-based CryptographyAlexandre Augusto Giron
 
Fuzzy Logic ppt
Fuzzy Logic pptFuzzy Logic ppt
Fuzzy Logic pptRitu Bafna
 
The Fundamental theorem of calculus
The Fundamental theorem of calculus The Fundamental theorem of calculus
The Fundamental theorem of calculus AhsanIrshad8
 
Fft analysis
Fft analysisFft analysis
Fft analysisSatrious
 
On the Numerical Solution of Differential Equations
On the Numerical Solution of Differential EquationsOn the Numerical Solution of Differential Equations
On the Numerical Solution of Differential EquationsKyle Poe
 
Formation of partial differential equations by eliminating arbitrary functions
Formation of partial differential equations by eliminating arbitrary functionsFormation of partial differential equations by eliminating arbitrary functions
Formation of partial differential equations by eliminating arbitrary functionsDr. Boddu Muralee Bala Krushna
 
Lecture Notes on Adaptive Signal Processing-1.pdf
Lecture Notes on Adaptive Signal Processing-1.pdfLecture Notes on Adaptive Signal Processing-1.pdf
Lecture Notes on Adaptive Signal Processing-1.pdfVishalPusadkar1
 

What's hot (20)

Lec17 sparse signal processing & applications
Lec17 sparse signal processing & applicationsLec17 sparse signal processing & applications
Lec17 sparse signal processing & applications
 
DSP_FOEHU - Lec 10 - FIR Filter Design
DSP_FOEHU - Lec 10 - FIR Filter DesignDSP_FOEHU - Lec 10 - FIR Filter Design
DSP_FOEHU - Lec 10 - FIR Filter Design
 
An introduction to discrete wavelet transforms
An introduction to discrete wavelet transformsAn introduction to discrete wavelet transforms
An introduction to discrete wavelet transforms
 
Gamma function
Gamma functionGamma function
Gamma function
 
TDA for feature selection
TDA for feature selectionTDA for feature selection
TDA for feature selection
 
RF Circuit Design - [Ch2-2] Smith Chart
RF Circuit Design - [Ch2-2] Smith ChartRF Circuit Design - [Ch2-2] Smith Chart
RF Circuit Design - [Ch2-2] Smith Chart
 
06 Arithmetic 1
06 Arithmetic 106 Arithmetic 1
06 Arithmetic 1
 
PR 127: FaceNet
PR 127: FaceNetPR 127: FaceNet
PR 127: FaceNet
 
Dsp U Lec08 Fir Filter Design
Dsp U   Lec08 Fir Filter DesignDsp U   Lec08 Fir Filter Design
Dsp U Lec08 Fir Filter Design
 
Basic galois field arithmatics required for error control codes
Basic galois field arithmatics required for error control codesBasic galois field arithmatics required for error control codes
Basic galois field arithmatics required for error control codes
 
Introduction - Lattice-based Cryptography
Introduction - Lattice-based CryptographyIntroduction - Lattice-based Cryptography
Introduction - Lattice-based Cryptography
 
Fuzzy Logic ppt
Fuzzy Logic pptFuzzy Logic ppt
Fuzzy Logic ppt
 
The Fundamental theorem of calculus
The Fundamental theorem of calculus The Fundamental theorem of calculus
The Fundamental theorem of calculus
 
Chapter 03 cyclic codes
Chapter 03   cyclic codesChapter 03   cyclic codes
Chapter 03 cyclic codes
 
Fft analysis
Fft analysisFft analysis
Fft analysis
 
Dif fft
Dif fftDif fft
Dif fft
 
On the Numerical Solution of Differential Equations
On the Numerical Solution of Differential EquationsOn the Numerical Solution of Differential Equations
On the Numerical Solution of Differential Equations
 
Formation of partial differential equations by eliminating arbitrary functions
Formation of partial differential equations by eliminating arbitrary functionsFormation of partial differential equations by eliminating arbitrary functions
Formation of partial differential equations by eliminating arbitrary functions
 
Lecture Notes on Adaptive Signal Processing-1.pdf
Lecture Notes on Adaptive Signal Processing-1.pdfLecture Notes on Adaptive Signal Processing-1.pdf
Lecture Notes on Adaptive Signal Processing-1.pdf
 
Convolutional codes
Convolutional codesConvolutional codes
Convolutional codes
 

Similar to Exact Matrix Completion via Convex Optimization Slide (PPT)

20070823
2007082320070823
20070823neostar
 
Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image ijcsa
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classificationSung Yub Kim
 
2012 mdsp pr09 pca lda
2012 mdsp pr09 pca lda2012 mdsp pr09 pca lda
2012 mdsp pr09 pca ldanozomuhamada
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
A Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue ProblemA Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue ProblemScott Donald
 
DAA - UNIT 4 - Engineering.pptx
DAA - UNIT 4 - Engineering.pptxDAA - UNIT 4 - Engineering.pptx
DAA - UNIT 4 - Engineering.pptxvaishnavi339314
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
 
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...ijdpsjournal
 
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...ijdpsjournal
 
Cs6402 design and analysis of algorithms may june 2016 answer key
Cs6402 design and analysis of algorithms may june 2016 answer keyCs6402 design and analysis of algorithms may june 2016 answer key
Cs6402 design and analysis of algorithms may june 2016 answer keyappasami
 
Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...Alexander Litvinenko
 
directed-research-report
directed-research-reportdirected-research-report
directed-research-reportRyen Krusinga
 
06_finite_elements_basics.ppt
06_finite_elements_basics.ppt06_finite_elements_basics.ppt
06_finite_elements_basics.pptAditya765321
 

Similar to Exact Matrix Completion via Convex Optimization Slide (PPT) (20)

20070823
2007082320070823
20070823
 
Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classification
 
2012 mdsp pr09 pca lda
2012 mdsp pr09 pca lda2012 mdsp pr09 pca lda
2012 mdsp pr09 pca lda
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
A Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue ProblemA Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue Problem
 
DAA - UNIT 4 - Engineering.pptx
DAA - UNIT 4 - Engineering.pptxDAA - UNIT 4 - Engineering.pptx
DAA - UNIT 4 - Engineering.pptx
 
Modeling the dynamics of molecular concentration during the diffusion procedure
Modeling the dynamics of molecular concentration during the  diffusion procedureModeling the dynamics of molecular concentration during the  diffusion procedure
Modeling the dynamics of molecular concentration during the diffusion procedure
 
ML ALL in one (1).pdf
ML ALL in one (1).pdfML ALL in one (1).pdf
ML ALL in one (1).pdf
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3
 
Backtracking
BacktrackingBacktracking
Backtracking
 
Least Squares method
Least Squares methodLeast Squares method
Least Squares method
 
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
 
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
 
Cs6402 design and analysis of algorithms may june 2016 answer key
Cs6402 design and analysis of algorithms may june 2016 answer keyCs6402 design and analysis of algorithms may june 2016 answer key
Cs6402 design and analysis of algorithms may june 2016 answer key
 
Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...
 
Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Randomized algorithms ver 1.0
Randomized algorithms ver 1.0
 
directed-research-report
directed-research-reportdirected-research-report
directed-research-report
 
06_finite_elements_basics.ppt
06_finite_elements_basics.ppt06_finite_elements_basics.ppt
06_finite_elements_basics.ppt
 
5994944.ppt
5994944.ppt5994944.ppt
5994944.ppt
 

More from Joonyoung Yi

Mixture-Rank Matrix Approximation for Collaborative Filtering
Mixture-Rank Matrix Approximation for Collaborative FilteringMixture-Rank Matrix Approximation for Collaborative Filtering
Mixture-Rank Matrix Approximation for Collaborative FilteringJoonyoung Yi
 
Sparsity Normalization: Stabilizing the Expected Outputs of Deep Networks
Sparsity Normalization: Stabilizing the Expected Outputs of Deep NetworksSparsity Normalization: Stabilizing the Expected Outputs of Deep Networks
Sparsity Normalization: Stabilizing the Expected Outputs of Deep NetworksJoonyoung Yi
 
Low-rank Matrix Approximation with Stability
Low-rank Matrix Approximation with StabilityLow-rank Matrix Approximation with Stability
Low-rank Matrix Approximation with StabilityJoonyoung Yi
 
Introduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with DiscussionsIntroduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with DiscussionsJoonyoung Yi
 
A Neural Autoregressive Approach to Collaborative Filtering (CF-NADE) Slide
A Neural Autoregressive Approach to Collaborative Filtering (CF-NADE) Slide A Neural Autoregressive Approach to Collaborative Filtering (CF-NADE) Slide
A Neural Autoregressive Approach to Collaborative Filtering (CF-NADE) Slide Joonyoung Yi
 
Introduction to XGBoost
Introduction to XGBoostIntroduction to XGBoost
Introduction to XGBoostJoonyoung Yi
 
Why biased matrix factorization works well?
Why biased matrix factorization works well?Why biased matrix factorization works well?
Why biased matrix factorization works well?Joonyoung Yi
 
Dynamically Expandable Network (DEN)
Dynamically Expandable Network (DEN)Dynamically Expandable Network (DEN)
Dynamically Expandable Network (DEN)Joonyoung Yi
 
Introduction to Low-rank Matrix Completion
Introduction to Low-rank Matrix CompletionIntroduction to Low-rank Matrix Completion
Introduction to Low-rank Matrix CompletionJoonyoung Yi
 

More from Joonyoung Yi (9)

Mixture-Rank Matrix Approximation for Collaborative Filtering
Mixture-Rank Matrix Approximation for Collaborative FilteringMixture-Rank Matrix Approximation for Collaborative Filtering
Mixture-Rank Matrix Approximation for Collaborative Filtering
 
Sparsity Normalization: Stabilizing the Expected Outputs of Deep Networks
Sparsity Normalization: Stabilizing the Expected Outputs of Deep NetworksSparsity Normalization: Stabilizing the Expected Outputs of Deep Networks
Sparsity Normalization: Stabilizing the Expected Outputs of Deep Networks
 
Low-rank Matrix Approximation with Stability
Low-rank Matrix Approximation with StabilityLow-rank Matrix Approximation with Stability
Low-rank Matrix Approximation with Stability
 
Introduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with DiscussionsIntroduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with Discussions
 
A Neural Autoregressive Approach to Collaborative Filtering (CF-NADE) Slide
A Neural Autoregressive Approach to Collaborative Filtering (CF-NADE) Slide A Neural Autoregressive Approach to Collaborative Filtering (CF-NADE) Slide
A Neural Autoregressive Approach to Collaborative Filtering (CF-NADE) Slide
 
Introduction to XGBoost
Introduction to XGBoostIntroduction to XGBoost
Introduction to XGBoost
 
Why biased matrix factorization works well?
Why biased matrix factorization works well?Why biased matrix factorization works well?
Why biased matrix factorization works well?
 
Dynamically Expandable Network (DEN)
Dynamically Expandable Network (DEN)Dynamically Expandable Network (DEN)
Dynamically Expandable Network (DEN)
 
Introduction to Low-rank Matrix Completion
Introduction to Low-rank Matrix CompletionIntroduction to Low-rank Matrix Completion
Introduction to Low-rank Matrix Completion
 

Recently uploaded

Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 

Recently uploaded (20)

Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 

Exact Matrix Completion via Convex Optimization Slide (PPT)

  • 1. Exact Matrix Completion via Convex Optimization Emmanuel J. Cand`es and Benjamin Recht Minsu Kim1, Joonyoung Yi1 1School of Computing, KAIST CS592, Advanced Machine Learning: High-Dimensional Data Analysis April, 2018 CS592 Exact Matrix Completion 2018. 4. 24 1 / 51
  • 2. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 2 / 51
  • 3. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 3 / 51
  • 4. Matrix Completion Problem(Informal) Given: Matrix M with some entries are missing Goal: Complete the matrix M CS592 Exact Matrix Completion 2018. 4. 24 4 / 51
  • 5. Matrix Completion Problem(Informal) Given: Matrix M with some entries are missing Goal: Complete the matrix M In general, it is impossible. But, it can be possible with low-rank assumption. CS592 Exact Matrix Completion 2018. 4. 24 4 / 51
  • 6. Low-rank Assumption Low-rank assumption For n1 × n2 matrix M of rank r, assume the min(n1, n2) r. Why this assumption is needed? CS592 Exact Matrix Completion 2018. 4. 24 5 / 51
  • 7. Low-rank Assumption Low-rank assumption For n1 × n2 matrix M of rank r, assume the min(n1, n2) r. Why this assumption is needed? For simplicity, think about n × n matrix M of rank r. It has (2n − r)r degrees of freedom. Degree of freedom is calculated by counting parameters in the SVD. (The number of singular values) + (degree of freedom of left singular vectors) + (degree of freedom of right singular vectors) = r + ((2n − r − 1) × r)/2 + ((2n − r − 1) × r)/2 Considerably smaller than n2 . With low-rank assumption, degree of freedom is reduced by about 2r/n. CS592 Exact Matrix Completion 2018. 4. 24 5 / 51
  • 8. Low-rank Assumption Low-rank assumption For n1 × n2 matrix M of rank r, assume the min(n1, n2) r. Why this assumption is needed? For simplicity, think about n × n matrix M of rank r. It has (2n − r)r degrees of freedom. Degree of freedom is calculated by counting parameters in the SVD. (The number of singular values) + (degree of freedom of left singular vectors) + (degree of freedom of right singular vectors) = r + ((2n − r − 1) × r)/2 + ((2n − r − 1) × r)/2 Considerably smaller than n2 . With low-rank assumption, degree of freedom is reduced by about 2r/n. This assumption is similar to sparsity assumption in the Lasso paper we’ve learned. CS592 Exact Matrix Completion 2018. 4. 24 5 / 51
  • 9. Which Matrices? Consider the rank-1 matrix M: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      Let m be the number of observed entries of M. CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
  • 10. Which Matrices? Consider the rank-1 matrix M: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      Let m be the number of observed entries of M. Then, we can see only 0 with probability 1 − m/n2 . CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
  • 11. Which Matrices? Consider the rank-1 matrix M: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      Let m be the number of observed entries of M. Then, we can see only 0 with probability 1 − m/n2 . If sample set doesn’t contain 1, we can not complete matrix exactly. CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
  • 12. Which Matrices? Consider the rank-1 matrix M: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      Let m be the number of observed entries of M. Then, we can see only 0 with probability 1 − m/n2 . If sample set doesn’t contain 1, we can not complete matrix exactly. So, impossible to recover all low-rank matrices. CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
  • 13. Which Matrices? Consider the SVD of a matrix M = r k=1 σkukv∗ k where uk’s and vk’s are the left and right singular vectors, and the σk’s are the singular values. Random orthogonal model The family {uk}1≤k≤r is selected uniformly at random among all families of r orthonormal vectors, and similarly for the family {vk}1≤k≤r. With this model, we can see, intuitively, small probability occuring exception as we mentioned. Formal proofs are provided by Lemma 2.1 and 2.2. CS592 Exact Matrix Completion 2018. 4. 24 7 / 51
  • 14. Which Sampling Sets? Clearly, we can not recover Mij if the sampling set avoids i-th row or j-th column. So, we have to define sampling set to recover. CS592 Exact Matrix Completion 2018. 4. 24 8 / 51
  • 15. Which Sampling Sets? Clearly, we can not recover Mij if the sampling set avoids i-th row or j-th column. So, we have to define sampling set to recover. Definition of sampling set Ω If Mij is observed, (i, j) ∈ Ω. The cardinality of Ω is m. Uniform sampling assumption The set Ω sampled uniformly at random. CS592 Exact Matrix Completion 2018. 4. 24 8 / 51
  • 16. Which Sampling Sets? Clearly, we can not recover Mij if the sampling set avoids i-th row or j-th column. So, we have to define sampling set to recover. Definition of sampling set Ω If Mij is observed, (i, j) ∈ Ω. The cardinality of Ω is m. Uniform sampling assumption The set Ω sampled uniformly at random. Additionally, this paper introduce sampling operator RΩ for convenience. Sampling Operator RΩ Let RΩ : Rn1×n2 → R Ω be the sampling operator which extracts the observed entries, RΩ(X) = (Xij)ij∈Ω. CS592 Exact Matrix Completion 2018. 4. 24 8 / 51
  • 17. Which Algorithm? Problem 1.3: Formal Matrix Completion Problem minimize rank(X) subject to RΩ(X) = RΩ(M) CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
  • 18. Which Algorithm? Problem 1.3: Formal Matrix Completion Problem minimize rank(X) subject to RΩ(X) = RΩ(M) Unfortunately, this problem is NP-hard and non-convex. rank function is not convex. All known algorithms require exponential time to n. CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
  • 19. Which Algorithm? Problem 1.3: Formal Matrix Completion Problem minimize rank(X) subject to RΩ(X) = RΩ(M) Unfortunately, this problem is NP-hard and non-convex. rank function is not convex. All known algorithms require exponential time to n. Problem 1.5: Convex Relaxation to Matrix Completion Problem minimize X ∗ subject to RΩ(X) = RΩ(M) CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
  • 20. Which Algorithm? Problem 1.3: Formal Matrix Completion Problem minimize rank(X) subject to RΩ(X) = RΩ(M) Unfortunately, this problem is NP-hard and non-convex. rank function is not convex. All known algorithms require exponential time to n. Problem 1.5: Convex Relaxation to Matrix Completion Problem minimize X ∗ subject to RΩ(X) = RΩ(M) Nuclear norm X ∗ is surrogate of rank. Also, nuclear norm is convex function. This heuristic is introduced by [Fazel et al. 2002]. CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
  • 21. Nuclear Norm Relaxation is Reasonable?(Informal) We already know that l1-norm is surrogate to l0-norm in the Lasso paper. CS592 Exact Matrix Completion 2018. 4. 24 10 / 51
  • 22. Nuclear Norm Relaxation is Reasonable?(Informal) We already know that l1-norm is surrogate to l0-norm in the Lasso paper. Figure: http://public.lanl.gov/mewall/kluwer2002.html Let vector σ = (σ1, σ2, · · · , σn), σi is i-th singular value of matrix M. The rank function is l0-norm of σ vector. The nuclear norm function is l1-norm of σ vector. CS592 Exact Matrix Completion 2018. 4. 24 10 / 51
  • 23. How to Solve Nuclear Norm Minimization? This method is also suggested by Fazel. Convex Relaxation to Matrix Completion Problem(Nuclear Norm Minimization) can be solved by Semi-definite Programming(SDP). CS592 Exact Matrix Completion 2018. 4. 24 11 / 51
  • 24. How to Solve Nuclear Norm Minimization? This method is also suggested by Fazel. Convex Relaxation to Matrix Completion Problem(Nuclear Norm Minimization) can be solved by Semi-definite Programming(SDP). Let SVD of matrix X = UΣV ∗. Define W1 = UΣU∗, W2 = V ΣV ∗, X = W1 X X∗ W2 . CS592 Exact Matrix Completion 2018. 4. 24 11 / 51
  • 25. How to Solve Nuclear Norm Minimization? This method is also suggested by Fazel. Convex Relaxation to Matrix Completion Problem(Nuclear Norm Minimization) can be solved by Semi-definite Programming(SDP). Let SVD of matrix X = UΣV ∗. Define W1 = UΣU∗, W2 = V ΣV ∗, X = W1 X X∗ W2 . Under above settings, the following properties satisfy: X ∗ = W1 ∗ = W2 ∗ X 0 (positive semidefinite matrix) X = U V Σ U V ∗ = U V √ Σ U V √ Σ ∗ 0 ∀ matrix A, AA∗ 0 ∀ symmetric matrix X, trace(X) = X ∗ https://chrischoy.github.io/research/matrix-norms/ CS592 Exact Matrix Completion 2018. 4. 24 11 / 51
  • 26. How to Solve Nuclear Norm Minimization? Therefore, the Problem 1.5 can be reduced to Reduced form of nuclear norm minimization minimize trace(X ) subject to RΩ(X) = RΩ(M), X 0 X = W1 X X∗ W2 trace(X ) = trace(W1) + trace(W2) = X ∗ + X ∗ = 2 X ∗ CS592 Exact Matrix Completion 2018. 4. 24 12 / 51
  • 27. How to Solve Nuclear Norm Minimization? Now, reduced form can be solved by SDP. Many algorithms solving SDP already exist. Semidefinite Programming(SDP) minimizeX ∈Sn C, X Sn subject to Ak, X Sn = bk, k = 1, . . . , m X 0 Sn : n × n symmetric matrix. X, Y = trace(X∗ Y ) is element-wise product. Ak is all 0s without one entry which is 1 for (i, j) ∈ Ω. CS592 Exact Matrix Completion 2018. 4. 24 13 / 51
  • 28. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 14 / 51
  • 29. Nuclear Norm Relaxation is Reasonable?(Formal) A first typical result: M: n1 × n2 matrix of rank r, n = max(n1, n2). Theorem 1.1(General) - M obeys the random orthogonal model. - With uniformly random sampling assumption. Then, ∃ constants C, c such that if m ≥ Cn5/4 r log n the minimizer to the problem 1.5 is unique and equal to M with probability at least 1 − cn−3. CS592 Exact Matrix Completion 2018. 4. 24 15 / 51
  • 30. Nuclear Norm Relaxation is Reasonable?(Formal) A first typical result: M: n1 × n2 matrix of rank r, n = max(n1, n2). Theorem 1.1(General) - M obeys the random orthogonal model. - With uniformly random sampling assumption. Then, ∃ constants C, c such that if m ≥ Cn5/4 r log n the minimizer to the problem 1.5 is unique and equal to M with probability at least 1 − cn−3. For low-rank case, tighter bound is given. Theorem 1.1(Low-rank) if r ≤ n1/5, the recovery is exact with same probability provided that m ≥ Cn6/5 r log n CS592 Exact Matrix Completion 2018. 4. 24 15 / 51
  • 31. Nuclear Norm Relaxation is Reasonable?(Formal) Asymptotically, Theorem 1.1(General): m = Ω(n1.25 r log n) Theorem 1.1(Low-rank): m = Ω(n1.2 r log n) This bound is not tight(Plug r ← n into Theorem 1.1 general case). CS592 Exact Matrix Completion 2018. 4. 24 16 / 51
  • 32. Nuclear Norm Relaxation is Reasonable?(Formal) Asymptotically, Theorem 1.1(General): m = Ω(n1.25 r log n) Theorem 1.1(Low-rank): m = Ω(n1.2 r log n) This bound is not tight(Plug r ← n into Theorem 1.1 general case). This theorem tells: With some conditions, we can recover matrix exactly. Problem 1.3 and Problem 1.5 are formally equivalent with some conditions. Rationality of nuclear norm relaxation. CS592 Exact Matrix Completion 2018. 4. 24 16 / 51
  • 33. Relaxation of Random Orthogonal Model Recall why we introduce random orthogonal model: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
  • 34. Relaxation of Random Orthogonal Model Recall why we introduce random orthogonal model: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      More generally, it is hard to recover if the singular vectors of the matrix M are similar to standard basis. Because, information is highly concentrated on specific region. Hence, the singular vectors need to be sufficiently spread. CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
  • 35. Relaxation of Random Orthogonal Model Recall why we introduce random orthogonal model: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      More generally, it is hard to recover if the singular vectors of the matrix M are similar to standard basis. Because, information is highly concentrated on specific region. Hence, the singular vectors need to be sufficiently spread. Random orthogonal model is one of the models to tackle this problem. We want to relax this assumption of model. Why? CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
  • 36. Relaxation of Random Orthogonal Model Recall why we introduce random orthogonal model: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      More generally, it is hard to recover if the singular vectors of the matrix M are similar to standard basis. Because, information is highly concentrated on specific region. Hence, the singular vectors need to be sufficiently spread. Random orthogonal model is one of the models to tackle this problem. We want to relax this assumption of model. Why? To get more general theorem that covers a much larger set of matrices M. CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
  • 37. Coherence Definition 1.2 Let U be a subspace or Rn of dimension r and PU be the orthogonal projection onto U. Then the coherence of U is defined to be µ(U) ≡ n r max 1≤i≤n PU ei 2 CS592 Exact Matrix Completion 2018. 4. 24 18 / 51
  • 38. Coherence Definition 1.2 Let U be a subspace or Rn of dimension r and PU be the orthogonal projection onto U. Then the coherence of U is defined to be µ(U) ≡ n r max 1≤i≤n PU ei 2 Examples Incoherence Example(Minimum Case) If U is spanned by vectors whose entries all have magnitude 1/ √ n, µ(U) = 1. Examples Coherence Example(Maximum Case) If U contains a standard basis element, µ(U) = n/r. The more similar to standard basis element, µ(U) is larger. The more far from standard basis element, µ(U) is smaller. CS592 Exact Matrix Completion 2018. 4. 24 18 / 51
  • 39. Coherence From definition 1.2, we define two assumptions A0 and A1. Assumption A0 The coherences obey max(µ(U), µ(V )) ≤ µ0 for some positive µ0. Assumption A1 The n1 × n2 matrix Σ1≤k≤rukv∗ k has maximum entry bounded by µ1 r/(n1n2) in absolute value for some positive µ1. A1 is always hold with µ1 = µ0 √ r. CS592 Exact Matrix Completion 2018. 4. 24 19 / 51
  • 40. Coherence From definition 1.2, we define two assumptions A0 and A1. Assumption A0 The coherences obey max(µ(U), µ(V )) ≤ µ0 for some positive µ0. Assumption A1 The n1 × n2 matrix Σ1≤k≤rukv∗ k has maximum entry bounded by µ1 r/(n1n2) in absolute value for some positive µ1. A1 is always hold with µ1 = µ0 √ r. This paper insists that A0, A1 are more general compared to random orthogonal model. Lemma 2.2 proves that random orthogonal model is special case of A0, A1. CS592 Exact Matrix Completion 2018. 4. 24 19 / 51
  • 41. Coherence Let’s investigate an quite natural assumption. The maximum entries of left and right singular matrix are bounded. CS592 Exact Matrix Completion 2018. 4. 24 20 / 51
  • 42. Coherence Let’s investigate an quite natural assumption. The maximum entries of left and right singular matrix are bounded. Assumption 1.12 Assume that the uj and vj’s obey maxij ei, uj 2 ≤ µB/n, maxij ei, vj 2 ≤ µB/n, for some value of µB = O(1). CS592 Exact Matrix Completion 2018. 4. 24 20 / 51
  • 43. Coherence Let’s investigate an quite natural assumption. The maximum entries of left and right singular matrix are bounded. Assumption 1.12 Assume that the uj and vj’s obey maxij ei, uj 2 ≤ µB/n, maxij ei, vj 2 ≤ µB/n, for some value of µB = O(1). With this assumption, assumptions A0, A1 hold. µ(U), µ(V ) ≤ µB µ1 = O( √ log n) (Proved by Lemma 2.1) CS592 Exact Matrix Completion 2018. 4. 24 20 / 51
  • 44. Main Result With incoherence conditions, relax Theorem 1.1 to Theorem 1.3. Theorem 1.3(General) - M obeys A0 and A1. - With uniformly random sampling assumption. Then, ∃ constants C, c such that if m ≥ Cmax(µ2 1, µ 1/2 0 µ1, µ0n1/4 )nr(β log n) for some β > 2, then the minimizer to the problem (1.5) is unique and equal to M with probability at least 1 − cn−β. CS592 Exact Matrix Completion 2018. 4. 24 21 / 51
  • 45. Main Result With incoherence conditions, relax Theorem 1.1 to Theorem 1.3. Theorem 1.3(General) - M obeys A0 and A1. - With uniformly random sampling assumption. Then, ∃ constants C, c such that if m ≥ Cmax(µ2 1, µ 1/2 0 µ1, µ0n1/4 )nr(β log n) for some β > 2, then the minimizer to the problem (1.5) is unique and equal to M with probability at least 1 − cn−β. Theorem 1.3(Low-rank) If r ≤ µ−1 0 n1/5, the recovery is exact with same prob. provided that m ≥ Cµ0n6/5r(β log n). CS592 Exact Matrix Completion 2018. 4. 24 21 / 51
  • 46. Main Result With incoherence conditions, relax Theorem 1.1 to Theorem 1.3. Theorem 1.3(General) - M obeys A0 and A1. - With uniformly random sampling assumption. Then, ∃ constants C, c such that if m ≥ Cmax(µ2 1, µ 1/2 0 µ1, µ0n1/4 )nr(β log n) for some β > 2, then the minimizer to the problem (1.5) is unique and equal to M with probability at least 1 − cn−β. Theorem 1.3(Low-rank) If r ≤ µ−1 0 n1/5, the recovery is exact with same prob. provided that m ≥ Cµ0n6/5r(β log n). Asymptotically similar to Theorem 1.1 CS592 Exact Matrix Completion 2018. 4. 24 21 / 51
  • 47. Connections to Lasso Recall: Lasso Paper Compressive sampling(or compressed sensing, matrix sensing) Ax = b (A is a design matrix, b is a observation vector) In the Lasso paper, they add the sparsity assumption with x to tackle the problem that problem dimension is much larger than observations(n p). x is k-sparse in the Fourier domain, it can be perfectly recovered with high probability by l1 minimization when m = Ω(k log n). CS592 Exact Matrix Completion 2018. 4. 24 22 / 51
  • 48. Connections to Lasso Recall: Lasso Paper Compressive sampling(or compressed sensing, matrix sensing) Ax = b (A is a design matrix, b is a observation vector) In the Lasso paper, they add the sparsity assumption with x to tackle the problem that problem dimension is much larger than observations(n p). x is k-sparse in the Fourier domain, it can be perfectly recovered with high probability by l1 minimization when m = Ω(k log n). Also, this paper claim that they generalizes the notion of incoherence to problems beyond the setting of sparse signal recovery. CS592 Exact Matrix Completion 2018. 4. 24 22 / 51
  • 49. Connections to Matrix Sensing Problem Original Fazel’s problem was Solving matrix sensing problem with low-rank assumption(nuclear norm heuristic) and other assumptions. CS592 Exact Matrix Completion 2018. 4. 24 23 / 51
  • 50. Connections to Matrix Sensing Problem Original Fazel’s problem was Solving matrix sensing problem with low-rank assumption(nuclear norm heuristic) and other assumptions. Contribution of this paper compared to original Fazel’s work. Extend Fazel’s work to matrix completion problem. Define some conditions(including incoherence) to complete matrix exactly. Serve theoretical bound of convex relaxation(nuclear norm minimization). CS592 Exact Matrix Completion 2018. 4. 24 23 / 51
  • 51. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 24 / 51
  • 52. Which Matrices are Incoherent? Theorem 1.3 says completing matrix is possible if the incoherence conditions are hold. Now, we want to find out how incoherence assumptions are reasonable. CS592 Exact Matrix Completion 2018. 4. 24 25 / 51
  • 53. Which Matrices are Incoherent? Theorem 1.3 says completing matrix is possible if the incoherence conditions are hold. Now, we want to find out how incoherence assumptions are reasonable. In this section, we’ll show that most random matrices are incoherent. Lemma 2.1 shows incoherence assumptions are reasonable. Lemma 2.2 shows a matrix with random orthogonal model satisfies incoherence conditions. CS592 Exact Matrix Completion 2018. 4. 24 25 / 51
  • 54. Incoherent Bases Span Incoherent Subspaces Recall: Assumption 1.12 Assume that the uj and vj’s obey maxij ei, uj 2 ≤ µB/n, maxij ei, vj 2 ≤ µB/n, for some value of µB = O(1). CS592 Exact Matrix Completion 2018. 4. 24 26 / 51
  • 55. Incoherent Bases Span Incoherent Subspaces Recall: Assumption 1.12 Assume that the uj and vj’s obey maxij ei, uj 2 ≤ µB/n, maxij ei, vj 2 ≤ µB/n, for some value of µB = O(1). If the assumption 1.12 holds, A0 holds with µ0 = µB(It is trivial). A1 holds with µ1 = CµB √ log n(It is not trivial). A1 hold with µ1 = µB √ r, but we can’t assume that r = O(log n). So, we’ll prove that A1 holds with µ1 = CµB √ log n. CS592 Exact Matrix Completion 2018. 4. 24 26 / 51
  • 56. Incoherent Bases Span Incoherent Subspaces Lemma 2.1 If assumption 1.12 holds, A1 holds with µ1 = CµB log n with high prob. CS592 Exact Matrix Completion 2018. 4. 24 27 / 51
  • 57. Incoherent Bases Span Incoherent Subspaces Lemma 2.1 If assumption 1.12 holds, A1 holds with µ1 = CµB log n with high prob. (Proof) Consider the matrix M = Σr k=1 kukv∗ k where { k}1≤k≤r is an arbitrary sign sequence and kuk = uk. CS592 Exact Matrix Completion 2018. 4. 24 27 / 51
  • 58. Incoherent Bases Span Incoherent Subspaces Lemma 2.1 If assumption 1.12 holds, A1 holds with µ1 = CµB log n with high prob. (Proof) Consider the matrix M = Σr k=1 kukv∗ k where { k}1≤k≤r is an arbitrary sign sequence and kuk = uk. From this setting, simple applying Hoeffding’s inequality, P(|Mij| ≥ λµB √ r/n) ≤ 2e−λ2/2 With setting λ = √ 2β log n and applying union bound, P(|M| ≥ µB √ 2βr log n/n) ≤ 2n2e−βe−β log n = 2n2n−βe−β ≤ 2n2n−β Plug β ← β + 2, then Lemma 2.1 holds. CS592 Exact Matrix Completion 2018. 4. 24 27 / 51
  • 59. Random Subspaces Span Incoherent Subspaces Now, we prove that random orthogonal model obeys the two assumptions A0 and A1(with appropriate values for the µ’s) with large probability. CS592 Exact Matrix Completion 2018. 4. 24 28 / 51
  • 60. Random Subspaces Span Incoherent Subspaces Now, we prove that random orthogonal model obeys the two assumptions A0 and A1(with appropriate values for the µ’s) with large probability. Lemma 2.3 is one of the the results of [Laurent et al. 2000]. Using the result, we can see that Lemma 2.2 is induced from Lemma 2.3. Lemma 2.2 Set ¯r = max(r, log n). Then ∃ constants C, c such that the random orthogonal model obeys: 1. µ(U) = (n/r) maxi PU ei 2 ≤ C¯r/r = µ0, 2. (n/r) 1≤k≤r ukv∗ k ∞ ≤ C log n ¯r/r = µ1 with high probability. CS592 Exact Matrix Completion 2018. 4. 24 28 / 51
  • 61. Random Subspaces Span Incoherent Subspaces Now, we prove that random orthogonal model obeys the two assumptions A0 and A1(with appropriate values for the µ’s) with large probability. Lemma 2.3 is one of the the results of [Laurent et al. 2000]. Using the result, we can see that Lemma 2.2 is induced from Lemma 2.3. Lemma 2.2 Set ¯r = max(r, log n). Then ∃ constants C, c such that the random orthogonal model obeys: 1. µ(U) = (n/r) maxi PU ei 2 ≤ C¯r/r = µ0, 2. (n/r) 1≤k≤r ukv∗ k ∞ ≤ C log n ¯r/r = µ1 with high probability. µ0 = O(1), µ1 = O(log n). As a result of Lemma 2.2, we can see that random subspace span incoherent subspace with high probability. CS592 Exact Matrix Completion 2018. 4. 24 28 / 51
  • 62. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 29 / 51
  • 63. Proof Strategy Problem 1.5: Convex relaxation to matrix completion problem minimize X ∗ subject to RΩ(X) = RΩ(M) We will prove the original matrix M is the unique solution to the problem 1.5. What are conditions for some matrix X to be a unique minimizer? CS592 Exact Matrix Completion 2018. 4. 24 30 / 51
  • 64. Subgradient of Nuclear Norm Convex optimization theory says: X is solution to (1.5) ⇔ ∃λ ∈ R|Ω| s.t. R∗ Ωλ ∈ ∂ X ∗ We know nuclear norm of a matrix is sum of its singular values. We may derive subgradient of nuclear norm from this. CS592 Exact Matrix Completion 2018. 4. 24 31 / 51
  • 65. Subgradient of Nuclear Norm Convex optimization theory says: X is solution to (1.5) ⇔ ∃λ ∈ R|Ω| s.t. R∗ Ωλ ∈ ∂ X ∗ We know nuclear norm of a matrix is sum of its singular values. We may derive subgradient of nuclear norm from this. Preliminary: subgradient of matrix norm (from [36]) ∂ A = conv{UDV ∗ , A = UΣV ∗ , d ∈ ∂φ(σ)} where D, Σ is m × n diagonal matrix with d, σ as diagonal entries. Since φ should be symmetric gauge function, we put φ(σ) = σ 1 = |σi|. CS592 Exact Matrix Completion 2018. 4. 24 31 / 51
  • 66. Subgradient of Nuclear Norm (cont.) For rank r, n1 × n2 matrix A, there are r non-zero singular values. Then, we know ∂ σ 1 = {x ∈ Rmin(n1,n2) | xi = 1 for i = 1..r, |xi| ≤ 1 otherwise} CS592 Exact Matrix Completion 2018. 4. 24 32 / 51
  • 67. Subgradient of Nuclear Norm (cont.) For rank r, n1 × n2 matrix A, there are r non-zero singular values. Then, we know ∂ σ 1 = {x ∈ Rmin(n1,n2) | xi = 1 for i = 1..r, |xi| ≤ 1 otherwise} If we partition singular vectors U, V like.. U = [U(1) |U(2) ], V = [V (1) |V (2) ] where U(1) , V (1) has r columns we get the result ∂ A ∗ = { U(1) V (1)∗ +U(2) RV (2)∗ for all R ∈ R(n1−r)×(n2−r) , R 2 ≤ 1} using the definition of convex hull(ALL convex combinations of elements) CS592 Exact Matrix Completion 2018. 4. 24 32 / 51
  • 68. Subgradient of Nuclear Norm (cont.) ∂ A ∗ = { U(1) V (1)∗ + U(2) RV (2)∗ for all R ∈ R(n1−r)×(n2−r) , R 2 ≤ 1} was expressed in the paper as, Y = 1≤k≤r ukv∗ k + W (3.4) where W obeys two properties: the column space of W is orthogonal to U ≡ span(u1, ..., ur). and the row space of W is orthogonal to V ≡ span(v1, ..., vr); the spectral norm of W is less than or equal to 1. This says we can decompose ∂ A ∗ into 2 orthogonal spaces, T(blue one) and T⊥(green one). We define PT , PT⊥ as projection onto each spaces. CS592 Exact Matrix Completion 2018. 4. 24 33 / 51
  • 69. Conditions for Unique Minimizer Lemma 3.1 conditions for unique minimizer Consider a matrix X0 = r k=1 σkukv∗ k of rank r which is feasible for the problem (1.5), and suppose that the following two conditions hold: 1. there exists a dual point λ such that Y = R∗ Ωλ obeys PT (Y ) = r k=1 ukv∗ k, PT⊥ (Y ) 2 < 1 2. the sampling operator RΩ restricted to elements in T is injective. Then X0 is the unique minimizer. Removing the equality from spectral norm, and adding condition 2 gives you unique minimizer! Now, constructing such Y became important. CS592 Exact Matrix Completion 2018. 4. 24 34 / 51
  • 70. Construction of Subgradient Define PΩ(X) = Xij if (i, j) ∈ Ω, 0 otherwise. Then set matrix Y as the solution to least square problem minimize X F subject to (PT PΩ)(X) = r k=1 ukv∗ k. This will make Y vanish in ΩC and have smaller PT⊥ (Y ) 2 as well. Statement 4.2 specification of Y Denote AΩT (M) = PΩPT (M). Then if A∗ ΩT AΩT = PT PΩPT has full rank when restricted to T, the minimizer is given by Y = AΩT (A∗ ΩT AΩT )−1 (E), where E = r k=1 ukv∗ k CS592 Exact Matrix Completion 2018. 4. 24 35 / 51
  • 71. Injectivity We study the injectivity of AΩT . For convenience, we use Bernoulli model, not uniform sampling: P(δij = 1) = p ≡ m n1n2 , Ω = {(i, j) : δij = 1}. Theorem 4.1 Small operator norm Suppose Ω is sampled according to the Bernoulli model and put n = max(n1, n2). Suppose that the coherences obey max(µ(U), µ(V )) ≤ µ0. Then, there is a numerical constants CR such that for all β > 1, p−1 PT PΩPT − pPT 2 ≤ CR µ0nr(β log n) m with probability at least 1 − 3n−β provided that CR µ0nr(β log n) m < 1. CS592 Exact Matrix Completion 2018. 4. 24 36 / 51
  • 72. Injectivity (cont.) With m large enough so that CR µ0(nr/m) log n ≤ 1/2, p 2 PT (X) F ≤ (PT PΩPT )(X) F ≤ 3p 2 PT (X) F (4.11) Corollary 4.3 Injectivity of RΩ in T Assume that CR µ0nr(log n)/m ≤ 1/2. With the same probability as in Theorem 4.1, we have PΩPT (X) F ≤ 3p/2 PT (X) F . This provides the second condition of unique minimizer. CS592 Exact Matrix Completion 2018. 4. 24 37 / 51
  • 73. Size of Spectral Norm We will investigate the probability of such Y will satisfy PT⊥ (Y ) 2 < 1. Denote H ≡ PT − p−1 PT PΩPT , then we can decompose PT⊥ (Y ) as PT ⊥ (Y ) = p−1 (PT ⊥ PΩPT )(E + H(E) + H2 (E) + · · · ), E = 1≤k≤r ukv∗ k. Then lemma 4.4-4.8 will give us the upper bound of these terms! p−1 (PT⊥ PΩPT )E 2 p−1 (PT⊥ PΩPT )H2 (E) 2 p−1 (PT⊥ PΩPT )H3 (E)) 2 · · · p−1 (PT⊥ PΩPT ) k≥k0 Hk (E) 2 CS592 Exact Matrix Completion 2018. 4. 24 38 / 51
  • 74. Size of Spectral Norm (cont.) If we set k0 = 3 in lemma 4.8, we can bound the spectral norm of PT⊥ (Y ) 2 < 1 with probability at least 1 − cn−β provided m ≥ C max(µ2 1, µ 1/2 0 µ1, µ 4/3 0 r1/3 , µ0n1/4 )nrβ log n If µ 4/3 0 r1/3 is maximum, it leads to trivial case(greater than n2). Without this, it is the first result of Theorem 1.3. If we set k0 = 4 in lemma 4.8, we can bound m m ≥ C max(µ2 0r, µ0n1/5 )nrβ log n which is the second result of Thoerem 1.3. CS592 Exact Matrix Completion 2018. 4. 24 39 / 51
  • 75. Overall Structure of the Paper vspace0.3cm CS592 Exact Matrix Completion 2018. 4. 24 40 / 51
  • 76. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 41 / 51
  • 77. Experiment Overview 1. Experiment of Theorem 1.3 2. Experiment of Theorem 1.3 with the assumption that M is semi-definite matrix. 3. Experiment of Fazel’s work. CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
  • 78. Experiment Overview 1. Experiment of Theorem 1.3 2. Experiment of Theorem 1.3 with the assumption that M is semi-definite matrix. 3. Experiment of Fazel’s work. Matrix M(n × n, rank r) generating method: Sample two n × r factors ML and MR with i.i.d. Gaussian entries. Setting M = MLM∗ R Sample a subset Ω of m entries uniformly at random. CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
  • 79. Experiment Overview 1. Experiment of Theorem 1.3 2. Experiment of Theorem 1.3 with the assumption that M is semi-definite matrix. 3. Experiment of Fazel’s work. Matrix M(n × n, rank r) generating method: Sample two n × r factors ML and MR with i.i.d. Gaussian entries. Setting M = MLM∗ R Sample a subset Ω of m entries uniformly at random. Determine exactness: M to be recovered if the solution returned by SDP, Xopt satisfied Xopt − M F / M F < 10−3 CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
  • 80. Experiment Overview 1. Experiment of Theorem 1.3 2. Experiment of Theorem 1.3 with the assumption that M is semi-definite matrix. 3. Experiment of Fazel’s work. Matrix M(n × n, rank r) generating method: Sample two n × r factors ML and MR with i.i.d. Gaussian entries. Setting M = MLM∗ R Sample a subset Ω of m entries uniformly at random. Determine exactness: M to be recovered if the solution returned by SDP, Xopt satisfied Xopt − M F / M F < 10−3 Repeated same procedure 50 times for 1, 2. 10 for 3. CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
  • 81. Experiment 1 Success: white, Failure: Black. (a): n = 40, (b): n = 50 Two experiments in very similar plots for different n. The results of this paper may be conservative. CS592 Exact Matrix Completion 2018. 4. 24 43 / 51
  • 82. Experiment 2 For positive semidefinite matrices case, the recovery region is much larger. Future work is needed to investigate. CS592 Exact Matrix Completion 2018. 4. 24 44 / 51
  • 83. Experiment 3 To check theoretical bound of matrix sensing problem(Original Fazel’s problem). CS592 Exact Matrix Completion 2018. 4. 24 45 / 51
  • 84. Reproduce Experiments We’ve reproduced this experiments using Python. Using CVXOPT package to SDP. https://github.com/JoonyoungYi/exact-mc. CS592 Exact Matrix Completion 2018. 4. 24 46 / 51
  • 85. Reproduce Experiments We’ve reproduced this experiments using Python. Using CVXOPT package to SDP. https://github.com/JoonyoungYi/exact-mc. We plotted the relation between n and m/n. rank = 2, trial = 1, Exactness metric is same as experiments in this paper: Xopt satisfied Xopt − M F / M F < 10−3 CS592 Exact Matrix Completion 2018. 4. 24 46 / 51
  • 86. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 47 / 51
  • 87. Improvements The main result of this paper. m = Ω(n1.25 r log n) With low-rank assumption(r ≤ n1/5), m = Ω(n1.2 r log n) CS592 Exact Matrix Completion 2018. 4. 24 48 / 51
  • 88. Improvements The main result of this paper. m = Ω(n1.25 r log n) With low-rank assumption(r ≤ n1/5), m = Ω(n1.2 r log n) Can we find tighter bound? Authors insisted that it would be hard as far as approaching in this way. Bound Ω(n1.2 ) is mainly determined by (PT ⊥ PΩPT Hk (E)) for k = 1, 2, 3 in the series (4.13). If we k to expand 4, Bound will become Ω(n7/6 ). If we k to expand 5, Bound will become Ω(n8/7 ). This can be done to reach k of size about log n, but size of the decoupling constants CD is varying on k. This is why the stronger results are hard to see. CS592 Exact Matrix Completion 2018. 4. 24 48 / 51
  • 89. Improvements The main result of this paper. m = Ω(n1.25 r log n) With low-rank assumption(r ≤ n1/5), m = Ω(n1.2 r log n) Can we find tighter bound? Authors insisted that it would be hard as far as approaching in this way. Bound Ω(n1.2 ) is mainly determined by (PT ⊥ PΩPT Hk (E)) for k = 1, 2, 3 in the series (4.13). If we k to expand 4, Bound will become Ω(n7/6 ). If we k to expand 5, Bound will become Ω(n8/7 ). This can be done to reach k of size about log n, but size of the decoupling constants CD is varying on k. This is why the stronger results are hard to see. However, [Candes et al.2010] showed that m = Ω(nr log n) with additional assumptions. CS592 Exact Matrix Completion 2018. 4. 24 48 / 51
  • 90. Further directions (CASE 1) Noise Handling Observation is not Mij, but Yij. Yij = Mij + zij, (i, j) ∈ Ω, where z is a deterministic or stochastic perturbation. CS592 Exact Matrix Completion 2018. 4. 24 49 / 51
  • 91. Further directions (CASE 2) Low-rank Matrix Fitting Let M = Σ1≤k≤nσkukv∗ k where σ1 ≥ σ2 ≥ · · · ≥ 0 and the truncated SVD of the matrix M, Mr = Σ1≤k≤rσkukv∗ k where the sum extends over the r largest singualar values. r is the number of singular value that we can not negligible. This is similar to process of PCA. Low-rank matrix fitting(or Low-rank matrix approximation) For general rank M, minimize X ∗ subject to X − M ∗ X − MR ∗ It is possible to try getting theoretical bound in low-rank matrix fitting problem. CS592 Exact Matrix Completion 2018. 4. 24 50 / 51
  • 92. Further directions (CASE 3) Slow SDP This direction can be done after (CASE 3). SDP is too slow. So, algorithm described in this paper rarely used in practically. CS592 Exact Matrix Completion 2018. 4. 24 51 / 51
  • 93. Further directions (CASE 3) Slow SDP This direction can be done after (CASE 3). SDP is too slow. So, algorithm described in this paper rarely used in practically. Practically, SVD(Alternating Minimization) Method is widely used. But, SVD method is really hard to prove theoretical bound. However, [Jain et al. 2012] showed that m = Ω(nr2.5 logn). This bound is slightly higher compared to convex optimization case, but [Jain et al. 2012] argued that this bound is not tight. CS592 Exact Matrix Completion 2018. 4. 24 51 / 51