SlideShare a Scribd company logo
1 of 34
Download to read offline
Dictionary Learning for
Massive Matrix Factorization
Arthur Mensch, Julien Mairal
Ga¨el Varoquaux, Bertrand Thirion
Inria Parietal, Inria Thoth
October 6, 2016
Introduction
Why am I here ?
Inria Parietal: machine learning for neuro-imaging
(fMRI data)
Matrix factorization: major ingredient in fMRI analysis
Very large datasets (2 TB): we designed faster algorithms
These algorithms can be used in collaborative filtering
D AX
Voxels
Time
=
k spatial maps Time
x
1
Work presented at ICML 2016
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 1 / 28
D´eroul´e
1 Matrix factorization for recommender systems
Collaborative filtering
Matrix factorization formulation
Existing methods
2 Subsampled online dictionary learning
Dictionary learning – existing methods
Handling missing values efficiently
New algorithm
3 Results
Setting
Benchmarks
Parameter setting
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 2 / 28
D´eroul´e
1 Matrix factorization for recommender systems
Collaborative filtering
Matrix factorization formulation
Existing methods
2 Subsampled online dictionary learning
Dictionary learning – existing methods
Handling missing values efficiently
New algorithm
3 Results
Setting
Benchmarks
Parameter setting
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 3 / 28
Collaborative filtering
Collaborative platform
n users rate a fraction of
p items
e.g movies, restaurants
Estimate ratings for
recommendation
Use the ratings of other users for recommendation
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 4 / 28
How to predict ratings ?
Credit: [Bell and Koren, 2007]
Joe like We were
soldiers, Black Hawk
down.
Bob and Alice like the
same films, and also
like Saving private
Ryan.
Joe should watch
Saving private Ryan,
because all of them
indeed likes war films.
Need to uncover topics in items
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 5 / 28
Predicting rate with scalar products
Embeddings to model the existence of genre/category/topics
Representative vectors for
users and items:
(αj
)1≤j≤n, (di )1≤i≤p ∈ Rk
q-th coefficient of di , αj
= affinity with the “topic” q
xij αj
di
k topics
di
αj
1
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 6 / 28
Predicting rate with scalar products
Embeddings to model the existence of genre/category/topics
Representative vectors for
users and items:
(αj
)1≤j≤n, (di )1≤i≤p ∈ Rk
q-th coefficient of di , αj
= affinity with the “topic” q
Ratings xij (item i, user j):
xij = di αj
( + biases)
= Common affinity for topics
xij αj
di
k topics
di
αj
1
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 6 / 28
Predicting rate with scalar products
Embeddings to model the existence of genre/category/topics
Representative vectors for
users and items:
(αj
)1≤j≤n, (di )1≤i≤p ∈ Rk
q-th coefficient of di , αj
= affinity with the “topic” q
Ratings xij (item i, user j):
xij = di αj
( + biases)
= Common affinity for topics
xij αj
di
k topics
di
αj
1
Learning problem: estimate D and A with known ratings
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 6 / 28
Matrix factorization
1Ω X AD
p
n k
n
=
1
X ∈ Rp×n ≈ DA ∈ Rp×k × Rk×n
Constraints / penalty on factors D and A
We only observe 1Ω X — Ω set of ratings provided by users
Recommender systems : millions of users, millions of items
How to scale matrix factorization to very large datasets ?
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 7 / 28
Formalism
Finding representation in Rk for items and users:
min
D∈Rp×k
A∈Rk×n (i,j)∈Ω
(xij − di αj
)2
+ λ( D 2
F + A 2
F )
= 1Ω (X − DA) 2
2 + λ( D 2
F + A 2
F ) 1Ω set of knownratings
2 reconstruction loss — 2 penalty for generalization
Existing methods
Alternated minimization
Stochastic gradient descent
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 8 / 28
Existing methods
Alternated minimization
Minimize over A, D: alternate between
D = min
D∈Rp×k
(i,j)∈Ω
(xij − di αj
)2
+ λ D 2
F
A = min
A∈Rk×n
(i,j)∈Ω
(xij − di αj
)2
+ λ A 2
F
No hyperparameters
Slow and memory expensive: use all ratings at each iteration
a.k.a. coordinate descent (variation in parameter update order)
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 9 / 28
Existing methods
Stochastic gradient descent
min
A,D
(i,j)∈Ω
fij (A, B)
def
= (xij − di αj
)2
+
1
cj
λ αj 2
2 +
1
ci
λ di
2
2
Gradient step for each rating:
(At, Dt) ← (At−1, Dt−1)−
1
ct
(A,D)fij (At−1, Dt−1)
Fast and memory efficient – won the Netflix prize
Very sensitive to step sizes (ct) – need to cross-validate
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 10 / 28
Towards a new algorithm
Best of both worlds ?
Fast and memory efficient algorithm
Little sensitive to hyperparameter setting
Subsampled online dictionary learning
Builds upon the online dictionary learning algorithm
popular in computer vision and interpretable learning (fMRI)
Adapt it to handle missing values efficiently
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 11 / 28
D´eroul´e
1 Matrix factorization for recommender systems
Collaborative filtering
Matrix factorization formulation
Existing methods
2 Subsampled online dictionary learning
Dictionary learning – existing methods
Handling missing values efficiently
New algorithm
3 Results
Setting
Benchmarks
Parameter setting
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 12 / 28
Dictionary learning
Recall: recommender system formalism
Non-masked matrix factorization with 2 penalty:
min
D∈Rp×k
A∈Rk×n
n
j=1
(xj
− D αj
)2
+ λ( D 2
F + A 2
F )
Penalties can be richer, and made into constraints
Dictionary learning
Learn the left side factor [Olshausen and Field, 1997]
min
D∈C
n
j=1
xj
−Dαj 2
2 +λΩ(αj
) αj
= argmin
α∈Rk
xi
−Dα 2
2 +λΩ(α)
Naive approach: alternated minimization
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 13 / 28
Online dictionary learning [Mairal et al., 2010]
At iteration t, select xt in {xj }j (user ratings), improve D
Single iteration complexity ∝ sample dimension O(p)
(Dt)t converges in a few epochs (one for large n)
xt αtD
p
n k n
=Stream
1
Very efficient in computer vision / networks / fMRI /
hyperspectral images
Can we use it efficiently for recommender systems ?
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 14 / 28
In short: Handling missing values
X
p
n
xt
Steam
Handle large n
n
Handle missing values
Online → online + partial
Batch →
online
Mtxt
Stream
Ignore
Unknown
Unaccessed
1
Leverage streaming + partial access to samples
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 15 / 28
In detail: online dictionary learning
Objective function involves latent codes (right side factor)
min
D∈C
1
t
t
i=1
xi − Dα∗
i (D) 2
2, α∗
i (D) = argmin
α
1
2
xi − Dα 2
2 + λΩ(α)
Replace latent codes by codes computed with old dictionaries
Build an upper-bounding surrogate function
min
1
t
t
i=1
xi −Dαi
2
2 αi = argmin
α
1
2
xi −Di−1α 2
2+λΩ(α)
Minimize surrogate — updateable online at low cost
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 16 / 28
In detail: online dictionary learning
Algorithm outline
1 Compute code
αt = argmin
α∈Rk
xt − Dt−1α 2
2 + λΩ(αt)
2 Update the surrogate function
gt =
1
t
t
i=1
xi − Dαi
2
2 = Tr (
1
2
D DAt − D Bt)
At = (1 −
1
t
)At−1 +
1
t
αtαt Bt = (1 −
1
t
)Bt−1 +
1
t
xtαt
3 Minimize surrogate
Dt = argmin
D∈C
gt(D) gt = DAt − Bt
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 17 / 28
In detail: online dictionary learning
Algorithm outline
1 Compute code – xt → complexity depends on p
αt = argmin
α∈Rk
xt − Dt−1α 2
2 + λΩ(αt)
2 Update the surrogate function – Complexity in O(p)
gt =
1
t
t
i=1
xi − Dαi
2
2 = Tr (
1
2
D DAt − D Bt)
At = (1 −
1
t
)At−1 +
1
t
αtαt Bt = (1 −
1
t
)Bt−1 +
1
t
xtαt
3 Minimize surrogate – Complexity in O(p)
Dt = argmin
D∈C
gt(D) gt = DAt − Bt
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 17 / 28
Specification for a new algorithm
Mtxt
Stream
Ignore
p
n
1
Constrained : use only known
ratings from Ω
Efficient: single iteration in O(s),
# of ratings provided by user t
Principled: follows the online
matrix factorization algorithm as
much as possible
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 18 / 28
Missing values in practice
Data stream: (xt)t → masked (Mtxt)t
= ratings from user t
Dimension: p (all items) → s (rated items)
Use only Mtxt in algorithm computation
→ complexity in O(s)
Mtxt
Stream
Ignore
p
n
1
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 19 / 28
Missing values in practice
Data stream: (xt)t → masked (Mtxt)t
= ratings from user t
Dimension: p (all items) → s (rated items)
Use only Mtxt in algorithm computation
→ complexity in O(s)
Mtxt
Stream
Ignore
p
n
1
Adaptation to make
Modify all parts of the algorithm to obtain O(s) complexity
1 Code
computation
2 Surrogate
update
3 Surrogate
minimization
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 19 / 28
Subsampled online dictionary learning
Check out paper !
Original online MF
1 Code computation
αt = argmin
α∈Rk
xt − Dt−1α 2
2
+ λΩ(αt )
2 Surrogate aggregation
At =
1
t
t
i=1
αi αi
Bt = Bt−1 +
1
t
(xt αt − Bt−1)
3 Surrogate minimization
Dj
← p⊥
Cr
j
(Dj
−
1
(At )j,j
(DAj
t −Bj
t ))
Our algorithm
1 Code computation: masked loss
αt = argmin
α∈Rk
Mt (xt − Dt−1α) 2
2
+ λ
rk Mt
p
Ω(αt )
2 Surrogate aggregation
At =
1
t
t
i=1
αi αi
Bt = Bt−1 +
1
t
i=1 Mi
(Mt xt αt − Mt Bt−1)
3 Surrogate minimization
Mt Dj
← p⊥
Cj
(Mt Dj
−
1
(At )j,j
Mt (D(Aj
t − (Bj
t ))
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 20 / 28
D´eroul´e
1 Matrix factorization for recommender systems
Collaborative filtering
Matrix factorization formulation
Existing methods
2 Subsampled online dictionary learning
Dictionary learning – existing methods
Handling missing values efficiently
New algorithm
3 Results
Setting
Benchmarks
Parameter setting
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 21 / 28
Experiments
Validation : Test RMSE (rating prediction) vs CPU time
Baseline : Coordinate descent solver [Yu et al., 2012] for
min
D∈Rp×k
A∈Rk×n (i,j)∈Ω
(xij − di αj
)2
+ λ( D 2
F + A 2
F )
Fastest solver available apart from SGD — hyperparameters
↑ Our method has a learning rate with little influence
Datasets : Movielens, Netflix
Publicly available
Larger one in the industry...
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 22 / 28
Results
Scalable algorithm: speed-up improves with size
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 23 / 28
Performance
Dataset Test RMSE Convergence time Speed
CD SODL CD SODL -up
ML 1M 0.872 0.866 6 s 8 s ×0.75
ML 10M 0.802 0.799 223 s 60 s ×3.7
NF (140M) 0.938 0.934 1714 s 256 s ×6.8
Outperform coordinate descent beyond 10M ratings
Same prediction performance
Speed-up 6.8× on Netflix
Simple model: RMSE is not state-of-the-art
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 24 / 28
Robustness to learning rate
Learning rate in algorithm to be set in [0.75, 1] (← theory)
In practice: Just set it in [0.8, 1]
1 10 40Epoch
0.80
0.81
0.82
0.83
0.84
0.85
0.86
0.87
RMSEontestset
Learning rate β0.75
0.78
0.81
0.83
0.86
0.89
0.92
0.94
0.97
1.00
MovieLens 10M
.1 1 10 20
0.93
0.94
0.95
0.96
0.97
0.98
0.99
Netflix
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 25 / 28
Conclusion
Take-home message
Online matrix factorization can be adapted
to handle missing value efficiently, with very
good performance in reccommender system
Mtxt
Stream
Ignore
p
n
1Algorithm usable in any rich model involving matrix factorization
Python package http://github.com/arthurmensch/modl
Article/slides at http://amensch.fr/publications
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 26 / 28
Conclusion
Take-home message
Online matrix factorization can be adapted
to handle missing value efficiently, with very
good performance in reccommender system
Mtxt
Stream
Ignore
p
n
1Algorithm usable in any rich model involving matrix factorization
Python package http://github.com/arthurmensch/modl
Article/slides at http://amensch.fr/publications
Questions ?
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 26 / 28
Appendix: Resting-state fMRI
Online dictionary learning
235 h run time
1 full epoch
10 h run time
1
24 epoch
Proposed method
10 h run time
1
2 epoch, reduction r=12
Qualitatively, usable maps are obtained 10× faster
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 27 / 28
Bibliography I
[Bell and Koren, 2007] Bell, R. M. and Koren, Y. (2007).
Lessons from the Netflix prize challenge.
ACM SIGKDD Explorations Newsletter, 9(2):75–79.
[Mairal et al., 2010] Mairal, J., Bach, F., Ponce, J., and Sapiro, G. (2010).
Online learning for matrix factorization and sparse coding.
The Journal of Machine Learning Research, 11:19–60.
[Olshausen and Field, 1997] Olshausen, B. A. and Field, D. J. (1997).
Sparse coding with an overcomplete basis set: A strategy employed by V1?
Vision Research, 37(23):3311–3325.
[Yu et al., 2012] Yu, H.-F., Hsieh, C.-J., and Dhillon, I. (2012).
Scalable coordinate descent approaches to parallel matrix factorization for
recommender systems.
In Proceedings of the International Conference on Data Mining, pages
765–774. IEEE.
Arthur Mensch Dictionary Learning for Massive Matrix Factorization 28 / 28

More Related Content

What's hot

Max Entropy
Max EntropyMax Entropy
Max Entropyjianingy
 
Machine learning
Machine learningMachine learning
Machine learningShreyas G S
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learningmooopan
 
Additive model and boosting tree
Additive model and boosting treeAdditive model and boosting tree
Additive model and boosting treeDong Guo
 
MS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmMS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmKaniska Mandal
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Toru Fujino
 
Learning to Reconstruct
Learning to ReconstructLearning to Reconstruct
Learning to ReconstructJonas Adler
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsKen Kuroki
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Random Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application ExamplesRandom Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application ExamplesFörderverein Technische Fakultät
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBenjamin Bengfort
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnGilles Louppe
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryRikiya Takahashi
 

What's hot (18)

Max Entropy
Max EntropyMax Entropy
Max Entropy
 
Machine learning
Machine learningMachine learning
Machine learning
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
 
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
 
Additive model and boosting tree
Additive model and boosting treeAdditive model and boosting tree
Additive model and boosting tree
 
MS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmMS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning Algorithm
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
 
Learning to Reconstruct
Learning to ReconstructLearning to Reconstruct
Learning to Reconstruct
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and Physics
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Machine learning
Machine learningMachine learning
Machine learning
 
Random Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application ExamplesRandom Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application Examples
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix Factorization
 
Estimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample SetsEstimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample Sets
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game Theory
 

Viewers also liked

CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...recsysfr
 
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - TinycluesPredictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinycluesrecsysfr
 
Sequential Learning in the Position-Based Model
Sequential Learning in the Position-Based ModelSequential Learning in the Position-Based Model
Sequential Learning in the Position-Based Modelrecsysfr
 
Recommendation @ Meetic
Recommendation @ MeeticRecommendation @ Meetic
Recommendation @ Meeticrecsysfr
 
What can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and loveWhat can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and loverecsysfr
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Informationrecsysfr
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentationrecsysfr
 
Injecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender systemInjecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender systemrecsysfr
 
Pulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at ScalePulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at Scalerecsysfr
 
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...recsysfr
 

Viewers also liked (10)

CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
 
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - TinycluesPredictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
 
Sequential Learning in the Position-Based Model
Sequential Learning in the Position-Based ModelSequential Learning in the Position-Based Model
Sequential Learning in the Position-Based Model
 
Recommendation @ Meetic
Recommendation @ MeeticRecommendation @ Meetic
Recommendation @ Meetic
 
What can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and loveWhat can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and love
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentation
 
Injecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender systemInjecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender system
 
Pulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at ScalePulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at Scale
 
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
 

Similar to Dictionary Learning for Massive Matrix Factorization

Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationArthur Mensch
 
Matrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsMatrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsDmitriy Selivanov
 
Talk icml
Talk icmlTalk icml
Talk icmlBo Li
 
Jörg Stelzer
Jörg StelzerJörg Stelzer
Jörg Stelzerbutest
 
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...AIST
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...Yuko Kuroki (黒木祐子)
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIJack Clark
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Gael Varoquaux
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1arogozhnikov
 
SURF 2012 Final Report(1)
SURF 2012 Final Report(1)SURF 2012 Final Report(1)
SURF 2012 Final Report(1)Eric Zhang
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Meanstthonet
 
slides-defense-jie
slides-defense-jieslides-defense-jie
slides-defense-jiejie ren
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELJoel Falcou
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...South Tyrol Free Software Conference
 

Similar to Dictionary Learning for Massive Matrix Factorization (20)

Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Matrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsMatrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender Systems
 
Talk icml
Talk icmlTalk icml
Talk icml
 
Jörg Stelzer
Jörg StelzerJörg Stelzer
Jörg Stelzer
 
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
 
SURF 2012 Final Report(1)
SURF 2012 Final Report(1)SURF 2012 Final Report(1)
SURF 2012 Final Report(1)
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Means
 
slides-defense-jie
slides-defense-jieslides-defense-jie
slides-defense-jie
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel  write Python code, get Fortran ...
SFSCON23 - Emily Bourne Yaman Güçlü - Pyccel write Python code, get Fortran ...
 
Review_Cibe Sridharan
Review_Cibe SridharanReview_Cibe Sridharan
Review_Cibe Sridharan
 
AlgorithmAnalysis2.ppt
AlgorithmAnalysis2.pptAlgorithmAnalysis2.ppt
AlgorithmAnalysis2.ppt
 

More from recsysfr

Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty FiveMulti Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Fiverecsysfr
 
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...recsysfr
 
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...recsysfr
 
Recommendations @ Rakuten Group
Recommendations @ Rakuten GroupRecommendations @ Rakuten Group
Recommendations @ Rakuten Grouprecsysfr
 
Recommender systems
Recommender systemsRecommender systems
Recommender systemsrecsysfr
 
Recommendation @Deezer
Recommendation @DeezerRecommendation @Deezer
Recommendation @Deezerrecsysfr
 
Flexible recommender systems based on graphs
Flexible recommender systems based on graphsFlexible recommender systems based on graphs
Flexible recommender systems based on graphsrecsysfr
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsrecsysfr
 
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?recsysfr
 
Recommendation @ PriceMinister-Rakuten - Road to personalization
Recommendation @ PriceMinister-Rakuten - Road to personalizationRecommendation @ PriceMinister-Rakuten - Road to personalization
Recommendation @ PriceMinister-Rakuten - Road to personalizationrecsysfr
 
Rakuten Institute of Technology Paris
Rakuten Institute of Technology ParisRakuten Institute of Technology Paris
Rakuten Institute of Technology Parisrecsysfr
 
Tailor-made personalization and recommendation - Sailendra
Tailor-made personalization and recommendation - SailendraTailor-made personalization and recommendation - Sailendra
Tailor-made personalization and recommendation - Sailendrarecsysfr
 
New tools from the bandit literature to improve A/B Testing
New tools from the bandit literature to improve A/B TestingNew tools from the bandit literature to improve A/B Testing
New tools from the bandit literature to improve A/B Testingrecsysfr
 
Story of the algorithms behind Deezer Flow
Story of the algorithms behind Deezer FlowStory of the algorithms behind Deezer Flow
Story of the algorithms behind Deezer Flowrecsysfr
 

More from recsysfr (14)

Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty FiveMulti Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five
 
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
 
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
 
Recommendations @ Rakuten Group
Recommendations @ Rakuten GroupRecommendations @ Rakuten Group
Recommendations @ Rakuten Group
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Recommendation @Deezer
Recommendation @DeezerRecommendation @Deezer
Recommendation @Deezer
 
Flexible recommender systems based on graphs
Flexible recommender systems based on graphsFlexible recommender systems based on graphs
Flexible recommender systems based on graphs
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratings
 
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
 
Recommendation @ PriceMinister-Rakuten - Road to personalization
Recommendation @ PriceMinister-Rakuten - Road to personalizationRecommendation @ PriceMinister-Rakuten - Road to personalization
Recommendation @ PriceMinister-Rakuten - Road to personalization
 
Rakuten Institute of Technology Paris
Rakuten Institute of Technology ParisRakuten Institute of Technology Paris
Rakuten Institute of Technology Paris
 
Tailor-made personalization and recommendation - Sailendra
Tailor-made personalization and recommendation - SailendraTailor-made personalization and recommendation - Sailendra
Tailor-made personalization and recommendation - Sailendra
 
New tools from the bandit literature to improve A/B Testing
New tools from the bandit literature to improve A/B TestingNew tools from the bandit literature to improve A/B Testing
New tools from the bandit literature to improve A/B Testing
 
Story of the algorithms behind Deezer Flow
Story of the algorithms behind Deezer FlowStory of the algorithms behind Deezer Flow
Story of the algorithms behind Deezer Flow
 

Recently uploaded

20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsMonica Sydney
 
75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptxAsmae Rabhi
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge GraphsEleniIlkou
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样ayvbos
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsMonica Sydney
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdfMatthew Sinclair
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdfMatthew Sinclair
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsMonica Sydney
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查ydyuyu
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Roommeghakumariji156
 
Microsoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftMicrosoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftAanSulistiyo
 

Recently uploaded (20)

20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
 
75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 
Microsoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftMicrosoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck Microsoft
 

Dictionary Learning for Massive Matrix Factorization

  • 1. Dictionary Learning for Massive Matrix Factorization Arthur Mensch, Julien Mairal Ga¨el Varoquaux, Bertrand Thirion Inria Parietal, Inria Thoth October 6, 2016
  • 2. Introduction Why am I here ? Inria Parietal: machine learning for neuro-imaging (fMRI data) Matrix factorization: major ingredient in fMRI analysis Very large datasets (2 TB): we designed faster algorithms These algorithms can be used in collaborative filtering D AX Voxels Time = k spatial maps Time x 1 Work presented at ICML 2016 Arthur Mensch Dictionary Learning for Massive Matrix Factorization 1 / 28
  • 3. D´eroul´e 1 Matrix factorization for recommender systems Collaborative filtering Matrix factorization formulation Existing methods 2 Subsampled online dictionary learning Dictionary learning – existing methods Handling missing values efficiently New algorithm 3 Results Setting Benchmarks Parameter setting Arthur Mensch Dictionary Learning for Massive Matrix Factorization 2 / 28
  • 4. D´eroul´e 1 Matrix factorization for recommender systems Collaborative filtering Matrix factorization formulation Existing methods 2 Subsampled online dictionary learning Dictionary learning – existing methods Handling missing values efficiently New algorithm 3 Results Setting Benchmarks Parameter setting Arthur Mensch Dictionary Learning for Massive Matrix Factorization 3 / 28
  • 5. Collaborative filtering Collaborative platform n users rate a fraction of p items e.g movies, restaurants Estimate ratings for recommendation Use the ratings of other users for recommendation Arthur Mensch Dictionary Learning for Massive Matrix Factorization 4 / 28
  • 6. How to predict ratings ? Credit: [Bell and Koren, 2007] Joe like We were soldiers, Black Hawk down. Bob and Alice like the same films, and also like Saving private Ryan. Joe should watch Saving private Ryan, because all of them indeed likes war films. Need to uncover topics in items Arthur Mensch Dictionary Learning for Massive Matrix Factorization 5 / 28
  • 7. Predicting rate with scalar products Embeddings to model the existence of genre/category/topics Representative vectors for users and items: (αj )1≤j≤n, (di )1≤i≤p ∈ Rk q-th coefficient of di , αj = affinity with the “topic” q xij αj di k topics di αj 1 Arthur Mensch Dictionary Learning for Massive Matrix Factorization 6 / 28
  • 8. Predicting rate with scalar products Embeddings to model the existence of genre/category/topics Representative vectors for users and items: (αj )1≤j≤n, (di )1≤i≤p ∈ Rk q-th coefficient of di , αj = affinity with the “topic” q Ratings xij (item i, user j): xij = di αj ( + biases) = Common affinity for topics xij αj di k topics di αj 1 Arthur Mensch Dictionary Learning for Massive Matrix Factorization 6 / 28
  • 9. Predicting rate with scalar products Embeddings to model the existence of genre/category/topics Representative vectors for users and items: (αj )1≤j≤n, (di )1≤i≤p ∈ Rk q-th coefficient of di , αj = affinity with the “topic” q Ratings xij (item i, user j): xij = di αj ( + biases) = Common affinity for topics xij αj di k topics di αj 1 Learning problem: estimate D and A with known ratings Arthur Mensch Dictionary Learning for Massive Matrix Factorization 6 / 28
  • 10. Matrix factorization 1Ω X AD p n k n = 1 X ∈ Rp×n ≈ DA ∈ Rp×k × Rk×n Constraints / penalty on factors D and A We only observe 1Ω X — Ω set of ratings provided by users Recommender systems : millions of users, millions of items How to scale matrix factorization to very large datasets ? Arthur Mensch Dictionary Learning for Massive Matrix Factorization 7 / 28
  • 11. Formalism Finding representation in Rk for items and users: min D∈Rp×k A∈Rk×n (i,j)∈Ω (xij − di αj )2 + λ( D 2 F + A 2 F ) = 1Ω (X − DA) 2 2 + λ( D 2 F + A 2 F ) 1Ω set of knownratings 2 reconstruction loss — 2 penalty for generalization Existing methods Alternated minimization Stochastic gradient descent Arthur Mensch Dictionary Learning for Massive Matrix Factorization 8 / 28
  • 12. Existing methods Alternated minimization Minimize over A, D: alternate between D = min D∈Rp×k (i,j)∈Ω (xij − di αj )2 + λ D 2 F A = min A∈Rk×n (i,j)∈Ω (xij − di αj )2 + λ A 2 F No hyperparameters Slow and memory expensive: use all ratings at each iteration a.k.a. coordinate descent (variation in parameter update order) Arthur Mensch Dictionary Learning for Massive Matrix Factorization 9 / 28
  • 13. Existing methods Stochastic gradient descent min A,D (i,j)∈Ω fij (A, B) def = (xij − di αj )2 + 1 cj λ αj 2 2 + 1 ci λ di 2 2 Gradient step for each rating: (At, Dt) ← (At−1, Dt−1)− 1 ct (A,D)fij (At−1, Dt−1) Fast and memory efficient – won the Netflix prize Very sensitive to step sizes (ct) – need to cross-validate Arthur Mensch Dictionary Learning for Massive Matrix Factorization 10 / 28
  • 14. Towards a new algorithm Best of both worlds ? Fast and memory efficient algorithm Little sensitive to hyperparameter setting Subsampled online dictionary learning Builds upon the online dictionary learning algorithm popular in computer vision and interpretable learning (fMRI) Adapt it to handle missing values efficiently Arthur Mensch Dictionary Learning for Massive Matrix Factorization 11 / 28
  • 15. D´eroul´e 1 Matrix factorization for recommender systems Collaborative filtering Matrix factorization formulation Existing methods 2 Subsampled online dictionary learning Dictionary learning – existing methods Handling missing values efficiently New algorithm 3 Results Setting Benchmarks Parameter setting Arthur Mensch Dictionary Learning for Massive Matrix Factorization 12 / 28
  • 16. Dictionary learning Recall: recommender system formalism Non-masked matrix factorization with 2 penalty: min D∈Rp×k A∈Rk×n n j=1 (xj − D αj )2 + λ( D 2 F + A 2 F ) Penalties can be richer, and made into constraints Dictionary learning Learn the left side factor [Olshausen and Field, 1997] min D∈C n j=1 xj −Dαj 2 2 +λΩ(αj ) αj = argmin α∈Rk xi −Dα 2 2 +λΩ(α) Naive approach: alternated minimization Arthur Mensch Dictionary Learning for Massive Matrix Factorization 13 / 28
  • 17. Online dictionary learning [Mairal et al., 2010] At iteration t, select xt in {xj }j (user ratings), improve D Single iteration complexity ∝ sample dimension O(p) (Dt)t converges in a few epochs (one for large n) xt αtD p n k n =Stream 1 Very efficient in computer vision / networks / fMRI / hyperspectral images Can we use it efficiently for recommender systems ? Arthur Mensch Dictionary Learning for Massive Matrix Factorization 14 / 28
  • 18. In short: Handling missing values X p n xt Steam Handle large n n Handle missing values Online → online + partial Batch → online Mtxt Stream Ignore Unknown Unaccessed 1 Leverage streaming + partial access to samples Arthur Mensch Dictionary Learning for Massive Matrix Factorization 15 / 28
  • 19. In detail: online dictionary learning Objective function involves latent codes (right side factor) min D∈C 1 t t i=1 xi − Dα∗ i (D) 2 2, α∗ i (D) = argmin α 1 2 xi − Dα 2 2 + λΩ(α) Replace latent codes by codes computed with old dictionaries Build an upper-bounding surrogate function min 1 t t i=1 xi −Dαi 2 2 αi = argmin α 1 2 xi −Di−1α 2 2+λΩ(α) Minimize surrogate — updateable online at low cost Arthur Mensch Dictionary Learning for Massive Matrix Factorization 16 / 28
  • 20. In detail: online dictionary learning Algorithm outline 1 Compute code αt = argmin α∈Rk xt − Dt−1α 2 2 + λΩ(αt) 2 Update the surrogate function gt = 1 t t i=1 xi − Dαi 2 2 = Tr ( 1 2 D DAt − D Bt) At = (1 − 1 t )At−1 + 1 t αtαt Bt = (1 − 1 t )Bt−1 + 1 t xtαt 3 Minimize surrogate Dt = argmin D∈C gt(D) gt = DAt − Bt Arthur Mensch Dictionary Learning for Massive Matrix Factorization 17 / 28
  • 21. In detail: online dictionary learning Algorithm outline 1 Compute code – xt → complexity depends on p αt = argmin α∈Rk xt − Dt−1α 2 2 + λΩ(αt) 2 Update the surrogate function – Complexity in O(p) gt = 1 t t i=1 xi − Dαi 2 2 = Tr ( 1 2 D DAt − D Bt) At = (1 − 1 t )At−1 + 1 t αtαt Bt = (1 − 1 t )Bt−1 + 1 t xtαt 3 Minimize surrogate – Complexity in O(p) Dt = argmin D∈C gt(D) gt = DAt − Bt Arthur Mensch Dictionary Learning for Massive Matrix Factorization 17 / 28
  • 22. Specification for a new algorithm Mtxt Stream Ignore p n 1 Constrained : use only known ratings from Ω Efficient: single iteration in O(s), # of ratings provided by user t Principled: follows the online matrix factorization algorithm as much as possible Arthur Mensch Dictionary Learning for Massive Matrix Factorization 18 / 28
  • 23. Missing values in practice Data stream: (xt)t → masked (Mtxt)t = ratings from user t Dimension: p (all items) → s (rated items) Use only Mtxt in algorithm computation → complexity in O(s) Mtxt Stream Ignore p n 1 Arthur Mensch Dictionary Learning for Massive Matrix Factorization 19 / 28
  • 24. Missing values in practice Data stream: (xt)t → masked (Mtxt)t = ratings from user t Dimension: p (all items) → s (rated items) Use only Mtxt in algorithm computation → complexity in O(s) Mtxt Stream Ignore p n 1 Adaptation to make Modify all parts of the algorithm to obtain O(s) complexity 1 Code computation 2 Surrogate update 3 Surrogate minimization Arthur Mensch Dictionary Learning for Massive Matrix Factorization 19 / 28
  • 25. Subsampled online dictionary learning Check out paper ! Original online MF 1 Code computation αt = argmin α∈Rk xt − Dt−1α 2 2 + λΩ(αt ) 2 Surrogate aggregation At = 1 t t i=1 αi αi Bt = Bt−1 + 1 t (xt αt − Bt−1) 3 Surrogate minimization Dj ← p⊥ Cr j (Dj − 1 (At )j,j (DAj t −Bj t )) Our algorithm 1 Code computation: masked loss αt = argmin α∈Rk Mt (xt − Dt−1α) 2 2 + λ rk Mt p Ω(αt ) 2 Surrogate aggregation At = 1 t t i=1 αi αi Bt = Bt−1 + 1 t i=1 Mi (Mt xt αt − Mt Bt−1) 3 Surrogate minimization Mt Dj ← p⊥ Cj (Mt Dj − 1 (At )j,j Mt (D(Aj t − (Bj t )) Arthur Mensch Dictionary Learning for Massive Matrix Factorization 20 / 28
  • 26. D´eroul´e 1 Matrix factorization for recommender systems Collaborative filtering Matrix factorization formulation Existing methods 2 Subsampled online dictionary learning Dictionary learning – existing methods Handling missing values efficiently New algorithm 3 Results Setting Benchmarks Parameter setting Arthur Mensch Dictionary Learning for Massive Matrix Factorization 21 / 28
  • 27. Experiments Validation : Test RMSE (rating prediction) vs CPU time Baseline : Coordinate descent solver [Yu et al., 2012] for min D∈Rp×k A∈Rk×n (i,j)∈Ω (xij − di αj )2 + λ( D 2 F + A 2 F ) Fastest solver available apart from SGD — hyperparameters ↑ Our method has a learning rate with little influence Datasets : Movielens, Netflix Publicly available Larger one in the industry... Arthur Mensch Dictionary Learning for Massive Matrix Factorization 22 / 28
  • 28. Results Scalable algorithm: speed-up improves with size Arthur Mensch Dictionary Learning for Massive Matrix Factorization 23 / 28
  • 29. Performance Dataset Test RMSE Convergence time Speed CD SODL CD SODL -up ML 1M 0.872 0.866 6 s 8 s ×0.75 ML 10M 0.802 0.799 223 s 60 s ×3.7 NF (140M) 0.938 0.934 1714 s 256 s ×6.8 Outperform coordinate descent beyond 10M ratings Same prediction performance Speed-up 6.8× on Netflix Simple model: RMSE is not state-of-the-art Arthur Mensch Dictionary Learning for Massive Matrix Factorization 24 / 28
  • 30. Robustness to learning rate Learning rate in algorithm to be set in [0.75, 1] (← theory) In practice: Just set it in [0.8, 1] 1 10 40Epoch 0.80 0.81 0.82 0.83 0.84 0.85 0.86 0.87 RMSEontestset Learning rate β0.75 0.78 0.81 0.83 0.86 0.89 0.92 0.94 0.97 1.00 MovieLens 10M .1 1 10 20 0.93 0.94 0.95 0.96 0.97 0.98 0.99 Netflix Arthur Mensch Dictionary Learning for Massive Matrix Factorization 25 / 28
  • 31. Conclusion Take-home message Online matrix factorization can be adapted to handle missing value efficiently, with very good performance in reccommender system Mtxt Stream Ignore p n 1Algorithm usable in any rich model involving matrix factorization Python package http://github.com/arthurmensch/modl Article/slides at http://amensch.fr/publications Arthur Mensch Dictionary Learning for Massive Matrix Factorization 26 / 28
  • 32. Conclusion Take-home message Online matrix factorization can be adapted to handle missing value efficiently, with very good performance in reccommender system Mtxt Stream Ignore p n 1Algorithm usable in any rich model involving matrix factorization Python package http://github.com/arthurmensch/modl Article/slides at http://amensch.fr/publications Questions ? Arthur Mensch Dictionary Learning for Massive Matrix Factorization 26 / 28
  • 33. Appendix: Resting-state fMRI Online dictionary learning 235 h run time 1 full epoch 10 h run time 1 24 epoch Proposed method 10 h run time 1 2 epoch, reduction r=12 Qualitatively, usable maps are obtained 10× faster Arthur Mensch Dictionary Learning for Massive Matrix Factorization 27 / 28
  • 34. Bibliography I [Bell and Koren, 2007] Bell, R. M. and Koren, Y. (2007). Lessons from the Netflix prize challenge. ACM SIGKDD Explorations Newsletter, 9(2):75–79. [Mairal et al., 2010] Mairal, J., Bach, F., Ponce, J., and Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. The Journal of Machine Learning Research, 11:19–60. [Olshausen and Field, 1997] Olshausen, B. A. and Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37(23):3311–3325. [Yu et al., 2012] Yu, H.-F., Hsieh, C.-J., and Dhillon, I. (2012). Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In Proceedings of the International Conference on Data Mining, pages 765–774. IEEE. Arthur Mensch Dictionary Learning for Massive Matrix Factorization 28 / 28