SlideShare a Scribd company logo
1 of 37
Download to read offline
Matrix Factorizations for Recommender Systems
Dmitriy Selivanov
selivanov.dmitriy@gmail.com
2017-11-16
Recommender systems are everywhere
Figure 1:
Recommender systems are everywhere
Figure 2:
Recommender systems are everywhere
Figure 3:
Recommender systems are everywhere
Figure 4:
Goals
Propose “relevant” items to customers
Retention
Exploration
Up-sale
Personalized offers
recommended items for a customer given history of activities (transactions, browsing
history, favourites)
Similar items
substitutions
bundles - frequently bought together
. . .
Live demo
Dataset - LastFM-360K:
360k users
160k artists
17M observations
sparsity - 0.9999999
Explicit feedback
Ratings, likes/dislikes, purchases:
cleaner data
smaller
hard to collect
RMSE2
=
1
D u,i∈D
(rui − ˆrui )2
Netflix prize
~ 480k users, 18k movies, 100m ratings
sparsity ~ 90%
goal is to reduce RMSE by 10% - from 0.9514 to 0.8563
Implicit feedback
noisy feedback (click, likes, purchases, search, . . . )
much easier to collect
wider user/item coverage
usually sparsity > 99.9%
One-Class Collaborative Filtering
observed entries are positive preferences
should have high confidence
missed entries in matrix are mix of negative preferences and positive preferences
consider them as negative with low confidence
we cannot really distinguish that user did not click a banner because of a lack of
interest or lack of awareness
Evaluation
Recap: we only care about how to produce small set of highly relevant items.
RMSE is bad metrics - very weak connection to business goals.
Only interested about relevance precision of retreived items:
space on the screen is limited
only order matters - most relevant items should be in top
Ranking - Mean average precision
AveragePrecision =
n
k=1
(P(k)×rel(k))
number of relevant documents
## index relevant precision_at_k
## 1: 1 0 0.0000000
## 2: 2 0 0.0000000
## 3: 3 1 0.3333333
## 4: 4 0 0.2500000
## 5: 5 0 0.2000000
map@5 = 0.1566667
Ranking - Normalized Discounted Cumulative Gain
Intuition is the same as for MAP@K, but also takes into account value of relevance:
DCGp =
p
i=1
2reli − 1
log2(i + 1)
nDCGp =
DCGp
IDCGp
IDCGp =
|REL|
i=1
2reli − 1
log2(i + 1)
Approaches
Content based
good for cold start
not personalized
Collaborative filtering
vanilla collaborative fitlering
matrix factorizations
. . .
Hybrid and context aware recommender systems
best of two worlds
Focus today
WRMF (Weighted Regularized Matrix Factorization) - Collaborative Filtering for
Implicit Feedback Datasets (2008)
efficient learning with accelerated approximate Alternating Least Squares
inference time
Linear-FLow - Practical Linear Models for Large-Scale One-Class Collaborative
Filtering (2016)
efficient truncated SVD
cheap cross-validation with full path regularization
Matrix Factorizations
Users can be described by small number of latent factors puk
Items can be described by small number of latent factors qki
Sparse data
items
users
Low rank matrix factorization
R = P × Q
factors
users
items
factors
Reconstruction
items
users
items
users
Truncated SVD
Take k largest singular values:
X ≈ UkDkV T
k
- Xk ∈ Rm∗n - Uk, V - columns are orthonormal bases (dot product of any 2 columns is
zero, unit norm) - Dk - matrix with singular values on diagonal
Truncated SVD is the best rank k approximation of the matrix X in terms of
Frobenius norm:
||X − UkDkV T
k ||F
P = Uk Dk
Q = DkV T
k
Issue with truncated SVD for “explicit” feedback
Optimal in terms of Frobenius norm - takes into account zeros in ratings -
RMSE =
1
users × items u∈users,i∈items
(rui − ˆrui )2
Overfits data
Objective = error only in “observed” ratings:
RMSE =
1
Observed u,i∈Observed
(rui − ˆrui )2
SVD-like matrix factorization with ALS
J =
u,i∈Observed
(rui − pu × qi )2
+ λ(||Q2
|| + ||P2
||)
Given Q fixed solve for p:
min
i∈Observed
(ri − qi × P)2
+ λ
u
j=1
p2
j
Given P fixed solve for q:
min
u∈Observed
(ru − pu × Q)2
+ λ
i
j=1
q2
j
Ridge regression: P = (QT Q + λI)−1QT r, Q = (PT P + λI)−1PT r
“Collaborative Filtering for Implicit Feedback Datasets”
WRMF - Weighted Regularized Matrix Factorization
“Default” approach
Proposed in 2008, but still widely used in industry (even at youtube)
several high-quality open-source implementations
J =
u,i
Cui (Pui − XuYi )2
+ λ(||X||F + ||Y ||F )
Preferences - binary
Pij =
1 if Rij > 0
0 otherwise
Confidence - Cui = 1 + f (Rui )
Alternating Least Squares for implicit feedback
For fixed Y :
dL/dxu = −2
i=item
cui (pui − xT
u yi )yi + 2λxu =
−2
i=item
cui (pui − yT
i xu)yi + 2λxu =
−2Y T
Cu
p(u) + 2Y T
Cu
Yxu + 2λxu
Setting dL/dxu = 0 for optimal solution gives us (Y T CuY + λI)xu = Y T Cup(u)
xu can be obtained by solving system of linear equations:
xu = solve(Y T
Cu
Y + λI, Y T
Cu
p(u))
Alternating Least Squares for implicit feedback
Similarly for fixed X:
dL/dyi = −2XT Ci p(i) + 2XT Ci Yyi + 2λyi
yi = solve(XT Ci X + λI, XT Ci p(i))
Another optimization:
XT Ci X = XT X + XT (Ci − I)X
Y T CuY = Y T Y + Y T (Cu − I)Y
XT X and Y T Y can be precomputed
Accelerated Approximate Alternating Least Squares
yi = solve(XT Ci X + λI, XT Ci p(i))
Iterative methods
Conjugate Gradient
Coordinate Descend
Fixed number of steps of (usually 3-4 is enough):
Inference time
How to make recommendations for new users?
There are no user embeddings since users are not in original matrix!
Inference time
Make one step on ALS with fixed item embeddings matrix => get new user embeddings:
given Y fixed, Cnew - new user-item interactions confidence
xunew = solve(Y T Cunew Y + λI, Y T Cunew p(unew ))
scores = Xnew Y T
WRMF Implementations
python implicit - implemets Conjugate Gradient. With GPU support recently!
R reco - implemets Conjugate Gradient
Spark ALS
Quora qmf
Google tensorflow
*titles are clickable
Linear-Flow
Idea is to learn item-item similarity matrix W from the data.
First
min J = ||X − XWk||F + λ||Wk||F
With constraint:
rank(W ) ≤ k
Linear-Flow observations
1. Whithout L2 regularization optimal solution is Wk = QkQT
k where
SVDk(X) = PkΣkQT
k
2. Whithout rank(W ) ≤ k optimal solution is just solution for ridge regression:
W = (XT X + λI)−1XT X - infeasible.
Linear-Flow reparametrization
SVDk(X) = PkΣkQT
k
Let W = QkY :
argmin(Y ) : ||X − XQkY ||F + λ||QkY ||F
Motivation
λ = 0 => W = QkQT
k and also soliton for current problem Y = QT
k
Linear-Flow closed-form solution
Notice that if Qk orthogogal then ||QkY ||F = ||Y ||F
Solve ||X − XQkY ||F + λ||Y ||F
Simple ridge regression with close form solution
Y = (QT
k XT
XQk + λI)−1
QT
k XT
X
Very cheap inversion of the matrix of rank k!
Linear-Flow hassle-free cross-validation
Y = (QT
k XT
XQk + λI)−1
QT
k XT
X
How to find lamda with cross-validation?
pre-compute Z = QT
k XT X so Y = (ZQk + λI)−1Z -
pre-compute ZQk
notice that value of lambda affects only diagonal of ZQk
generate sequence of lambda (say of length 50) based on min/max diagonal values
solving 50 rigde regression of a small rank is super-fast
Linear-Flow hassle-free cross-validation
Figure 7:
Suggestions
start simple - SVD, WRMF
design proper cross-validation - both objective and data split
think about how to incorporate business logic (for example how to exclude
something)
use single machine implementations
think about inference time
don’t waste time with libraries/articles/blogposts wich demonstrate MF with dense
matrices
Questions?
http://dsnotes.com/tags/recommender-systems/
https://github.com/dselivanov/reco
Contacts:
selivanov.dmitriy@gmail.com
https://github.com/dselivanov
https://www.linkedin.com/in/dselivanov1

More Related Content

What's hot

[論文紹介] LSTM (LONG SHORT-TERM MEMORY)
[論文紹介] LSTM (LONG SHORT-TERM MEMORY)[論文紹介] LSTM (LONG SHORT-TERM MEMORY)
[論文紹介] LSTM (LONG SHORT-TERM MEMORY)Tomoyuki Hioki
 
Bayes Independence Test - HSIC と性能を比較する-
Bayes Independence Test - HSIC と性能を比較する-Bayes Independence Test - HSIC と性能を比較する-
Bayes Independence Test - HSIC と性能を比較する-Joe Suzuki
 
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリングmlm_kansai
 
強化学習@PyData.Tokyo
強化学習@PyData.Tokyo強化学習@PyData.Tokyo
強化学習@PyData.TokyoNaoto Yoshida
 
Android Malware 2020 (CCCS-CIC-AndMal-2020)
Android Malware 2020 (CCCS-CIC-AndMal-2020)Android Malware 2020 (CCCS-CIC-AndMal-2020)
Android Malware 2020 (CCCS-CIC-AndMal-2020)Indraneel Dabhade
 
機械学習の未解決課題
機械学習の未解決課題機械学習の未解決課題
機械学習の未解決課題Hiroyuki Masuda
 
知識ベース型推薦の解説
知識ベース型推薦の解説知識ベース型推薦の解説
知識ベース型推薦の解説Takahiro Kubo
 
Neural Processes Family
Neural Processes FamilyNeural Processes Family
Neural Processes FamilyKota Matsui
 
분산 강화학습 논문(DeepMind IMPALA) 구현
분산 강화학습 논문(DeepMind IMPALA) 구현분산 강화학습 논문(DeepMind IMPALA) 구현
분산 강화학습 논문(DeepMind IMPALA) 구현정주 김
 
3分でわかる多項分布とディリクレ分布
3分でわかる多項分布とディリクレ分布3分でわかる多項分布とディリクレ分布
3分でわかる多項分布とディリクレ分布Junya Saito
 
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -Project Samurai
 
圏論とHaskellは仲良し
圏論とHaskellは仲良し圏論とHaskellは仲良し
圏論とHaskellは仲良しohmori
 
Kaggle Avito Demand Prediction Challenge 9th Place Solution
Kaggle Avito Demand Prediction Challenge 9th Place SolutionKaggle Avito Demand Prediction Challenge 9th Place Solution
Kaggle Avito Demand Prediction Challenge 9th Place SolutionJin Zhan
 
Imitation learning tutorial
Imitation learning tutorialImitation learning tutorial
Imitation learning tutorialYisong Yue
 
ウィナーフィルタと適応フィルタ
ウィナーフィルタと適応フィルタウィナーフィルタと適応フィルタ
ウィナーフィルタと適応フィルタToshihisa Tanaka
 
トピックモデルの基礎と応用
トピックモデルの基礎と応用トピックモデルの基礎と応用
トピックモデルの基礎と応用Tomonari Masada
 
[DL輪読会]Factorized Variational Autoencoders for Modeling Audience Reactions to...
[DL輪読会]Factorized Variational Autoencoders for Modeling Audience Reactions to...[DL輪読会]Factorized Variational Autoencoders for Modeling Audience Reactions to...
[DL輪読会]Factorized Variational Autoencoders for Modeling Audience Reactions to...Deep Learning JP
 
機械学習を民主化する取り組み
機械学習を民主化する取り組み機械学習を民主化する取り組み
機械学習を民主化する取り組みYoshitaka Ushiku
 
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learningゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement LearningPreferred Networks
 
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence ModelingDeep Learning JP
 

What's hot (20)

[論文紹介] LSTM (LONG SHORT-TERM MEMORY)
[論文紹介] LSTM (LONG SHORT-TERM MEMORY)[論文紹介] LSTM (LONG SHORT-TERM MEMORY)
[論文紹介] LSTM (LONG SHORT-TERM MEMORY)
 
Bayes Independence Test - HSIC と性能を比較する-
Bayes Independence Test - HSIC と性能を比較する-Bayes Independence Test - HSIC と性能を比較する-
Bayes Independence Test - HSIC と性能を比較する-
 
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
 
強化学習@PyData.Tokyo
強化学習@PyData.Tokyo強化学習@PyData.Tokyo
強化学習@PyData.Tokyo
 
Android Malware 2020 (CCCS-CIC-AndMal-2020)
Android Malware 2020 (CCCS-CIC-AndMal-2020)Android Malware 2020 (CCCS-CIC-AndMal-2020)
Android Malware 2020 (CCCS-CIC-AndMal-2020)
 
機械学習の未解決課題
機械学習の未解決課題機械学習の未解決課題
機械学習の未解決課題
 
知識ベース型推薦の解説
知識ベース型推薦の解説知識ベース型推薦の解説
知識ベース型推薦の解説
 
Neural Processes Family
Neural Processes FamilyNeural Processes Family
Neural Processes Family
 
분산 강화학습 논문(DeepMind IMPALA) 구현
분산 강화학습 논문(DeepMind IMPALA) 구현분산 강화학습 논문(DeepMind IMPALA) 구현
분산 강화학습 논문(DeepMind IMPALA) 구현
 
3分でわかる多項分布とディリクレ分布
3分でわかる多項分布とディリクレ分布3分でわかる多項分布とディリクレ分布
3分でわかる多項分布とディリクレ分布
 
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -
 
圏論とHaskellは仲良し
圏論とHaskellは仲良し圏論とHaskellは仲良し
圏論とHaskellは仲良し
 
Kaggle Avito Demand Prediction Challenge 9th Place Solution
Kaggle Avito Demand Prediction Challenge 9th Place SolutionKaggle Avito Demand Prediction Challenge 9th Place Solution
Kaggle Avito Demand Prediction Challenge 9th Place Solution
 
Imitation learning tutorial
Imitation learning tutorialImitation learning tutorial
Imitation learning tutorial
 
ウィナーフィルタと適応フィルタ
ウィナーフィルタと適応フィルタウィナーフィルタと適応フィルタ
ウィナーフィルタと適応フィルタ
 
トピックモデルの基礎と応用
トピックモデルの基礎と応用トピックモデルの基礎と応用
トピックモデルの基礎と応用
 
[DL輪読会]Factorized Variational Autoencoders for Modeling Audience Reactions to...
[DL輪読会]Factorized Variational Autoencoders for Modeling Audience Reactions to...[DL輪読会]Factorized Variational Autoencoders for Modeling Audience Reactions to...
[DL輪読会]Factorized Variational Autoencoders for Modeling Audience Reactions to...
 
機械学習を民主化する取り組み
機械学習を民主化する取り組み機械学習を民主化する取り組み
機械学習を民主化する取り組み
 
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learningゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement Learning
 
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
 

Viewers also liked

Recsys matrix-factorizations
Recsys matrix-factorizationsRecsys matrix-factorizations
Recsys matrix-factorizationsDmitriy Selivanov
 
Disorder And Tolerance In Distributed Systems At Scale
Disorder And Tolerance In Distributed Systems At ScaleDisorder And Tolerance In Distributed Systems At Scale
Disorder And Tolerance In Distributed Systems At ScaleHelena Edelson
 
Nelson: Rigorous Deployment for a Functional World
Nelson: Rigorous Deployment for a Functional WorldNelson: Rigorous Deployment for a Functional World
Nelson: Rigorous Deployment for a Functional WorldTimothy Perrett
 
Finding similar items in high dimensional spaces locality sensitive hashing
Finding similar items in high dimensional spaces  locality sensitive hashingFinding similar items in high dimensional spaces  locality sensitive hashing
Finding similar items in high dimensional spaces locality sensitive hashingDmitriy Selivanov
 
Return of the transaction king
Return of the transaction kingReturn of the transaction king
Return of the transaction kingRyan Knight
 
Analyzing Functional Programs
Analyzing Functional ProgramsAnalyzing Functional Programs
Analyzing Functional ProgramsDave Cleaver
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersAmazon Web Services
 
Pythonが動く仕組み(の概要)
Pythonが動く仕組み(の概要)Pythonが動く仕組み(の概要)
Pythonが動く仕組み(の概要)Yoshiaki Shibutani
 
JVM上で動くPython処理系実装のススメ
JVM上で動くPython処理系実装のススメJVM上で動くPython処理系実装のススメ
JVM上で動くPython処理系実装のススメYoshiaki Shibutani
 
機械学習のためのベイズ最適化入門
機械学習のためのベイズ最適化入門機械学習のためのベイズ最適化入門
機械学習のためのベイズ最適化入門hoxo_m
 
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践Yoshifumi Kawai
 

Viewers also liked (11)

Recsys matrix-factorizations
Recsys matrix-factorizationsRecsys matrix-factorizations
Recsys matrix-factorizations
 
Disorder And Tolerance In Distributed Systems At Scale
Disorder And Tolerance In Distributed Systems At ScaleDisorder And Tolerance In Distributed Systems At Scale
Disorder And Tolerance In Distributed Systems At Scale
 
Nelson: Rigorous Deployment for a Functional World
Nelson: Rigorous Deployment for a Functional WorldNelson: Rigorous Deployment for a Functional World
Nelson: Rigorous Deployment for a Functional World
 
Finding similar items in high dimensional spaces locality sensitive hashing
Finding similar items in high dimensional spaces  locality sensitive hashingFinding similar items in high dimensional spaces  locality sensitive hashing
Finding similar items in high dimensional spaces locality sensitive hashing
 
Return of the transaction king
Return of the transaction kingReturn of the transaction king
Return of the transaction king
 
Analyzing Functional Programs
Analyzing Functional ProgramsAnalyzing Functional Programs
Analyzing Functional Programs
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
Pythonが動く仕組み(の概要)
Pythonが動く仕組み(の概要)Pythonが動く仕組み(の概要)
Pythonが動く仕組み(の概要)
 
JVM上で動くPython処理系実装のススメ
JVM上で動くPython処理系実装のススメJVM上で動くPython処理系実装のススメ
JVM上で動くPython処理系実装のススメ
 
機械学習のためのベイズ最適化入門
機械学習のためのベイズ最適化入門機械学習のためのベイズ最適化入門
機械学習のためのベイズ最適化入門
 
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
 

Similar to Matrix Factorizations for Recommender Systems Explained

Massive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringMassive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringArthur Mensch
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorizationrecsysfr
 
Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03Rediet Moges
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Meanstthonet
 
Q-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeQ-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeMagdi Mohamed
 
Q-Metrics in Theory And Practice
Q-Metrics in Theory And PracticeQ-Metrics in Theory And Practice
Q-Metrics in Theory And Practiceguest3550292
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation systemKimikazu Kato
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackarogozhnikov
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Charles Martin
 
DS-MLR: Scaling Multinomial Logistic Regression via Hybrid Parallelism
DS-MLR: Scaling Multinomial Logistic Regression via Hybrid ParallelismDS-MLR: Scaling Multinomial Logistic Regression via Hybrid Parallelism
DS-MLR: Scaling Multinomial Logistic Regression via Hybrid ParallelismParameswaran Raman
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Universitat Politècnica de Catalunya
 
0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdfAlexanderLerch4
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIJack Clark
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdfanandsimple
 
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...Geoffrey Négiar
 
Lecture note4coordinatedescent
Lecture note4coordinatedescentLecture note4coordinatedescent
Lecture note4coordinatedescentXudong Sun
 
Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...Université de Liège (ULg)
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...JuanPabloCarbajal3
 

Similar to Matrix Factorizations for Recommender Systems Explained (20)

Massive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringMassive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filtering
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Means
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
Q-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeQ-Metrics in Theory and Practice
Q-Metrics in Theory and Practice
 
Q-Metrics in Theory And Practice
Q-Metrics in Theory And PracticeQ-Metrics in Theory And Practice
Q-Metrics in Theory And Practice
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation system
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic track
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
 
DS-MLR: Scaling Multinomial Logistic Regression via Hybrid Parallelism
DS-MLR: Scaling Multinomial Logistic Regression via Hybrid ParallelismDS-MLR: Scaling Multinomial Logistic Regression via Hybrid Parallelism
DS-MLR: Scaling Multinomial Logistic Regression via Hybrid Parallelism
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
 
0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf
 
Optimization tutorial
Optimization tutorialOptimization tutorial
Optimization tutorial
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdf
 
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
 
Lecture note4coordinatedescent
Lecture note4coordinatedescentLecture note4coordinatedescent
Lecture note4coordinatedescent
 
Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...
 

Recently uploaded

ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 

Recently uploaded (20)

Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 

Matrix Factorizations for Recommender Systems Explained

  • 1. Matrix Factorizations for Recommender Systems Dmitriy Selivanov selivanov.dmitriy@gmail.com 2017-11-16
  • 2. Recommender systems are everywhere Figure 1:
  • 3. Recommender systems are everywhere Figure 2:
  • 4. Recommender systems are everywhere Figure 3:
  • 5. Recommender systems are everywhere Figure 4:
  • 6. Goals Propose “relevant” items to customers Retention Exploration Up-sale Personalized offers recommended items for a customer given history of activities (transactions, browsing history, favourites) Similar items substitutions bundles - frequently bought together . . .
  • 7. Live demo Dataset - LastFM-360K: 360k users 160k artists 17M observations sparsity - 0.9999999
  • 8. Explicit feedback Ratings, likes/dislikes, purchases: cleaner data smaller hard to collect RMSE2 = 1 D u,i∈D (rui − ˆrui )2
  • 9. Netflix prize ~ 480k users, 18k movies, 100m ratings sparsity ~ 90% goal is to reduce RMSE by 10% - from 0.9514 to 0.8563
  • 10. Implicit feedback noisy feedback (click, likes, purchases, search, . . . ) much easier to collect wider user/item coverage usually sparsity > 99.9% One-Class Collaborative Filtering observed entries are positive preferences should have high confidence missed entries in matrix are mix of negative preferences and positive preferences consider them as negative with low confidence we cannot really distinguish that user did not click a banner because of a lack of interest or lack of awareness
  • 11. Evaluation Recap: we only care about how to produce small set of highly relevant items. RMSE is bad metrics - very weak connection to business goals. Only interested about relevance precision of retreived items: space on the screen is limited only order matters - most relevant items should be in top
  • 12. Ranking - Mean average precision AveragePrecision = n k=1 (P(k)×rel(k)) number of relevant documents ## index relevant precision_at_k ## 1: 1 0 0.0000000 ## 2: 2 0 0.0000000 ## 3: 3 1 0.3333333 ## 4: 4 0 0.2500000 ## 5: 5 0 0.2000000 map@5 = 0.1566667
  • 13. Ranking - Normalized Discounted Cumulative Gain Intuition is the same as for MAP@K, but also takes into account value of relevance: DCGp = p i=1 2reli − 1 log2(i + 1) nDCGp = DCGp IDCGp IDCGp = |REL| i=1 2reli − 1 log2(i + 1)
  • 14. Approaches Content based good for cold start not personalized Collaborative filtering vanilla collaborative fitlering matrix factorizations . . . Hybrid and context aware recommender systems best of two worlds
  • 15. Focus today WRMF (Weighted Regularized Matrix Factorization) - Collaborative Filtering for Implicit Feedback Datasets (2008) efficient learning with accelerated approximate Alternating Least Squares inference time Linear-FLow - Practical Linear Models for Large-Scale One-Class Collaborative Filtering (2016) efficient truncated SVD cheap cross-validation with full path regularization
  • 16. Matrix Factorizations Users can be described by small number of latent factors puk Items can be described by small number of latent factors qki
  • 18. Low rank matrix factorization R = P × Q factors users items factors
  • 20. Truncated SVD Take k largest singular values: X ≈ UkDkV T k - Xk ∈ Rm∗n - Uk, V - columns are orthonormal bases (dot product of any 2 columns is zero, unit norm) - Dk - matrix with singular values on diagonal Truncated SVD is the best rank k approximation of the matrix X in terms of Frobenius norm: ||X − UkDkV T k ||F P = Uk Dk Q = DkV T k
  • 21. Issue with truncated SVD for “explicit” feedback Optimal in terms of Frobenius norm - takes into account zeros in ratings - RMSE = 1 users × items u∈users,i∈items (rui − ˆrui )2 Overfits data Objective = error only in “observed” ratings: RMSE = 1 Observed u,i∈Observed (rui − ˆrui )2
  • 22. SVD-like matrix factorization with ALS J = u,i∈Observed (rui − pu × qi )2 + λ(||Q2 || + ||P2 ||) Given Q fixed solve for p: min i∈Observed (ri − qi × P)2 + λ u j=1 p2 j Given P fixed solve for q: min u∈Observed (ru − pu × Q)2 + λ i j=1 q2 j Ridge regression: P = (QT Q + λI)−1QT r, Q = (PT P + λI)−1PT r
  • 23. “Collaborative Filtering for Implicit Feedback Datasets” WRMF - Weighted Regularized Matrix Factorization “Default” approach Proposed in 2008, but still widely used in industry (even at youtube) several high-quality open-source implementations J = u,i Cui (Pui − XuYi )2 + λ(||X||F + ||Y ||F ) Preferences - binary Pij = 1 if Rij > 0 0 otherwise Confidence - Cui = 1 + f (Rui )
  • 24. Alternating Least Squares for implicit feedback For fixed Y : dL/dxu = −2 i=item cui (pui − xT u yi )yi + 2λxu = −2 i=item cui (pui − yT i xu)yi + 2λxu = −2Y T Cu p(u) + 2Y T Cu Yxu + 2λxu Setting dL/dxu = 0 for optimal solution gives us (Y T CuY + λI)xu = Y T Cup(u) xu can be obtained by solving system of linear equations: xu = solve(Y T Cu Y + λI, Y T Cu p(u))
  • 25. Alternating Least Squares for implicit feedback Similarly for fixed X: dL/dyi = −2XT Ci p(i) + 2XT Ci Yyi + 2λyi yi = solve(XT Ci X + λI, XT Ci p(i)) Another optimization: XT Ci X = XT X + XT (Ci − I)X Y T CuY = Y T Y + Y T (Cu − I)Y XT X and Y T Y can be precomputed
  • 26. Accelerated Approximate Alternating Least Squares yi = solve(XT Ci X + λI, XT Ci p(i)) Iterative methods Conjugate Gradient Coordinate Descend Fixed number of steps of (usually 3-4 is enough):
  • 27. Inference time How to make recommendations for new users? There are no user embeddings since users are not in original matrix!
  • 28. Inference time Make one step on ALS with fixed item embeddings matrix => get new user embeddings: given Y fixed, Cnew - new user-item interactions confidence xunew = solve(Y T Cunew Y + λI, Y T Cunew p(unew )) scores = Xnew Y T
  • 29. WRMF Implementations python implicit - implemets Conjugate Gradient. With GPU support recently! R reco - implemets Conjugate Gradient Spark ALS Quora qmf Google tensorflow *titles are clickable
  • 30. Linear-Flow Idea is to learn item-item similarity matrix W from the data. First min J = ||X − XWk||F + λ||Wk||F With constraint: rank(W ) ≤ k
  • 31. Linear-Flow observations 1. Whithout L2 regularization optimal solution is Wk = QkQT k where SVDk(X) = PkΣkQT k 2. Whithout rank(W ) ≤ k optimal solution is just solution for ridge regression: W = (XT X + λI)−1XT X - infeasible.
  • 32. Linear-Flow reparametrization SVDk(X) = PkΣkQT k Let W = QkY : argmin(Y ) : ||X − XQkY ||F + λ||QkY ||F Motivation λ = 0 => W = QkQT k and also soliton for current problem Y = QT k
  • 33. Linear-Flow closed-form solution Notice that if Qk orthogogal then ||QkY ||F = ||Y ||F Solve ||X − XQkY ||F + λ||Y ||F Simple ridge regression with close form solution Y = (QT k XT XQk + λI)−1 QT k XT X Very cheap inversion of the matrix of rank k!
  • 34. Linear-Flow hassle-free cross-validation Y = (QT k XT XQk + λI)−1 QT k XT X How to find lamda with cross-validation? pre-compute Z = QT k XT X so Y = (ZQk + λI)−1Z - pre-compute ZQk notice that value of lambda affects only diagonal of ZQk generate sequence of lambda (say of length 50) based on min/max diagonal values solving 50 rigde regression of a small rank is super-fast
  • 36. Suggestions start simple - SVD, WRMF design proper cross-validation - both objective and data split think about how to incorporate business logic (for example how to exclude something) use single machine implementations think about inference time don’t waste time with libraries/articles/blogposts wich demonstrate MF with dense matrices