SlideShare a Scribd company logo
1 of 60
Download to read offline
Recommenders
Shallow / Deep
SUDEEP DAS
Frontiers and Advances in Data Sciences Conference,
X’ian, China 2017
Recommendations
guide our
experiences almost
everywhere!
Personalization in my
typical day
Morning: News/ Workout/ Getting ready
Commute hours:
Music/ YouTube Lectures/ Books
Now and then:
Social Media/ Shopping online
Evenings are for Netflix, of course!
ORIGINS
● 1999-2005: Netflix Prize:
○ >10% improvement, win $1,000,000
● Top performing model(s) ended up be a
variation of Matrix Factorization (SVD++,
Koren, et al)
● Although Netflix’s rec system has moved on,
MF is still the foundational method on which
most collaborative filtering systems are
based
Background
Matrix
Factorization
Singular Value Decomposition (Origins)
R = U Σ VT
U
VT
=
users
items
Σ
ratings
matrix
left/right singular
vectors
(orthonormal basis)
Singular values
(scaling)
R
● Low-rank approximation
● Eckart-Young theorem:
SVD: Largest SV’s for approximation
≈
[U’,Σ’,VT
’] = argmin ǁR - UΣVT
ǁ2
R
F
Frobenius Norm
Low-rank Matrix Factorization
● No orthogonality requirement
● Weighted least squares (or others)
P≈R
Q
Size of latent space
U Σ VT
Scaling factor is
absorbed into
both matrices
(not normalized)
● Bias terms
● Regularization, e.g. L2, L1, etc
Low-rank MF (cont…)
Overall bias User bias Item bias
From Olivier Grisel, dotAI 2017
The FeedForward View
MF Extensions
● Replace user-vector with sum of item vectors
Asymmetric Matrix Factorization
( )≈R I(R)
items items
N(u) is all items user i
rated/viewed/clicked
Y
Q
AMF, relation to Neural Network
1-hot encoding of a user’s
play history
Single hidden layer is
equivalent to learning
a Y and Q matrix (aka
weights)
● SLIM replaces low-rank approx by a sparse item-item matrix.
Sparity comes from L1 regularizer.
● Equivalent to constructing a regression using user’s play history to
predict ratings
● NB: Important that diagonal is excluded. Otherwise solution is trivial.
SLIM
≈R I(R)
Diagonal
replaced with
with zeros
Y
items
items
0
Clustering and
PGM
Example / Motivation
Classic Example / Motivation
?
?
? ?
?
?
? ?
?
0.88
Items
Users now belong to
multiple “topics”,
with some proportion
0.12
Purchases are a mix
proportional to user’s
affinity for topic, and item
affinity within topic
K
D
W
θ φz w
α
β
Latent Dirichlet Allocation (LDA)
LDA as a generative model
What topics look like:
0.15 0.630.22
Final step: Recommending from topics
● Once we’ve learnt a user’s distribution over topics, and each topic’s
distribution over items. Producing a recommendation is easy.
● Score every item, i, using below, and recommend items with highest
probability (discarding items the user has already purchased)
Deep Learning
in Recommender Systems
Why deep?
Deep
Learning
Is Making
Waves
Everywhere!
In many domains, deep learning is achieving near-human
or super-human accuracy!
However, applications of Deep Learning in Recommender Systems is at its infancy.
So, what is Deep Learning?
A class of machine learning algorithms:
● that use a cascade of multiple non-linear processing layers
● and complex model structures
● to learn different representations of the data in each layer
● where higher level features are derived from lower level features to form a
hierarchical representation.
Balázs Hidasi, RecSys 2016
Traditional vs Deep
Handcrafted
Features
Learned/Trainable
Features
Trainable Classifier
Trainable Classifier
Traditional ML
Deep Learning
“Socrates”
“Socrates”
Learning hierarchical representations of data
Learned Features Trainable Classifier
Each layer learns progressively complex representations from its predecessor
“Socrates”
Raw
Pixels Edges
Parts of Objects composed
from edges
Object models
Earliest adaptation: Restricted Boltzmann Machines
From recent presentation by
Alexandros Karatzoglou
One hidden layer.
User feedback on
items interacted
with, are
propagated back
to all items.
Very similar to an
autoencoder!
There are many ways to make this deep.
From Olivier Grisel, dotAI 2017
From Olivier Grisel, dotAI 2017
From Olivier Grisel, dotAI 2017
From Olivier Grisel, dotAI 2017
Deep Triplet Networks
From Olivier Grisel,
dotAI 2017
Wide + Deep Models for Recommendations
In a recommender setting, you may want to train with a wide set of
cross-product feature transformations , so that the model essentially
memorizes these sparse feature combinations (rules):
Meh! Yay! Cheng et al, Google Inc.
(2016)
Wide + Deep Models for Recommendations
On the other hand, you may want the ability to generalize using the
representational power of a deep network. But deep nets can
over-generalize.
Cheng et al, Google Inc.
(2016)
Wide + Deep Models for Recommendations
Best of both worlds:
Jointly train a deep + wide
network. The cross-feature
transformation in the wide
model component can
memorize all those sparse,
specific rules, while the
deep model component can
generalize to similar items
via embeddings.
Cheng et al, Google Inc.
(2016)
Wide + Deep Models for Recommendations
Cheng et al, Google Inc. (2016)
Wide + Deep Model
for app
recommendations.
The Youtube Recommendation model
A two Stage Approach with two deep networks:
● The candidate generation network takes events
from the user’s YouTube activity history as input and
retrieves a small subset (hundreds) of videos from a
large corpus. These candidates are intended to be
generally relevant to the user with high precision. The
candidate generation network only provides broad
personalization via collaborative filtering.
● The ranking network scores each video according to
a desired objective function using a rich set of
features describing the video and user. The highest
scoring videos are presented to the user, ranked by
their score
Covington et al., Google Inc. (2016)
The Youtube Recommendation model
Deep candidate generation model architecture
● embedded sparse features concatenated with
dense features. Embeddings are averaged
before concatenation to transform variable
sized bags of sparse IDs into fixed-width
vectors suitable for input to the hidden layers.
● All hidden layers are fully connected.
● In training, a cross-entropy loss is minimized
with gradient descent on the output of the
sampled softmax.
● At serving, an approximate nearest neighbor
lookup is performed to generate hundreds of
candidate video recommendations.
Stage One
Covington et al., Google Inc. (2016)
The Youtube Recommendation model
Stage Two
Deep ranking network
architecture
● uses embedded categorical
features (both univalent and
multivalent) with shared
embeddings and powers of
normalized continuous
features.
● All layers are fully connected.
In practice, hundreds of
features are fed into the
network.
Covington et al., Google Inc. (2016)
Autoencoders
Collaborative
Denoising
Auto-Encoder
Collaborative Denoising Auto-Encoders for Top-N Recommender Systems, Wu et.al., WSDM 2016
● Treats the feedback on items
y that the user U has
interacted with (input layer)
as a noisy version of the
user’s preferences on all
items (output layer)
● Introduces a user specific
input node and hidden bias
node, while the item weights
are shared across all users.
Recurrent Neural Networks - Sequence Modeling
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
A recurrent neural network can be thought of as multiple copies of the same
network, each passing a message to a successor.
Session-based recommendation with Recurrent
Neural Networks (GRU4Rec)
Hidasi et al.
ICLR (2016)
● Treat each user session as
sequence of clicks
● Predict next item in the session
sequence
Adding Item metadata to GRU4Rec: Parallel RNN
Hidasi et al.
Recsys (2016)
● Separate RNNs for each input
type
○ Item ID
○ Image feature vector
obtained from CNN (last
avg. pooling layer)
Convolutional Neural Nets
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 (2016) http://cs231n.stanford.edu/
VBPR: Visual Bayesian Personalized Ranking from
Implicit Feedback
He et al., AAAI (2015)
Helping cold start with augmenting item factors with visual factors
● Create an item Factor that is a sum of two terms: An Item Visual Factor which is an embedding
of a Deep CNN on the item image, and the usual collaborative item factor.
Deep content based music recommendations
http://benanne.github.io/2014/08/05/spotify-cnns.html
Cold Starting New or Less Popular
Music
● Take the Mel Spectrogram of
the song and run it through
several convolutional and
MaxPooling layers to a
compressed 1d representation.
● The training objective is to
minimize the squared error
between the collaborative item
factors of a known item and the
item factor predicted from the
CNN>
● Then for a new item, the model
can predict the item factor, and
make recommendations. Aäron van den Oord, Sander Dieleman and Benjamin Schrauwen, NIPS
2013
The Pinterest Application: Pin2Vec Related Pins
Liu et al (2017)
https://medium.com/the-graph/applying-deep-learning-to-related-pi
ns-a6fee3c92f5e
Learn a 128 dimensional compressed
representation of each item
(embedding). Then use a similarity
function (cosine) between them to find
similar items.
The Pinterest Application: Pin2Vec Related Pins
Liu et al (2017)
https://medium.com/the-graph/applying-deep-learning-to-related-pi
ns-a6fee3c92f5e
Co-occurrence Pin2Vec
The Pinterest Application: Pin2Vec Related Pins
Liu et al (2017)
https://medium.com/the-graph/applying-deep-learning-to-related-pi
ns-a6fee3c92f5e
Some concluding thoughts
● Deep Learning is augmenting shallow model based recommender systems.
The main draws for DL in RecSys seems to be:
● Better generalization beyond linear models for user-item interactions.
● Embeddings: Unified representation of heterogeneous signals (e.g. add
image/audio/textual content as side information to item embeddings via
convolutional NNs).
● Exploitation of sequential information in actions leading up to recommendation
(e.g. LSTM on viewing/purchase/search history to predict what will be
watched/purchased/searched next).
● DL toolkits provide unprecedented flexibility in experimenting with loss
functions (e.g. in toolkits like TensorFlow/MxNet/Keras etc. switching the loss
from classification loss to ranking loss is trivial.
Headline
THANKS!
sdas@netflix.com
@datamusing
@netflixresearch

More Related Content

What's hot

Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
Justin Basilico
 
Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at Netflix
Justin Basilico
 

What's hot (20)

Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at Netflix
 
Query Facet Mapping and its Applications in Streaming Services: The Netflix C...
Query Facet Mapping and its Applications in Streaming Services: The Netflix C...Query Facet Mapping and its Applications in Streaming Services: The Netflix C...
Query Facet Mapping and its Applications in Streaming Services: The Netflix C...
 
Shallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemShallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender System
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Netflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time TravelNetflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time Travel
 
Personalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningPersonalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep Learning
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at Netflix
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Learning a Personalized Homepage
Learning a Personalized HomepageLearning a Personalized Homepage
Learning a Personalized Homepage
 
Artwork Personalization at Netflix Fernando Amat RecSys2018
Artwork Personalization at Netflix Fernando Amat RecSys2018 Artwork Personalization at Netflix Fernando Amat RecSys2018
Artwork Personalization at Netflix Fernando Amat RecSys2018
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019
 
Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at Netflix
 
Data council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at Netflix
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
 
CF Models for Music Recommendations At Spotify
CF Models for Music Recommendations At SpotifyCF Models for Music Recommendations At Spotify
CF Models for Music Recommendations At Spotify
 

Similar to Crafting Recommenders: the Shallow and the Deep of it!

Similar to Crafting Recommenders: the Shallow and the Deep of it! (20)

How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxFrom Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptx
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer Learning
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
BRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkBRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning Talk
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
 
Deep Learning Recommender Systems
Deep Learning Recommender SystemsDeep Learning Recommender Systems
Deep Learning Recommender Systems
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
 
On the Influence Propagation of Web Videos
On the Influence Propagation of Web VideosOn the Influence Propagation of Web Videos
On the Influence Propagation of Web Videos
 
Fundamentals of Deep Recommender Systems
 Fundamentals of Deep Recommender Systems Fundamentals of Deep Recommender Systems
Fundamentals of Deep Recommender Systems
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right Dataset
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Image captioning
Image captioningImage captioning
Image captioning
 
Dssg talk CNN intro
Dssg talk CNN introDssg talk CNN intro
Dssg talk CNN intro
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 

Recently uploaded

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 

Recently uploaded (20)

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 

Crafting Recommenders: the Shallow and the Deep of it!

  • 1. Recommenders Shallow / Deep SUDEEP DAS Frontiers and Advances in Data Sciences Conference, X’ian, China 2017
  • 4. Morning: News/ Workout/ Getting ready
  • 6. Now and then: Social Media/ Shopping online
  • 7. Evenings are for Netflix, of course!
  • 9.
  • 10. ● 1999-2005: Netflix Prize: ○ >10% improvement, win $1,000,000 ● Top performing model(s) ended up be a variation of Matrix Factorization (SVD++, Koren, et al) ● Although Netflix’s rec system has moved on, MF is still the foundational method on which most collaborative filtering systems are based Background
  • 12. Singular Value Decomposition (Origins) R = U Σ VT U VT = users items Σ ratings matrix left/right singular vectors (orthonormal basis) Singular values (scaling) R
  • 13. ● Low-rank approximation ● Eckart-Young theorem: SVD: Largest SV’s for approximation ≈ [U’,Σ’,VT ’] = argmin ǁR - UΣVT ǁ2 R F Frobenius Norm
  • 14. Low-rank Matrix Factorization ● No orthogonality requirement ● Weighted least squares (or others) P≈R Q Size of latent space U Σ VT Scaling factor is absorbed into both matrices (not normalized)
  • 15. ● Bias terms ● Regularization, e.g. L2, L1, etc Low-rank MF (cont…) Overall bias User bias Item bias
  • 16. From Olivier Grisel, dotAI 2017 The FeedForward View
  • 18. ● Replace user-vector with sum of item vectors Asymmetric Matrix Factorization ( )≈R I(R) items items N(u) is all items user i rated/viewed/clicked Y Q
  • 19. AMF, relation to Neural Network 1-hot encoding of a user’s play history Single hidden layer is equivalent to learning a Y and Q matrix (aka weights)
  • 20. ● SLIM replaces low-rank approx by a sparse item-item matrix. Sparity comes from L1 regularizer. ● Equivalent to constructing a regression using user’s play history to predict ratings ● NB: Important that diagonal is excluded. Otherwise solution is trivial. SLIM ≈R I(R) Diagonal replaced with with zeros Y items items 0
  • 23. Classic Example / Motivation
  • 24. ? ? ? ? ? ? ? ? ? 0.88 Items Users now belong to multiple “topics”, with some proportion 0.12 Purchases are a mix proportional to user’s affinity for topic, and item affinity within topic
  • 25. K D W θ φz w α β Latent Dirichlet Allocation (LDA)
  • 26. LDA as a generative model
  • 27. What topics look like: 0.15 0.630.22
  • 28. Final step: Recommending from topics ● Once we’ve learnt a user’s distribution over topics, and each topic’s distribution over items. Producing a recommendation is easy. ● Score every item, i, using below, and recommend items with highest probability (discarding items the user has already purchased)
  • 31. In many domains, deep learning is achieving near-human or super-human accuracy! However, applications of Deep Learning in Recommender Systems is at its infancy.
  • 32. So, what is Deep Learning? A class of machine learning algorithms: ● that use a cascade of multiple non-linear processing layers ● and complex model structures ● to learn different representations of the data in each layer ● where higher level features are derived from lower level features to form a hierarchical representation. Balázs Hidasi, RecSys 2016
  • 33. Traditional vs Deep Handcrafted Features Learned/Trainable Features Trainable Classifier Trainable Classifier Traditional ML Deep Learning “Socrates” “Socrates”
  • 34. Learning hierarchical representations of data Learned Features Trainable Classifier Each layer learns progressively complex representations from its predecessor “Socrates” Raw Pixels Edges Parts of Objects composed from edges Object models
  • 35. Earliest adaptation: Restricted Boltzmann Machines From recent presentation by Alexandros Karatzoglou One hidden layer. User feedback on items interacted with, are propagated back to all items. Very similar to an autoencoder!
  • 36. There are many ways to make this deep. From Olivier Grisel, dotAI 2017
  • 37. From Olivier Grisel, dotAI 2017
  • 38. From Olivier Grisel, dotAI 2017
  • 39. From Olivier Grisel, dotAI 2017
  • 40. Deep Triplet Networks From Olivier Grisel, dotAI 2017
  • 41. Wide + Deep Models for Recommendations In a recommender setting, you may want to train with a wide set of cross-product feature transformations , so that the model essentially memorizes these sparse feature combinations (rules): Meh! Yay! Cheng et al, Google Inc. (2016)
  • 42. Wide + Deep Models for Recommendations On the other hand, you may want the ability to generalize using the representational power of a deep network. But deep nets can over-generalize. Cheng et al, Google Inc. (2016)
  • 43. Wide + Deep Models for Recommendations Best of both worlds: Jointly train a deep + wide network. The cross-feature transformation in the wide model component can memorize all those sparse, specific rules, while the deep model component can generalize to similar items via embeddings. Cheng et al, Google Inc. (2016)
  • 44. Wide + Deep Models for Recommendations Cheng et al, Google Inc. (2016) Wide + Deep Model for app recommendations.
  • 45. The Youtube Recommendation model A two Stage Approach with two deep networks: ● The candidate generation network takes events from the user’s YouTube activity history as input and retrieves a small subset (hundreds) of videos from a large corpus. These candidates are intended to be generally relevant to the user with high precision. The candidate generation network only provides broad personalization via collaborative filtering. ● The ranking network scores each video according to a desired objective function using a rich set of features describing the video and user. The highest scoring videos are presented to the user, ranked by their score Covington et al., Google Inc. (2016)
  • 46. The Youtube Recommendation model Deep candidate generation model architecture ● embedded sparse features concatenated with dense features. Embeddings are averaged before concatenation to transform variable sized bags of sparse IDs into fixed-width vectors suitable for input to the hidden layers. ● All hidden layers are fully connected. ● In training, a cross-entropy loss is minimized with gradient descent on the output of the sampled softmax. ● At serving, an approximate nearest neighbor lookup is performed to generate hundreds of candidate video recommendations. Stage One Covington et al., Google Inc. (2016)
  • 47. The Youtube Recommendation model Stage Two Deep ranking network architecture ● uses embedded categorical features (both univalent and multivalent) with shared embeddings and powers of normalized continuous features. ● All layers are fully connected. In practice, hundreds of features are fed into the network. Covington et al., Google Inc. (2016)
  • 49. Collaborative Denoising Auto-Encoder Collaborative Denoising Auto-Encoders for Top-N Recommender Systems, Wu et.al., WSDM 2016 ● Treats the feedback on items y that the user U has interacted with (input layer) as a noisy version of the user’s preferences on all items (output layer) ● Introduces a user specific input node and hidden bias node, while the item weights are shared across all users.
  • 50. Recurrent Neural Networks - Sequence Modeling http://colah.github.io/posts/2015-08-Understanding-LSTMs/ A recurrent neural network can be thought of as multiple copies of the same network, each passing a message to a successor.
  • 51. Session-based recommendation with Recurrent Neural Networks (GRU4Rec) Hidasi et al. ICLR (2016) ● Treat each user session as sequence of clicks ● Predict next item in the session sequence
  • 52. Adding Item metadata to GRU4Rec: Parallel RNN Hidasi et al. Recsys (2016) ● Separate RNNs for each input type ○ Item ID ○ Image feature vector obtained from CNN (last avg. pooling layer)
  • 53. Convolutional Neural Nets Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 (2016) http://cs231n.stanford.edu/
  • 54. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback He et al., AAAI (2015) Helping cold start with augmenting item factors with visual factors ● Create an item Factor that is a sum of two terms: An Item Visual Factor which is an embedding of a Deep CNN on the item image, and the usual collaborative item factor.
  • 55. Deep content based music recommendations http://benanne.github.io/2014/08/05/spotify-cnns.html Cold Starting New or Less Popular Music ● Take the Mel Spectrogram of the song and run it through several convolutional and MaxPooling layers to a compressed 1d representation. ● The training objective is to minimize the squared error between the collaborative item factors of a known item and the item factor predicted from the CNN> ● Then for a new item, the model can predict the item factor, and make recommendations. Aäron van den Oord, Sander Dieleman and Benjamin Schrauwen, NIPS 2013
  • 56. The Pinterest Application: Pin2Vec Related Pins Liu et al (2017) https://medium.com/the-graph/applying-deep-learning-to-related-pi ns-a6fee3c92f5e Learn a 128 dimensional compressed representation of each item (embedding). Then use a similarity function (cosine) between them to find similar items.
  • 57. The Pinterest Application: Pin2Vec Related Pins Liu et al (2017) https://medium.com/the-graph/applying-deep-learning-to-related-pi ns-a6fee3c92f5e Co-occurrence Pin2Vec
  • 58. The Pinterest Application: Pin2Vec Related Pins Liu et al (2017) https://medium.com/the-graph/applying-deep-learning-to-related-pi ns-a6fee3c92f5e
  • 59. Some concluding thoughts ● Deep Learning is augmenting shallow model based recommender systems. The main draws for DL in RecSys seems to be: ● Better generalization beyond linear models for user-item interactions. ● Embeddings: Unified representation of heterogeneous signals (e.g. add image/audio/textual content as side information to item embeddings via convolutional NNs). ● Exploitation of sequential information in actions leading up to recommendation (e.g. LSTM on viewing/purchase/search history to predict what will be watched/purchased/searched next). ● DL toolkits provide unprecedented flexibility in experimenting with loss functions (e.g. in toolkits like TensorFlow/MxNet/Keras etc. switching the loss from classification loss to ranking loss is trivial.