SlideShare a Scribd company logo
1 of 58
Download to read offline
October 17, 2015
Data Pipelines for
Music Recommendations
@
Spotify
Vidhya Murali
@vid052
Vidhya Murali
Who Am I?
2
•Areas of Interest: Data & Machine Learning
•Data Engineer @Spotify
•Masters Student from the University of Wisconsin Madison
aka Happy Badger for life!
“Torture the data, and it will
confess!”
3
– Ronald Coase, Nobel Prize Laureate
Spotify’s Big Data
4
•Started in 2006, now available in 58 countries
• 70+ million active users, 20+ million paid subscribers
• 30+ million songs in our catalog, ~20K added every day
• 1.5 billion playlists so far and counting
• 1 TB of user data logged every day
• Hadoop cluster with 1500 nodes
• ~20,000 Hadoop jobs per day
Music Recommendations at Spotify
Features:
Discover
Discover Weekly
Moments
Radio
Related Artists
5
6
30 million tracks…
What to recommend?
Approaches 7
•Manual curation by Experts
•Editorial Tagging
•Metadata (e.g. Label provided data, NLP over News,
Blogs)
•Audio Signals
•Collaborative Filtering Model
Approaches 7
•Manual curation by Experts
•Editorial Tagging
•Metadata (e.g. Label provided data, NLP over News,
Blogs)
•Audio Signals
•Collaborative Filtering Model
Collaborative Filtering Model 8
•Find patterns from user’s past behavior to generate
recommendations
•Domain independent
•Scalable
•Accuracy (Collaborative Model) >= Accuracy (Content
Based Model)
Definition of CF
9
Hey,
I like tracks P, Q, R, S!
Well,
I like tracks Q, R, S, T!
Then you should check out
track P!
Nice! Btw try track T!
Legacy Slide of Erik Bernhardsson
The YoLo Problem
10
The YoLo Problem
10
•YoLo Problem: “You Only Listen Once” to judge recommendations
•Goal: Predict if users will listen to new music (new to user)
The YoLo Problem
10
•YoLo Problem: “You Only Listen Once” to judge recommendations
•Goal: Predict if users will listen to new music (new to user)
•Challenges
•Scale of catalog (30M songs + ~20K added every day)
•Repeated consumption of music is not very uncommon
•Music is niche
•Music consumption is heavily influenced by user’s lifestyle
The YoLo Problem
10
•YoLo Problem: “You Only Listen Once” to judge recommendations
•Goal: Predict if users will listen to new music (new to user)
•Challenges
•Scale of catalog (30M songs + ~20K added every day)
•Repeated consumption of music is not very uncommon
•Music is niche
•Music consumption is heavily influenced by user’s lifestyle
•Input: Feedback is implicit through streaming behavior, collection adds,
browse history, search history etc
User Plays to Track Recs 11
User Plays to Track Recs 11
1. Weighted play counts from logs
User Plays to Track Recs 11
1. Weighted play counts from logs
2. Train Model using the input signals
User Plays to Track Recs 11
1. Weighted play counts from logs
2. Train Model using the input signals
3. Generate recs from
the trained model
User Plays to Track Recs 11
1. Weighted play counts from logs
2. Train Model using the input signals
3. Generate recs from
the trained model
4. Post process the
recommendations
12
Step 1: ETL of Logs
•Extract and transform the anonymized logs to training data set
•Case: Logs -> (user, track, wt.count)
Step 2: Construct Big Matrix! 13
Tracks(n)
Users(m)
Vidhya
Burn by Ellie Goulding
Step 2: Construct Big Matrix! 13
Tracks(n)
Users(m)
Vidhya
Burn by Ellie Goulding
Order of 70M x 30M!
Latent Factor Models 14
Vidhya
Burn
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
•Use a “small” representation for each user and items(tracks): f-dimensional
vectors
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
m m
n
m n
Latent Factor Models 14
Vidhya
Burn
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
•Use a “small” representation for each user and items(tracks): f-dimensional
vectors
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
m m
n
m n
User Track Matrix:
(m x n)
Latent Factor Models 14
Vidhya
Burn
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
•Use a “small” representation for each user and items(tracks): f-dimensional
vectors
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
m m
n
m n
User Vector Matrix:
X: (m x f)
User Track Matrix:
(m x n)
Latent Factor Models 14
Vidhya
Burn
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
•Use a “small” representation for each user and items(tracks): f-dimensional
vectors
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
m m
n
m n
User Vector Matrix:
X: (m x f)
Track Vector Matrix:
Y: (n x f)
User Track Matrix:
(m x n)
Latent Factor Models 14
Vidhya
Burn
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
•Use a “small” representation for each user and items(tracks): f-dimensional
vectors
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
(here, f = 2)
m m
n
m n
User Vector Matrix:
X: (m x f)
Track Vector Matrix:
Y: (n x f)
User Track Matrix:
(m x n)
Matrix Factorization using Implicit Feedback 15
Matrix Factorization using Implicit Feedback
User Track Play
Count Matrix
15
Matrix Factorization using Implicit Feedback
User Track Play
Count Matrix
User Track
Preference
Matrix
Binary Label:
1 => played
0 => not played
15
Matrix Factorization using Implicit Feedback
User Track Play
Count Matrix
User Track
Preference
Matrix
Binary Label:
1 => played
0 => not played
Weights
Matrix
Weights based on play count
and smoothing
15
Equation(s) Alert!
16
Implicit Matrix Factorization 17
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
X YUsers
Tracks
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vectoryi
Alternating Least Squares 18
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
Tracks
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix tracks
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
yi
19
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix tracks
Solve for users
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
Alternating Least Squares
yi
Tracks
20
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix users
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
Alternating Least Squares
yi
Tracks
21
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix users
Solve for tracks
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
Alternating Least Squares
yi
Tracks
22
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix users
Solve for tracks
Repeat until convergence…
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
Alternating Least Squares
yi
Tracks
23
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix users
Solve for tracks
Repeat until convergence…
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
Alternating Least Squares
yi
Tracks
Vectors
•“Compact” representation for users and items(tracks) in the same space
Why Vectors? 25
•Vectors encode higher order dependencies
•Users and Items in the same vector space!
•Use vector similarity to compute:
•Item-Item similarities
•User-Item recommendations
•Linear complexity: order of number of latent factors
•Easy to scale up
26
•Compute track similarities and track recommendations for users as a similarity
measure
Step 3: Compute Recs!
•Euclidian Distance
•Cosine Similarity
•Pearson Correlation
26
•Compute track similarities and track recommendations for users as a similarity
measure
Step 3: Compute Recs!
Recommendations via Cosine Similarity 27
Recommendations via Cosine Similarity 27
28
Annoy
•70 million users, at least 4 million tracks for candidates per user
•Brute Force Approach:
•O(70M x 4M x 10) ~= 0(3 peta-operations)!
• Approximate Nearest Neighbor Oh Yeah!
• Uses Local Sensitive Hashing
• Clone: https://github.com/spotify/annoy
29
•Apply Filters
•Interacted music
•Holiday music anyone?
•Factor for:
•Diversity
•Freshness
•Popularity
•Demographics
•Seasonality
Step 4: Post Processing
30
70 Million users x 30
Million tracks. How to
scale?
Matrix Factorization with MapReduce
31
Reduce stepMap step
u % K = 0
i % L = 0
u % K = 0
i % L = 1
...
u % K = 0
i % L = L-1
u % K = 1
i % L = 0
u % K = 1
i % L = 1
... ...
... ... ... ...
u % K = K-1
i % L = 0
... ...
u % K = K-1
i % L = L-1
item vectors
item%L=0
item vectors
item%L=1
item vectors
i % L = L-1
user vectors
u % K = 0
user vectors
u % K = 1
user vectors
u % K = K-1
all log entries
u % K = 1
i % L = 1
u % K = 0
u % K = 1
u % K = K-1
•Split the matrix up into K x L blocks.
•Each mapper gets a different block, sums up intermediate terms, then key by
user (or item) to reduce final user (or item) vector.
Matrix Factorization with MapReduce
32
One map task
Distributed
cache:
All user vectors
where u % K = x
Distributed
cache:
All item vectors
where i % L = y
Mapper Emit contributions
Map input:
tuples (u, i, count)
where
u % K = x
and
i % L = y
Reducer New vector!
•Input to Mapper is a list of (user, item, count) tuples
– user modulo K is the same for all users in block
– item modulo L is the same for all items in the block
– Mapper aggregates intermediate contributions for each user (or item)
– Eg: K=4, Mapper #1 gets user 1, 5, 9, 13 etc
– Reducer keys by user (or item), aggregates intermediate mapper sums and solves closed form for final user
(or item) vector
Music Recommendations Data Flow
33
34
Source:
Revisiting YOLO!
35
“You Only Listen Once to judge
recommendations” problem
Optimizing for the Yolo Problem
•OFFLINE TESTING:
•Experts’ Inputs
•Measure accuracy
•A/B TESTS: control vs a/b group. Some useful metrics we consider:
•DAU / WAU / MAU
•Retention
•Session Length
•Skip Rate
36
Challenge Accepted!
•Cold start problem for both users and new music/upcoming artists:
•Content based signals, real time recommendation
•Measuring recommendation quality:
•A/B test metrics
•Active forums for getting user feedback
•Scam Attacks:
•Rule based model to detect scammers
•Humans choices are not always predictable:
•Faith in humanity
37
What Next?
•Personalize user experience on Spotify for every moment:
•Right Now
•Recommend other media formats:
•Podcasts
•Video
•Power music recommendations on other platforms:
•Google Now
38
Join the
Band!
We are hiring!
39
Thank You!
You can reach me @
Email: vidhya@spotify.com
Twitter: @vid052

More Related Content

What's hot

Interactive Recommender Systems
Interactive Recommender SystemsInteractive Recommender Systems
Interactive Recommender Systems
Roelof van Zwol
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At Spotify
Adam Kawa
 

What's hot (20)

From Idea to Execution: Spotify's Discover Weekly
From Idea to Execution: Spotify's Discover WeeklyFrom Idea to Execution: Spotify's Discover Weekly
From Idea to Execution: Spotify's Discover Weekly
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At Spotify
 
Spotify Discover Weekly: The machine learning behind your music recommendations
Spotify Discover Weekly: The machine learning behind your music recommendationsSpotify Discover Weekly: The machine learning behind your music recommendations
Spotify Discover Weekly: The machine learning behind your music recommendations
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
 
Approximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetupApproximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetup
 
Scala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyScala Data Pipelines @ Spotify
Scala Data Pipelines @ Spotify
 
Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014
 
Music Personalization : Real time Platforms.
Music Personalization : Real time Platforms.Music Personalization : Real time Platforms.
Music Personalization : Real time Platforms.
 
Machine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data MeetupMachine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data Meetup
 
Homepage Personalization at Spotify
Homepage Personalization at SpotifyHomepage Personalization at Spotify
Homepage Personalization at Spotify
 
ML+Hadoop at NYC Predictive Analytics
ML+Hadoop at NYC Predictive AnalyticsML+Hadoop at NYC Predictive Analytics
ML+Hadoop at NYC Predictive Analytics
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
 
Storm at Spotify
Storm at SpotifyStorm at Spotify
Storm at Spotify
 
Interactive Recommender Systems
Interactive Recommender SystemsInteractive Recommender Systems
Interactive Recommender Systems
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At Spotify
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019
 
Personalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningPersonalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep Learning
 

Viewers also liked

Viewers also liked (6)

Playlist Recommendations @ Spotify
Playlist Recommendations @ SpotifyPlaylist Recommendations @ Spotify
Playlist Recommendations @ Spotify
 
Music & interaction
Music & interactionMusic & interaction
Music & interaction
 
Music survey results (2)
Music survey results (2)Music survey results (2)
Music survey results (2)
 
Mugo one pager
Mugo one pagerMugo one pager
Mugo one pager
 
Jackdaw research music survey report
Jackdaw research music survey reportJackdaw research music survey report
Jackdaw research music survey report
 
How We Listen to Music - SXSW 2015
How We Listen to Music - SXSW 2015How We Listen to Music - SXSW 2015
How We Listen to Music - SXSW 2015
 

Similar to Building Data Pipelines for Music Recommendations at Spotify

Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networks
Marco Brambilla
 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Sc Huang
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
sscdotopen
 
microposts2015presentation-150518124457-lva1-app6892.pdf
microposts2015presentation-150518124457-lva1-app6892.pdfmicroposts2015presentation-150518124457-lva1-app6892.pdf
microposts2015presentation-150518124457-lva1-app6892.pdf
SunnySam26
 

Similar to Building Data Pipelines for Music Recommendations at Spotify (20)

Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with Spark
 
FitCompete
FitCompeteFitCompete
FitCompete
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Recsys 2018 overview and highlights
Recsys 2018 overview and highlightsRecsys 2018 overview and highlights
Recsys 2018 overview and highlights
 
Practical Deep Learning Using Tensor Flow - Sandeep Kath
Practical Deep Learning Using Tensor Flow - Sandeep KathPractical Deep Learning Using Tensor Flow - Sandeep Kath
Practical Deep Learning Using Tensor Flow - Sandeep Kath
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Scalable Recommendation Algorithms with LSH
Scalable Recommendation Algorithms with LSHScalable Recommendation Algorithms with LSH
Scalable Recommendation Algorithms with LSH
 
A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networks
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
IntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfIntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdf
 
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
 
microposts2015presentation-150518124457-lva1-app6892.pdf
microposts2015presentation-150518124457-lva1-app6892.pdfmicroposts2015presentation-150518124457-lva1-app6892.pdf
microposts2015presentation-150518124457-lva1-app6892.pdf
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.ppt
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.ppt
 
A Unified Music Recommender System Using Listening Habits and Semantics of Tags
A Unified Music Recommender System Using Listening Habits and Semantics of TagsA Unified Music Recommender System Using Listening Habits and Semantics of Tags
A Unified Music Recommender System Using Listening Habits and Semantics of Tags
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

Building Data Pipelines for Music Recommendations at Spotify

  • 1. October 17, 2015 Data Pipelines for Music Recommendations @ Spotify Vidhya Murali @vid052
  • 2. Vidhya Murali Who Am I? 2 •Areas of Interest: Data & Machine Learning •Data Engineer @Spotify •Masters Student from the University of Wisconsin Madison aka Happy Badger for life!
  • 3. “Torture the data, and it will confess!” 3 – Ronald Coase, Nobel Prize Laureate
  • 4. Spotify’s Big Data 4 •Started in 2006, now available in 58 countries • 70+ million active users, 20+ million paid subscribers • 30+ million songs in our catalog, ~20K added every day • 1.5 billion playlists so far and counting • 1 TB of user data logged every day • Hadoop cluster with 1500 nodes • ~20,000 Hadoop jobs per day
  • 5. Music Recommendations at Spotify Features: Discover Discover Weekly Moments Radio Related Artists 5
  • 7. Approaches 7 •Manual curation by Experts •Editorial Tagging •Metadata (e.g. Label provided data, NLP over News, Blogs) •Audio Signals •Collaborative Filtering Model
  • 8. Approaches 7 •Manual curation by Experts •Editorial Tagging •Metadata (e.g. Label provided data, NLP over News, Blogs) •Audio Signals •Collaborative Filtering Model
  • 9. Collaborative Filtering Model 8 •Find patterns from user’s past behavior to generate recommendations •Domain independent •Scalable •Accuracy (Collaborative Model) >= Accuracy (Content Based Model)
  • 10. Definition of CF 9 Hey, I like tracks P, Q, R, S! Well, I like tracks Q, R, S, T! Then you should check out track P! Nice! Btw try track T! Legacy Slide of Erik Bernhardsson
  • 12. The YoLo Problem 10 •YoLo Problem: “You Only Listen Once” to judge recommendations •Goal: Predict if users will listen to new music (new to user)
  • 13. The YoLo Problem 10 •YoLo Problem: “You Only Listen Once” to judge recommendations •Goal: Predict if users will listen to new music (new to user) •Challenges •Scale of catalog (30M songs + ~20K added every day) •Repeated consumption of music is not very uncommon •Music is niche •Music consumption is heavily influenced by user’s lifestyle
  • 14. The YoLo Problem 10 •YoLo Problem: “You Only Listen Once” to judge recommendations •Goal: Predict if users will listen to new music (new to user) •Challenges •Scale of catalog (30M songs + ~20K added every day) •Repeated consumption of music is not very uncommon •Music is niche •Music consumption is heavily influenced by user’s lifestyle •Input: Feedback is implicit through streaming behavior, collection adds, browse history, search history etc
  • 15. User Plays to Track Recs 11
  • 16. User Plays to Track Recs 11 1. Weighted play counts from logs
  • 17. User Plays to Track Recs 11 1. Weighted play counts from logs 2. Train Model using the input signals
  • 18. User Plays to Track Recs 11 1. Weighted play counts from logs 2. Train Model using the input signals 3. Generate recs from the trained model
  • 19. User Plays to Track Recs 11 1. Weighted play counts from logs 2. Train Model using the input signals 3. Generate recs from the trained model 4. Post process the recommendations
  • 20. 12 Step 1: ETL of Logs •Extract and transform the anonymized logs to training data set •Case: Logs -> (user, track, wt.count)
  • 21. Step 2: Construct Big Matrix! 13 Tracks(n) Users(m) Vidhya Burn by Ellie Goulding
  • 22. Step 2: Construct Big Matrix! 13 Tracks(n) Users(m) Vidhya Burn by Ellie Goulding Order of 70M x 30M!
  • 23. Latent Factor Models 14 Vidhya Burn .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . •Use a “small” representation for each user and items(tracks): f-dimensional vectors .. . .. . .. . .. . . . ... ... ... ... .. m m n m n
  • 24. Latent Factor Models 14 Vidhya Burn .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . •Use a “small” representation for each user and items(tracks): f-dimensional vectors .. . .. . .. . .. . . . ... ... ... ... .. m m n m n User Track Matrix: (m x n)
  • 25. Latent Factor Models 14 Vidhya Burn .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . •Use a “small” representation for each user and items(tracks): f-dimensional vectors .. . .. . .. . .. . . . ... ... ... ... .. m m n m n User Vector Matrix: X: (m x f) User Track Matrix: (m x n)
  • 26. Latent Factor Models 14 Vidhya Burn .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . •Use a “small” representation for each user and items(tracks): f-dimensional vectors .. . .. . .. . .. . . . ... ... ... ... .. m m n m n User Vector Matrix: X: (m x f) Track Vector Matrix: Y: (n x f) User Track Matrix: (m x n)
  • 27. Latent Factor Models 14 Vidhya Burn .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . •Use a “small” representation for each user and items(tracks): f-dimensional vectors .. . .. . .. . .. . . . ... ... ... ... .. (here, f = 2) m m n m n User Vector Matrix: X: (m x f) Track Vector Matrix: Y: (n x f) User Track Matrix: (m x n)
  • 28. Matrix Factorization using Implicit Feedback 15
  • 29. Matrix Factorization using Implicit Feedback User Track Play Count Matrix 15
  • 30. Matrix Factorization using Implicit Feedback User Track Play Count Matrix User Track Preference Matrix Binary Label: 1 => played 0 => not played 15
  • 31. Matrix Factorization using Implicit Feedback User Track Play Count Matrix User Track Preference Matrix Binary Label: 1 => played 0 => not played Weights Matrix Weights based on play count and smoothing 15
  • 33. Implicit Matrix Factorization 17 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. X YUsers Tracks • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vectoryi
  • 34. Alternating Least Squares 18 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers Tracks • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix tracks •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. yi
  • 35. 19 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix tracks Solve for users •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. Alternating Least Squares yi Tracks
  • 36. 20 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix users •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. Alternating Least Squares yi Tracks
  • 37. 21 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix users Solve for tracks •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. Alternating Least Squares yi Tracks
  • 38. 22 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix users Solve for tracks Repeat until convergence… •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. Alternating Least Squares yi Tracks
  • 39. 23 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix users Solve for tracks Repeat until convergence… •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. Alternating Least Squares yi Tracks
  • 40. Vectors •“Compact” representation for users and items(tracks) in the same space
  • 41. Why Vectors? 25 •Vectors encode higher order dependencies •Users and Items in the same vector space! •Use vector similarity to compute: •Item-Item similarities •User-Item recommendations •Linear complexity: order of number of latent factors •Easy to scale up
  • 42. 26 •Compute track similarities and track recommendations for users as a similarity measure Step 3: Compute Recs!
  • 43. •Euclidian Distance •Cosine Similarity •Pearson Correlation 26 •Compute track similarities and track recommendations for users as a similarity measure Step 3: Compute Recs!
  • 44. Recommendations via Cosine Similarity 27
  • 45. Recommendations via Cosine Similarity 27
  • 46. 28 Annoy •70 million users, at least 4 million tracks for candidates per user •Brute Force Approach: •O(70M x 4M x 10) ~= 0(3 peta-operations)! • Approximate Nearest Neighbor Oh Yeah! • Uses Local Sensitive Hashing • Clone: https://github.com/spotify/annoy
  • 47. 29 •Apply Filters •Interacted music •Holiday music anyone? •Factor for: •Diversity •Freshness •Popularity •Demographics •Seasonality Step 4: Post Processing
  • 48. 30 70 Million users x 30 Million tracks. How to scale?
  • 49. Matrix Factorization with MapReduce 31 Reduce stepMap step u % K = 0 i % L = 0 u % K = 0 i % L = 1 ... u % K = 0 i % L = L-1 u % K = 1 i % L = 0 u % K = 1 i % L = 1 ... ... ... ... ... ... u % K = K-1 i % L = 0 ... ... u % K = K-1 i % L = L-1 item vectors item%L=0 item vectors item%L=1 item vectors i % L = L-1 user vectors u % K = 0 user vectors u % K = 1 user vectors u % K = K-1 all log entries u % K = 1 i % L = 1 u % K = 0 u % K = 1 u % K = K-1 •Split the matrix up into K x L blocks. •Each mapper gets a different block, sums up intermediate terms, then key by user (or item) to reduce final user (or item) vector.
  • 50. Matrix Factorization with MapReduce 32 One map task Distributed cache: All user vectors where u % K = x Distributed cache: All item vectors where i % L = y Mapper Emit contributions Map input: tuples (u, i, count) where u % K = x and i % L = y Reducer New vector! •Input to Mapper is a list of (user, item, count) tuples – user modulo K is the same for all users in block – item modulo L is the same for all items in the block – Mapper aggregates intermediate contributions for each user (or item) – Eg: K=4, Mapper #1 gets user 1, 5, 9, 13 etc – Reducer keys by user (or item), aggregates intermediate mapper sums and solves closed form for final user (or item) vector
  • 52. 34
  • 53. Source: Revisiting YOLO! 35 “You Only Listen Once to judge recommendations” problem
  • 54. Optimizing for the Yolo Problem •OFFLINE TESTING: •Experts’ Inputs •Measure accuracy •A/B TESTS: control vs a/b group. Some useful metrics we consider: •DAU / WAU / MAU •Retention •Session Length •Skip Rate 36
  • 55. Challenge Accepted! •Cold start problem for both users and new music/upcoming artists: •Content based signals, real time recommendation •Measuring recommendation quality: •A/B test metrics •Active forums for getting user feedback •Scam Attacks: •Rule based model to detect scammers •Humans choices are not always predictable: •Faith in humanity 37
  • 56. What Next? •Personalize user experience on Spotify for every moment: •Right Now •Recommend other media formats: •Podcasts •Video •Power music recommendations on other platforms: •Google Now 38
  • 57. Join the Band! We are hiring! 39
  • 58. Thank You! You can reach me @ Email: vidhya@spotify.com Twitter: @vid052