Spotify uses both push and pull paradigms to match artists and fans in a personal and relevant way. The push paradigm is exemplified by Home, which surfaces personalized playlists using an algorithm called BaRT. BaRT is a multi-armed bandit algorithm that explores and exploits to select playlists based on a reward function. Research shows personalizing the reward function for each user and playlist type improves results. Search represents the pull paradigm, where users search for specific music. Understanding user intent and mindset helps improve search satisfaction. Both paradigms aim to reduce effort and increase success based on offline and online evaluation. Voice interactions may represent a hybrid paradigm.
3. Spotify’s mission is to
unlock the potential of
human creativity — by
giving a million creative
artists the opportunity
to live off their art and
billions of fans the
opportunity to enjoy
and be inspired by it.
7. “We conclude that information retrieval and
information filtering are indeed two sides of
the same coin. They work together to help
people get the information needed to
perform their tasks.”
Information filtering and information retrieval: Two sides of the same coin? NJ Belkin & WB Croft,
Communications of the ACM, 1992.
8. “We can conclude that recommender
systems and search are also two sides of
the same coin at Spotify. They work
together to help fans get the music they will
enjoy listening”.
PULL
PARADIGM
PUSH
PARADIGM
is this the case?
10. Home
Home is the default screen of the mobile app
for all Spotify users worldwide.
It surfaces the best of what Spotify has to
offer, for every situation, personalized
playlists, new releases, old favorites, and
undiscovered gems.
Help users find something they are going to
enjoy listening to, quickly.
11. Streaming UserBaRT
Explore, Exploit, Explain: Personalizing Explainable Recommendations with Bandits. J McInerney, B Lacker, S Hansen, K Higley, H.Bouchard, A Gruson
& R Mehrotra, RecSys 2018.
BaRT: Machine learning algorithm for
Spotify Home
12. BaRT (Bandits for Recommendations as Treatments)
How to rank playlists (cards) in each shelf first, and then how to rank the shelves?
13. Explore vs Exploit
Flip a coin with given probability of tail
If head, pick best card in M according to predicted reward r → EXPLOIT
If tail, pick card from M at random → EXPLORE
BaRT: Multi-armed bandit algorithm for Home
https://hackernoon.com/reinforcement-learning-part-2-152fb510cc54
14. Success is captured by the reward function
Reward
Binarised Streaming Time
BaRT UserStreaming
success is when user
streams the playlist
for at least 30s
15. Is success the same for all playlists?
Consumption time of a sleep playlist is longer than average playlist consumption time.
Jazz listeners consume Jazz and other playlists for longer period than average users.
16. one reward function
for all users and all
playlists
success independent of
user and playlist
one reward function
per user x playlist
success depends on user and
playlist
too granular, sparse, noisy,
costly to generate & maintain
one reward function
per group of users x
playlists
success depends on group of
users listening to group of
playlists
Personalizing the reward function for BaRT
17. Co-clustering using streaming time
users
playlists
user
groups
playlist groups
Dhillon, Mallela & Modha, "Information-theoretic co-clustering”, KDD 2003.
group = cluster
group of user x playlist = co-cluster
19. Deriving User- and Content-specific Rewards for
Contextual Bandits. P Dragone, R Mehrotra & M Lalmas.
WWW 2019.
Using playlist consumption time to inform metric
to optimise for (Home reward function)
Optimizing for mean consumption time led to +22.24% in predicted
stream rate. Defining per user x playlist cluster led to further +13%.
mean of consumption time
co-clustering user group x
playlist type
Metrics
importance to affinity type
features over generic
(age/day) features
20. Jointly Leveraging Intent and Interaction Signals to
Predict User Satisfaction with Slate Recommendations.
R Mehrotra, M Lalmas, D Kenney, T Lim-Meng & G
Hashemian. WWW 2019.
Three Intent Models intent important to interpret
user interaction
Passively Listening
- quickly access playlists or saved music
- play music matching mood or activity
- find music to play in background
Actively Engaging
- discover new music to listen to now
- save new music or follow new playlists for later
- explore artists or albums more deeply
User intent
Machine learning across intents on Home is
better than intent-agnostic machine learning
Considering intent improves ability to infer user satisfaction by 20%.
21. Towards a Fair Marketplace: Counterfactual Evaluation of
the trade-off between Relevance, Fairness & Satisfaction
in Recommendation Systems. R Mehrotra, J McInerney,
H Bouchard, M Lalmas & F Diaz. CIKM 2018.
Playlist is deemed diverse if it
contains tracks from artists with
different popularity groups.
Very few sets have both high
relevance & high diversity.
Diversity Relevance
Diversity
Recommender system optimizing for relevance
may not have high diversity estimate.
Gains in fairness possible without severe loss of satisfaction.
Adaptive policies aware of user receptiveness perform better.
22. Offline evaluation framework to launch,
evaluate and archive machine learning
studies, ensuring reproducibility and allowing
sharing across teams.
Offline Evaluation to Make Decisions About Playlist
Recommendation Algorithms. A Gruson, P Chandar, C
Charbuillet, J McInerney, S Hansen, D Tardieu & B
Carterette, WSDM 2019.
Offline
evaluation
24. Large catalog
40M+ songs, 3B+ playlists
2K+ microgenres
Many languages
79 markets
Different modalities
Typed, voice
Heterogeneous content
Music, podcast
Various granularities
Song, artist, playlist, podcast
Various goals
Focus, discover, lean-back, mood,
activity
Searching for music
25. Overview of the user journey in search
TYPE/TALK
User
communicates
with us
CONSIDER
User evaluates
what we show
them
DECIDE
User ends the
search session
INTENT
What the user
wants to do
MINDSET
How the user
thinks about
results
27. s sa satt sat sati statis
Search is instantaneous
… the search logs for “satisfaction”
From prefix to query
→ What is the actual query?
→ What is a click vs prefix vs query?
prefix
query
28. A user can approach any intent with any mindset
FOCUSED
One specific thing in mind
OPEN
A seed of an idea in mind
EXPLORATORY
A path to explore
LISTEN
Have a listening session
ORGANIZE
Curate for future listening
SHARE
Connect with friends
FACT CHECK
Find specific information
EXPLORATORY mindset seems rare and likely better served by other features such as Browse.
LISTEN and ORGANIZE are most prominent intents & associated with lean-back vs lean-in behavior.
29. FOCUSED
One specific thing in mind
OPEN
A seed of an idea in mind
EXPLORATORY
A path to explore
● Find it or not
● Quickest/easiest
path to results is
important
● From nothing good
enough, good enough
to better than good
enough
● Willing to try things out
● But still want to fulfil
their intent
● Difficult for users to
assess how it went
● May be able to answer
in relative terms
● Users expect to be
active when in an
exploratory mindset
● Effort is expected
User
mindsets
Just Give Me What I Want: How People Use and Evaluate
Music Search. C Hosey, L Vujović, B St. Thomas, J
Garcia-Gathright & J Thom, CHI 2019.
How the user thinks about results
30. Focused
mindset
Search Mindsets: Understanding Focused and
Non-Focused Information Seeking in Music Search. A Li, J
Thom, P Ravichandran, C Hosey, B St. Thomas & J
Garcia-Gathright, WWW 2019.
Understanding mindset helps us understand
search satisfaction.
65% of searches were focused.
When users search with a Focused Mindset
Put MORE effort in search.
Scroll down and click on lower rank results.
Click MORE on album/track/artist and LESS on
playlist.
MORE likely to save/add but LESS likely to
stream directly.
31. Developing Evaluation Metrics for Instant Search Using
Mixed Methods. P Ravichandran, J Garcia-Gathright, C
Hosey, B St. Thomas & J Thom. SIGIR 2019.
success rate more
sensitive than
click-through rate.
Metrics
Users evaluate their experience on search
based on two main factors: success and effort
TYPE
User communicates
with us
CONSIDER
User evaluates what
we show them
DECIDE
User ends the
search session
EFFORT SUCCESS
33. Search by
voice
Users ask for Spotify to play music, without
saying what they would like to hear
→ open mindset
Play
Spotify
Play music
Play music
from Spotify
Play me
some music
Play the
music
Play my
Spotify
Play some
music on
Spotify
Play some
music
Play music
on Spotify
34. Non-specific querying is a way for a user to effortlessly start
a listening session via voice.
Non-specific querying is a way to remove the burden of
choice when a user is open to lean-back listening.
User education matters as users will not engage in a
use-case they do not know about.
Trust and control are central to a positive experience. Users
need to trust the system enough to try it out.
Search as push paradigmSearch by
voice