This document describes a contextual TV recommendation system called Watch-It-Next that provides recommendations for shared smart TV devices. It uses the program currently being watched as contextual information to infer who is likely watching and narrow the recommendations. It presents a technique called "3-Way" that learns a standard matrix factorization model and scores recommendation candidates based on their agreement with both the device and context item vectors, without changing the learning algorithm. An evaluation on a large dataset of TV viewing data found the contextual recommendations outperformed non-contextual baselines, indicating the current context helps disambiguate who is watching on shared devices.
3. Challenge: Recommendations in
Shared Accounts and Devices
“I am a 34 yo man who enjoys action and sci-fi movies. This is
what my children have done to my netflix account”
09/10/15 3
4. Our Focus: Recommendations for
Smart TVs
09/10/15 4
Main problems:
Inferring who has consumed
each item in the past
Who is currently requesting
the recommendations
“Who” can be a subset of
users
Smart TVs can track what is being watched on them,
but not who was watching.
7. This Work: Contextual Personalized
Recommendations
09/10/15 7
WatchItNext problem:
it is 8:30pm and “House of Cards” is on
What should we recommend to be watched
next on this device?
Implicit assumption: there’s a good chance
whoever is in front of the set now, will
remain there
8. WatchItNext Inputs and Output
09/10/15 8
Available programs,
a.k.a. “line-up”
Ranked
recommendations
9. A fundamental principle in recommender systems
Taps similarities in patterns of consumption/enjoyment of
items by users
Recommends to a user what users with detected similar
tastes have consumed/enjoyed
Collaborative Filtering
09/10/15 9
10. Consider a consumption matrix R of users and items
ru,i=1 whenever person u consumed item i
In other cases, ru,i might be person u’s rating on item i
The matrix R is typically very sparse
…and often very large
Collaborative Filtering –
Mathematical Abstraction
users
R =
Items
|U| x |I|
09/10/15 10
11. Latent factor models (LFM):
Map both users and items to some f-dimensional space Rf
, i.e.
produce f-dimensional vectors vu and wi for each user and item
Define rating estimates as inner products: qui = <vu,wi>
Main problem: finding a mapping of users and items to the latent
factor space that produces “good” estimates
Collaborative Filtering –
Matrix Factorization
users
R =
Items
≈
|U| x |I| |U| x f f x |I|
V
W
09/10/15 11
12. Main Contribution:
“3-Way” Technique
Learn a standard matrix factorization model (LFM/LDA)
When recommending to a device d currently watching context item
c, score each target item t as follows:
S(t follows c|d) = ∑j=1..kvd(j)*wc(j)*wt(j)
May require an additive shift to get rid of negative values.
Score is high for targets that agree with both context and device
Results in “Sequential LFM/LDA” – a personalized contextual
recommender
Again – no need to model context or change learning algorithm;
learn as usual, just apply change when scoring
09/10/15 12
13. Data by the Numbers
Training data: three months’ worth of viewership data
Test Data: derived from one month of viewership data
09/10/15 13
* Items are {movie, sports event, series} – not at the individual episode level
Devices Unique items*
Triplets
339647 17232 More than 19M
Setting Test Instances Average Line-up Size
Habitual ~3.8M 390
Exploratory ~1.7M 349
14. Metric: Avg. Rank Percentile (ARP)
Note: with large line-ups, ARP is practically equivalent to average
AUC09/10/15 14
RP = 0.75
?next
(RP = 0.25)
(RP = 0.50)
(RP = 1.0)
Rank Percentile properties:
Ranges in (0,1]
Higher is better
Random scores ~0.5 in
large lineups
16. Contextual Personalized
Recommenders
09/10/15 16
SequentialLDA [LFM]: 3-way element-wise multiplication
of device vector, context item and target item
TemporalLDA[LFM]: regular LDA/LFM score, multiplied
by Temporal Popularity
TempSeqLDA[LFM]: 3-way score multiplied by
Temporal Popularity
17. Results (1)
Sequential Context Matters
Degradation when using a random item as context indicates that
the correct context item reflects the current viewing session, and
implicitly the current watchers of the device
09/10/15 17
18. Results (2)
Sequential Context Matters
Device Entropy: the entropy of p(topic | device) as computed by LDA
on the training data; high values correspond to diverse distributions
09/10/15 18
20. Conclusions
Multi-user or shared devices pose challenging recommendation
problems.
Sequential context helps – it “narrows" the topical variety of the
program to be watched next on the device.
Intuitively, context serves to implicitly disambiguate the
current user or users of the device.
3-Way technique is an effective way of incorporating sequential
context that has no impact on learning.
Thank you! Questions?
Please come visit the poster tomorrow.
Raz@yahoo-inc.com
09/10/15 20
Editor's Notes
Even in Carlos’ example his top recommendation row had “House of Cards” next to a couple of kids shows.
Related task on ratings data: matrix completion
Predict users’ ratings for items they have yet to rate, i.e. “complete” missing values
On last bullet – this is the main challenge for recommender systems.
Habitual setting: all line-up items are eligible for recommendation to a device
Exploratory setting: only items that were not previously watched on the device are eligible for recommendation
So now let’s look at the results. First, I’ll try to convince you that sequential context matters
What you can see here is the ARP of vanilla LDA and sequential LDA (3-way technique), as a function of device entropy. Think of the entropy as a measure of the topical variety of a device – the higher it is, the more diverse the viewing habits of a device are. So we expect that as the entropy rises, it will be harder to recommend. But, using the sequential context here, mitigates the degradation in prediction accuracy, and even for the highest entropy devices (where vanilla LDA gives almost random predictions), we get ARP of above 70%.