Blendle @ RecSys'17: Online Learning to Rank for Recommender Systems

Online
Learning to Rank
for Recommending
Daan Odijk 
Lead data scientist @ Blendle

@dodijk
Mission
Help you discover and support  
the world’s best journalism 
International
May 2014: The Netherlands
Sept 2015: Germany
March 2016: United States
Publisher-backed
a.o. NY Times, Nikkei, Axel Springer
70 employees
10 journalists & 50 developers
Blendle

@dodijk
In Blendle you can
browse through all
quality
newspapers and
magazines

@dodijk
You only pay for
what you read,
with a single
click

Scale at Blendle
@dodijk
Articles
> 6M in total
> 7K new every day
> 30% is read
Users
> 1M users
~ 1 in 5 converts to a
paying user
Events
~ 2B in total
> 2M new every day
! "

@dodijk
Our editors select the best
articles for our email
newsletter every day
Our personalisation
algorithms create a
personal bundle from this

@dodijk
Random Forest classiﬁer trained
on a year of editorial picks
Clustered based on Cosine
similarity with TF.IDF vectors
Prioritised
Selection

Sales→
Today
1w
|
2w
|
3w
|
Short shelve life

Daily cold start
• >7K new articles every night
• Our newsletter is an
important trafﬁc driver
• No usage info to rank the
newsletter before we send
the newsletter

Article enrichment pipeline
Author
extraction
Semantic
linking
Sentiment
analysis
Stylometry
Named Entity
Recognition
PoS-tagging
Length, word
variation, vocabulary
richness, …
Polarity scores
(negative, positive)
Locations,
persons,
organizations
Wikipedia
concepts
@dodijk
Language
detection
Topic modeling
Tokenization
Spark LDA
EM

User Proﬁling
Enrich
Aggregate Proﬁle
• Reads
• Views
• Negative feedback

Learning to rank: preference learning
Model
Enrich EnrichProﬁle Proﬁle
Extract ML Features
Learning to predict

Learning to rank: preference learning
Model
Enrich EnrichProfile Profile
Extract ML Features
Learning to predict
Enrich Profile
Extract ML Features Rank
Ranking

Online Learning to Rank
• Learning with a user in the loop
• Daily updates to our model

query
[Yue et al, 2009; Hofmann et al., 2011]
Dueling Bandit Gradient Descent
wAuthor
wTopic
wAuthor
wTopic
Explorative RankerExploitative Ranker
For Blendle the useris the query

Interleaved Ranking Explorative RankingExploitative Ranking
A
B
C
D
E
F
C
G
D
A
B
E
query
TeamDraft Interleave
Radlinski, F., Kurup, M., & Joachims, T. (2008). How does clickthrough data reflect retrieval quality? In CIKM ’08.

A
B
D
E
F
C
G
D
A
B
E
query

AB
D
E
F
C
G
D
B
E
query

Interleaved Ranking
A
B
E
C
G
D
query

Interleaved Ranking
A
B
E
C
G
D
query
note: the interleaving
method is NOT part of
DBGD, it just provides
feedback

query
wAuthor
wTopic
Explorative RankerExploitative Ranker
wAuthor
wTopic
G

query
Exploitative Ranker
wAuthor
wTopic
G

Online Evaluation
@dodijk
-2%
0%
2%
4%
6%
Online Learning A/B Test
Lift in
articles
read
Days →

User Cold Start
@dodijk
+ Newspapers
RankerWeight
inScore
0%
50%
100%
Articles Read
0 5
Onboarding Reading
5% lift in
articles read

@dodijkBlendle
Relevance
Foryou
✓
Diversiﬁcation

@dodijkBlendle
Relevance
Foryou
✓
Similarity
Diversiﬁcation

@dodijkBlendle
Relevance
Foryou
✓
Similarity
✓
Diversiﬁcation

@dodijkBlendle
Relevance
Foryou
✓
Similarity
✓ ✓
Diversiﬁcation

@dodijkBlendle
Relevance
Foryou
✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Diversiﬁcation

@dodijkBlendle
Breaking Bubbles
Editorial selection
Must reads
Diversiﬁcation
Little effect of popularity
Filter Bubbles
Foryou
MustReads
Making Bubbles
Onboarding
Reading history
Explicit feedback

Timing problem
• Our editors wake up at 5am
and are done reading at
7am
• Which is also when we want
to send our newsletter
• We simply can’t wait for a
batch process

@dodijk
article
published
articles
enrich
article
enrich
article
enrich
article
articles updates
persist
update
articles
picks
editor picks
article
create pick
update
picked articles
article
features
article features
users
user article
features
user article features
ranker A
ranker B
ranker Z
user article scores
persist
scorescores
ﬁlled nightly witha batch process
scores arriveafter seconds
#users x #updates = 200M
#users x #updates x #rankers = 600M
#updates = 200

personalize
bundle
@dodijk
scores
editor sends
newsletter
newsletters
users
select users
user newsletters personalized newsletter
send newsletter
computed
realtime
ﬁlled with abatch process
#newsletters = 1
#newsletters x #users = 1M
#newsletters x #users = 1M
~7am
~7.15am
personalizepersonalize
bundle
experimentation

Cold start problem
• So we enrich our
content

Timing problem
• So we precompute
as much as
possible

Explanations: #FATREC
Why this article?
Because you seem to like a long
read article every now and then.
@dodijk

Blendle @ RecSys'17: Online Learning to Rank for Recommender Systems

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Blendle @ RecSys'17: Online Learning to Rank for Recommender Systems