More Related Content Similar to Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Ceccarelli, Bloomberg (20) More from Lucidworks (20) Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Ceccarelli, Bloomberg1. © 2018 Bloomberg Finance L.P. All rights reserved.
Learning to Rank:
From Theory to Production
Malvina Josephidou & Diego Ceccarelli
Bloomberg
@malvijosephidou | @diegoceccarelli
#Activate18 #ActivateSearch
2. © 2018 Bloomberg Finance L.P. All rights reserved.
About Us
o Software Engineers at Bloomberg
o Working on relevance of the News search engine
o Before joining, PhDs in ML and IR
3. © 2018 Bloomberg Finance L.P. All rights reserved.
Bloomberg – Who are we?
o A technology company with
5,000+ software engineers
o Financial data, analytics,
communication and trading tools
o More than 325K subscribers in
170 countries
5. © 2018 Bloomberg Finance L.P. All rights reserved.
How do we retrieve relevant results?
o Use relevance
functions to assign
scores to each
matching document
o Sort documents by
relevance score
6
50.1
35.5
10.2
Relevance
Score
6. © 2018 Bloomberg Finance L.P. All rights reserved.
How do we design relevance functions?
score = tf
+ 5.2 x tf(title)
+ 4.5 x tf(desc)
7. © 2018 Bloomberg Finance L.P. All rights reserved.
Good Luck With That…
query = Solr query = Italy query = Facebook query = Trump
score = tf(body)
+ 5.2 x tf(title)
+ 4.5 x tf(desc)
+ ??? x doc-length
+ ??? x freshness
+ ??? x popularity
+ ??? x author
+ ..... ?????
?
8. © 2018 Bloomberg Finance L.P. All rights reserved.
How do we come up with ranking functions?
o We don’t. Hand-tuning ranking functions is
insane
o ML to the rescue: Use data to train algorithms
to distinguish relevant documents from
irrelevant documents
9. © 2018 Bloomberg Finance L.P. All rights reserved.
2018 achievement
Learning-to-Rank fully deployed in production
10. © 2018 Bloomberg Finance L.P. All rights reserved.
Learning-to-Rank
(aka LTR)
Use machine learning algorithms to rank
results in a way that optimizes search
relevance
11. © 2018 Bloomberg Finance L.P. All rights reserved.
How does this work?
Index
Top-k
retrieval
User
Query
People
Commodities
News
Other Sources
ReRanking
Model
Top-k
reranked
Top-x
retrieval
x >> k
12. © 2018 Bloomberg Finance L.P. All rights reserved.
How to Ship LTR in Production in 3 Steps
Make it Work
Make it Fast
Deploy to Production
13. © 2018 Bloomberg Finance L.P. All rights reserved.
LTR steps
I. Collect query-document judgments [Offline]
II. Extract query-document features [Solr]
III. Train model with judgments + features [Offline]
IV. Deploy model [Solr]
V. Apply model [Solr]
VI. Evaluate results [Offline]
14. © 2018 Bloomberg Finance L.P. All rights reserved.
I. Collect judgements
Judgement
(good/bad)
Judgement
(5 stars)
3/5
5/5
0/5
15. © 2018 Bloomberg Finance L.P. All rights reserved.
I. Collect judgements
Explicit – judges assess search results manually
o Experts
o Crowdsourced
Implicit – infer assessments through user behavior
o Aggregated result clicks
o Query reformulation
o Dwell time
16. © 2018 Bloomberg Finance L.P. All rights reserved.
II. Extract Features
Signals that give an indication of a result’s importance
Query matches
the title
Freshness Is it from
bloomberg.com?
Popularity
0 0.7 0 3583
1 0.9 1 625
0 0.1 0 129
17. © 2018 Bloomberg Finance L.P. All rights reserved.
II. Extract Features
o Define features to extract in
myFeatures.json
o Deploy features definition file
to Solr
curl -XPUT
'http://localhost:8983/solr/myCollection
/schema/feature-store' --data-binary
"@/path/myFeatures.json" -H 'Content-
type:application/json'
[
{
"name": "matchTitle",
"type": "org.apache.solr.ltr.feature. SolrFeature",
"params": {
"q": "{!field f=title}${text}"
}, {
"name": "freshness",
"type": "org.apache.solr.ltr.feature. SolrFeature",
"params": {
"q": "{!func}recip(ms(NOW,timestamp),3.16e-11,1,1)"
},
{ "name": "isFromBloomberg", … },
{ "name": "popularity", … }
]
18. © 2018 Bloomberg Finance L.P. All rights reserved.
o Add features transformer to Solr config
o Request features for document by adding [features] to the fl
parameter
http://localhost:8983/solr/myCollection/query?q=test&fl=title,url,[features]
II. Extract Features
<!– Document transformer adding feature vectors with each retrieved document ->
<transformer name=“features”
class=“org.apache.solr…LTRFeatureLoggerTransformerFactory” />
19. © 2018 Bloomberg Finance L.P. All rights reserved.
III. Train Model
o Combine query-document judgments & features into training data
file
o Train ranking model offline
• RankSVM1 [liblinear]
• LambdaMART2 [ranklib]
Example: Linear model
score = 1.2 x matchTitle + 10 x popularity
+ 5.4 x isFromBloomberg - 2 x freshness
1T. Joachims, Optimizing Search Engines Using Clickthrough Data,
Proceedings of the ACM Conference on Knowledge Discovery and Data
Mining (KDD), ACM, 2002.
2C.J.C. Burges, "From RankNet to LambdaRank to LambdaMART: An
Overview", Microsoft Research Technical Report MSR-TR-2010-82, 2010.
20. © 2018 Bloomberg Finance L.P. All rights reserved.
IV. Deploy Model
o Generate trained output model
in myModelName.json
o Deploy model definition file to
Solr
curl -XPUT
'http://localhost:8983/solr/techproducts/schema/mode
l-store' --data-binary "@/path/myModelName.json" -H
'Content-type:application/json'
21. © 2018 Bloomberg Finance L.P. All rights reserved.
V. Re-rank Results
o Add LTR query parser to Solr config
o Search and re-rank results
http://localhost:8983/solr/myCollection/query?q=AAPL&
rq={!ltr model="myModelName" reRankDocs=100}
<!– Query parser used to re-rank top docs with a provided model -->
<queryParser name=“ltr” class=“org.apache.solr.ltr.search.LTRQParserPlagin”/>
22. © 2018 Bloomberg Finance L.P. All rights reserved.
VI. Evaluate quality of search
Precision
how many relevant results I
returned divided by total number
of results returned
Recall
how many relevant results I
returned divided by total number
of relevant results for the query
NDCG (discounted cumulative
gain)
Image credit: https://commons.wikimedia.org/wiki/File:Precisionrecall.svg by User:Walber
23. © 2018 Bloomberg Finance L.P. All rights reserved.
How to Ship LTR in Production in 3 Steps
Make it Work
Make it Fast
Deploy to Production
24. © 2018 Bloomberg Finance L.P. All rights reserved.
Are We Good to Go?
o Mid 2017: the infrastructure bit
seemed to be done.
o We had a ‘placeholder’ feature store.
But, we needed useful features
computed in production.
o We deployed a new shiny feature
store…
25. © 2018 Bloomberg Finance L.P. All rights reserved.
There’s no model without features…
Search latency when we rolled out a
new set of features
26. © 2018 Bloomberg Finance L.P. All rights reserved.
Why was
it slow? Which
features
were to
blame?
27. © 2018 Bloomberg Finance L.P. All rights reserved.
Metrics on feature latency
o We instrumented support in Apache Solr to log
the time it took to compute each feature on each
document.
o We added analytics on top of that.
28. © 2018 Bloomberg Finance L.P. All rights reserved.
Why is it slow?
o 1.9ms is ok for 10 documents, but not for 100 docs…
Feature Latencies using the old set of
features
Total time per search: 19ms
Totalfeaturetimeperdoc
Feature Latencies using the new set of features,
including the FastSlothFeature
Total time per search: 15ms
Totalfeaturetimeperdoc
Total time per search: 145ms
Feature Latencies using the new set of features
SlothFeature
Totalfeaturetimeperoc
29. © 2018 Bloomberg Finance L.P. All rights reserved.
How do we go faster?
oSome of the features are unrelated to
the query: For example: is the source
reliable?
oWe can precompute them and store
them in the index.
30. © 2018 Bloomberg Finance L.P. All rights reserved.
Index Static Features
• Add feature_is_wire_BLAH to the Solr schema
<field name="feature_is_wire_BLAH" type="tdouble" indexed="false" stored="true"
docValues=”false" required="false”/>
• Use UpdateRequestProcessors in Solr config to create new fields
<updateRequestProcessorChain
<processor
class="com.internal.solr.update.processor.IsFoundUpdateProcessorFactory">
<str name="source">wire</str>
<str name="dest">feature_is_wire_BLAH</str>
<str name="map">
{
”BLAH" : 1.0
}
</str>
<double name="default">0.0</double>
</processor>
</updateRequestProcessorChain>
31. © 2018 Bloomberg Finance L.P. All rights reserved.
Index Static Features
• Features at runtime are produced by reading the value from the
index using the FieldValueFeature.
Would this ease all our performance troubles?
"features":[
{
"type":"ltr.feature.impl.FieldValueFeature",
"params": {
"field": "feature_is_wire_BLAH"
},
"name": “is_wire_BLAH",
"default": 0.0
},
33. © 2018 Bloomberg Finance L.P. All rights reserved.
Why is this
happening to
us?
34. © 2018 Bloomberg Finance L.P. All rights reserved.
Reading from the index can still be slow…
o Retrieving field values from their stored values is
slow
o So we changed the implementation of
FieldValueFeature to use DocValues if they are
available
o DocValues record field values in a column-
oriented way, mapping doc ids to values
35. © 2018 Bloomberg Finance L.P. All rights reserved.
From Stored Fields to Doc Values
Feature latencies when retrieving 100 docs using
DocValues
Feature latencies when retrieving 100 docs using
StoredFields
Total time per search: 135ms Total time per search: 25ms
: 5x faster!
36. © 2018 Bloomberg Finance L.P. All rights reserved.
Are we there yet?
Feature logging/computation
Document re-ranking
37. © 2018 Bloomberg Finance L.P. All rights reserved.
Evaluating performance: NO-OP Model
o A linear model, performing all the computations but not
modifying the original order
Original
Solr Score
Query
matches
the title
Freshness Is the
document
from
Bloomberg
.com?
Popularity
2.3 0 0.7 0 3583
2.1 1 0.9 1 625
1.3 0 0.1 0 129
No-op
1.0
0.0
0.0
0.0
0.0
X =
Final Score
2.3
2.1
1.3
38. © 2018 Bloomberg Finance L.P. All rights reserved.
Need for speed
No LTR, retrieve 3
docs
(ms)
LTR retrieve 3
docs, re-rank 25
(ms)
Median search time 39 77
Why is it so slow???
39. © 2018 Bloomberg Finance L.P. All rights reserved.
News needs grouping
o Similar news stories are grouped (clustered) together and only the
top-ranked story in each group is shown
40. © 2018 Bloomberg Finance L.P. All rights reserved.
Grouping + Re-ranking:
Not a match made in heaven
oRegular grouping involves 3
communication rounds between
coordinator and shards
o With re-ranking, we have to re-
rank the groups and the
documents in each group
[ SOLR-8776 Support RankQuery in grouping ]
43. © 2018 Bloomberg Finance L.P. All rights reserved.
What is the Las Vegas Patch?
44. © 2018 Bloomberg Finance L.P. All rights reserved.
Grouping
Three requests from the
coordinator:
1. Coordinator asks for top n
groups for the query and
computes top n groups.
2. Each shard compute top m
documents for the top n groups
3. Coordinator retrieves top docs
for each group and retrieve them
from the shards
45. © 2018 Bloomberg Finance L.P. All rights reserved.
Why? Example:
o Top 2 groups, top 2 documents, 2 shards
Doc Group Score
doc6 group3 70.0
doc7 group2 65.0
doc8 group3 60.0
doc9 group2 60.0
doc10 group1 50.0
Doc Group Score
doc1 group1 20.0
doc2 group2 5.0
doc3 group1 100.0
doc4 group1 120.0
doc5 group2 10.0
Group Docs Score
group1 doc4 120.0
doc1 20.0
group2 doc5 10.0
doc2 5.0
Group Docs Score
group3 doc6 70.0
doc8 60.0
group2 doc7 65.0
doc9 60.0
Machine1Machine2 Top Groups:
Group1: doc4 doc1
Group3: doc6 doc8
Top docs should be:
Group1: doc4 doc10
Group3: doc6 doc8
46. © 2018 Bloomberg Finance L.P. All rights reserved.
Las Vegas Idea
o If you want just one document per group,
you do not have this problem
o We can return the top document from each
group in the first step and skip the second
step entirely
o For LTR: Re-rank only the top document of
each group
47. © 2018 Bloomberg Finance L.P. All rights reserved.
Show me the numbers
o We made plain old search faster: by
about 40%!
o LTR-served searches are still faster
than they were before we did the Las
Vegas optimization
Method Median time Perc95 time
Normal Grouping (No LTR) 0.20 0.35
Las Vegas (No LTR) 0.12 0.26
Las Vegas+LTR no-op 0.18 0.27
48. © 2018 Bloomberg Finance L.P. All rights reserved.
How to Ship LTR in Production in 3 Steps
Make it Work
Make it Fast
Deploy to Production
49. © 2018 Bloomberg Finance L.P. All rights reserved.
Where is the model?
Let the LTR hackathon start…
o Write code to process training data
in svmlite format.
o Wrappers around scikit-learn to
train various linear models, do
regularization, hyperparameter
optimization and model debugging
o Evaluate the model: MAP, NDCG,
MRR
50. © 2018 Bloomberg Finance L.P. All rights reserved.
A Small Model for LTR, a Giant Step for
Bloomberg
o Released initially to select
internal users
o Then, to all internal users
o Then, to 10% of our clients
o And finally, to 100% of our
clients
51. © 2018 Bloomberg Finance L.P. All rights reserved.
It’s a whole new (LTR) world!
o Trying out new features, new models
and classes of models
o Experimenting with different types of
training data
o Rolling out new rankers, for different
types of queries
52. © 2018 Bloomberg Finance L.P. All rights reserved.
Take home messages
o Make sure you can measure success
and failure – metrics, metrics, metrics!
o If a feature is static index it
o Don’t use stored values for static
features, always rely on DocValues
o If you are not happy with the
performance of search, consider a trip
to Las Vegas
you may end up improving
performance by 40%
53. Thank you! Eυχαριστούμε! Grazie!
And btw: we are hiring a senior search relevance engineer!
bit.ly/2Orb8bc
https://www.bloomberg.com/careers
Malvina Josephidou & Diego Ceccarelli
Bloomberg
@malvijosephidou |@diegoceccarelli
#Activate18 #ActivateSearch
Editor's Notes Malvina starts here Diego 16 slides Malvina starts here https://pxhere.com/en/photo/1143377https://creativecommons.org/publicdomain/zero/1.0/You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. https://upload.wikimedia.org/wikipedia/commons/a/a7/Cute_Sloth.jpg the sloth is under CC commons, can be reused https://commons.wikimedia.org/wiki/File:Cute_Sloth.jpg You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission Every update request in Solr is passed through a chain of events defined in the solr config under the updaterequestprocessorchain. Updaterequestprocessors allow us to create new fields from existing ones or change existing fields at index time. We use updaterequestprocessors to compute features at index time and write those values under new feature fields which are kept in the index. In this snippet of code we add in the updaterequestprocessorchain a custom made updaterequestprocessor that looks at the wire field and if it finds the string it will create this new feature field with value 1 else it will create it it with the default value of 0. And we can define other such updaterequestprocessors with our feature computation logic, for example a feature might involve counting terms in a multivaluedfield and we can easily do that too. https://www.pexels.com/photo/baby-child-close-up-crying-47090/ CC0 License
✓ Free for personal and commercial use
✓ No attribution required
Picture from
https://pxhere.com/en/photo/1143377
Licence is CC0 Public Domain
Free for personal and commercial use
No attribution required Learn more
/* Change FieldValueFeature to use docvalues to fetch a feature value when the field has docValues
Docvalues are faster for this particular case because they build a forward index from documents to values. They Public domain license: https://www.goodfreephotos.com/vector-images/you-shall-not-pass-sign-with-gandalf-vector-clipart.png.php the Work may be freely reproduced, distributed, transmitted, used, modified, built upon, or otherwise exploited by anyone for any purpose, commercial or non-commercial, and in any way, including by methods that have not yet been invented or conceived. https://www.pexels.com/photo/attraction-building-city-hotel-415999/ CCO licenseYou can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission In cases where we do grouping and ask for group.limit=1 only it is possible to skip the second grouping step. In our test datasets it improved speed by around 40%.
Essentially, in the first grouping step each shard returns the top K groups based on the highest scoring document in each group. The top K groups from each shard are merged in the federator and in the second step we ask all the shards to return the top documents from each of the top ranking groups.
If we only want to return the highest scoring document per group we can return the top document id in the first step, merge results in the federator to retain the top K groups and then skip the second grouping step entirely. This is possible provided that:
We do not need to know the total number of matching documents per groupb) Within group sort and between group sort is the same.
The LTR optimization is to then to compute the LTR score on the top-ranking document per group only (rather than all members of a group)
instead of applying the model to each document in each group and each shard, apply LTR on the top document per group where ‘top’ is determined by the Solr score. So documents in grouped are ranked using Solr score, groups are ranked using LTR.
Malvina starts here We were ready to roll out the model.https://www.pexels.com/photo/view-ape-thinking-primate-33535/ CC0 license
What we realized at this point was that there was no model. There was something called ltr1 we had developed using features computed offline but we realized that model was broken.We had product on our backs, our code worked and was fast enough, but we didn’t actually have anything to roll out and we also hadn't written a single line of code related to training models using features logged in production.
https://www.pexels.com/photo/drink-beer-cheers-toast-8859/ CC0 You can’t optimize what you can’t measure