SlideShare a Scribd company logo
1 of 55
Download to read offline
Learning To Rank For Solr
Michael Nilsson – Software Engineer
Diego Ceccarelli – Software Engineer
Joshua Pantony – Software Engineer
Bloomberg LP
OUTLINE
●  Search at Bloomberg
●  Why do we need machine learning for search?
●  Learning to Rank
●  Solr Learning to Rank Plugin
8 millions searches PER DAY
1 million PER DAY
400	
  million	
  stories	
  in	
  the	
  index	
  
SOLR IN BLOOMBERG
●  Search engine of choice at Bloomberg
─  Large community / Well distributed committers
─  Open source Apache Project
─  Used within many commercial products
─  Large feature set and rapid growth
●  Committed to open-source
─  Ability to contribute to core engine
─  Ability to fix bugs ourselves
─  Contributions in almost every Solr release since 4.5.0
PROBLEM SETUP
score: 30
score: 1.0
PROBLEM SETUP
𝑆𝑐𝑜𝑟𝑒=100∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝑇𝑖𝑡𝑙𝑒+
10∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝐷𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛
score: 52.2
score: 30.8
PROBLEM SETUP
𝑆𝑐𝑜𝑟𝑒=100∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝑇𝑖𝑡𝑙𝑒+
10∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝐷𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛
PROBLEM SETUP
𝑆𝑐𝑜𝑟𝑒=𝟏𝟓𝟎∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝑇𝑖𝑡𝑙𝑒+
𝟑.𝟏𝟒∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝐷𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛+
𝟒𝟐∗ 𝑐𝑙𝑖𝑐𝑘𝑠
PROBLEM SETUP
𝑆𝑐𝑜𝑟𝑒=𝟗𝟗.𝟗∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝑇𝑖𝑡𝑙𝑒
+𝟑.𝟏𝟏𝟏𝟒∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝐷𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛
+𝟒𝟐.𝟒𝟐∗ 𝑐𝑙𝑖𝑐𝑘𝑠 +
5 ∗  timeElapsedFrom  LastUpdate  
●  It’s hard to manually tweak the ranking
─  You must be an expert in the domain
─  … or a magician
PROBLEM SETUP
𝑆𝑐𝑜𝑟𝑒=𝟗𝟗.𝟗∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝑇𝑖𝑡𝑙𝑒
+𝟑.𝟏𝟏𝟏𝟒∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝐷𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛
+𝟒𝟐.𝟒𝟐∗ 𝑐𝑙𝑖𝑐𝑘𝑠 +
5 ∗  timeElapsedFrom  LastUpdate  
query = solr query = lucene query = austin query = bloomberg query = …
PROBLEM SETUP
It’s easier with Machine Learning
●  2,000+ parameters (non-linear, factorially larger than linear form)
●  8,000+ queries that are regularly tuned
●  Early on we spent many days hand tuning…
SEARCH PIPELINE (ONLINE)
Index
Top-k
retrieval
User
Query
People
Commodities
News
Other Sources
ReRanking
Model
Top-k
reranked
Top-x
retrieval
x >> k
TRAINING PIPELINE (OFFLINE)
Index
Feature
Extraction
Learning
Algorithm
Ranking
Model
Training
Query-Document
Pairs
People
Commodities
News
Other Sources
Metrics
TRAINING PIPELINE (OFFLINE)
Index
Feature
Extraction
Learning
Algorithm
Ranking
Model
Training
Query-Document
Pairs
People
Commodities
News
Other Sources
Metrics
TRAINING DATA: IMPLICIT VS EXPLICIT
What is explicit data?
●  A set of judges will assess the
search results manually given a
query
─  Experts
─  Crowd
What is implicit data?
●  Infer user preferences based on
user behavior
─  Aggregated results clicks
─  Query reformulation
─  Dwell time
Pros:
─  Data is very clean
Cons:
─  Can be very expensive!
Pros:
─  A lot of data!
Cons:
─  Extremely noisy
─  Privacy concerns
TRAINING PIPELINE (OFFLINE)
Index
Feature
Extraction
Learning
Algorithm
Ranking
Model
Training
Query-Document
Pairs
People
Commodities
News
Other Sources
Metrics
FEATURES
●  A feature is an individual measurable property
●  Given a query, and a collection we can produce many features for each
document in the collection
─  If the query matches the title
─  Length of the document
─  Number of views
─  How old is it?
─  Can be visualized on a mobile device?
FEATURES
Extract “features”
Was the result a
cofounder?
0
Features are signals that give an indication of a result’s importance
FEATURES
Extract “features”
Features are signals that give an indication of a result’s importance
Was the result a
cofounder?
0
Does the document
have an exec. position?
1
Query : APPL US
FEATURES
Extract “features”
Features are signals that give an indication of a result’s importance
Was the result a
cofounder?
0
Does the query match
the document title?
0
Does the document
have an exec. position?
1
FEATURES
Extract “features”
Features are signals that give an indication of a result’s importance
Was the result a
cofounder?
0
Does the query match
the document title?
0
Does the document
have an exec. position?
1
Popularity (%) 0.9
FEATURES
Extract “features”
Features are signals that give an indication of a result’s importance
Was the result a
cofounder?
0
Does the query match
the document title?
1
Does the document
have an exec. position?
0
Popularity (%) 0.6
TRAINING PIPELINE (OFFLINE)
Index
Feature
Extraction
Learning
Algorithm
Ranking
Model
Training
Query-Document
Pairs
People
Commodities
News
Other Sources
Metrics
METRICS
How do we know if our model is doing better?
●  Offline metrics
─  Precision/Recall/F1 score
─  nDCG (Normalized Discount Cumulative Gain)
─  Other metrics (e.g., ERR, MAP, …)
●  Online Metrics
─  Click through rates à higher
─  Time to first click à lower
─  Interleaving1
1O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. Large scale validation and analysis of interleaved search evaluation. ACM
Transactions on Information Science, 30(1), 2012.
TRAINING PIPELINE (OFFLINE)
Index
Feature
Extraction
Learning
Algorithm
Ranking
Model
Training
Query-Document
Pairs
People
Commodities
News
Other Sources
Metrics
LEARNING TO RANK
●  Learn how to combine the features for optimizing one or more metrics
●  Many learning algorithms
─  RankSVM1
─  LambdaMART2
─  …
1T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference on Knowledge Discovery and
Data Mining (KDD), ACM, 2002.
2C.J.C. Burges, "From RankNet to LambdaRank to LambdaMART: An Overview", Microsoft Research Technical Report MSR-
TR-2010-82, 2010.
SEARCH PIPELINE: STANDARD
Index
Top-k
retrieval
User
Query
SolrPeople
Commodities
News
Other Sources
SEARCH PIPELINE: STANDARD
Index
Top-k
retrieval
User
Query
Solr
Training
Data
Learning
Algorithm
Ranking
Model Offline
People
Commodities
News
Other Sources
SEARCH PIPELINE: STANDARD
Index
Top-k
retrieval
User
Query
Solr
Ranking
ModelOnline
Top-x
reranked
People
Commodities
News
Other Sources
SEARCH PIPELINE: SOLR INTEGRATION
Index
Top-k
retrieval
User
Query
Solr
Ranking
ModelOnline
Top-x
reranked
People
Commodities
News
Other Sources
SOLR RELEVANCY
●  Pros
─  Simple and quick scoring computation
─  Phrase matching
─  Function query boosting on time, distance, popularity, etc
─  Customized fields for stemming, synonyms, etc
●  Cons
─  Lots of manual time for creating a well tuned query
─  Weights are brittle, and may not be compatible in the future with more documents
or fields added
LTR PLUGIN: GOALS
●  Don’t tune the relevancy manually!
─  Uses machine learning to power automatic relevancy tuning
●  Significant relevancy improvements
●  Allow comparable scores across collections
─  Collections of different sizes
●  Maintaining low latency
─  Re-use the vast Solr search functionality that is already built-in
─  Less data transport
●  Makes it simple to use domain knowledge to rapidly create features
─  Features are no longer coded but rather scripted
STANDARD SOLR SEARCH REQUEST
Index
Top-k
retrieval
User
Query
People
Commodities
News
Other Sources
Index
STANDARD SOLR SEARCH REQUEST
Index
[10 Million]
Top-10
retrieval
User
Query
Matches
[10k]
Score
[10k]
Solr Query
People
Commodities
News
Other Sources
LTR SOLR SEARCH REQUEST
Index
[10 Million]
Top-1000
retrieval
User
Query
Matches
[10k]
Score
[10k]
Ranking
Model
Top-10
reranked
Solr Query
LTR Query
People
Commodities
News
Other Sources
<!-- Query parser used to rerank top docs with a provided model -->	
  
<queryParser name="ltr" class="org.apache.solr.ltr.ranking.LTRQParserPlugin" />	
  
LTR PLUGIN: RERANKING
●  LTRQuery extends Solr’s RankQuery
─  Wraps main query to fetch initial results
─  Returns custom TopDocsCollector for reranked ordered results
●  Solr rerank request parameter
rq={!ltr model=myModel1 reRankDocs=100 efi.user_query=‘james’ efi.my_var=123}
─  !ltr – name used in the solrconfig.xml for the LTRQParserPlugin
─  model – name of deployed model to use for reranking
─  reRankDocs – total number of documents to rerank
─  efi.* – custom parameters used to pass external feature information for your
features to use
•  Query intent
•  Personalization
SEARCH PIPELINE (ONLINE)
Index
[10 Million]
Top-1000
retrieval
User
Query
Matches
[10k]
Score
[10k]
Ranking
Model
Top-10
reranked
Feature
Extraction
People
Commodities
News
Other Sources
{	
  
	
  	
  	
  	
  "name":	
  	
  "Tim	
  Cook",	
  
	
  	
  	
  	
  "primary_position":	
  	
  "ceo",	
  
	
  	
  	
  	
  "category	
  ":	
  	
  "person",	
  
	
  	
  	
  	
  …	
  
}	
  
FEATURES
Extract “features”
Features are signals that give an indication of a result’s importance
Was the result a
cofounder?
0
Does the query match
the document title?
0
Does the document
have an exec. position?
1
Popularity (%) 0.9
LTR PLUGIN: FEATURES BEFORE
[	
  
	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "name":	
  	
  "isPersonAndExecutive",	
  
	
  	
  	
  	
  	
  	
  	
  	
  "type":	
  "org.apache.solr.ltr.feature.impl.SolrFeature",	
  
	
  	
  	
  	
  	
  	
  	
  	
  "params":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "fq":	
  [	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "{!terms	
  f=category}person",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "{!terms	
  f=primary_position}ceo,	
  cto,	
  cfo,	
  president"	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ]	
  
	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  },	
  
	
  	
  	
  	
  …	
  
]	
  
LTR PLUGIN: FEATURES AFTER
LTR PLUGIN: FUNCTION QUERIES
[	
  
	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "name":	
  	
  "documentRecency",	
  
	
  	
  	
  	
  	
  	
  	
  	
  "type":	
  "org.apache.solr.ltr.feature.impl.SolrFeature",	
  
	
  	
  	
  	
  	
  	
  	
  	
  "params":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "q":	
  "{!func}recip(	
  ms(NOW,publish_date),	
  3.16e-­‐11,	
  1,	
  1)"	
  
	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  },	
  
	
  	
  	
  	
  …	
  
]	
  
	
  
1	
  for	
  docs	
  dated	
  now,	
  1/2	
  for	
  docs	
  dated	
  1	
  year	
  ago,	
  1/3	
  for	
  docs	
  dated	
  2	
  years	
  ago,	
  etc..	
  	
  
See	
  http://wiki.apache.org/solr/FunctionQuery#Date_Boosting	
  
LTR PLUGIN: FEATURE STORE
●  FeatureStore is a Solr Managed Resource
─  REST API endpoint for performing CRUD operations on Solr objects
─  Stored in maintained in Zookeeper
●  Deploy
─  curl -XPUT 'http://yoursolrserver/solr/collection/config/fstore'
--data-binary @./features.json -H 'Content-type:application/json'
●  View
─  http://yoursolrserver/solr/collection/config/fstore
LTR PLUGIN: FEATURES
●  Simplifies feature engineering through configuration file
●  Utilizes rich search functionality built-in to Solr
─  Phrase matching
─  Synonyms, Stemming, etc
●  Inherit the Feature class for specialized features
SEARCH PIPELINE (ONLINE)
Index
[10 Million]
Top-1000
retrieval
User
Query
Matches
[10k]
Score
[10k]
Ranking
Model
Top-10
reranked
Feature
Extraction
People
Commodities
News
Other Sources
TRAINING PIPELINE (OFFLINE)
Index
[10 Million]
Top-1000
retrieval
Training
Queries
Matches
[10k]
Score
[10k]
Feature
Extraction
Learning
Algorithm
Ranking
Model
People
Commodities
News
Other Sources
{	
  
	
  	
  	
  	
  "name":	
  	
  "Tim	
  Cook",	
  
	
  	
  	
  	
  "primary_position":	
  	
  "ceo",	
  
	
  	
  	
  	
  "category	
  ":	
  	
  "person",	
  
	
  	
  	
  	
  …	
  
}	
  
FEATURES
Extract “features”
Features are signals that give an indication of a result’s importance
Was the result a
cofounder?
0
Does the query match
the document title?
0
Does the document
have an exec. position?
1
Popularity (%) 0.9
<!-- Document transformer adding feature vectors with each retrieved document -->	
  
<transformer name="fv" class= "org.apache.solr.ltr.ranking.LTRFeatureTransformer" />	
  
LTR PLUGIN: FEATURE EXTRACTION
●  Feature extraction uses Solr’s TransformerFactory
─  Returns a custom field with each document
●  fl = *, [fv]
{	
  
	
  	
  	
  	
  "name":	
  	
  "Tim	
  Cook",	
  
	
  	
  	
  	
  "primary_position":	
  	
  "ceo",	
  
	
  	
  	
  	
  "category	
  ":	
  	
  "person",	
  
	
  	
  	
  	
  …	
  
	
  	
  	
  	
  "[fv]":	
  	
  "isCofounder:0.0,	
  isPersonAndExecutive:1.0,	
  matchTitle:0.0,	
  popularity:0.9"	
  
}	
  
LTR PLUGIN: MODEL{	
  
	
  	
  	
  	
  "type":	
  "org.apache.solr.ltr.ranking.LambdaMARTModel",	
  
	
  	
  	
  	
  "name":	
  "mymodel1",	
  
	
  	
  	
  	
  "features":	
  [	
  
	
  	
  	
  	
  	
  	
  	
  	
  {	
  "name":	
  "matchedTitle"},	
  
	
  	
  	
  	
  	
  	
  	
  	
  {	
  "name":	
  "isPersonAndExecutive"}	
  
	
  	
  	
  	
  ],	
  
	
  	
  	
  	
  "params":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "trees":	
  [	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "weight":	
  1,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "tree":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "feature":	
  "matchedTitle",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "threshold":	
  0.5,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "left":	
  {	
  "value":	
  -­‐100	
  },	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "right":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "feature":	
  "isPersonAndExecutive",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "threshold":	
  0.5,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "left":	
  {	
  "value":	
  50	
  },	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "right":	
  {	
  "value":	
  75	
  }	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  	
  	
  	
  ]	
  
	
  	
  	
  	
  }	
  
}	
  
LTR PLUGIN: MODEL
●  ModelStore is also a Solr Managed Resource
●  Deploy
─  curl -XPUT 'http://yoursolrserver/solr/collection/config/mstore'
--data-binary @./model.json -H 'Content-type:application/json'
●  View
─  http://yoursolrserver/solr/collection/config/mstore
●  Inherit from the model class for new scoring algorithms
─  score()
─  explain()
LTR PLUGIN: EVALUATION
●  Offline Metrics
─  nDCG increased approximately 10% after reranking
●  Online Metrics
─  Clicks @ 1 up by approximately 10%
BEFORE AND AFTER
Query: “unemployment”
Solr Ranking Machine Learned Reranking
LTR PLUGIN: EVALUATION
●  Offline Metrics
─  nDCG increased approximately 10% after reranking
●  Online Metrics
─  Clicks @ 1 up by approximately 10%
●  Performance
─  About 30% faster than previous external ranking system
10 million documents in collection
100k queries
1k features
1k documents/query reranked
LTR PLUGIN: BENEFITS
●  Simpler feature engineering, without compiling
●  Access to rich internal Solr search functionality for feature building
●  Search result relevancy improvements vs regular Solr relevance
●  Automatic relevancy tuning
●  Compatible scores across collections
●  Performance benefits vs external ranking system
FUTURE WORK
●  Continue work to open source the plugin
●  Support pipelining multiple reranking models
●  Allow a simple ranking model to be used in the first pass
QUESTIONS?

More Related Content

What's hot

Neural Search Comes to Apache Solr
Neural Search Comes to Apache SolrNeural Search Comes to Apache Solr
Neural Search Comes to Apache SolrSease
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginsearchbox-com
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackRich Lee
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and searchEugene Yan Ziyou
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesBoosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesLucidworks (Archived)
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphP. Taylor Goetz
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesOpenSource Connections
 
Vector databases and neural search
Vector databases and neural searchVector databases and neural search
Vector databases and neural searchDmitry Kan
 
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Julian Hyde
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxElasticsearch
 
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019confluent
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesNeo4j
 
Introduction To Kibana
Introduction To KibanaIntroduction To Kibana
Introduction To KibanaJen Stirrup
 
Neo4j GraphTour Santa Monica 2019 - Amundsen Presentation
Neo4j GraphTour Santa Monica 2019 - Amundsen PresentationNeo4j GraphTour Santa Monica 2019 - Amundsen Presentation
Neo4j GraphTour Santa Monica 2019 - Amundsen PresentationTamikaTannis
 
Incorporating Diversity in a Learning to Rank Recommender System
Incorporating Diversity in a Learning to Rank Recommender SystemIncorporating Diversity in a Learning to Rank Recommender System
Incorporating Diversity in a Learning to Rank Recommender SystemJacek Wasilewski
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...StreamNative
 
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)Eva Tse
 

What's hot (20)

Neural Search Comes to Apache Solr
Neural Search Comes to Apache SolrNeural Search Comes to Apache Solr
Neural Search Comes to Apache Solr
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component plugin
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and search
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesBoosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User Preferences
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
 
Learn to Rank search results
Learn to Rank search resultsLearn to Rank search results
Learn to Rank search results
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon Hughes
 
Vector databases and neural search
Vector databases and neural searchVector databases and neural search
Vector databases and neural search
 
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
 
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and Ontologies
 
Introduction To Kibana
Introduction To KibanaIntroduction To Kibana
Introduction To Kibana
 
Neo4j GraphTour Santa Monica 2019 - Amundsen Presentation
Neo4j GraphTour Santa Monica 2019 - Amundsen PresentationNeo4j GraphTour Santa Monica 2019 - Amundsen Presentation
Neo4j GraphTour Santa Monica 2019 - Amundsen Presentation
 
Incorporating Diversity in a Learning to Rank Recommender System
Incorporating Diversity in a Learning to Rank Recommender SystemIncorporating Diversity in a Learning to Rank Recommender System
Incorporating Diversity in a Learning to Rank Recommender System
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
 

Similar to Learn How to Optimize Solr Search with Machine Learning

Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systemsTrey Grainger
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 CareerBuilder.com
 
Building multi billion ( dollars, users, documents ) search engines on open ...
Building multi billion ( dollars, users, documents ) search engines  on open ...Building multi billion ( dollars, users, documents ) search engines  on open ...
Building multi billion ( dollars, users, documents ) search engines on open ...Andrei Lopatenko
 
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks Lucidworks
 
2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies
2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies
2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for NewbiesTYPO3 CertiFUNcation
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation EnginesTrey Grainger
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solrTrey Grainger
 
Building a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solrBuilding a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solrlucenerevolution
 
AI from your data lake: Using Solr for analytics
AI from your data lake: Using Solr for analyticsAI from your data lake: Using Solr for analytics
AI from your data lake: Using Solr for analyticsDataWorks Summit
 
The Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesThe Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesTrey Grainger
 
From Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank StoryFrom Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank StoryAlessandro Benedetti
 
Elasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ SignalElasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ SignalJoachim Draeger
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemTrey Grainger
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionSplunk
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentationTao Feng
 
Webinar: Lucidworks + Thomson Reuters for Improved Investment Performance
Webinar: Lucidworks + Thomson Reuters for Improved Investment PerformanceWebinar: Lucidworks + Thomson Reuters for Improved Investment Performance
Webinar: Lucidworks + Thomson Reuters for Improved Investment PerformanceLucidworks
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discoverymarkgrover
 
SplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner WorkshopSplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner Workshopjenny_splunk
 

Similar to Learn How to Optimize Solr Search with Machine Learning (20)

Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
Building multi billion ( dollars, users, documents ) search engines on open ...
Building multi billion ( dollars, users, documents ) search engines  on open ...Building multi billion ( dollars, users, documents ) search engines  on open ...
Building multi billion ( dollars, users, documents ) search engines on open ...
 
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks
Practical End-to-End Learning to Rank Using Fusion - Andy Liu, Lucidworks
 
kdd2015
kdd2015kdd2015
kdd2015
 
2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies
2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies
2018 - CertiFUNcation - Olivier Dobberka: Apache Solr for Newbies
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solr
 
Building a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solrBuilding a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solr
 
AI from your data lake: Using Solr for analytics
AI from your data lake: Using Solr for analyticsAI from your data lake: Using Solr for analytics
AI from your data lake: Using Solr for analytics
 
Solr Architecture
Solr ArchitectureSolr Architecture
Solr Architecture
 
The Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesThe Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation Engines
 
From Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank StoryFrom Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank Story
 
Elasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ SignalElasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ Signal
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
Webinar: Lucidworks + Thomson Reuters for Improved Investment Performance
Webinar: Lucidworks + Thomson Reuters for Improved Investment PerformanceWebinar: Lucidworks + Thomson Reuters for Improved Investment Performance
Webinar: Lucidworks + Thomson Reuters for Improved Investment Performance
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
 
SplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner WorkshopSplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner Workshop
 

More from Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Learn How to Optimize Solr Search with Machine Learning

  • 1. Learning To Rank For Solr Michael Nilsson – Software Engineer Diego Ceccarelli – Software Engineer Joshua Pantony – Software Engineer Bloomberg LP
  • 2. OUTLINE ●  Search at Bloomberg ●  Why do we need machine learning for search? ●  Learning to Rank ●  Solr Learning to Rank Plugin
  • 3. 8 millions searches PER DAY 1 million PER DAY 400  million  stories  in  the  index  
  • 4. SOLR IN BLOOMBERG ●  Search engine of choice at Bloomberg ─  Large community / Well distributed committers ─  Open source Apache Project ─  Used within many commercial products ─  Large feature set and rapid growth ●  Committed to open-source ─  Ability to contribute to core engine ─  Ability to fix bugs ourselves ─  Contributions in almost every Solr release since 4.5.0
  • 6. PROBLEM SETUP 𝑆𝑐𝑜𝑟𝑒=100∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝑇𝑖𝑡𝑙𝑒+ 10∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝐷𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛 score: 52.2 score: 30.8
  • 7. PROBLEM SETUP 𝑆𝑐𝑜𝑟𝑒=100∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝑇𝑖𝑡𝑙𝑒+ 10∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝐷𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛
  • 8. PROBLEM SETUP 𝑆𝑐𝑜𝑟𝑒=𝟏𝟓𝟎∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝑇𝑖𝑡𝑙𝑒+ 𝟑.𝟏𝟒∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝐷𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛+ 𝟒𝟐∗ 𝑐𝑙𝑖𝑐𝑘𝑠
  • 9. PROBLEM SETUP 𝑆𝑐𝑜𝑟𝑒=𝟗𝟗.𝟗∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝑇𝑖𝑡𝑙𝑒 +𝟑.𝟏𝟏𝟏𝟒∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝐷𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛 +𝟒𝟐.𝟒𝟐∗ 𝑐𝑙𝑖𝑐𝑘𝑠 + 5 ∗  timeElapsedFrom  LastUpdate  
  • 10. ●  It’s hard to manually tweak the ranking ─  You must be an expert in the domain ─  … or a magician PROBLEM SETUP 𝑆𝑐𝑜𝑟𝑒=𝟗𝟗.𝟗∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝑇𝑖𝑡𝑙𝑒 +𝟑.𝟏𝟏𝟏𝟒∗ 𝑠𝑐𝑜𝑟𝑒𝑂𝑛𝐷𝑒𝑠𝑐𝑟𝑖𝑝𝑡𝑖𝑜𝑛 +𝟒𝟐.𝟒𝟐∗ 𝑐𝑙𝑖𝑐𝑘𝑠 + 5 ∗  timeElapsedFrom  LastUpdate   query = solr query = lucene query = austin query = bloomberg query = …
  • 11. PROBLEM SETUP It’s easier with Machine Learning ●  2,000+ parameters (non-linear, factorially larger than linear form) ●  8,000+ queries that are regularly tuned ●  Early on we spent many days hand tuning…
  • 12. SEARCH PIPELINE (ONLINE) Index Top-k retrieval User Query People Commodities News Other Sources ReRanking Model Top-k reranked Top-x retrieval x >> k
  • 15. TRAINING DATA: IMPLICIT VS EXPLICIT What is explicit data? ●  A set of judges will assess the search results manually given a query ─  Experts ─  Crowd What is implicit data? ●  Infer user preferences based on user behavior ─  Aggregated results clicks ─  Query reformulation ─  Dwell time Pros: ─  Data is very clean Cons: ─  Can be very expensive! Pros: ─  A lot of data! Cons: ─  Extremely noisy ─  Privacy concerns
  • 17. FEATURES ●  A feature is an individual measurable property ●  Given a query, and a collection we can produce many features for each document in the collection ─  If the query matches the title ─  Length of the document ─  Number of views ─  How old is it? ─  Can be visualized on a mobile device?
  • 18. FEATURES Extract “features” Was the result a cofounder? 0 Features are signals that give an indication of a result’s importance
  • 19. FEATURES Extract “features” Features are signals that give an indication of a result’s importance Was the result a cofounder? 0 Does the document have an exec. position? 1 Query : APPL US
  • 20. FEATURES Extract “features” Features are signals that give an indication of a result’s importance Was the result a cofounder? 0 Does the query match the document title? 0 Does the document have an exec. position? 1
  • 21. FEATURES Extract “features” Features are signals that give an indication of a result’s importance Was the result a cofounder? 0 Does the query match the document title? 0 Does the document have an exec. position? 1 Popularity (%) 0.9
  • 22. FEATURES Extract “features” Features are signals that give an indication of a result’s importance Was the result a cofounder? 0 Does the query match the document title? 1 Does the document have an exec. position? 0 Popularity (%) 0.6
  • 24. METRICS How do we know if our model is doing better? ●  Offline metrics ─  Precision/Recall/F1 score ─  nDCG (Normalized Discount Cumulative Gain) ─  Other metrics (e.g., ERR, MAP, …) ●  Online Metrics ─  Click through rates à higher ─  Time to first click à lower ─  Interleaving1 1O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. Large scale validation and analysis of interleaved search evaluation. ACM Transactions on Information Science, 30(1), 2012.
  • 26. LEARNING TO RANK ●  Learn how to combine the features for optimizing one or more metrics ●  Many learning algorithms ─  RankSVM1 ─  LambdaMART2 ─  … 1T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2002. 2C.J.C. Burges, "From RankNet to LambdaRank to LambdaMART: An Overview", Microsoft Research Technical Report MSR- TR-2010-82, 2010.
  • 30. SEARCH PIPELINE: SOLR INTEGRATION Index Top-k retrieval User Query Solr Ranking ModelOnline Top-x reranked People Commodities News Other Sources
  • 31. SOLR RELEVANCY ●  Pros ─  Simple and quick scoring computation ─  Phrase matching ─  Function query boosting on time, distance, popularity, etc ─  Customized fields for stemming, synonyms, etc ●  Cons ─  Lots of manual time for creating a well tuned query ─  Weights are brittle, and may not be compatible in the future with more documents or fields added
  • 32. LTR PLUGIN: GOALS ●  Don’t tune the relevancy manually! ─  Uses machine learning to power automatic relevancy tuning ●  Significant relevancy improvements ●  Allow comparable scores across collections ─  Collections of different sizes ●  Maintaining low latency ─  Re-use the vast Solr search functionality that is already built-in ─  Less data transport ●  Makes it simple to use domain knowledge to rapidly create features ─  Features are no longer coded but rather scripted
  • 33. STANDARD SOLR SEARCH REQUEST Index Top-k retrieval User Query People Commodities News Other Sources
  • 34. Index STANDARD SOLR SEARCH REQUEST Index [10 Million] Top-10 retrieval User Query Matches [10k] Score [10k] Solr Query People Commodities News Other Sources
  • 35. LTR SOLR SEARCH REQUEST Index [10 Million] Top-1000 retrieval User Query Matches [10k] Score [10k] Ranking Model Top-10 reranked Solr Query LTR Query People Commodities News Other Sources
  • 36. <!-- Query parser used to rerank top docs with a provided model -->   <queryParser name="ltr" class="org.apache.solr.ltr.ranking.LTRQParserPlugin" />   LTR PLUGIN: RERANKING ●  LTRQuery extends Solr’s RankQuery ─  Wraps main query to fetch initial results ─  Returns custom TopDocsCollector for reranked ordered results ●  Solr rerank request parameter rq={!ltr model=myModel1 reRankDocs=100 efi.user_query=‘james’ efi.my_var=123} ─  !ltr – name used in the solrconfig.xml for the LTRQParserPlugin ─  model – name of deployed model to use for reranking ─  reRankDocs – total number of documents to rerank ─  efi.* – custom parameters used to pass external feature information for your features to use •  Query intent •  Personalization
  • 37. SEARCH PIPELINE (ONLINE) Index [10 Million] Top-1000 retrieval User Query Matches [10k] Score [10k] Ranking Model Top-10 reranked Feature Extraction People Commodities News Other Sources
  • 38. {          "name":    "Tim  Cook",          "primary_position":    "ceo",          "category  ":    "person",          …   }   FEATURES Extract “features” Features are signals that give an indication of a result’s importance Was the result a cofounder? 0 Does the query match the document title? 0 Does the document have an exec. position? 1 Popularity (%) 0.9
  • 40. [          {                  "name":    "isPersonAndExecutive",                  "type":  "org.apache.solr.ltr.feature.impl.SolrFeature",                  "params":  {                          "fq":  [                                  "{!terms  f=category}person",                                  "{!terms  f=primary_position}ceo,  cto,  cfo,  president"                          ]                  }          },          …   ]   LTR PLUGIN: FEATURES AFTER
  • 41. LTR PLUGIN: FUNCTION QUERIES [          {                  "name":    "documentRecency",                  "type":  "org.apache.solr.ltr.feature.impl.SolrFeature",                  "params":  {                          "q":  "{!func}recip(  ms(NOW,publish_date),  3.16e-­‐11,  1,  1)"                  }          },          …   ]     1  for  docs  dated  now,  1/2  for  docs  dated  1  year  ago,  1/3  for  docs  dated  2  years  ago,  etc..     See  http://wiki.apache.org/solr/FunctionQuery#Date_Boosting  
  • 42. LTR PLUGIN: FEATURE STORE ●  FeatureStore is a Solr Managed Resource ─  REST API endpoint for performing CRUD operations on Solr objects ─  Stored in maintained in Zookeeper ●  Deploy ─  curl -XPUT 'http://yoursolrserver/solr/collection/config/fstore' --data-binary @./features.json -H 'Content-type:application/json' ●  View ─  http://yoursolrserver/solr/collection/config/fstore
  • 43. LTR PLUGIN: FEATURES ●  Simplifies feature engineering through configuration file ●  Utilizes rich search functionality built-in to Solr ─  Phrase matching ─  Synonyms, Stemming, etc ●  Inherit the Feature class for specialized features
  • 44. SEARCH PIPELINE (ONLINE) Index [10 Million] Top-1000 retrieval User Query Matches [10k] Score [10k] Ranking Model Top-10 reranked Feature Extraction People Commodities News Other Sources
  • 45. TRAINING PIPELINE (OFFLINE) Index [10 Million] Top-1000 retrieval Training Queries Matches [10k] Score [10k] Feature Extraction Learning Algorithm Ranking Model People Commodities News Other Sources
  • 46. {          "name":    "Tim  Cook",          "primary_position":    "ceo",          "category  ":    "person",          …   }   FEATURES Extract “features” Features are signals that give an indication of a result’s importance Was the result a cofounder? 0 Does the query match the document title? 0 Does the document have an exec. position? 1 Popularity (%) 0.9
  • 47. <!-- Document transformer adding feature vectors with each retrieved document -->   <transformer name="fv" class= "org.apache.solr.ltr.ranking.LTRFeatureTransformer" />   LTR PLUGIN: FEATURE EXTRACTION ●  Feature extraction uses Solr’s TransformerFactory ─  Returns a custom field with each document ●  fl = *, [fv] {          "name":    "Tim  Cook",          "primary_position":    "ceo",          "category  ":    "person",          …          "[fv]":    "isCofounder:0.0,  isPersonAndExecutive:1.0,  matchTitle:0.0,  popularity:0.9"   }  
  • 48. LTR PLUGIN: MODEL{          "type":  "org.apache.solr.ltr.ranking.LambdaMARTModel",          "name":  "mymodel1",          "features":  [                  {  "name":  "matchedTitle"},                  {  "name":  "isPersonAndExecutive"}          ],          "params":  {                  "trees":  [                          {                                  "weight":  1,                                  "tree":  {                                          "feature":  "matchedTitle",                                          "threshold":  0.5,                                          "left":  {  "value":  -­‐100  },                                          "right":  {                                                  "feature":  "isPersonAndExecutive",                                                  "threshold":  0.5,                                                  "left":  {  "value":  50  },                                                  "right":  {  "value":  75  }                                          }                                  }                          }                  ]          }   }  
  • 49. LTR PLUGIN: MODEL ●  ModelStore is also a Solr Managed Resource ●  Deploy ─  curl -XPUT 'http://yoursolrserver/solr/collection/config/mstore' --data-binary @./model.json -H 'Content-type:application/json' ●  View ─  http://yoursolrserver/solr/collection/config/mstore ●  Inherit from the model class for new scoring algorithms ─  score() ─  explain()
  • 50. LTR PLUGIN: EVALUATION ●  Offline Metrics ─  nDCG increased approximately 10% after reranking ●  Online Metrics ─  Clicks @ 1 up by approximately 10%
  • 51. BEFORE AND AFTER Query: “unemployment” Solr Ranking Machine Learned Reranking
  • 52. LTR PLUGIN: EVALUATION ●  Offline Metrics ─  nDCG increased approximately 10% after reranking ●  Online Metrics ─  Clicks @ 1 up by approximately 10% ●  Performance ─  About 30% faster than previous external ranking system 10 million documents in collection 100k queries 1k features 1k documents/query reranked
  • 53. LTR PLUGIN: BENEFITS ●  Simpler feature engineering, without compiling ●  Access to rich internal Solr search functionality for feature building ●  Search result relevancy improvements vs regular Solr relevance ●  Automatic relevancy tuning ●  Compatible scores across collections ●  Performance benefits vs external ranking system
  • 54. FUTURE WORK ●  Continue work to open source the plugin ●  Support pipelining multiple reranking models ●  Allow a simple ranking model to be used in the first pass