Rated Ranking Evaluator Enterprise: the next generation of free Search Quality Evaluation Tools

Rated Ranking Evaluator Enterprise:
the Next Generation of Free Search
Quality Evaluation Tools 
 
Alessandro Benedetti, Director
Andrea Gazzarini, Co-Founder
15th
September 2021

‣ Born in Tarquinia(ancient Etruscan city)
‣ R&D Software Engineer
‣ Director
‣ Master in Computer Science
‣ Apache Lucene/Solr PMC member/committer
‣ Elasticsearch expert
‣ Semantic, NLP, Machine Learning
technologies passionate
‣ Beach Volleyball player and Snowboarder
Who We Are
Alessandro Benedetti

‣ Born in Viterbo
‣ Hermit Software Engineer
‣ Master in Economy
‣ Programming Passionate
‣ RRE Creator
‣ Apache Lucene/Solr Expert
‣ Elasticsearch Expert
‣ Apache Qpid Committer
‣ Father, Husband
‣ Bass player, aspiring (still frustrated at the moment) Chapman
Stick player
Who We Are
Andrea Gazzarini

‣ Headquarter in London/distributed
‣ Open Source Enthusiasts
‣ Apache Lucene/Solr experts
‣ Elasticsearch experts
‣ Community Contributors
‣ Active Researchers
‣ Hot Trends : Learning To Rank,
Document Similarity,
Search Quality Evaluation,
Relevancy Tuning
Search Services
www.sease.io

Agenda
RRE Open Source
RRE Enterprise (RREE)
Query Discovery
Rating Generation
Explore Evaluation Results

‣ Open Source library for Search Quality Evaluation
‣ Ratings are expected in input (Json supported)
‣ Many offline metrics available out-of-the-box
(Precision@k, NDCG@k, F-Measure…)
‣ Apache Solr and Elasticsearch support
‣ Development-centric approach
‣ Evaluation on the fly and results in various formats
‣ Community building up!
‣ RRE-User mailing list:
https://groups.google.com/g/rre-user
‣ https://github.com/SeaseLtd/rated-ranking-evaluator
Rated Ranking Evaluator : RRE Open Source
What is it ?

Rated Ranking Evaluator : The Genesis
2018
Search Consultancy Project
A customer explicitly asked for a
rudimental search quality evaluation
tool while we were working on its
search infrastructure.
Jun
Search Quality Evaluation
A Developer Perspective
0.9
Search Quality Evaluation
Tools And Techniques
Oct
1.0
to be continued...
mumble mumble

Rated Ranking Evaluator: The Idea
RRE has been thought as a development tool that
executes search quality evaluations as part of a
project build process.
It’s like a read–eval–print loop (REPL) tied on top of an
Information Retrieval subsystem that encourages an
incremental / iterative approach.
The underlying idea
New system Existing system
Here are the requirements
Ok
V1.0 has been released
Cool!
a month later…
We have a change request.
We found a bug
We need to improve our search
system, users are complaining about
junk in search results.
Ok
v0.1
…
v0.9
v1.1
v1.2
v1.3
…
v2.0
v2.0
In terms of retrieval effectiveness,
how can we know the system
performance across various versions?

Rated Ranking Evaluator: Domain Model
RRE Domain Model is organized into a composite /
tree-like structure where the relationships between
entities are always 1 to many.
The top level entity is a placeholder representing an
evaluation execution.
Versioned metrics are computed at query level and
then reported, using an aggregation function, at
upper levels.
The benefit of having a composite structure is clear:
we can see a metric value at different levels (e.g. a
query, all queries belonging to a query group, all
queries belonging to a topic or at corpus level)
Domain Model
Evaluation
Corpus
1..*
v1.0
P@10
NDCG
AP
F-MEASURE
….
v1.1
P@10
NDCG
AP
F-MEASURE
….
v1.2
P@10
NDCG
AP
F-MEASURE
….
v1.n
P@10
NDCG
AP
F-MEASURE
….
Topic
Query Group
Query
1..*
1..*
1..*
…
Top level domain entity
Test dataset / collection
Information need
Query variants
Queries

Rated Ranking Evaluator : How it works
Data
Configuration
Ratings
Search Platform
uses a
produces
Evaluation Data
INPUT LAYER EVALUATION LAYER OUTPUT LAYER
JSON
RRE Console
…
used for generating

Explainability - Why is important in Information Retrieval?
Dev, tune & Build
Check evaluation results
We are thinking about how
to fill a third monitor

RRE Enterprise : The Genesis
2018
A customer explicitly asked for a rudimental search quality evaluation
tool while we were working on its search infrastructure.
2019
Rated Ranking Evaluator
An Open Source Approach
to be continued...
The first sketches depicting an idea
about an enterprise-level version of
RRE.
The development starts few months
later.

Rated Ranking Evaluator : The Genesis
2018
A customer explicitly asked for a rudimental search quality evaluation
tool while we were working on its search infrastructure.
2019
to be continued...
2020
2021
JBCP
ID Discovery
Query Discovery

RRE Open Source Recap: How it works (½)
Data
Configuration
Ratings
targets produces
Evaluation Data
JSON
RRE Console
…
used for generating
OR

RRE Open Source Recap: How it works (2/2)
Data
Configuration
Ratings
targets produces
Evaluation Data
JSON
RRE Console
…
used for generating
OR

RRE Enterprise: Query Discovery?
API
Problem: I have an intermediate Search-API that builds complex Apache Solr/Elasticsearch queries

RRE Open Source: No Query Discovery
targets
OR
1
Evaluation Data
produces
2

Query Discovery
RRE Enterprise: Query Discovery & Evaluation
2
Evaluation Data
produces
3
targets
OR
{SEARCH API}
1
correlates on
targets

Input Rating: RRE Open Source vs RREE
RRE Enterprise
RRE Open Source
Topic
Query Group
Query
Information need
Query variant
(Search Engine) Query
+ 3
Rated Document
Topic
Query Group
API Request
Information need
Query variant
(Search API) Request
+ 3
Rated Document
Query (Search Engine) Query

RRE Enterprise: Rating Generation
A fundamental requirement of
offline search quality
evaluation is to gather
<query,document,rating>
triples that represent the
relevance(rating) of a
document given a user
information need(query).
Before assessing the retrieval
effectiveness of a system it is
necessary to associate a
relevance rating to each pair
<query, document> involved in
our evaluation.

RRE Enterprise: Explicit Ratings (1/2)
• Explicitly provided by domain experts
• High accuracy
• High effort / time / resources
• RRE Open Source accepts only explicit ratings
Explicit Ratings
RRE Rating Structure
Topic
Query Group
Query
Information need
Query variant
Query
+ 3
Rated Document
Question: how to minimize the relevant effort required
to provide explicit ratings?

RRE Enterprise: Explicit Ratings (2/2)
• Chrome plugin which applies an
evaluation layer on top of an arbitrary
website
• Ratings are generated using directly the
customer website
• Users Lowest learning curve
• Generated ratings are sent to RREE
through a specific endpoint
• An “ID Discovery” component translates
the received data (rated web items) in RRE
Ratings (rated Solr/Elasticsearch
documents)
Judgment Collector

Implicit Feedback - Click
{
"collection": "papers",
"query": "interleaving",
"blackBoxQueryRequest": "GET
/query?q=interleaving HTTP/1.1rnHost:
localhost:5063",
"documentId": "1",
"click": 1,
"timestamp":
"2021-06-23T16:10:49Z",
"queryDocumentPosition": 0
}
Click
http://vps-933d20b7.vps.ovh.net:8080/1.0
/rre-enterprise-api/input-api/interaction
Search Result Page

Implicit Feedback - Add To Cart
{
localhost:5063",
"documentId": "1",
"addToCart": 1,
"timestamp":
"2021-06-23T16:10:49Z",
}
Add To Cart
Search Result Page

Implicit Feedback - Sale
{
localhost:5063",
"documentId": "1",
"sale": 1,
"timestamp":
"2021-06-23T16:10:49Z",
}
Sale
Search Result Page

Implicit Feedback - Revenue
{
localhost:5063",
"documentId": "1",
"revenue": 100,
"timestamp":
"2021-06-23T16:10:49Z",
}
Revenue
Search Result Page

Implicit Feedback - Storage
{
localhost:5063",
"documentId": "1",
"revenue": 100,
"timestamp":
"2021-06-23T16:10:49Z",
}
Interactions

Implicit Feedback - Calculate Online Metrics
Interactions
<query,document>

Implicit Feedback - Estimate Relevance
Global Max: 1 Local Max: [0...1]
Global Min: 0 Local Min: [0...1]
Simple Click Model
0 1 2 3 4
Rating
Metric Score
0 1
0.5

‣ UI using React library
‣ Configuration support
‣ Navigation of Evaluation Results
‣ Overview - quick view for Business Stakeholders
‣ Explore (an evaluation) - deep view for Software
Engineers
‣ Compare (two evaluations) - deep comparison for
Software Engineers
Explore Evaluation Results: UI

Evaluation Results - Overview
https://rre-enterprise.netlify.app/overview

Evaluation Results - Overview Expanded

Evaluation Results - Overview Zoom

Evaluation Results - Explore Query Info

Evaluation Results - Explore Query Info 2

Evaluation Results - Compare (1/3)

Evaluation Results - Compare Query Info

‣ Release with free usage plan
‣ Configuration support
‣ Support for multimedia document properties
‣ Intelligent insights on weak performing queries,
groups, topics
‣ Improvements on Click Modelling for Implicit
Relevance estimation
RRE Enterprise: Future Work

Rated Ranking Evaluator Enterprise: the next generation of free Search Quality Evaluation Tools

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Rated Ranking Evaluator Enterprise: the next generation of free Search Quality Evaluation Tools

Similar to Rated Ranking Evaluator Enterprise: the next generation of free Search Quality Evaluation Tools (20)

More from Sease

More from Sease (20)

Recently uploaded

Recently uploaded (20)

Rated Ranking Evaluator Enterprise: the next generation of free Search Quality Evaluation Tools