SlideShare a Scribd company logo
1 of 15
TEST DRIVEN RELEVANCY
How to Work with Content Experts to Optimize and
Maintain Search Relevancy

Doug Turnbull

Rena Morse

Search Relevancy Expert Director of Semantic Tech
OpenSource Connections Silverchair Information
Systems
Its us!
Hi Iā€Ÿm Doug!

@softwaredoug
http://www.linkedin.com/in/softwaredoug
http://bit.ly/softwaredoug

Hi Iā€Ÿm Rena!

renam@silverchair.com
How do sales/content curators collaborate with devs?
ā€œWhen doctors search for ā€žmyocardial infarctionā€Ÿ these
documents about ā€žheart attacksā€Ÿ should come upā€

ā€œMyocardial in-what-tion? Dangit Rena, Iā€Ÿm a
Solr consultant ā€“ not a doctorā€
ā€œI donā€Ÿt evenā€¦ā€
ā€œlet me work my Solr magic and get back to
you next weekā€¦ā€
How do content curators collaborate with devs?
ā€¢

Doug knows Solr

ā€¢

Bob knows his business
Rena knows her content
Supplier Pressure
Sales
Conversions
Niche customers

q={!boost b=log(numCatPictures)}bunnies
<tokenizer class="solr.WhitespaceTokenizerFactory"/>

Myocardial infarction

Renal carcinoma

This is a universal pattern ā€“ it takes different strokes!
How do content curators collaborate with devs?
ā€œRena, I fixed that myocardial in-whatever-tion relevancy
issueā€
ā€œOk but you broke this other thing I thought
was fixed!ā€
<reiterates that heā€Ÿs a search expert not a doctor>
<reiterates sheā€Ÿs a paying client>
ā€œok let me see what I can doā€¦. Iā€Ÿll get back to you in a
weekā€
Our Problems
ā€¢

People Problem: Our collaboration stinks
ā€“ ā€œThrow problems over the fenceā€¦ wait a weekā€
ā€“ Siloed by expertise (search vs content experts)
ā€“ Potential for politics, anger, frustration (all paths to the dark side)

ā€¢

Technical Problem: Search testing is hard
ā€“ Small set of relevancy rules impacts all user searches
ā€“ Therefore: Much easier to have regressions than in software
ā€“ Very easy to have regressions
Our Problems
ā€¢

In short, Broken Workflow:
ā€“ Iterations between devs and non-technical experts take a long time
ā€“ Devs need immediate access to non-technical expertise to make rapid
progress
ā€¢ Gather broken searches
ā€¢ Rate existing searches
ā€¢ Find searches that have slid backwards in quality
ā€“ Non-technical experts clearly need devs
ā€¢ Translate business rules to relevancy rules
ā€¢ Bending sophisticated token matching engine to our approximate user
intent
Our Problems
ā€¢

Our lack of collaboration means our testing stinks
ā€“ Need expert feedback to test
ā€œthis is good, this is bad, this is okā€¦.ā€

ā€¢

Search devs often donā€Ÿt know good search! They need help.
ā€œI need an army of Renas locked in a room
telling me what is good and badā€
Solutions?
ā€¢

In s/w development -- automated testing is often away to collaborate
ā€“ Devs Can sit together with content experts and ask:
ā€¢ What should happen in this case?
ā€“ Then record that in the form of a test
@Given("tab $asciiTab")
public void tab(String asciiTab) {
tab = new Tab(asciiTab, new TabParser());
}

@When("the guitar plays")
public void guitarPlays() {
guitar = new Guitar();
guitar.play(tab);
}
@Then("the following notes will be played $notes")
public void theFollowingNotesWillBePlayed(String notes) {
ensureThat(expectedNotes(notes), arePlayedBy(guitar));
}
Test Driven Development with Search
ā€¢

Collaborative testing is absolutely essential for search
ā€“ Good search is best defined by experts in the content:
ā€¢ Marketing, sales, users, etc
Iā€Ÿm a search expert! Not a Doctor!
How can I possibly measure search

ā€¢

Unfortunately thereā€Ÿs nothing to help content experts communicate with search
devs around search (frankly this is rather shocking to me)
Help me help you! I have few ways to
record, measure, and evaluate search
Test Driven Development with Search
ā€¢

Collaborative testing is absolutely essential for search
ā€“ Every change to search relevancy will cause other queries to change
ā€“ MUST know how much our relevancy has changed
ā€œI fixed your searchā€¦ Does it matter that toe fungus query
changed by 30%?ā€
ā€œYeah lets see if we can work together to balance
the two relevancy concernsā€
ā€œIā€Ÿm glad we have tests to know whatā€Ÿs changed!ā€
Test Driven Development with Search
ā€¢

Apply these ideas to search quality:
ā€“ Given query Y
ā€“ What documents are awesome/good/ok/bad/terrible?

ā€¢
ā€¢

Record these ratings somewhere
Observe the changes of all queries simultaneously as we modify relevancy params
Now Iā€Ÿve got the ultimate relevancy workbench. I can
see if my ideas are working or failing right away!
Now I cane see instantly if Dougā€Ÿs changes are making
progress or if weā€Ÿre moving backwards!
Quepid! (give your queries some love!)
ā€¢
ā€¢

We built a tool (a product!) around these ideas
Now our favorite relevancy workbench

Learn more about Quepid!

http://quepid.io

ā€œHey weā€Ÿre kicking butt and
taking names on these search
queries!ā€
Test Driven Development with Quepid
Search Quality is about Collaboration*
*and collaboration is about testing

Demo time

More Related Content

More from lucenerevolution

Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiledlucenerevolution
Ā 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs lucenerevolution
Ā 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
Ā 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Stormlucenerevolution
Ā 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?lucenerevolution
Ā 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APIlucenerevolution
Ā 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
Ā 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
Ā 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
Ā 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenallucenerevolution
Ā 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...lucenerevolution
Ā 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
Ā 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadooplucenerevolution
Ā 
A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...lucenerevolution
Ā 
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
How Lucene Powers the LinkedIn Segmentation and Targeting PlatformHow Lucene Powers the LinkedIn Segmentation and Targeting Platform
How Lucene Powers the LinkedIn Segmentation and Targeting Platformlucenerevolution
Ā 
Query Latency Optimization with Lucene
Query Latency Optimization with LuceneQuery Latency Optimization with Lucene
Query Latency Optimization with Lucenelucenerevolution
Ā 
10 keys to Solr's Future
10 keys to Solr's Future10 keys to Solr's Future
10 keys to Solr's Futurelucenerevolution
Ā 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friendslucenerevolution
Ā 
Hacking Lucene and Solr for Fun and Profit
Hacking Lucene and Solr for Fun and ProfitHacking Lucene and Solr for Fun and Profit
Hacking Lucene and Solr for Fun and Profitlucenerevolution
Ā 

More from lucenerevolution (20)

Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Ā 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
Ā 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
Ā 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
Ā 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
Ā 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
Ā 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
Ā 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Ā 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
Ā 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
Ā 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Ā 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
Ā 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadoop
Ā 
A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...
Ā 
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
How Lucene Powers the LinkedIn Segmentation and Targeting PlatformHow Lucene Powers the LinkedIn Segmentation and Targeting Platform
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
Ā 
Query Latency Optimization with Lucene
Query Latency Optimization with LuceneQuery Latency Optimization with Lucene
Query Latency Optimization with Lucene
Ā 
10 keys to Solr's Future
10 keys to Solr's Future10 keys to Solr's Future
10 keys to Solr's Future
Ā 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friends
Ā 
The Typed Index
The Typed IndexThe Typed Index
The Typed Index
Ā 
Hacking Lucene and Solr for Fun and Profit
Hacking Lucene and Solr for Fun and ProfitHacking Lucene and Solr for Fun and Profit
Hacking Lucene and Solr for Fun and Profit
Ā 

Recently uploaded

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
Ā 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
Ā 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
Ā 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
Ā 
FULL ENJOY šŸ” 8264348440 šŸ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY šŸ” 8264348440 šŸ” Call Girls in Diplomatic Enclave | DelhiFULL ENJOY šŸ” 8264348440 šŸ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY šŸ” 8264348440 šŸ” Call Girls in Diplomatic Enclave | Delhisoniya singh
Ā 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
Ā 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
Ā 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
Ā 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
Ā 
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
Ā 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
Ā 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
Ā 
šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜RTylerCroy
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
Ā 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
Ā 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
Ā 
Swan(sea) Song ā€“ personal research during my six years at Swansea ... and bey...
Swan(sea) Song ā€“ personal research during my six years at Swansea ... and bey...Swan(sea) Song ā€“ personal research during my six years at Swansea ... and bey...
Swan(sea) Song ā€“ personal research during my six years at Swansea ... and bey...Alan Dix
Ā 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationRadu Cotescu
Ā 

Recently uploaded (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Ā 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Ā 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Ā 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Ā 
FULL ENJOY šŸ” 8264348440 šŸ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY šŸ” 8264348440 šŸ” Call Girls in Diplomatic Enclave | DelhiFULL ENJOY šŸ” 8264348440 šŸ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY šŸ” 8264348440 šŸ” Call Girls in Diplomatic Enclave | Delhi
Ā 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
Ā 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Ā 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Ā 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
Ā 
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
Ā 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Ā 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Ā 
šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Ā 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
Ā 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Ā 
Swan(sea) Song ā€“ personal research during my six years at Swansea ... and bey...
Swan(sea) Song ā€“ personal research during my six years at Swansea ... and bey...Swan(sea) Song ā€“ personal research during my six years at Swansea ... and bey...
Swan(sea) Song ā€“ personal research during my six years at Swansea ... and bey...
Ā 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organization
Ā 

Test Driven Relevancy -- How to Work with Content Experts to Optimize and Maintain Search Relevancy

  • 1.
  • 2. TEST DRIVEN RELEVANCY How to Work with Content Experts to Optimize and Maintain Search Relevancy Doug Turnbull Rena Morse Search Relevancy Expert Director of Semantic Tech OpenSource Connections Silverchair Information Systems
  • 3. Its us! Hi Iā€Ÿm Doug! @softwaredoug http://www.linkedin.com/in/softwaredoug http://bit.ly/softwaredoug Hi Iā€Ÿm Rena! renam@silverchair.com
  • 4. How do sales/content curators collaborate with devs? ā€œWhen doctors search for ā€žmyocardial infarctionā€Ÿ these documents about ā€žheart attacksā€Ÿ should come upā€ ā€œMyocardial in-what-tion? Dangit Rena, Iā€Ÿm a Solr consultant ā€“ not a doctorā€ ā€œI donā€Ÿt evenā€¦ā€ ā€œlet me work my Solr magic and get back to you next weekā€¦ā€
  • 5. How do content curators collaborate with devs? ā€¢ Doug knows Solr ā€¢ Bob knows his business Rena knows her content Supplier Pressure Sales Conversions Niche customers q={!boost b=log(numCatPictures)}bunnies <tokenizer class="solr.WhitespaceTokenizerFactory"/> Myocardial infarction Renal carcinoma This is a universal pattern ā€“ it takes different strokes!
  • 6. How do content curators collaborate with devs? ā€œRena, I fixed that myocardial in-whatever-tion relevancy issueā€ ā€œOk but you broke this other thing I thought was fixed!ā€ <reiterates that heā€Ÿs a search expert not a doctor> <reiterates sheā€Ÿs a paying client> ā€œok let me see what I can doā€¦. Iā€Ÿll get back to you in a weekā€
  • 7. Our Problems ā€¢ People Problem: Our collaboration stinks ā€“ ā€œThrow problems over the fenceā€¦ wait a weekā€ ā€“ Siloed by expertise (search vs content experts) ā€“ Potential for politics, anger, frustration (all paths to the dark side) ā€¢ Technical Problem: Search testing is hard ā€“ Small set of relevancy rules impacts all user searches ā€“ Therefore: Much easier to have regressions than in software ā€“ Very easy to have regressions
  • 8. Our Problems ā€¢ In short, Broken Workflow: ā€“ Iterations between devs and non-technical experts take a long time ā€“ Devs need immediate access to non-technical expertise to make rapid progress ā€¢ Gather broken searches ā€¢ Rate existing searches ā€¢ Find searches that have slid backwards in quality ā€“ Non-technical experts clearly need devs ā€¢ Translate business rules to relevancy rules ā€¢ Bending sophisticated token matching engine to our approximate user intent
  • 9. Our Problems ā€¢ Our lack of collaboration means our testing stinks ā€“ Need expert feedback to test ā€œthis is good, this is bad, this is okā€¦.ā€ ā€¢ Search devs often donā€Ÿt know good search! They need help. ā€œI need an army of Renas locked in a room telling me what is good and badā€
  • 10. Solutions? ā€¢ In s/w development -- automated testing is often away to collaborate ā€“ Devs Can sit together with content experts and ask: ā€¢ What should happen in this case? ā€“ Then record that in the form of a test @Given("tab $asciiTab") public void tab(String asciiTab) { tab = new Tab(asciiTab, new TabParser()); } @When("the guitar plays") public void guitarPlays() { guitar = new Guitar(); guitar.play(tab); } @Then("the following notes will be played $notes") public void theFollowingNotesWillBePlayed(String notes) { ensureThat(expectedNotes(notes), arePlayedBy(guitar)); }
  • 11. Test Driven Development with Search ā€¢ Collaborative testing is absolutely essential for search ā€“ Good search is best defined by experts in the content: ā€¢ Marketing, sales, users, etc Iā€Ÿm a search expert! Not a Doctor! How can I possibly measure search ā€¢ Unfortunately thereā€Ÿs nothing to help content experts communicate with search devs around search (frankly this is rather shocking to me) Help me help you! I have few ways to record, measure, and evaluate search
  • 12. Test Driven Development with Search ā€¢ Collaborative testing is absolutely essential for search ā€“ Every change to search relevancy will cause other queries to change ā€“ MUST know how much our relevancy has changed ā€œI fixed your searchā€¦ Does it matter that toe fungus query changed by 30%?ā€ ā€œYeah lets see if we can work together to balance the two relevancy concernsā€ ā€œIā€Ÿm glad we have tests to know whatā€Ÿs changed!ā€
  • 13. Test Driven Development with Search ā€¢ Apply these ideas to search quality: ā€“ Given query Y ā€“ What documents are awesome/good/ok/bad/terrible? ā€¢ ā€¢ Record these ratings somewhere Observe the changes of all queries simultaneously as we modify relevancy params Now Iā€Ÿve got the ultimate relevancy workbench. I can see if my ideas are working or failing right away! Now I cane see instantly if Dougā€Ÿs changes are making progress or if weā€Ÿre moving backwards!
  • 14. Quepid! (give your queries some love!) ā€¢ ā€¢ We built a tool (a product!) around these ideas Now our favorite relevancy workbench Learn more about Quepid! http://quepid.io ā€œHey weā€Ÿre kicking butt and taking names on these search queries!ā€
  • 15. Test Driven Development with Quepid Search Quality is about Collaboration* *and collaboration is about testing Demo time