SlideShare a Scribd company logo
1 of 37
Download to read offline
BETTER SEARCH
           ENGINE TESTING




     STPCON 2011 | EPUGH@O19S.COM | @DEP4B

                                             1




     WHY AM I QUALIFIED TO BE
           UP HERE?
•   President of OpenSource
    Connections

• Contributor  to CruiseControl
    and Continuum CI projects

• Member    of Apache Software
    Foundation

• Presenter at conferences
    (OSCON, ApacheCON, jTDS,
    ExpoQA, STPcon 2009!)

                                             2
AUTHOR




         3




WRITER




         4
FATHER




           5




AGILISTA




           6
AGENDA
 Why is Search Becoming More
          Important?


   What is a Search Engine?

    Techniques for Testing

          Wrap Up



                               7




WHY IS SEARCH
BECOMING MORE
  IMPORTANT?


                               8
INFORMATION IS EXPLODING

   “information workers ... are each bombarded with1.6
  gigabytes of information on average every day through
    emails, reports, blogs, text messages,calls and more”.

  • http://online.wsj.com/article/SB124252211780027326.html




                                                               9




               UNSTRUCTURED



• emails, spreadsheets, documents, presentations, images,
 databases

  • 75%   unstructured to 25% structured




                                                              10
MANAGING DATA IS
              EXPENSIVE


•1   GB costs $.20 to store

•1   GB costs $3500 to Manage




                                                               11




 WHAT DOES 3500 BUY YOU?


• 69% of respondents felt 50% or less of data could be found
 online

• Knowledge  workers spend 25% of their time engaged in
 search-related activities.




                                                               12
WHY NOT JUST USE GOOGLE


 • We   don’t want 44 million results, we want 1

 • we   want “the” answer, not “an” answer

 • we   tolerate inefficieny in the Internet search
                    As
John Allenhappy toputs it: “The Internet is
 • We are Paulos “satisfice”
the world's largest library.It's just that all
           the books are on
                the floor.”
                                                     13




        WHAT IS A SEARCH
           ENGINE?




                                                     14
15




16
17




18
19




THREE STAGES OF SEARCH




                         20
CONTENT INDEXING
•- creating an index by crawling the content directories,
 databases, other repositories using an automated process
 (either pushing or pulling changes)

• create    an Index, which is a searchable key to a collection.

• In
   Enterprise Search, the indexing mechanism should be able
 to access company private data (with access privileges
 maintained)

• control
        indexing schedule - being able to index rapidly
 changing content quickly, other content more slowly.

• rather   than having the bot look for the data.
                                                                      21




            CONTENT INDEXING

• Indexing    may also support

  • Metadata     extraction

  • Auto-summarization, which      is analyse of the collection and
       group its content into categories or clusters.

• Metadatain turn becomes facets that can be used to tune the
 query to put emphasis on that category.


                                                                      22
CONTENT INDEXING
                   23




   QUERYING




                   24
FORMATTING
                                            25




   FACETING
Faceted or "guided navigation"
leverages metadata fields and
values to provide users with visible
options for narrowing or refining
their query.
- Peter Morville, Search Patterns




                                       26
                                            26
Search Stack



          User Interface

          Search Engine

               Data

                           27




  HOW DO WE TEST?

                           28
HOW DO WE TEST?


• Querying

• Formatting

• Content   Indexing

• Performance



                               29




         WHO SHOULD TEST?




                               30
CHALLENGES
• Competing   business stakeholders:

 • Tester: When I search for “lamp shades”, I used to see these
   documents, now I see a differing set.

 • Business  Owner: How do I know that the new search
   engine is better?

 • User: My   pet feature “search within these results” works
   differently.

 • Marketing Guy: I want to control the results so the current
   marketing push for toilet paper brand X always shows up at
   the top.
                                                                  31




                  CHALLENGES


 • Stakeholders  want a better search implementation, but
   perversely often want it to all work “the exact same way”. !
   Getting agreement across all the stakeholders for the
   project vision, and agree on the metrics is a challenge.




                                                                  32
PERFECT SEARCH TESTER
         WOULD BE ALL OF
• Mathematician                    • Business Analyst

• Librarian                        • Systems   Engineer

• UX   Expert                      • Geographer!

• Writer                           • Psychologist

• Programmer




                                                                   33




       KNOWLEDGE TRANSFER

• If
   you don’t have the perfect team already, bring in experts and
  do domain knowledge transfer.

• Learn the vocabulary of search to better communicate
  together

  • “auto   complete” vs “auto suggest”

• Do “Search    for Content Team” brownbag sessions!


                                                                   34
QUERY TESTING



• Often    called “relevancy testing”




                                        35




              TWO SCHOOLS OF
                THOUGHT


• “One True Answer”

• “I   know it when I see it”




                                        36
“ONE TRUE ANSWER”

• Absolute Truth   / Matrix / Grid / TREC / Relevancy Assertions

 • The    correct answers for each search are known ahead of
   time

 • Humans   judges often decide these correct answers, stored
   as Relevancy Assertions

 • Can    be labor intensive to setup

• A “Numerical   Grade” is produced for comparision

                                                                   37




           PROBLEMS WITH THIS
               APPROACH
• Open  to gaming. TREC competition is swamped by
 “academic” search engine efforts that don’t work in the real
 world.

• Needa well understood data set with generally accepted
 answers.

  is it better to have an engine that gives modestly relevant
   results almost all the time, or an engine that gives really
 good answers sometimes, better on average than the other
     engine, but sometimes gives back complete garbage?
                                                                   38
A/B TESTING
                                              Engine version 1 and
                                              version 2!

• Tracks   explicit or implicit preferences between engines A/B

• Often    dispenses with the notion of the "correct" answer

• Canbe easier to setup, but some fear the best answers will be
 missed by both engines



                                                                     39




                       RELEVANCY



• Do   we have any defined relevancy metrics?

• Relevancy   is like porn.....




                                                                     40
I KNOW IT WHEN I SEE IT!




            http://en.wikipedia.org/wiki/Les_Amants

                                                                  41




   BEYOND PRECISION AND
  RECALL: HOW ENGINES ARE
• Binary   vs. Non-Binary Grading Systems

  • Early TREC
             had binary judgements, only Yes/No on whether
   each doc was related to a test search

  • More    choices were later added

  •A system can use letter grades (A, B, C, D and F) or numeric
   grades

  • Another style asks testers to sort documents in their
   preferred order
                                                                  42
CLASSIC MEASUREMENTS OF
    SEARCH RELEVANCY


• Recall: "Did
           I nd all the documents I expected to get back?!
 What percent?"

• Precision: "Did
                the system bring back other documents that
 weren't relevant?! What percent were on target?"




                                                              43




                    NEWER IDEAS


• Rank: The   order of documents that were returned

  • Generally
            a 1 in 20 match in the #1 spot is better than a
   50% rate where all matches are on the second page.




                                                              44
INTERACTIVITY: WHAT
      NAVIGATORS OR
 VISUALIZATION WERE GIVEN
• Facets   and sorting: Clickable filters and sort options

• Unsupervised     Clustering: Related terms or phrases, or related
 searches

• Spelling   and thesaurus suggestions



                                                                      45




   SUBJECT DISAMBIGUATION,
   SENTIMENT, CONFLICTING
    INFORMATION, CROWD
            HINTS
• kidney   bean or kidney cell

• "best   football team in the UK"




                                                                      46
47




SOURCES OF VARIANCE, AKA
      "PROBLEMS"
                 Note, this is talking
                 about comparing search
                 engine a to search engine
                 b. But I am thinking
                 more in the context of
                 search engive v1 to v2!




                                             48
DIFFERENT GOALS



• Perfect/Human   vs. Best vs. Acceptable vs. Better than X

• Constrained   vs. Unconstrained Resources (time, cpu, storage)




                                                                   49




                     SAMPLE SIZE


• Amount   of Data

 • Fixed   set or growing over time

• Number    of Testers (AB or Relevancy Judgments)

• Number    of Searches



                                                                   50
VERTICAL VS. HORIZONTAL
          CONTENT


• Oneextreme: Specific demo may cover just one discipline, for
 example Medical Journals

• Other   extreme: Internet covers vastly disparate domains




                                                                51




                             USERS
• Experienced     vs. New Searcher

• Subject   Expert vs. Novice

• Spelling, typing   and computer proficiency

• InterfaceMedium (large visual display, small text display,
 audible, Braille, etc)

• Amount      of Effort to understand Search

• Willingness   to Iterate

• Searching    for specific answer vs. General Exploration
                                                                52
TYPE OF SEARCHES

• Length    / 1 or 2 words

• Full   question

• Sample     text

• Internet   Boolean

• Advanced     Boolean / Syntax / Proximity

  • Wildcard, Regex, etc.

                                              53




                    PUNCTUATION


• Chemical

• Source    Code

• Units   of Measure

• Literal   vs. Search Operator



                                              54
NOT EVEN GETTING INTO
    MULTI LINGUAL SEARCH



• How   do I test in languages I don’t understand?




                                                     55




          GROK YOUR RESULTS
                                                     56
FORMATTING TESTING



• Directly   builds on most of our existing test skills.




                                                           57




     PERSONAS & SCENARIOS




                                                           58
Persona 1: Going to be a mom
                            Oh my God                                                                                                       Needy
                            I’m actually
                            Pregnant!                                          Narrative

                                                                                 Self Introduction
                                                                                 Hi all, I'm very new to this but i couldn't help but share my
                                                                                 excitement. I have just found out today that i am pregnant. It
                                                                                 wasn't planned, me and my partner of a year and a half were
                                                                                 going to wait until we had our own place and were married
                                                                                 rst but it looks like we have done it the other way round.
         What’s next? What am I
         supposed to do?
         Guidance please!
                                                                                 My only concern is that i don't really know how my boyfriend
  To interact with                                                               feels about it. I know we need to discuss the options but i
  people going                                                                   have really already made up my mind about what i want to do.
  through the same                                                               There is so much to consider, money, a decent place to live,
  thing.                                                                         being ready but i know i am ready and have been for a long
                                                                                 time ( I get extremely broody when i see my friends kids)

                                    Scenarios that typify -
                                    planned to get pregnant, but                 Should i just tell him how i feel or go with how he feels
                                    hasn't done any research                     because i don't want to lose him. He is a loving partner who
                                    Catch phrases - Nervous but                  would stand by me through anything i just don’t want him to
                                    excited, giddy, Where do I                   feel like i am tying him down!!! I suppose i am feeling very
                                    start?                                       happy but also very confused at the same time!!!
                                    Tag lines - Wants to share,
                                    has a million questions                      http://forum.sofeminine.co.uk/forum/maternite1/
                                    Likely to say - Guide me,                    __f468_maternite1-Oh-my-god-i-m-pregnant.html
                                    help me get off to a good
                                    start




                                                                                                                                                    59




              Persona 2: New Mom
                          Are my kids sick or is                                                                                        Demanding
                          this condition normal?
                          How do I…?                                       Narrative
                                                                           I have been hearing about women who claim that thier 2, 2 1/2 or 3 year
                                                                           old is not ready for the potty. They claim its a nightmare and are waiting
                                                                           for their children to come around.

                                                                           Maybe I grew up in the twilight zone, but I had always assumed that
                                                                           potty training was something that is just done. Its done when:
                                                                              a) The child in question can sleep through the night and stay dry.
                                                                              b) The child in question can speak to you, in full sentences. like,
                                                                              "apple juice, please" or "wanna go to the park" or "momma I wanna
             How do I ensure my
                                                                              hold you..."
             baby is latching on
                                                                              c) The child in question knows they are soiled and can ask to be
             correctly?
                                                                              changed.

                                                                           Barring any of those things, a child is ready to be placed on the potty.
  What type of stroller                                                    using the potty was never negotiable in my family. When we hit the
  should I buy? What                                                       above milestones my mother trained us. We just did it. If we complained
  brand of car seat is                                                     she never put diapers on us, she just kept directing us back to the potty.
  best?
                                                                           Her methods of redirection may be controversial (she told my brother
                                                                           that unless he was a big boy he would not get a happy meal. Boys who
                                                                           pooed on themselves got sad meals... lol!!! He straightened up and
                                   Scenarios that Typify                   started using the potty at 2 1/2) but she was never abusive or anything
                                                                           she just DIDNT ASK US. it was time to potty and that was it.
                                   Likely to say -Are my kids sick or is
                                   this condition normal?                  The reasoning was that I used to drink from a bottle, and sleep with my
                                   Describes herself - wants to be a
                                                                           mother, and such, now I don't. I also used to crap my pants, and that is
                                   good mother, looking for expert
                                   advice, wants to get ideas from other   no longer allowed after a certain point.
                                   moms
                                   Narrative- could be working mom,
                                   could be stay at home mom               My question is this: why ask children if they are ready to use the potty,
                                   Questions likely to ask - sometimes     after they are clearly ready to use it (with language tools and bladder
This picture captures my life
                                   wants to ask questions/get expert       control)? Why is it treated like something that is negotiable or that the
perfectly: an adult beverage
                                   advice
  sitting on a book about                                                  child has a choice of either coming around to it or not? I understand that
         underpants.                                                       children are sensitive and you have to follow their lead, at times. But
                                                                           allowing them to shit
                                                                                                                                                    60
Scenario 1
Find old answer                                “I know went through this before with my first child,
                                               but cannot recall the answer”


                                          Preamble
                                          Experienced mom has a dĂŠjĂ  vu moment about a
                                          previous problematic experience with her first child. She
                                          has a partial recollection of a piece of information
                                                                                                                                    Success Factors
                                          related to the answer she seeks but she needs help in
                                                                                                                                    • Speed of Comprehension
                                          pulling
                                                                                                                                    • Directness to destination
                                                                                                                                    • Reduced:
                                                                                                                                       • Number of queries
                                                                                                                                       • Number of results
                                                                                                                                    • Indirect Knowledge Transfer

       Thinking aloud in the Family Room
                                                                                                                       Very nice – lists out related
                                                                Josh had not started to cry                            concerns for constipation.
           Hhhm I now I                                         non-stop for 3 hours when                                 Let’s see: ‘symptoms’,
           had the same           wwwaaaaaaaa                     it finally dawned on me                                ‘cures’, ‘when to call the
          issue with josh,       wwwwaaaaaa . . .                  that he had not had a                               doctor’, ‘what other moms
            but what the              ggg                        movement for 3 days . . .                               are saying’, ‘topic over
           heck did I do?        wwwaaaaaaaa . . .                             Let’s try querying that . . .                       view’     Ok – I’ll take ‘cures’ Alex
                                                                              “no poop” . . . Not likely . . .
                                                                                                                                             for a 300 points and my
                                                                               Uumm . .. “constipation”?
                                                                                                                                            personal sanity! Water . . .
                                                                               Oh, might help to specify
                                                                                                                                             fruit juice . . . high-ber
                                                                              who as well . . . “baby” . . .
                                                                                                                                            baby foods - Ahhh prune
                                                                                                                                            juice . . . prune juice! Now
                                                                                                                                           why didn’t I remember that!




     After hours of frustration mother home alone has a     Mother starts to type in query but suggest-as-you-     Structured results quickly tip off the mother to the
     partial epiphany as to her child’s problem.            type search box hints to her to be more specific.      assorted aspects of constipation. She focuses in on
                                                                                                                   one of the aspects and has total recollection of her
                                                                                                                   previous experience.




                                                                                                                                                                                     61




Scenario 2
Urgent Question                          It’s 2am and I don’t know who to ask?”



                                          Preamble
                                          Mother of twins finds herself with panicked in the early
                                          morning hours with a new situation.




                                                                                                                                    Success Factors
                                                                                                                                    • Speed of Comprehension
                                                                                                                                    • Directness to answer

                  Crying in the Kitchen
                                                                             I don’t have to read                      ‘102’ . . . thank
                         wwwaaaaaaaa                                    hundreds of pages on the                      god ! We’re safe
                       wwwwaaaaaa . . .                                  internet . . . I just need a
                             gggwwwaaaaaaaa                             quick concise answer . . .
                      wwwaaaaaaaaa . .                                    . . . at what temperature do I
           Crap! Who I am I     wwwwaaaaa . . .                              need to be worried . . ? !                     Ahhh . . . that’s helpful -
                              .      ggg
         supposed to at this                                                                                                other conditions to know
          hour ! Why is it no wwwaaaaaaaaa . . .                              Please [BabyCenter] show me                            about . . .
         body is open when                                                            the answer . . !                           That’s thorough : ‘What will the doctor
            I need them ? !                                   wwwaaaaaaaa                                                                           do? ‘
                                                             wwwwaaaaaa . . .                                                   Interesting ‘If fever is a defense against
                                                                  ggg                                                          infection, is it really a good idea to try to
                                                                   wwwaaaaaaaa
                                                             wwwaaaaaaaaa . .                                                                 bring it down?’
                                                                 wwwwaaaaaa . . .
                                                                    .
                                                                       ggg
                                                                 wwwaaaaaaaaa . .                                                                                Let me book mark
                                                                        .                                                                                          this for later.




     In the middle of the night, a mother of twins finds    Mother starts to type in a query but notices the       The mother zooms in on the specific answer she
     herself alone, overwhelmed, and in dire need of an     suggest-as-you-type search box lets her narrow her     seeks. But then she notices collateral knowledge
     answer.                                                question boosting her confidence she is going to get   she takes note of for later reading.
                                                            the answer she needs.




                                                                                                                                                                                     62
CONTENT INDEXING
               TESTING


• Leverages our normal testing skills. And typically what it really
 means is “Performance Testing”.

 • Lot’s   of “integration” testing.




                                                                      63




     PERFORMANCE TESTING




                                                                      64
LEVELS OF SCALING
• Scale    High

  • There     is a quickly hit point of diminishing returns!

• Scale Wide

  • The    safety valve for lots of load.

• Scale    Deep

  • ScalingDeep? You are doing some crazy stuff with huge
    indexes!!
                                     65
                                                               65




            SCALE WIDE (SLAVES)

• Too   many inbound queries!

• slaves
      poll master for
 changes

• index and config files
 transferred

• ALL   JAVA!

                                     66
                                                               66
SCALE WIDE (SHARDING)
• Too     large of an index to query

• Split
      index over multiple Search
 servers

  •A      -> M: Server 1, N -> Z: Server 2

  • uniqueId.hash    % numServers

• Relevancy     typically balanced shards

• Requestsplit across shards, results
 aggregated to single response
                                   67
                                             67




                       SCALE DEEP


• Combine  both scaling wide
 to handle number of queries
 with sharding to handle size
 of indexes!




                                   68
                                             68
WRAP UP




                                                                                69




                               User         Search
Methodology                  Interface      Engine
                                                           Data

Concurrent Streams of Work
                                             Iteration 2 Story:
Operationalize Solr           Deploy Solr into BabyCenter Test Environment




                                                Iteration 2 Story:
  Search Analysis               Integrate Solr into Community UI, A/B Testing




                                            Iteration 2 Story:
Search Experience               Conceptual Model (Personas, etc) & Mockups




 OSC APPROACH TO SEARCH
                                                                                70
OSC APPROACH TO SEARCH
                                                                     71




                     RESOURCES

• http://www.scribd.com/doc/17563004/Why-You-Cant-Just-
    Google-for-Enterprise-Knowledge

• http://www.searchtools.com/info/user-interface.html

• http://www.alistapart.com/articles/testing-search-for-relevancy-
    and-precision/

•



                                                                     72
SEARCHPATTERNS.ORG
                          73
                                                 73




                 THANK YOU!



• twitter:   dep4b

• speakerrate:   http://www.speakerrate.com/epugh/

• email:   epugh@opensourceconnections.com

                          74
                                                 74

More Related Content

Similar to Better Search Engine Testing - Eric Pugh

RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
S. Diana Hu
 
Automated Testing with Databases
Automated Testing with DatabasesAutomated Testing with Databases
Automated Testing with Databases
Stephen Ritchie
 

Similar to Better Search Engine Testing - Eric Pugh (20)

Get It Right the First Time Through Cheap and Easy DIY Usability Testing - Dr...
Get It Right the First Time Through Cheap and Easy DIY Usability Testing - Dr...Get It Right the First Time Through Cheap and Easy DIY Usability Testing - Dr...
Get It Right the First Time Through Cheap and Easy DIY Usability Testing - Dr...
 
Messy Research: How to Make Qualitative Data Quantifiable and Make Messy Data...
Messy Research: How to Make Qualitative Data Quantifiable and Make Messy Data...Messy Research: How to Make Qualitative Data Quantifiable and Make Messy Data...
Messy Research: How to Make Qualitative Data Quantifiable and Make Messy Data...
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Get it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testingGet it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testing
 
Get it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testingGet it right the first time through cheap and easy DIY usability testing
Get it right the first time through cheap and easy DIY usability testing
 
Owned Media
Owned MediaOwned Media
Owned Media
 
Action research for_librarians_carl2012
Action research for_librarians_carl2012Action research for_librarians_carl2012
Action research for_librarians_carl2012
 
Web Usability
Web UsabilityWeb Usability
Web Usability
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search engines
 
Automated Testing with Databases
Automated Testing with DatabasesAutomated Testing with Databases
Automated Testing with Databases
 
GHAMAS Design Principles
GHAMAS Design PrinciplesGHAMAS Design Principles
GHAMAS Design Principles
 
Three Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data ScienceThree Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data Science
 
Casablanca SharePoint Days Power User Search Tips
Casablanca SharePoint Days Power User Search TipsCasablanca SharePoint Days Power User Search Tips
Casablanca SharePoint Days Power User Search Tips
 
Action research for_librarians_carl2012
Action research for_librarians_carl2012Action research for_librarians_carl2012
Action research for_librarians_carl2012
 
Search Across Multiple VIVO Instances
Search Across Multiple VIVO InstancesSearch Across Multiple VIVO Instances
Search Across Multiple VIVO Instances
 
Make Enterprise Search Less Broken
Make Enterprise Search Less BrokenMake Enterprise Search Less Broken
Make Enterprise Search Less Broken
 
10 Easy Ways to Take Your Website from Good to Great
10 Easy Ways to Take Your Website from Good to Great10 Easy Ways to Take Your Website from Good to Great
10 Easy Ways to Take Your Website from Good to Great
 

More from lucenerevolution

Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
lucenerevolution
 

More from lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Enterprise Knowledge
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Better Search Engine Testing - Eric Pugh

  • 1. BETTER SEARCH ENGINE TESTING STPCON 2011 | EPUGH@O19S.COM | @DEP4B 1 WHY AM I QUALIFIED TO BE UP HERE? • President of OpenSource Connections • Contributor to CruiseControl and Continuum CI projects • Member of Apache Software Foundation • Presenter at conferences (OSCON, ApacheCON, jTDS, ExpoQA, STPcon 2009!) 2
  • 2. AUTHOR 3 WRITER 4
  • 3. FATHER 5 AGILISTA 6
  • 4. AGENDA Why is Search Becoming More Important? What is a Search Engine? Techniques for Testing Wrap Up 7 WHY IS SEARCH BECOMING MORE IMPORTANT? 8
  • 5. INFORMATION IS EXPLODING “information workers ... are each bombarded with1.6 gigabytes of information on average every day through emails, reports, blogs, text messages,calls and more”. • http://online.wsj.com/article/SB124252211780027326.html 9 UNSTRUCTURED • emails, spreadsheets, documents, presentations, images, databases • 75% unstructured to 25% structured 10
  • 6. MANAGING DATA IS EXPENSIVE •1 GB costs $.20 to store •1 GB costs $3500 to Manage 11 WHAT DOES 3500 BUY YOU? • 69% of respondents felt 50% or less of data could be found online • Knowledge workers spend 25% of their time engaged in search-related activities. 12
  • 7. WHY NOT JUST USE GOOGLE • We don’t want 44 million results, we want 1 • we want “the” answer, not “an” answer • we tolerate inefcieny in the Internet search As John Allenhappy toputs it: “The Internet is • We are Paulos “satisce” the world's largest library.It's just that all the books are on the floor.” 13 WHAT IS A SEARCH ENGINE? 14
  • 10. 19 THREE STAGES OF SEARCH 20
  • 11. CONTENT INDEXING •- creating an index by crawling the content directories, databases, other repositories using an automated process (either pushing or pulling changes) • create an Index, which is a searchable key to a collection. • In Enterprise Search, the indexing mechanism should be able to access company private data (with access privileges maintained) • control indexing schedule - being able to index rapidly changing content quickly, other content more slowly. • rather than having the bot look for the data. 21 CONTENT INDEXING • Indexing may also support • Metadata extraction • Auto-summarization, which is analyse of the collection and group its content into categories or clusters. • Metadatain turn becomes facets that can be used to tune the query to put emphasis on that category. 22
  • 12. CONTENT INDEXING 23 QUERYING 24
  • 13. FORMATTING 25 FACETING Faceted or "guided navigation" leverages metadata fields and values to provide users with visible options for narrowing or refining their query. - Peter Morville, Search Patterns 26 26
  • 14. Search Stack User Interface Search Engine Data 27 HOW DO WE TEST? 28
  • 15. HOW DO WE TEST? • Querying • Formatting • Content Indexing • Performance 29 WHO SHOULD TEST? 30
  • 16. CHALLENGES • Competing business stakeholders: • Tester: When I search for “lamp shades”, I used to see these documents, now I see a differing set. • Business Owner: How do I know that the new search engine is better? • User: My pet feature “search within these results” works differently. • Marketing Guy: I want to control the results so the current marketing push for toilet paper brand X always shows up at the top. 31 CHALLENGES • Stakeholders want a better search implementation, but perversely often want it to all work “the exact same way”. ! Getting agreement across all the stakeholders for the project vision, and agree on the metrics is a challenge. 32
  • 17. PERFECT SEARCH TESTER WOULD BE ALL OF • Mathematician • Business Analyst • Librarian • Systems Engineer • UX Expert • Geographer! • Writer • Psychologist • Programmer 33 KNOWLEDGE TRANSFER • If you don’t have the perfect team already, bring in experts and do domain knowledge transfer. • Learn the vocabulary of search to better communicate together • “auto complete” vs “auto suggest” • Do “Search for Content Team” brownbag sessions! 34
  • 18. QUERY TESTING • Often called “relevancy testing” 35 TWO SCHOOLS OF THOUGHT • “One True Answer” • “I know it when I see it” 36
  • 19. “ONE TRUE ANSWER” • Absolute Truth / Matrix / Grid / TREC / Relevancy Assertions • The correct answers for each search are known ahead of time • Humans judges often decide these correct answers, stored as Relevancy Assertions • Can be labor intensive to setup • A “Numerical Grade” is produced for comparision 37 PROBLEMS WITH THIS APPROACH • Open to gaming. TREC competition is swamped by “academic” search engine efforts that don’t work in the real world. • Needa well understood data set with generally accepted answers. is it better to have an engine that gives modestly relevant results almost all the time, or an engine that gives really good answers sometimes, better on average than the other engine, but sometimes gives back complete garbage? 38
  • 20. A/B TESTING Engine version 1 and version 2! • Tracks explicit or implicit preferences between engines A/B • Often dispenses with the notion of the "correct" answer • Canbe easier to setup, but some fear the best answers will be missed by both engines 39 RELEVANCY • Do we have any dened relevancy metrics? • Relevancy is like porn..... 40
  • 21. I KNOW IT WHEN I SEE IT! http://en.wikipedia.org/wiki/Les_Amants 41 BEYOND PRECISION AND RECALL: HOW ENGINES ARE • Binary vs. Non-Binary Grading Systems • Early TREC had binary judgements, only Yes/No on whether each doc was related to a test search • More choices were later added •A system can use letter grades (A, B, C, D and F) or numeric grades • Another style asks testers to sort documents in their preferred order 42
  • 22. CLASSIC MEASUREMENTS OF SEARCH RELEVANCY • Recall: "Did I nd all the documents I expected to get back?! What percent?" • Precision: "Did the system bring back other documents that weren't relevant?! What percent were on target?" 43 NEWER IDEAS • Rank: The order of documents that were returned • Generally a 1 in 20 match in the #1 spot is better than a 50% rate where all matches are on the second page. 44
  • 23. INTERACTIVITY: WHAT NAVIGATORS OR VISUALIZATION WERE GIVEN • Facets and sorting: Clickable lters and sort options • Unsupervised Clustering: Related terms or phrases, or related searches • Spelling and thesaurus suggestions 45 SUBJECT DISAMBIGUATION, SENTIMENT, CONFLICTING INFORMATION, CROWD HINTS • kidney bean or kidney cell • "best football team in the UK" 46
  • 24. 47 SOURCES OF VARIANCE, AKA "PROBLEMS" Note, this is talking about comparing search engine a to search engine b. But I am thinking more in the context of search engive v1 to v2! 48
  • 25. DIFFERENT GOALS • Perfect/Human vs. Best vs. Acceptable vs. Better than X • Constrained vs. Unconstrained Resources (time, cpu, storage) 49 SAMPLE SIZE • Amount of Data • Fixed set or growing over time • Number of Testers (AB or Relevancy Judgments) • Number of Searches 50
  • 26. VERTICAL VS. HORIZONTAL CONTENT • Oneextreme: Specic demo may cover just one discipline, for example Medical Journals • Other extreme: Internet covers vastly disparate domains 51 USERS • Experienced vs. New Searcher • Subject Expert vs. Novice • Spelling, typing and computer prociency • InterfaceMedium (large visual display, small text display, audible, Braille, etc) • Amount of Effort to understand Search • Willingness to Iterate • Searching for specic answer vs. General Exploration 52
  • 27. TYPE OF SEARCHES • Length / 1 or 2 words • Full question • Sample text • Internet Boolean • Advanced Boolean / Syntax / Proximity • Wildcard, Regex, etc. 53 PUNCTUATION • Chemical • Source Code • Units of Measure • Literal vs. Search Operator 54
  • 28. NOT EVEN GETTING INTO MULTI LINGUAL SEARCH • How do I test in languages I don’t understand? 55 GROK YOUR RESULTS 56
  • 29. FORMATTING TESTING • Directly builds on most of our existing test skills. 57 PERSONAS & SCENARIOS 58
  • 30. Persona 1: Going to be a mom Oh my God Needy I’m actually Pregnant! Narrative Self Introduction Hi all, I'm very new to this but i couldn't help but share my excitement. I have just found out today that i am pregnant. It wasn't planned, me and my partner of a year and a half were going to wait until we had our own place and were married rst but it looks like we have done it the other way round. What’s next? What am I supposed to do? Guidance please! My only concern is that i don't really know how my boyfriend To interact with feels about it. I know we need to discuss the options but i people going have really already made up my mind about what i want to do. through the same There is so much to consider, money, a decent place to live, thing. being ready but i know i am ready and have been for a long time ( I get extremely broody when i see my friends kids) Scenarios that typify - planned to get pregnant, but Should i just tell him how i feel or go with how he feels hasn't done any research because i don't want to lose him. He is a loving partner who Catch phrases - Nervous but would stand by me through anything i just don’t want him to excited, giddy, Where do I feel like i am tying him down!!! I suppose i am feeling very start? happy but also very confused at the same time!!! Tag lines - Wants to share, has a million questions http://forum.sofeminine.co.uk/forum/maternite1/ Likely to say - Guide me, __f468_maternite1-Oh-my-god-i-m-pregnant.html help me get off to a good start 59 Persona 2: New Mom Are my kids sick or is Demanding this condition normal? How do I…? Narrative I have been hearing about women who claim that thier 2, 2 1/2 or 3 year old is not ready for the potty. They claim its a nightmare and are waiting for their children to come around. Maybe I grew up in the twilight zone, but I had always assumed that potty training was something that is just done. Its done when: a) The child in question can sleep through the night and stay dry. b) The child in question can speak to you, in full sentences. like, "apple juice, please" or "wanna go to the park" or "momma I wanna How do I ensure my hold you..." baby is latching on c) The child in question knows they are soiled and can ask to be correctly? changed. Barring any of those things, a child is ready to be placed on the potty. What type of stroller using the potty was never negotiable in my family. When we hit the should I buy? What above milestones my mother trained us. We just did it. If we complained brand of car seat is she never put diapers on us, she just kept directing us back to the potty. best? Her methods of redirection may be controversial (she told my brother that unless he was a big boy he would not get a happy meal. Boys who pooed on themselves got sad meals... lol!!! He straightened up and Scenarios that Typify started using the potty at 2 1/2) but she was never abusive or anything she just DIDNT ASK US. it was time to potty and that was it. Likely to say -Are my kids sick or is this condition normal? The reasoning was that I used to drink from a bottle, and sleep with my Describes herself - wants to be a mother, and such, now I don't. I also used to crap my pants, and that is good mother, looking for expert advice, wants to get ideas from other no longer allowed after a certain point. moms Narrative- could be working mom, could be stay at home mom My question is this: why ask children if they are ready to use the potty, Questions likely to ask - sometimes after they are clearly ready to use it (with language tools and bladder This picture captures my life wants to ask questions/get expert control)? Why is it treated like something that is negotiable or that the perfectly: an adult beverage advice sitting on a book about child has a choice of either coming around to it or not? I understand that underpants. children are sensitive and you have to follow their lead, at times. But allowing them to shit 60
  • 31. Scenario 1 Find old answer “I know went through this before with my rst child, but cannot recall the answer” Preamble Experienced mom has a dĂŠjĂ  vu moment about a previous problematic experience with her first child. She has a partial recollection of a piece of information Success Factors related to the answer she seeks but she needs help in • Speed of Comprehension pulling • Directness to destination • Reduced: • Number of queries • Number of results • Indirect Knowledge Transfer Thinking aloud in the Family Room Very nice – lists out related Josh had not started to cry concerns for constipation. Hhhm I now I non-stop for 3 hours when Let’s see: ‘symptoms’, had the same wwwaaaaaaaa it nally dawned on me ‘cures’, ‘when to call the issue with josh, wwwwaaaaaa . . . that he had not had a doctor’, ‘what other moms but what the ggg movement for 3 days . . . are saying’, ‘topic over heck did I do? wwwaaaaaaaa . . . Let’s try querying that . . . view’ Ok – I’ll take ‘cures’ Alex “no poop” . . . Not likely . . . for a 300 points and my Uumm . .. “constipation”? personal sanity! Water . . . Oh, might help to specify fruit juice . . . high-ber who as well . . . “baby” . . . baby foods - Ahhh prune juice . . . prune juice! Now why didn’t I remember that! After hours of frustration mother home alone has a Mother starts to type in query but suggest-as-you- Structured results quickly tip off the mother to the partial epiphany as to her child’s problem. type search box hints to her to be more specific. assorted aspects of constipation. She focuses in on one of the aspects and has total recollection of her previous experience. 61 Scenario 2 Urgent Question It’s 2am and I don’t know who to ask?” Preamble Mother of twins finds herself with panicked in the early morning hours with a new situation. Success Factors • Speed of Comprehension • Directness to answer Crying in the Kitchen I don’t have to read ‘102’ . . . thank wwwaaaaaaaa hundreds of pages on the god ! We’re safe wwwwaaaaaa . . . internet . . . I just need a gggwwwaaaaaaaa quick concise answer . . . wwwaaaaaaaaa . . . . . at what temperature do I Crap! Who I am I wwwwaaaaa . . . need to be worried . . ? ! Ahhh . . . that’s helpful - . ggg supposed to at this other conditions to know hour ! Why is it no wwwaaaaaaaaa . . . Please [BabyCenter] show me about . . . body is open when the answer . . ! That’s thorough : ‘What will the doctor I need them ? ! wwwaaaaaaaa do? ‘ wwwwaaaaaa . . . Interesting ‘If fever is a defense against ggg infection, is it really a good idea to try to wwwaaaaaaaa wwwaaaaaaaaa . . bring it down?’ wwwwaaaaaa . . . . ggg wwwaaaaaaaaa . . Let me book mark . this for later. In the middle of the night, a mother of twins finds Mother starts to type in a query but notices the The mother zooms in on the specific answer she herself alone, overwhelmed, and in dire need of an suggest-as-you-type search box lets her narrow her seeks. But then she notices collateral knowledge answer. question boosting her confidence she is going to get she takes note of for later reading. the answer she needs. 62
  • 32. CONTENT INDEXING TESTING • Leverages our normal testing skills. And typically what it really means is “Performance Testing”. • Lot’s of “integration” testing. 63 PERFORMANCE TESTING 64
  • 33. LEVELS OF SCALING • Scale High • There is a quickly hit point of diminishing returns! • Scale Wide • The safety valve for lots of load. • Scale Deep • ScalingDeep? You are doing some crazy stuff with huge indexes!! 65 65 SCALE WIDE (SLAVES) • Too many inbound queries! • slaves poll master for changes • index and cong les transferred • ALL JAVA! 66 66
  • 34. SCALE WIDE (SHARDING) • Too large of an index to query • Split index over multiple Search servers •A -> M: Server 1, N -> Z: Server 2 • uniqueId.hash % numServers • Relevancy typically balanced shards • Requestsplit across shards, results aggregated to single response 67 67 SCALE DEEP • Combine both scaling wide to handle number of queries with sharding to handle size of indexes! 68 68
  • 35. WRAP UP 69 User Search Methodology Interface Engine Data Concurrent Streams of Work Iteration 2 Story: Operationalize Solr Deploy Solr into BabyCenter Test Environment Iteration 2 Story: Search Analysis Integrate Solr into Community UI, A/B Testing Iteration 2 Story: Search Experience Conceptual Model (Personas, etc) & Mockups OSC APPROACH TO SEARCH 70
  • 36. OSC APPROACH TO SEARCH 71 RESOURCES • http://www.scribd.com/doc/17563004/Why-You-Cant-Just- Google-for-Enterprise-Knowledge • http://www.searchtools.com/info/user-interface.html • http://www.alistapart.com/articles/testing-search-for-relevancy- and-precision/ • 72
  • 37. SEARCHPATTERNS.ORG 73 73 THANK YOU! • twitter: dep4b • speakerrate: http://www.speakerrate.com/epugh/ • email: epugh@opensourceconnections.com 74 74