Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Enterprise search – how relevant is ‘relevance’?
Martin White, Managing Director, Intranet Focus Ltd
Visiting Professor, I...
My search history 1975 – 2020 (and beyond..!)
2
1975
1999
2002 - 2020
DECO
Unilever Computing Services
1982
http://intrane...
@Intranetfocus 3
https://unsplash.com/photos/4zbEzE87zz4
What do we mean by search?
• WWW search
• Massive volumes of everything dominated by Google and the need for advertising r...
Enterprise search
• Massive amounts of structured and unstructured content, very little of it curated and dominated by
Mic...
@Intranetfocus 6
https://rgu-repository.worktribe.com/output/249111https://www.clearbox.co.uk/diagnosing-enterprise-search/
Classic IR/IIR evaluation assumptions
• Access to a representative test collection
• Selection of a representative user co...
Information seeking – some numbers
@Intranetfocus 8
0 10 20 30 40 50 60 70 80 90
Information service
Electronic closing bi...
Managing Babel
@Intranetfocus 9
The state of the art
@Intranetfocus 10
© 2020 IntraTeam Digital Workplace
Benchmarking Service
www.intrateam.com
Findwise ...
Impact of search tasks on search performance
@Intranetfocus 11
The qualitative content analysis of work-task journal study...
Foraging for information
@Intranetfocus 12
Information patch
Information scent
Information diet
Complex Searcher Model
Information Scent, Searching and Stopping. Modelling SERP Level
Stopping Behaviour David Maxwell an...
Stopping strategies
@Intranetfocus 14
Increase in user effort
Increase in
search success Stopping point
"I was expecting t...
Professional search – use cases are everything
@Intranetfocus 15
Information Management and Processing
5 (2018) 1042-1057
...
Looking at click logs
@Intranetfocus 16
72,000 employees
From #150 – #500 average queries per month 200
box 83338 3
BOX 15...
Three use cases
@Intranetfocus 17
Use Case A
Quick access route to applications, people, office locations,
breaking news e...
How relevant is relevance?
• Enterprise search as users will not click through each result in sequence as they
are using a...
Evaluating enterprise search
1. Click logs have a value, but for every query term ask “Why?”
2. “Time well spent” http://i...
Summary
• The primary search mode in the enterprise is information foraging, where users
apply a very personal array of he...
Martin.white@intranetfocus.com
Time for reflection
Further reading
@Intranetfocus 22
Enterprise Search (O’Reilly Media)
Achieving Enterprise Search Satisfaction http://intra...
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

0

Share

Download to read offline

Enterprise Search – How Relevant Is Relevance?

Download to read offline

Enterprise search is the outlier in search applications. It has to work effectively with very large collections of un-curated content, often in multiple languages, to meet the requirements of employees who need to make business-critical decisions.

In this talk, I will outline the challenges of searching enterprise content. Recent research is revealing a unique pattern of search behaviour in which relevance is both very important and yet also irrelevant, and where recall is just as important as precision. This behaviour has implications for the use of standard metrics for search performance (especially in the case of federated search across multiple applications) and for the adoption of AI/ML techniques.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Enterprise Search – How Relevant Is Relevance?

  1. 1. Enterprise search – how relevant is ‘relevance’? Martin White, Managing Director, Intranet Focus Ltd Visiting Professor, Information School, University of Sheffield Meetup London 23 June 2020 Martin.white@intranetfocus.com @intranetfocus @Intranetfocus 1
  2. 2. My search history 1975 – 2020 (and beyond..!) 2 1975 1999 2002 - 2020 DECO Unilever Computing Services 1982 http://intranetfocus.com/resources/reports/
  3. 3. @Intranetfocus 3 https://unsplash.com/photos/4zbEzE87zz4
  4. 4. What do we mean by search? • WWW search • Massive volumes of everything dominated by Google and the need for advertising revenue • Extensive use of SEO curation of content to improve findability • Inadvertently sets the ‘benchmark’ for enterprise search “Why can’t our search be like Google?’ • Web site/intranet search • Highly curated content with a focus on browsing • Search usually ends up as an afterthought • Significant resources devoted to ‘being found’ • Repository search • Services for (primarily) academic users with highly curated content on a library/archive management system • Also extensive use of external search applications such as Scopus and Lexis • Precision is important • E-Commerce search • Highly curated content and very good user tracking metrics • Very clear RoI based on sales revenues and repeat customers @Intranetfocus 4
  5. 5. Enterprise search • Massive amounts of structured and unstructured content, very little of it curated and dominated by Microsoft 365 Search in terms of installed base and ‘best practice’ • Often multiple applications in a federated search implementation • Multiple content languages and unpredictable language skills of users • Security trimming plays havoc with relevance • Users are experts in their domain, and have a substantial about of information pushed to them by other enterprise applications (ERP, CRM, HR) and of course email and social media • Search is therefore ‘additive’ and not from a zero knowledge base • Not based around business processes so it is very difficult to assess whether a search failure is a content issue, a query issue or a technology issue • Rarely is there any internal content/user focused team. Largely managed by IT on ‘technical performance’ (uptime/traffic) metrics • Assumption is that search is intuitive, so no training and no mentoring available @Intranetfocus 5
  6. 6. @Intranetfocus 6 https://rgu-repository.worktribe.com/output/249111https://www.clearbox.co.uk/diagnosing-enterprise-search/
  7. 7. Classic IR/IIR evaluation assumptions • Access to a representative test collection • Selection of a representative user cohort (all too often students!) • Fluency in query and content languages • No (or little) prior subject knowledge assumed of searchers • No (or little) justification for undertaking the search • No business imperative (or buy in) to find the information • A/B tests used on a test collection to assess ranking improvement • Relevance assessed (usually) on item title and snippet • No assessment of content quality @Intranetfocus 7
  8. 8. Information seeking – some numbers @Intranetfocus 8 0 10 20 30 40 50 60 70 80 90 Information service Electronic closing binders Internal specialists Enterprise search Library books Knowledge database Library staff Electronic research services DM applications Intranet Global law firm – approx. 1000 respondents out of 2000 staff Intranet Focus client % using each option on a ‘very frequent’ or ‘frequent’ basis
  9. 9. Managing Babel @Intranetfocus 9
  10. 10. The state of the art @Intranetfocus 10 © 2020 IntraTeam Digital Workplace Benchmarking Service www.intrateam.com Findwise Enterprise Findability Survey 2018 https://findwise.com/en/enterprise-search
  11. 11. Impact of search tasks on search performance @Intranetfocus 11 The qualitative content analysis of work-task journal study material pointed on the following work task scenarios: (1) ordinary and (2) unordinary administrative tasks; everyday professional tasks as (3) high-quality tasks, (4) “just-to-get- done” tasks and (5) regular teamwork; and unordinary professional tasks as (6) unique tasks and (7) inventive teamwork The impression we have, is that the majority of current IIR research centres on Internet searching and everyday- life information needs. ….However, there remains a need for IIR research on information searching in relation to information intensive work task performance with respect to optimise information searching, the various platforms used for information searching, and understanding of the conditions under which work task performance takes place. https://drive.google.com/file/d/1BosRT0sDMvVjPxLCR4dpxuYT9ip-FSWW/view 2019 Strix Lecture given by Pia Borlund on search and task completion
  12. 12. Foraging for information @Intranetfocus 12 Information patch Information scent Information diet
  13. 13. Complex Searcher Model Information Scent, Searching and Stopping. Modelling SERP Level Stopping Behaviour David Maxwell and Leif Azzopardi https://www.cmswire.com/information-management/enterprise-search-development-start- with-the-user-interface/
  14. 14. Stopping strategies @Intranetfocus 14 Increase in user effort Increase in search success Stopping point "I was expecting to see Document X on the first couple of pages. It’s not there/ How can I trust the search application?" "Five irrelevant results one after another. I'll give John a call. He'll know the answer" "I'm on p3 of the results and I've only seen one decent result and that's a document I already have" "I've found the French version of the report. Where is the English master version? I'm wasting my time"
  15. 15. Professional search – use cases are everything @Intranetfocus 15 Information Management and Processing 5 (2018) 1042-1057 https://strathprints.strath.ac.uk/64935/ (open access version) Synonyms Boolean Abbreviations HealthcareLegal
  16. 16. Looking at click logs @Intranetfocus 16 72,000 employees From #150 – #500 average queries per month 200 box 83338 3 BOX 15526 26 Box 7276 76 106,140 concur 50577 5 Concur 8100 67 concur 11914 38 70,591 hr 15541 25 HR 9929 46 25,470 lti 9197 53 LTI 9101 56 18,298 my hr 22011 15 myhr 11233 43 33,244 ocm 19446 18 OCM 7407 74 26,853 saba 22667 14 saba cloud 13700 32 36,367 software 12715 35 Software 9455 49 software store 14453 29 36,623 workday 84380 2 Workday 8981 57 93,361 it 8287 65 15,776 IT 7489 70 kiosk 14120 30 Kiosk 6858 81 20,978 angel 36133 9 ANGEL 8557 61 44,690
  17. 17. Three use cases @Intranetfocus 17 Use Case A Quick access route to applications, people, office locations, breaking news etc Often ‘clever’ single query terms Use Case B Checking on corporate policies to find the most recent version, Corporate/divisional news archive Products, initiatives and projects Query terms include date ranges/alphanumeric terms Use Case C Topic-related searches using well-considered query terms Often periodic (i.e. monthly) or cyclic (‘end-of-year’) searches Sum of Top 100 Searches 1,563,384 Sum of Next 400 1,389,782 ‘Precision’ ‘Recall’
  18. 18. How relevant is relevance? • Enterprise search as users will not click through each result in sequence as they are using a range of clues to work out what is relevant. • Ascertaining ‘intent’ is very difficult as people have multiple roles, multiple use- cases and unequal and unknown prior knowledge • It could be looking for an author that is in the same building so that they can go at talk to them about a problem they have. The document search was actually to find a local expert. The document may otherwise have been ‘irrelevant’ • The user may spot a very relevant document at say #3 but will not click on it because they already have it. The presence of the document at #3 is a reassurance that they are on the right track. • These behaviours make it very challenging to work out what set of heuristics the user is employing to assess each result and so what is ‘relevant’ @Intranetfocus 18
  19. 19. Evaluating enterprise search 1. Click logs have a value, but for every query term ask “Why?” 2. “Time well spent” http://intranetfocus.com/time-well-spent-a-potential-holistic-view-of-productivity/ 3. Computational ethnography using data logging 4. Constant usability testing around tasks and task completion 5. Repetition (also constant) of high hit and low/zero hit searches 6. Embedded search mentors, especially in project teams 7. Clever surveys 8. Story telling – “how was it for you?” 9. Spot checks 10. Heuristic benchmarking (SearchCheck http://intranetfocus.com/enterprise-search-consulting-services/) @Intranetfocus 19
  20. 20. Summary • The primary search mode in the enterprise is information foraging, where users apply a very personal array of heuristics to assessing relevance • ‘Relevance’ is still relevant but has to be associated with specific use cases and with a range of other search and query measurements • Content quality, snippet quality and the extent of search expertise play very important roles in search success and search satisfaction • Security trimming can have a significant impact on ranking which is impossible to allow for • The complexity of enterprise search implementations is why having a skilled multi-disciplinary search team is essential @Intranetfocus 20
  21. 21. Martin.white@intranetfocus.com Time for reflection
  22. 22. Further reading @Intranetfocus 22 Enterprise Search (O’Reilly Media) Achieving Enterprise Search Satisfaction http://intranetfocus.com/resources/reports/ Recent columns Seven stress tests for your enterprise search and intranet search applications Distributed information management – the oxygen of your organisation Unpacking the complexities of enterprise search behaviour Search won’t improve until we understand why people search, not just how Diving into enterprise search query logs When improving search performance don’t follow the clicks Time spent searching – a chronology of the myth and some recent research A list of twenty enterprise search myths http://theses.gla.ac.uk/41132/

Enterprise search is the outlier in search applications. It has to work effectively with very large collections of un-curated content, often in multiple languages, to meet the requirements of employees who need to make business-critical decisions. In this talk, I will outline the challenges of searching enterprise content. Recent research is revealing a unique pattern of search behaviour in which relevance is both very important and yet also irrelevant, and where recall is just as important as precision. This behaviour has implications for the use of standard metrics for search performance (especially in the case of federated search across multiple applications) and for the adoption of AI/ML techniques.

Views

Total views

245

On Slideshare

0

From embeds

0

Number of embeds

1

Actions

Downloads

3

Shares

0

Comments

0

Likes

0

×