SlideShare a Scribd company logo
1 of 27
Download to read offline
AN IN-DEPTH ANALYSIS OF TAGS AND CONTROLLED
METADATA FOR BOOK SEARCH
TOINE BOGERS
VIVIEN PETRAS
MARCH 23, 2017iCONFERENCE 2017
OUTLINE
▸ Introduction
▸ Methodology & Experimental Setup
▸ Analysis
– Tags vs. Controlled Vocabularies
– Book Search Requests
– Failure Analysis
▸ Conclusions & Future Work
2
INTRODUCTION
MOTIVATION
▸ Readers often struggle with existing systems (i.e., library
catalogs, Amazon, eBook sellers) to discover new books
– Information needs are contextual, personal & complex
– Book metadata does not contain the necessary information
4
EARLIER WORK
▸ iConference 2015
– Tags outperform controlled vocabularies for search, but
sometimes controlled vocabularies are better.
– Controlled vocabularies contains more unique terms, tags
more repetition of terms.
▸ Why?
– Terminology
– Popularity / frequency
– Type of request
5
STUDY OBJECTIVES
▸ Why are tags better than controlled vocabularies for book
search?
– Which types of book search requests are better addressed
using tags and which using CV?
– Which book search requests fail completely and what
characterizes such requests?
6
METHODOLOGY&
EXPERIMENTAL SETUP
EXPERIMENTAL SETUP
▸ Controlled Vocabulary content (CV)
– DDC class labels
– Subjects
– Geographic names
– Category labels
– LCSH terms
▸ Tags
– Each tag occurs as many times as it has been assigned by
the users
▸ Unique tags
– Each tag occurs only once
8
AMAZON/LIBRARYTHING COLLECTION 9
Tags
Tags
Controlled Vocabulary Content (CV)
DDC class labels
subjects
geographic names
category labels
LCSH terms
Unique Tags
Unique Tags per record
ANNOTATED LT TOPIC
10
Recommended
books
Topic title
Narrative
EXPERIMENTAL SETUP
▸ Amazon / LibraryThing collection of book records
– 2 million records
▸ LibraryThing forum topics for search requests
– 334 search requests for testing
▸ Relevance judgements
– Recommendations from LT members with graded relevance scoring
(highest relevance if book is added by searcher)
▸ Evaluation metric
– Normalized Discounted Cumulated Gain (NDCG@10)
▸ IR system
– Indri 5.4 toolkit
10
ANALYSIS
TAGS vs. CONTROLLED VOCABULARIES
▸ Question 1: Is there a difference in performance between
CV and Tags in retrieval?
▸ Answer
– Tags perform significantly
better than CV
– The combination of both
results in even better
performance than just for
tags, but not significantly so
– Losing tag frequency
information helps rather than
hurts performance (also not
significantly)
12
TAGS vs. CONTROLLED VOCABULARIES
▸ Question 2: Do tags outperform CV because of the so-
called popularity effect?
▸ Answer
– No, there does not seem to be a popularity effect
– Types = unique words in a record
– Tokens = all instances of words in a record
13
TAGS vs. CONTROLLED VOCABULARIES
▸ Question 3: Do Tags and
CV complement or cancel
each other out?
▸ Answer
– Tags and CV
complement each
other: they are
successful on different
sets of requests
– But most zero-difference
requests (74.0%)
actually fail completely!
When and why?
14
REQUESTS – RELEVANCE ASPECTS
▸ What makes a suggested book relevant to the user?
– Distinguish between eight relevance aspects (Reuter, 2007;
Koolen et al., 2015)
16
REQUESTS – RELEVANCE ASPECTS
Aspect Description
% of requests
(N = 87)
Accessibility Language, length, or level of difficulty of a book 9.2 %
Content Topic, plot, genre, style, or comprehensiveness 79.3 %
Engagement
Fit a certain mood or interest, are considered high
quality, or provide a certain reading experience
25.3 %
Familiarity
Similar to known books or related to a previous
experience
47.1 %
Known-item
The user is trying to identify a known book, but cannot
remember the metadata that would locate it
12.6 %
Metadata
With a certain title or by a certain author or publisher, in
a particular format, or certain year
23.0 %
Novelty Unusual or quirky, or containing novel content 3.4 %
Socio-cultural
Related to the user's socio-cultural background or
values; popular or obscure
13.8 %
16
REQUESTS – RELEVANCE ASPECTS
▸ Question 4: What types of book requests are best served
by the Unique tags and CV collections?
▸ Answer
– CV terms show a tendency to work best for requests that
touch upon aspects of engagement
– Other requests are best served by Unique tags
17
REQUESTS – RELEVANCE ASPECTS
0,00 0,10 0,20 0,30 0,40 0,50 0,60 0,70 0,80 0,90 1,00
Socio-cultural
(N = 10)
Novelty
(N = 2)
Metadata
(N = 17)
Known-item
(N = 11)
Familiarity
(N = 36)
Engagement
(N = 21)
Content
(N = 63)
Accessibility
(N = 7)
Unique tags
CV
0.0 0.20.1 0.40.3 0.60.5 0.80.7 1.00.9
Socio-cultural
(N = 10)
0.1127
0.0428
Novelty
(N = 2)
0.5304
0.0000
Metadata
(N = 17)
0.2454
0.1259
Known-item
(N = 11)
0.3593
0.1818
Familiarity
(N = 36)
0.1833
0.0701
Engagement
(N = 21)
0.1121
0.1425
Content
(N = 63)
0.1965
0.0821
Accessibility
(N = 7)
0.1235
0.0749
Performance grouped by relevance aspect
NDCG@10
18
REQUESTS – TYPE OF BOOK
▸ Question 5: What types of book requests (fiction or non-
fiction) are best served by Unique tags or CV?
▸ Answer
– Unique tags work significantly better for fiction
– CV work better for non-fiction (but not significantly so)
19
FAILURE ANALYSIS
▸ Question 6: Do failed book search requests fail because of
data sparsity, a lower recall base, or a lack of examples?
▸ Answer
– Neither sparsity nor the size of the recall base are the
reason for retrieval failure
– The number of examples provided by the requester has
significant positive influence on performance
(N = 247)
(N = 87)
(N = 334)
20
FAILURE ANALYSIS
▸ Question 7: Do book search requests fail because of their
relevance aspects?
▸ Answer
– No, relevance
aspects are
distributed equally
for successful &
failed requests
– Only Accessibility-
and Metadata-
related search
requests seem to
fail more often
21
FAILURE ANALYSIS
▸ Question 8: Does the type of book that is being requested
(fiction vs. non-fiction) have an influence on whether
requests succeed or fail?
▸ Answer
– Requests for works of fiction fail significantly more often
22
CONCLUSIONS &
FUTURE WORK
FINDINGS
▸ Tags outperform CV...
– ...probably because their terminology is closer to the user‘s
language (not because of the popularity effect)
▸ Sometimes CV are better, for example, for non-fiction books...
– ...whereas tags are better for fiction and for content-related,
familiarity or known-item searches
▸ We believe that tags are simply better able to match the user‘s
language when looking for books
– Although they are still not that great at it!
– Book search is still hard, especially for fiction books
25
OPEN QUESTIONS
▸ How can book metadata be adapted to be closer to the
vocabulary used in real-world book search requests?
▸ What other aspects (besides type of requested book or
relevance aspect of search request) contribute to request
difficulty?
▸ Our question to you:
– What other questions can we ask of this data?
26
QUESTIONS?
Paper URL: http://bit.ly/iconf2017

More Related Content

Viewers also liked

Subject Headings & Classification, or, Why librarians don't seem to think lik...
Subject Headings & Classification, or, Why librarians don't seem to think lik...Subject Headings & Classification, or, Why librarians don't seem to think lik...
Subject Headings & Classification, or, Why librarians don't seem to think lik...Naomi Young
 
RDA, MARC and BIBFRAME: transition and interaction
RDA, MARC and BIBFRAME: transition and interactionRDA, MARC and BIBFRAME: transition and interaction
RDA, MARC and BIBFRAME: transition and interactionGordon Dunsire
 
Beyond MARC: MARC, linked data, and Bibframe
Beyond MARC: MARC, linked data, and BibframeBeyond MARC: MARC, linked data, and Bibframe
Beyond MARC: MARC, linked data, and BibframeThomas Meehan
 
BIBFRAME and Moving Away From MARC
BIBFRAME and Moving Away From MARCBIBFRAME and Moving Away From MARC
BIBFRAME and Moving Away From MARCThomas Meehan
 
Tools of our Trade (RDA, MARC21) 2010-03-15
Tools of our Trade (RDA, MARC21) 2010-03-15Tools of our Trade (RDA, MARC21) 2010-03-15
Tools of our Trade (RDA, MARC21) 2010-03-15Ann Chapman
 
RDA and the semantic Web
RDA and the semantic WebRDA and the semantic Web
RDA and the semantic WebGordon Dunsire
 
BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?Thomas Meehan
 
Cataloging with RDA: An Overview
Cataloging with RDA: An OverviewCataloging with RDA: An Overview
Cataloging with RDA: An OverviewEmily Nimsakont
 

Viewers also liked (10)

Semantic Web Applications in Libraries: The Road to BIBFRAME
Semantic Web Applications in Libraries: The Road to BIBFRAMESemantic Web Applications in Libraries: The Road to BIBFRAME
Semantic Web Applications in Libraries: The Road to BIBFRAME
 
Subject Headings & Classification, or, Why librarians don't seem to think lik...
Subject Headings & Classification, or, Why librarians don't seem to think lik...Subject Headings & Classification, or, Why librarians don't seem to think lik...
Subject Headings & Classification, or, Why librarians don't seem to think lik...
 
RDA, MARC and BIBFRAME: transition and interaction
RDA, MARC and BIBFRAME: transition and interactionRDA, MARC and BIBFRAME: transition and interaction
RDA, MARC and BIBFRAME: transition and interaction
 
Beyond MARC: MARC, linked data, and Bibframe
Beyond MARC: MARC, linked data, and BibframeBeyond MARC: MARC, linked data, and Bibframe
Beyond MARC: MARC, linked data, and Bibframe
 
BIBFRAME and Moving Away From MARC
BIBFRAME and Moving Away From MARCBIBFRAME and Moving Away From MARC
BIBFRAME and Moving Away From MARC
 
MARC and BIBFRAME
MARC and BIBFRAMEMARC and BIBFRAME
MARC and BIBFRAME
 
Tools of our Trade (RDA, MARC21) 2010-03-15
Tools of our Trade (RDA, MARC21) 2010-03-15Tools of our Trade (RDA, MARC21) 2010-03-15
Tools of our Trade (RDA, MARC21) 2010-03-15
 
RDA and the semantic Web
RDA and the semantic WebRDA and the semantic Web
RDA and the semantic Web
 
BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?
 
Cataloging with RDA: An Overview
Cataloging with RDA: An OverviewCataloging with RDA: An Overview
Cataloging with RDA: An Overview
 

Similar to An In-depth Analysis of Tags and Controlled Metadata for Book Search

natureofinquiryandresearch-191011224537.pdf
natureofinquiryandresearch-191011224537.pdfnatureofinquiryandresearch-191011224537.pdf
natureofinquiryandresearch-191011224537.pdfJARYLPILLAZAR1
 
Marketing Research Ch04
Marketing Research Ch04Marketing Research Ch04
Marketing Research Ch04guestf8364c
 
natureofinquiryandresearch-191011224537.pptx
natureofinquiryandresearch-191011224537.pptxnatureofinquiryandresearch-191011224537.pptx
natureofinquiryandresearch-191011224537.pptxJubilinAlbania
 
Questioning Practices And Strategies
Questioning Practices And  StrategiesQuestioning Practices And  Strategies
Questioning Practices And Strategiesrobbi makely
 
Research questions and hypotheses_Hang_Vietnam
Research questions and hypotheses_Hang_VietnamResearch questions and hypotheses_Hang_Vietnam
Research questions and hypotheses_Hang_VietnamHangNguyenPhuocDieu
 
Identifying and formulating a research question: Ayurveda Perspective
Identifying and formulating a research question: Ayurveda Perspective Identifying and formulating a research question: Ayurveda Perspective
Identifying and formulating a research question: Ayurveda Perspective Kishor Patwardhan
 
2. practical research ii nature of inquiry & research
2. practical research ii nature of inquiry & research2. practical research ii nature of inquiry & research
2. practical research ii nature of inquiry & researchLive Angga
 
2-171124011016.pdf
2-171124011016.pdf2-171124011016.pdf
2-171124011016.pdfJovManalili1
 
Arte387 Ch3
Arte387 Ch3Arte387 Ch3
Arte387 Ch3SCWARTED
 
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title TakishaPeck109
 
Essential questions
Essential questionsEssential questions
Essential questionsCarla Piper
 
Searching Databases.docx
Searching Databases.docxSearching Databases.docx
Searching Databases.docxstudywriters
 
Searching Databases.docx
Searching Databases.docxSearching Databases.docx
Searching Databases.docxwrite4
 
Questionnaire design dr. s l gupta
Questionnaire design dr. s l guptaQuestionnaire design dr. s l gupta
Questionnaire design dr. s l guptaRavindra Sharma
 

Similar to An In-depth Analysis of Tags and Controlled Metadata for Book Search (20)

Nature of inquiry and research
Nature of inquiry and researchNature of inquiry and research
Nature of inquiry and research
 
natureofinquiryandresearch-191011224537.pdf
natureofinquiryandresearch-191011224537.pdfnatureofinquiryandresearch-191011224537.pdf
natureofinquiryandresearch-191011224537.pdf
 
Marketing Research Ch04
Marketing Research Ch04Marketing Research Ch04
Marketing Research Ch04
 
natureofinquiryandresearch-191011224537.pptx
natureofinquiryandresearch-191011224537.pptxnatureofinquiryandresearch-191011224537.pptx
natureofinquiryandresearch-191011224537.pptx
 
Questioning Practices And Strategies
Questioning Practices And  StrategiesQuestioning Practices And  Strategies
Questioning Practices And Strategies
 
Research questions and hypotheses_Hang_Vietnam
Research questions and hypotheses_Hang_VietnamResearch questions and hypotheses_Hang_Vietnam
Research questions and hypotheses_Hang_Vietnam
 
Identifying and formulating a research question: Ayurveda Perspective
Identifying and formulating a research question: Ayurveda Perspective Identifying and formulating a research question: Ayurveda Perspective
Identifying and formulating a research question: Ayurveda Perspective
 
Classroom Assessment Techniques
Classroom Assessment TechniquesClassroom Assessment Techniques
Classroom Assessment Techniques
 
PPT-Final.pptx
PPT-Final.pptxPPT-Final.pptx
PPT-Final.pptx
 
2. practical research ii nature of inquiry & research
2. practical research ii nature of inquiry & research2. practical research ii nature of inquiry & research
2. practical research ii nature of inquiry & research
 
2-171124011016.pdf
2-171124011016.pdf2-171124011016.pdf
2-171124011016.pdf
 
Arte387 Ch3
Arte387 Ch3Arte387 Ch3
Arte387 Ch3
 
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
TITLE OF PAPER HERE1TITLE OF PAPER HERE3Full Title
 
QUALITATIVE DATA ANALYSIS.ppt
QUALITATIVE DATA ANALYSIS.pptQUALITATIVE DATA ANALYSIS.ppt
QUALITATIVE DATA ANALYSIS.ppt
 
Summary+of+comments+based+on+scoring+on+feb++29+2012
Summary+of+comments+based+on+scoring+on+feb++29+2012Summary+of+comments+based+on+scoring+on+feb++29+2012
Summary+of+comments+based+on+scoring+on+feb++29+2012
 
Search vs Text Classification
Search vs Text ClassificationSearch vs Text Classification
Search vs Text Classification
 
Essential questions
Essential questionsEssential questions
Essential questions
 
Searching Databases.docx
Searching Databases.docxSearching Databases.docx
Searching Databases.docx
 
Searching Databases.docx
Searching Databases.docxSearching Databases.docx
Searching Databases.docx
 
Questionnaire design dr. s l gupta
Questionnaire design dr. s l guptaQuestionnaire design dr. s l gupta
Questionnaire design dr. s l gupta
 

More from Toine Bogers

"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C..."If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...Toine Bogers
 
Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while DrivingHands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while DrivingToine Bogers
 
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...Toine Bogers
 
A Study of Usage and Usability of Intelligent Personal Assistants in Denmark
A Study of Usage and Usability of Intelligent Personal Assistants in DenmarkA Study of Usage and Usability of Intelligent Personal Assistants in Denmark
A Study of Usage and Usability of Intelligent Personal Assistants in DenmarkToine Bogers
 
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...Toine Bogers
 
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq..."I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq...Toine Bogers
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingToine Bogers
 
Defining and Supporting Narrative-driven Recommendation
Defining and Supporting Narrative-driven RecommendationDefining and Supporting Narrative-driven Recommendation
Defining and Supporting Narrative-driven RecommendationToine Bogers
 
Personalized search
Personalized searchPersonalized search
Personalized searchToine Bogers
 
A Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index SizeA Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index SizeToine Bogers
 
Measuring System Performance in Cultural Heritage Systems
Measuring System Performance in Cultural Heritage SystemsMeasuring System Performance in Cultural Heritage Systems
Measuring System Performance in Cultural Heritage SystemsToine Bogers
 
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...Toine Bogers
 
Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?Toine Bogers
 
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on TwitterMicro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on TwitterToine Bogers
 

More from Toine Bogers (14)

"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C..."If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
 
Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while DrivingHands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
 
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
 
A Study of Usage and Usability of Intelligent Personal Assistants in Denmark
A Study of Usage and Usability of Intelligent Personal Assistants in DenmarkA Study of Usage and Usability of Intelligent Personal Assistants in Denmark
A Study of Usage and Usability of Intelligent Personal Assistants in Denmark
 
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
 
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq..."I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Defining and Supporting Narrative-driven Recommendation
Defining and Supporting Narrative-driven RecommendationDefining and Supporting Narrative-driven Recommendation
Defining and Supporting Narrative-driven Recommendation
 
Personalized search
Personalized searchPersonalized search
Personalized search
 
A Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index SizeA Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index Size
 
Measuring System Performance in Cultural Heritage Systems
Measuring System Performance in Cultural Heritage SystemsMeasuring System Performance in Cultural Heritage Systems
Measuring System Performance in Cultural Heritage Systems
 
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
 
Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?
 
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on TwitterMicro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
 

Recently uploaded

(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)itwameryclare
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxSimeonChristian
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 

Recently uploaded (20)

(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 

An In-depth Analysis of Tags and Controlled Metadata for Book Search

  • 1. AN IN-DEPTH ANALYSIS OF TAGS AND CONTROLLED METADATA FOR BOOK SEARCH TOINE BOGERS VIVIEN PETRAS MARCH 23, 2017iCONFERENCE 2017
  • 2. OUTLINE ▸ Introduction ▸ Methodology & Experimental Setup ▸ Analysis – Tags vs. Controlled Vocabularies – Book Search Requests – Failure Analysis ▸ Conclusions & Future Work 2
  • 4. MOTIVATION ▸ Readers often struggle with existing systems (i.e., library catalogs, Amazon, eBook sellers) to discover new books – Information needs are contextual, personal & complex – Book metadata does not contain the necessary information 4
  • 5. EARLIER WORK ▸ iConference 2015 – Tags outperform controlled vocabularies for search, but sometimes controlled vocabularies are better. – Controlled vocabularies contains more unique terms, tags more repetition of terms. ▸ Why? – Terminology – Popularity / frequency – Type of request 5
  • 6. STUDY OBJECTIVES ▸ Why are tags better than controlled vocabularies for book search? – Which types of book search requests are better addressed using tags and which using CV? – Which book search requests fail completely and what characterizes such requests? 6
  • 8. EXPERIMENTAL SETUP ▸ Controlled Vocabulary content (CV) – DDC class labels – Subjects – Geographic names – Category labels – LCSH terms ▸ Tags – Each tag occurs as many times as it has been assigned by the users ▸ Unique tags – Each tag occurs only once 8
  • 9. AMAZON/LIBRARYTHING COLLECTION 9 Tags Tags Controlled Vocabulary Content (CV) DDC class labels subjects geographic names category labels LCSH terms Unique Tags Unique Tags per record
  • 11. EXPERIMENTAL SETUP ▸ Amazon / LibraryThing collection of book records – 2 million records ▸ LibraryThing forum topics for search requests – 334 search requests for testing ▸ Relevance judgements – Recommendations from LT members with graded relevance scoring (highest relevance if book is added by searcher) ▸ Evaluation metric – Normalized Discounted Cumulated Gain (NDCG@10) ▸ IR system – Indri 5.4 toolkit 10
  • 13. TAGS vs. CONTROLLED VOCABULARIES ▸ Question 1: Is there a difference in performance between CV and Tags in retrieval? ▸ Answer – Tags perform significantly better than CV – The combination of both results in even better performance than just for tags, but not significantly so – Losing tag frequency information helps rather than hurts performance (also not significantly) 12
  • 14. TAGS vs. CONTROLLED VOCABULARIES ▸ Question 2: Do tags outperform CV because of the so- called popularity effect? ▸ Answer – No, there does not seem to be a popularity effect – Types = unique words in a record – Tokens = all instances of words in a record 13
  • 15. TAGS vs. CONTROLLED VOCABULARIES ▸ Question 3: Do Tags and CV complement or cancel each other out? ▸ Answer – Tags and CV complement each other: they are successful on different sets of requests – But most zero-difference requests (74.0%) actually fail completely! When and why? 14
  • 16. REQUESTS – RELEVANCE ASPECTS ▸ What makes a suggested book relevant to the user? – Distinguish between eight relevance aspects (Reuter, 2007; Koolen et al., 2015) 16
  • 17. REQUESTS – RELEVANCE ASPECTS Aspect Description % of requests (N = 87) Accessibility Language, length, or level of difficulty of a book 9.2 % Content Topic, plot, genre, style, or comprehensiveness 79.3 % Engagement Fit a certain mood or interest, are considered high quality, or provide a certain reading experience 25.3 % Familiarity Similar to known books or related to a previous experience 47.1 % Known-item The user is trying to identify a known book, but cannot remember the metadata that would locate it 12.6 % Metadata With a certain title or by a certain author or publisher, in a particular format, or certain year 23.0 % Novelty Unusual or quirky, or containing novel content 3.4 % Socio-cultural Related to the user's socio-cultural background or values; popular or obscure 13.8 % 16
  • 18. REQUESTS – RELEVANCE ASPECTS ▸ Question 4: What types of book requests are best served by the Unique tags and CV collections? ▸ Answer – CV terms show a tendency to work best for requests that touch upon aspects of engagement – Other requests are best served by Unique tags 17
  • 19. REQUESTS – RELEVANCE ASPECTS 0,00 0,10 0,20 0,30 0,40 0,50 0,60 0,70 0,80 0,90 1,00 Socio-cultural (N = 10) Novelty (N = 2) Metadata (N = 17) Known-item (N = 11) Familiarity (N = 36) Engagement (N = 21) Content (N = 63) Accessibility (N = 7) Unique tags CV 0.0 0.20.1 0.40.3 0.60.5 0.80.7 1.00.9 Socio-cultural (N = 10) 0.1127 0.0428 Novelty (N = 2) 0.5304 0.0000 Metadata (N = 17) 0.2454 0.1259 Known-item (N = 11) 0.3593 0.1818 Familiarity (N = 36) 0.1833 0.0701 Engagement (N = 21) 0.1121 0.1425 Content (N = 63) 0.1965 0.0821 Accessibility (N = 7) 0.1235 0.0749 Performance grouped by relevance aspect NDCG@10 18
  • 20. REQUESTS – TYPE OF BOOK ▸ Question 5: What types of book requests (fiction or non- fiction) are best served by Unique tags or CV? ▸ Answer – Unique tags work significantly better for fiction – CV work better for non-fiction (but not significantly so) 19
  • 21. FAILURE ANALYSIS ▸ Question 6: Do failed book search requests fail because of data sparsity, a lower recall base, or a lack of examples? ▸ Answer – Neither sparsity nor the size of the recall base are the reason for retrieval failure – The number of examples provided by the requester has significant positive influence on performance (N = 247) (N = 87) (N = 334) 20
  • 22. FAILURE ANALYSIS ▸ Question 7: Do book search requests fail because of their relevance aspects? ▸ Answer – No, relevance aspects are distributed equally for successful & failed requests – Only Accessibility- and Metadata- related search requests seem to fail more often 21
  • 23. FAILURE ANALYSIS ▸ Question 8: Does the type of book that is being requested (fiction vs. non-fiction) have an influence on whether requests succeed or fail? ▸ Answer – Requests for works of fiction fail significantly more often 22
  • 25. FINDINGS ▸ Tags outperform CV... – ...probably because their terminology is closer to the user‘s language (not because of the popularity effect) ▸ Sometimes CV are better, for example, for non-fiction books... – ...whereas tags are better for fiction and for content-related, familiarity or known-item searches ▸ We believe that tags are simply better able to match the user‘s language when looking for books – Although they are still not that great at it! – Book search is still hard, especially for fiction books 25
  • 26. OPEN QUESTIONS ▸ How can book metadata be adapted to be closer to the vocabulary used in real-world book search requests? ▸ What other aspects (besides type of requested book or relevance aspect of search request) contribute to request difficulty? ▸ Our question to you: – What other questions can we ask of this data? 26