Search engines and recommender systems have come to play prominent roles in our online lives, providing us with access to the information we need and with recommendations for new interesting content to consume.
However, in the case of more complex information needs 'pure' search engines and recommender systems are not always able to satisfy a user's needs. There are tens of thousands of examples on the Web of such complex needs, where users express what they are looking for in rich narrative descriptions.
This talk presents a high-level overview of the work done in the Social Book Search Lab workshops, which ranges from detecting and mining such complex book requests to exploring how to generate the best recommendations and how users interact with book search engines.
Presented on October 7, 2016 at the AoIR 2016 conference in Berlin, Germany.
What to Read Next? Three Stages of Data-driven Book Discovery
1. WHAT TO READ NEXT?
THREE STAGES OF DATA-DRIVEN BOOK DISCOVERY
METTE SKOV
TOINE BOGERS
AALBORG UNIVERSITY
AOIR PANEL ON ‘NETWORKED READING’ OCTOBER 7, 2016
3. BOOKS ARE NOT DEAD (THEY AREN’T EVEN SICK!)
Books remain very popular!
– Slow but steady increase in book sales to
2.7 billion books in the US in 2015
E-books make up 18.9% of that in the US
– Total sales revenue: $29.2 billion in US in
2015
E-book sales revenue was $5.3 billion
So there is definitely a market & need for
discovering (new) interesting books!
3
4. BOOK DISCOVERY IS NOT THAT EASY...
Readers often struggle with existing systems (search
engines & recommender systems) to discover new books
– Information needs are highly complex
Topical match, complex relevance aspects, personal interests
& preferences, context of use
– Search engines and recommenders are ill-equipped to
address such needs!
4
6. SOCIAL BOOK SEARCH LAB
Series of workshops (2011-2016) with a shared data
challenge using data from LibraryThing and Amazon
Focus is on the design, development & evaluation of
systems that can address complex book requests
1. Detecting complex book requests
2. Analyzing book requests for relevance aspects
3. Developing better algorithms for suggesting relevant
books
4. Exploring interactionswith book search engines
6
7. OVERVIEW OF DATA SOURCES 7
2.8 million books
944 annotated
requests
Books
Users
Book
requests
9. BOOK REQUESTS
Forum posts describing realistic book search requests
– Book request narratives can touch upon many different
aspects
Users search for topics, genres, authors, plots, etc.
Users want books that are engaging, funny, well-written,
educational, etc.
Users have different preferences, knowledge, reading level,
etc.
– Book discussion fora contain many such focused requests!
LibraryThing, Goodreads, …
9
10. OVERVIEW OF DATA SOURCES 10
2.8 million books
944 annotated
requests
5,658 suggestionsSuggestions
Books
Users
Book
requests
13. OVERVIEW OF DATA SOURCES 13
94,000 user
profiles
2.8 million books
944 annotated
requests
5,658 suggestionsSuggestions
Books
Users
Book
requests User profiles
14. OVERVIEW OF DATA SOURCES 14
Suggestions
Books
Users
Book
requests
Bibliographic
metadata
Curated
metadata
User-generated
content
User profiles
Tags
Reviews
16. DETECTING BOOK REQUESTS
How common are requests for book recommendations in
the LibraryThing Forums?
– Currently 233,000+ threads in the LibraryThing forums
– Annotated a random sample of 4,000 threads of which
15.1% were book requests
– Means there are potentially over 35,000 book requests on
LibraryThing!
17
17. DETECTING BOOK REQUESTS
Can we detect such book requests automatically?
– Initial experiments achieved an accuracy of 94.17% on a test
set of 2,000 annotated book requests
– Most predictive characteristics
Words such as any, suggestions, looking, recommendations,
thanks, anyone, read, books, and recommend
No. of sentencesending in a question mark
Degree of expertiseof LibraryThing users replying to the thread
Ratio of suggested books cataloged afterwards by the requester
18
19. ANALYZING BOOK REQUESTS
Book requests contain many elements that could be mined
to benefit search engines & recommendation systems
– Example: Relevance aspects
What makes a suggested book relevant to the user?
Identified eight relevance aspects in book search
requests (Reuter, 2007; Koolen et al., 2015)
20
20. ANALYZING BOOK REQUESTS
Accessibility
– Accessibility in terms of the language, length, or level of difficulty of a
book.
Content
– Aspects such as topic, plot, genre, style, or comprehensiveness of a book.
Engagement
– Books that fit a particular mood or interest, books that are considered high
quality, or provide a particular reading experience.
Familiarity
– Books that are similar to known books or related to a previous experience.
21
21. ANALYZING BOOK REQUESTS
Known-item
– Descriptions of known books with the sole purposeof identifying its title and/or
author.
Metadata
– Books with a certain title or by a certain author, editor, illustrator, publisher, in a
particular format, or written or published in certain year or period.
Novelty
– Books with content that is novel to the reader, books that are unusualor quirky.
Socio-Cultural
– Books related to the user’s socio-culturalbackground or values, books that are
popular or obscure, or books that have had a particular culturalor social impact.
22
25. SUGGESTING RELEVANTBOOKS 29
- Title
- Publisher
- Editorial
- Creator
- Series
- Award
- Character
- Place
• Different
grou
- Blurb
- Epigraph
- First words
- Last words
- Quotation
- User reviews
• Different
grou
- Dewey
- Thesaurus
- Index terms
- Tags
Bibliographic metadata
Content
Curated metadata
Reviews
Tags
Different types of book metadata fields
29. AIM & APPROACH
We aim to contribute to building dedicated book
search and discovery services
Our long-term goal is to investigate book search
behaviour through a range of user tasks and interfaces:
– How should the user interface combine professional,
curated metadata and user-generated metadata?
– How should the user interface adapt itself as the user
progresses through their search task, and if so, how?
– When do users prefer to browse or search?
– How can we best support different types of search
tasks?
30. USER STUDY OF INTERACTIVE BOOK SEARCH BEHAVIOUR
Comparative user studies with 192 + 111 participants (2015 & 2016)
Welcome
Informed
Consent
Background
Pre-Task
Information
Task
Post-Task
Questions
Experience Thank you
31. EXPERIMENTAL TASKS
Goal-oriented task: Imagine you participate in an experiment at a desert-
island for one month. There will be no people, no TV, radio or other
distraction. The only things you are allowed to take with you are 5 books:
– On surviving on a desert island
– That will teach you something new
– Highly recommended by other users
– For fun
– About one of your personal hobbies or interests
Non-goal task: Imagine you are waiting to meet a friend in a coffee shop
or pub or the airport or your office. While waiting, you come across this
website and explore it looking for any book that you find interesting, or
engaging or relevant. Explore anything you wish until you are completely
and utterly bored…
36. WHAT HAVE WE LEARNED SO FAR?
Need for heterogeneous record information (user
generated and curated, professional data)
Multi-stage interface:
– Longer search session
– Less queries issued (more browsing)
– No differences in number of books added to book bag
Clear differences in search behaviour between the
different types of tasks
(Gäde et al. 2015, 2016)
38. CONCLUSIONS
Tens of thousands of information needs are going unmet
– Just the tip of the iceberg?
– Search engines and recommender systems are ill-equipped
to deal with this!
42
39. OPEN QUESTIONS
How (dis)similar are relevance aspects for books to those
for other domains?
How do relevance aspects influence the choice of
algorithm(s) & data representation(s)?
How does the combination of data from different sources
(Amazon, LibraryThing, Library of Congress, British Library)
affect the quality of the results and UX?
Decontextualized metadata: What happens when we mix
metadata from different sources?
– Example: reuse of recommendations or tags ‘out of context’
43
41. REFERENCES
Slide 3
– Book sales statistics taken from https://www.statista.com/topics/1177/book-
market/ and https://www.statista.com/topics/1474/e-books/; last visited
October 1, 2016
Slide 6
– Official website of the Social Book Search lab: http://social-book-
search.humanities.uva.nl/
Slide 20
– Reuter, K. (2007). Assessing Aesthetic Relevance: Children’s Book Selection in a
Digital Library. JASIST, 58(12), 1745–1763.
– Koolen, M., Bogers, T., Van den Bosch, A., and Kamps, J. (2015). Looking for
Books in Social Media: An Analysis of Complex Search Requests. Proceedings of
ECIR 2015, Volume 9022 of the series Lecture Notes in Computer Science, pp.
184-196
46
42. REFERENCES
Slide 40
– Gäde, M., Hall, M., Huurdeman, H., Kamps, J., Koolen, M., Skov, M., Toms, E. &
Walsh, D. (2015). Overview of the SBS 2015 Interactive Track . Working Notes of
CLEF 2015 – Conference and Labs of the Evaluation Forum, CEUR workshop
proceedings, vol. 1391
– Gäde, M., Hall, M., Huurdeman, H., Kamps, J., Koolen, M., Skov, M., Bogers, T. &
Walsh, D. (2015). Overview of the SBS 2016 Interactive Track. Working Notes of
CLEF 2016 – Conference and Labs of the Evaluation Forum, CEUR workshop
proceedings, vol. 1609
47