The Art Discovery Group Catalogeu was presented at the meeting of Art Libraries.net in Copenhagen, October 2014. The presentation outlines the content, interface developments and new horizons including data mining and language tagging for improved clustering and presentation, clustering journal articles, analysing data and improving data quality.
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
Art discovery group catalogue: Usage, content and new horizons
1. Art Libraries.net Danmarks Kunstbibliotek , Copenhagen 10 October 2014
Art Discovery Group
Catalogue: Usage, Content
and New Horizons
Janifer Gatenby & Boaz Nadav-Manes
2. Art Libraries.net Danmarks Kunstbibliotek , Copenhagen 10 October 2014
What can you find
in the catalogue?
3. “The Art Discovery Group Catalogue is the best place to start for art-related
content discovery” Geert-Jan Koot, Head of the Rijksmuseum Research Library in
• A foundation of
shared data
• A ‘collective
collection’ of the
world’s libraries
• 72,000 libraries
worldwide
• 320+ million
records
• 41 m digital
items
• 31 m
institutional
repository
records
Amsterdam.
Search
OCLC central index
Licensed
• 2,000+ databases
• 1.4 billion articles
in total
• Articles from
78,000 journal
titles
• Articles from
21,000 journal
titles
4. Art Discovery Group Catalogue
http://www.artlibraries.worldcat.org
• ADGC Scope
– 60+ of the World’s Major Art Libraries (& growing)
– Including 3 large networks
– And including OCLC central article index
• World-wide scope
• Narrower scope within Art libraries
– Currently configured by region
– Customisable
5. Coverage of OCLC Central Article Index
• ALJC Linguistics and Arts 2008, 2009, 2010 (75% of ISSNs)
• Art Full Text (H.W. Wilson) (75% of ISSNs)
• Arts & Humanities Full Text (75% of ISSNs)
• Intellect Arts and Creative Media Collection (75% of ISSNs)
• ISI Arts and Humanities Citation Index (75% of ISSNs)
• ProQuest Arts Module (75% of ISSNs)
• ProQuest International Arts Module (75% of ISSNs)
• SpringerLink Architecture and Design E-Books (75% of ISSNs)
• Taylor & Francis Arts & Humanities Archive (75% of ISSNs)
• Taylor & Francis Arts & Humanities Collection (75% of ISSNs)
6. Examples: Search available with a group
subscription
• Arts & Sciences I – XI (JSTOR)
• CAMIO (art museum images) (OCLC)
• Oxford Art Online (OUP)
• SCIPIO (OCLC)
No authentication but OCLC reports usage
7. Examples: Available with a Group
Subscription & Authentication
• Art Abstracts (H. W. Wilson) (EBSCO)
• Art Full Text (H.W. Wilson) (EBSCO)
• Art Index (H.W. Wilson) (EBSCO)
• Art index retrospective (H.W. Wilson) (EBSCO)
• Art bibliographies Modern (Proquest)
• Avery Index to Architecture Periodicals (ProQuest or EBSCO
or H.W. Wilson / EBSCO)
20. Art Auction Catalogues
– How can the retrieval be improved
– Improve Work and sub work (GLIMIR) clustering
– Ensure Scipio records are well clustered with non
Scipio records
Author
field Author Title Art scope
110
Christie, Manson &
Woods International
Inc
Impressionist and modern art evening sale $b
including property from the estate of Edgar M.
Bronfman y
art institute
Chicago +
710
Christie, Manson &
Woods International
(New York)
Impressionist and modern art evening sale $b
Tuesday, 6 May 2014 y SIK-ISEA
710
Christie, Manson &
Woods International
Inc
Impressionist and modern art : evening sale :
Christie's Tuesday, 6 May 2014, 7pm (lots 1-54) n [Yale]
110 Christie's
Impressionist and modern art evening sale $b
including property from the estate of Edgar M.
Bronfman y RIJKS
710 Christie's
Impressionist and modern art$b evening sale;
including property from the estate of Edgar M.
Bronfman; Tuesday 6 May 2014; properties from
the American Hospital of Paris … y u Heidelberg
21. Also being studied
• Inclusion of licensed content
• ILL options
• Subject relevance
– sending noise to the bottom of the results
– from package subscriptions
– from libraries without a specific symbol for the art
library
22. New White Paper
identifies problems with data quality
http://www.oclc.org/content/dam/oclc/reports/data-quality/215233usf-
SuccessStrategies-Summary.pdf
24. Clustering Articles
• Article metadata is received from multiple
sources – the same article in the same serial
• Articles are re-published
• Articles are really like monographs
• OCLC Research is working on two initiatives
– Algorithmic
– Similarity vectors; graphic
25. New Discovery API
• Access to an ever growing collection of central index metadata for which
OCLC has been granted rights.
• Linked Data response formats, so that library collections can speak the
language preferred by the Web.
• Facet functionality, so that libraries can deliver a modern search
experience with the ability to quickly drill down into search results.
• Access to the latest data models, including entities
• http://www.oclc.org/news/releases/2014/201432dublin.en.html
http://www.oclc.org/worldcat.en.html
WorldCat represents a “collective collection” of the world’s libraries, built through the contributions of librarians, expanded and enhanced through individual, regional and national programs.
http://www.oclc.org/worldcat/community.en.html
WorldCat is the international, online catalog that helps OCLC members share resources, reduce costs and increase visibility and impact in the communities they serve and on the Web. WorldCat is cooperatively built and managed by a dedicated, expert community of librarians, OCLC staff and content providers who collaborate to improve the integrity of, and access to, the “collective collection” of the world’s libraries.
Libraries contribute to WorldCat; publishers and other content providers contribute to the central index.
Searches via ADGC default to Art Libraries Scope. WorldCat and the OCLC central article index are searched but only those bibliographic records or serial titles with holdings from Art Libraries are included in the result set. The scope can be broadened to World-Wide (all holdings) or narrowed. The current narrowing is by region but this is tailorable and other narrower scopes are possible.
This includes titles that where the metadata has been made available to OCLC for inclusion in the Central Article Index. This slide does not say that any of these subscriptions are included in the Central Article index, it just indicates the percentage of the serials within a subscription that are included.
Some subscriptions do not require authentication for access to the metadata but require OCLC to report usage. The metadata is included in the search results but access to full text may be limited by IP address range testing, for example.
A group subscription is required for some titles where the metadata is the subscription. As ADGC is a publically available resource it is a commercial challenge to negotiate a goup subscription. We are working on this with a prioritorised set of titles.
The next set of slides indicates the enhancements that will be made to the user interface. The new interface is ready for installation once configuration details such as logo are finalised.
There are several logos being evaluated, this is a prototype.
The current interface requires use of the back arrow to access the result set. With the new interface, the result set will be always available.
http://oclc.org/worldcat-discovery/features.en.html . This URL gives a summary of the interface improvements in development. We are working towards a knowledge card similar to the Google Knowledge card.
This is a screen produced for the specification of the new work level display. Existing work clusters are maintained by a common work ID within each bibliographic record. There are also sub work clusters that identify identical content (regardless of format) and identical records catalogued in different languages of cataloguing. Work level records have been generated that includes authors, titles, subjects, content and summary notes, all language tagged. An Iceland Fisherman, original in French displayed in English UI with Translations table.
We have started to examine auction catalogues. From just 2 titles (admittedly not a statistical sample) we have found each has 5 bibliographic records that failed to be identified as the same work by the work algorithms due to different cataloguing styles. The example above indicates 4 different records created by members of the ADGC and one other. The creation of name authority records for the major auction houses will help.
Scipio is an initiative of a group of art libraries (mostly North American) which cooperate to generate bibliographic records. The first line of this table is a Scipio record.
We originally thought that the main culprit of noise was from university art libraries which did not have a separate symbol from their main institutional library. However, by examining noise, it seems that serial package subscriptions are producing the most noise. We are investigating ways to relegate noise to the end of the result set – by classification or serial title, etc.
A new study released this week examines data quality in the records found in the central index and makes some recommendations for publishers and content providers.
The main recommended solutions.
Until now journal articles have always been in the too hard basket. There are two separate projects within OCLC research to cluster articles. The data provided by the content providers cannot be de-duplicated and this causes obvious noise in the result sets of searches.
The announcement this month of a new discovery API offers new possibilities, though I personally am not suggesting that we should rush into it. The new API searches the content index at the same time as WorldCat and thus it is now possible to produce a fully customised interface as opposed to a configurable / tailored interface.