This document discusses BIBFRAME, a new bibliographic framework being developed as a replacement for the MARC cataloging standard. It provides an overview of BIBFRAME, including its goals of utilizing linked data and resolving issues with MARC. The document also examines the BIBFRAME model and vocabulary, experiments being conducted with it, and questions around its future adoption.
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
BIBFRAME and the Future of Cataloguing
1. BIBFRAME :
the future of cataloguing? /
Thomas Meehan.
Cambridge Cataloguing Advisory Group
Divinity Faculty
7 December 2016
tom@aurochs.org @orangeaurochs
4. Linked Data: The Web of Data
1. Use URIs as names for things.
2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful information, using
the standards (RDF, SPARQL).
4. Include links to other URIs so that they can discover more things.
Tim Berners-Lee (2006)
9. British Library model (used for the BNB)
<http://bnb.data.bl.uk/doc/resource/015771460> dct:creator <http://bnb.data.bl.uk/id/person/WaughEvelyn1903-1966> .
10. Some library Linked Data releases
2008 Swedish National Library
2010 German National Library (authority data)
2011 BNB
Cambridge University Library
Europeana
French National Library
2012 OCLC Worldcat (using schema.org)
Spanish National Library
2014 RLUK (as part of the European Library)
13. British Library Model (used for the BNB)
<http://bnb.data.bl.uk/doc/resource/015771460> dct:creator <http://bnb.data.bl.uk/id/person/WaughEvelyn1903-1966> .
14. British Library Model (used for the BNB)
<http://bnb.data.bl.uk/id/person/WaughEvelyn1903-1966> rdfs:label "Waugh, Evelyn, 1903-1966" ;
owl:sameAs <http://viaf.org/viaf/68937142> .
15. BIBFRAME: Why?
“Most felt any benefits of RDA would be
largely unrealized in a MARC environment.
MARC may hinder the separation of
elements and ability to use URIs in a linked
data environment.”
Report and Recommendations of the U.S. RDA Test Coordinating
Committee. Executive Summary. 2011.
16. BIBFRAME: Scope
"Demonstrate credible progress towards a
replacement for MARC."
Report and Recommendations of the U.S. RDA Test Coordinating Committee.
Executive Summary. 2011.
“a foundation for the future of bibliographic
description, both on the web, and in the
broader networked world.”
BIBFRAME website.
23. BIBFRAME 1.0: Vocabulary
ex:work1 bf:creator [
a bf:Person, a bf:Authority ;
bf:authorizedAccessPoint “Holland, Tom” ;
bf:hasAuthority <http://id.loc.gov/authorities/names/nb2004033017>
]
ex:instance1 bf:instanceTitle [
a bf:Title ;
bf:titleValue “Rubicon” ;
bf:subtitle “the last years of the Roman Republic”
].
24. BIBFRAME: Vocabulary 2.0
ex:work1 rdau:P60434 <http://id.loc.gov/authorities/names/nb2004033017> .
<http://id.loc.gov/authorities/names/nb2004033017>
a bf:Person ;
rdfs:label “Holland, Tom” .
ex:instance1 bf:title [
a bf:Title;
bf:mainTitle “Rubicon” ;
bf:subtitle “the last years of the Roman Republic”
] .
25. BIBFRAME: Bibfra.me
ex:work1 bflite+relation:author [
a bfLite:Person ;
bfLite:authorityLink <http://id.loc.gov/authorities/names/nb2004033017> ;
bfLite:name “Holland, Tom”
]
Ex:instance1 bfLite:title “Rubicon” ;
bfLite+library “the last years of the Roman Republic” .
26. BIBFRAME: Experiments and Projects
Library of Congress trial, including work on the Bibframe Editor (BFE)
and transformation service.
National Library of Medicine
Libhub and Bibflow
OCLC and schema.org
LD4L
29. What is it all for?
• Googling (see schema.org etc)
• Storage
• Editing
• Openness
• Ending the library silo
• APIs!
• A lifeboat to allow us to kill off MARC
30. When is it all going to happen?
https://www.flickr.com/photos/andreboeni/7882212230/
31. Standards and Vocabularies
British Library. Free Data Services. 1, Linked Open BNB.
http://www.bl.uk/bibliographic/datafree.html#lod
Dublin Core Metadata Initiative. DCMI Metadata Terms.
http://dublincore.org/documents/2012/06/14/dcmi-terms/
LD4L. LD4L : Linked Data for Libraries. https://www.ld4l.org/
Library of Congress. BIBFRAME Model and Vocabulary.
http://www.loc.gov/bibframe/docs/index.html
Library of Congress. LC Linked Data Service : Authorities and Vocabularies. http://id.loc.gov/
Schema.org. Schema.org. http://schema.org/
W3C. RDF. https://www.w3.org/RDF/
Zepheira. Bibframe Vocabulary Navigator. http://www.bibfra.me/
32. Reports and Articles
Berners-Lee. Linked Data : Design Issues. 2006. https://www.w3.org/DesignIssues/LinkedData.html
Godby, Carol Jean. The Relationship Between BIBFRAME and OCLC’s Linked-Data Model of Bibliographic Description: A Working Paper. 2013.
http://www.oclc.org/content/dam/research/publications/library/2013/2013-05.pdf
Greenall, Rurik. Is A Common Framework for Library Data A Dead End? 2013. https://brinxmat.wordpress.com/2013/08/08/is-a-common-framework-for-
library-data-a-dead-end/
Greenall, Rurik. Oslo Public Library Cataloguing Linked Data. 2016. https://vimeo.com/192831354
Library of Congress. Bibliographic Framework as a Web of Data: Linked Data Model and Supporting Services. 2012.
http://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf
Library of Congress. A Bibliographic Framework for the Digital Age. 2011. http://www.loc.gov/bibframe/news/framework-103111.html
Library of Congress. Bibliographic Framework Initiative. http://www.loc.gov/bibframe/
Malssen, Kara Van. BIBFRAME AV Modeling Study: Defining a Flexible Model for Description of Audiovisual Resources. 2014.
http://www.loc.gov/bibframe/pdf/bibframe-avmodelingstudy-may15-2014.pdf
Fallgren, Nancy. NLM BIBFRAME Update. 2015. https://www.nlm.nih.gov/pubs/techbull/mj15/mj15_bibframe.html
U.S. RDA Test Coordinating Committee. Report and Recommendations of the U.S. RDA Test Coordinating Committee. Executive Summary. 2011.
http://www.nlm.nih.gov/tsd/cataloging/RDA_report_executive_summary.pdf
Zepheira. Bibframe Vocabulary Navigator [BIBFRAME lite]. http://www.bibfra.me/
Welsh, Anne, Bikakis, Antonis, Garea Garcia, Natalia, Mahony, Simon, Inskip, Charles, Vogel, Mira. Work in Progress: the Linked Open Bibliographic Data
Project. Catalogue and Index (178) p. 15-19. 2014. http://discovery.ucl.ac.uk/1466469/
33. Photos
Meehan, Thomas. Cat and books.
https://www.flickr.com/photos/orangeaurochs/30568200516/ (CC-BY 2.0)
Bone, Andrew. GWR No.22 AEC Diesel Railcar with Firefy Broad Gauge Steam
Loco Replica at Didcot Great Western Railway Centre.
https://www.flickr.com/photos/andreboeni/7882212230/ (CC-BY 2.0)
Meehan, Thomas. Sleepy cat, books, and tinsel.
https://www.flickr.com/photos/orangeaurochs/8332159067/ (CC-BY 2.0)
34. BIBFRAME :
the future of cataloguing? /
Thomas Meehan.
Cambridge Cataloguing Advisory Group
Divinity Faculty
7 December 2016
tom@aurochs.org @orangeaurochs
Editor's Notes
Hello.
I am Thomas Meehan, Head of Cataloguing and Metadata at UCL.
Today I am going to talk about Bibframe, the Library of Congress’s proposal to replace MARC with linked data. I became interested in Linked Data after seeing it action at a British Library event and thought it interesting in its own right. I have been fascinated to see linked data become more accepted to the point that the Library of Congress are planning to adopt it via its Bibframe initiative.
This is not a how-to guide!
After briefly covering what Bibframe stands for, I will provide a brief Introduction to Linked Data. I’ll then talk a bit more about Bibframe, its history and what it looks like. I’ll finish with some remarks about what might happen next with Bibframe and linked data.
Initiated by the Library of Congress, BIBFRAME provides a foundation for the future of bibliographic description, both on the web, and in the broader networked world.
Short for Bibliographic Framework Initiative, which initially had a somewhat broader focus. The term Bibframe is more commonly now used to refer to the vocabulary.
It’s an abbreviation, not an acronym. Officially written in SHOUTY CAPITALS.
To understand Bibframe, you have to understand linked data. The best way to think of Linked Data as a web of data.
Linked data is not a standard as such. When people use the phrase Linked Data they are actually referring to a Web of Data compared to web of documents (e.g. Wikipedia, even library catalogue pages), using specific principles, as specified by Tim Berners Lee:
URI: Not just an address like a URL. URIs can be URLs or URNs. URLs can be http, ftp, etc. URNs are not web actionable
HTTP: I.e. over the web. If you don't have http, you cannot easily go and look up more information.
Useful info: Basically description, something about it, as on a web page you'd provide information in HTML, in linked data you provide information in RDF (of which more in a second). You can search it using SPARQL (of which more from Owen after lunch)
Links: Crucial. You can find out more from other URIs, much as links on a web page allow references and explanations, and further information to be explored.
Note: All this is independent of libraries and proceeds rather from the W3C. Linked data is not a formal W3C standard but RDF is, like HTML. The Web of Data is the basis of a semantic web, where meaning as well as text means that computers can make sense of it and act on it.
Understanding RDF is important to understanding linked data and that's what I'm going to spend some time on now.
If you open an HTML webpage like Wikipedia you’ll be presented with natural language sentences in English or another human language, such as this simple one:
Two entities and a relationship
To identify these entities unambiguously, we could assign them:
Name authorities headings
Unique database ids
URIs!...
This is the URI for Brideshead Revisited that LC used on their linked data set.
We can now add URIs for the author, Evelyn Waugh, and even the creator relationship itself!
This is a triple, of which all RDF is made! This is a simple statement but once you put lots of these together you can express much more rich concepts.
Note example in Turtle (.ttl) above. This is just one way of writing out RDF. Others include RDF/XML (RDF as XML, as you’d expect), n-triples (every triple written out in full line by line), RDF/JSON (RDF in JSON, not to be confused with JSON-LD). Each has their own benefits and fans, and can normally be converted one into the other without losing data. Turtle is the most humanly-readable one once you understand it.
For comparison this is the British Library data model.
Reused existing vocabularies and mixed vocabularies up.
+ VIAF in 2012, which is brilliant bridge between library authorities and e.g. Wikipedia (look at the bottom of a Wikipedia page for a well-known author!)
Library Data by no means originated in libraries. Long term users include the
BBC (e.g. sport and wildlife websites, but internally)
Wikipedia/DBpedia
Ordnance Survey (see above)
If you scroll to the bottom of the OS page, you’ll see a link to the Turtle. This is an excerpt.
The British Library data model again….
…and in more detail.
Again, re-used existing vocabularies, except where there was nothing to fit.
Again, links to external resources as well as giving the text of the person's name.
Arguably, it’s all RDA’s fault. The Bibframe initiative was initially undertaken in partial response to RDA testing in 2011 which determined that MARC21 wasn’t up to handling RDA properly:
“Most felt any benefits of RDA would be largely unrealized in a MARC environment.
MARC may hinder the separation of elements and ability to use URIs in a linked data environment.”
This all follows on from a growing widespread feeling that MARC is not really up to the job anymore and needs to be replaced. In particular it is very text based and is not used outside of libraries.
The work was originally undertaken with the consultants Zepheira, whose President, Eric Miller, was closely involved in developing the W3C RDF standard.
1. For RDA to be adopted, the Committee suggested that the US national libraries "Demonstrate credible progress towards a replacement for MARC.“ The Library of Congress decided in 2011 to use linked data as the basis of a replacement:
2. There was also a broader, more ambitious, intention: “BIBFRAME provides a foundation for the future of bibliographic description, both on the web, and in the broader networked world” as stated on their website. It is therefore:
Designed to cope with RDA, including FRBR
Designed to cope with legacy data
Designed to cope with both converted MARC and newly created data
- Designed to cope with non-catalogue data, although this is currently not so evident:
Zepheira’s bibfra.me has specific Archive and Rare Materials extensions. Archives hasn’t been developed yet and Rare Books includes only Title Proper, Date Publication, and Signature from dcrmb.
Over all this hangs the shadow of MARC.
3. Curiously, unlike say the BL, Oslo Public Library, or other linked data initiatives, Bibframe is not focussed on one library’s or consortium’s needs. Not, for instance, the specific business needs of LC and others? Even MARC was created for a very specific purpose and this probably allowed its quick development and success.
Here is the common image showing the original Bibframe model which demonstrates one of its most unsettling aspects for cataloguers: that it doesn’t follow FRBR!
.
The Work looks like a FRBR Work; the Instance looks very like a FRBR Manifestation; the Authority looks like any of the group 2 or 3 entities: authors, subjects, etc.; but where is the Expression? In reality, a Bibframe work can be both a FRBR work and a FRBR Expression, depending on how it describes itself:
To put it facetiously:
“BIBFRAME has worked on modelling works as Works within the BIBFRAME model, similar to the RDA modelling work, itself modelled on the work on the FRBR model of Works and Expressions. A BIBFRAME Work is a creative work, perhaps a FRBR Work, or an RDA FRBR Work but it also expresses a FRBR Expression, and of course an RDA FRBR Expression. A Work may express another Work based on others’ work, not just a FRBR Work or an RDA Work. That also works. FRBR Works or RDA Works expressed as BIBFRAME Works can relate to FRBR Expressions (BIBFRAME Works or RDA Expressions). So, Works are works that can be Works but also Expressions linked to Works that really are Works.” http://www.aurochs.org/aurlog/2013/05/25/the-bibframe-work/
This though does make the point that Bibframe is designed to handle RDA but also lots of things that are not RDA. Most AACR2 records don’t have Expression records either and there are other FRBR-like models (e.g. CIDOC CRM) to take account of if Bibframe is to move beyond purely accommodating library catalogue data. This reiterates the point that we are unlikely to have a uniform way of looking at the world
This is a theoretical example, imagining that we have a linked data server set up.
Notice how all the properties are BIBFRAME-specific. BIBFRAME is very like this, unusually so, arguably for reasons of security and control. Schema.org also is but is much less ambitious or complicated.
None of this is supported by library systems, and that is part of the point! MARC locks us into library-specific specialised software. Using linked data frees us, at least in theory. There is the danger with BIBFRAME being such an 'official' standard that this is what everyone will follow. Not necessarily a good thing.
The LC’s Bibframe AV Modelling study in 2014 proposed modifying the Bibframe model to add the Event alongside Work. A subclass of Work called Content which could be a recording of an event: it depicts or captures the event.
The new Core elements are Work, Instance, and Item, not Annotation which is subsumed under Item or replaced with the W3C Annotation model.
(
Bf:Content bf:captures bf:Event;
Bf:Event bf:depicts bf:Work
)
Bibframe 2.0 also adds Item which are not annotations as well as Events which can be subjects of works or which works can depict.
International Council of Museums. The CIDOC Conceptual Reference Model. 2013. http://www.cidoc-crm.org/
Profiles. The BIBFRAME metamodel is designed to be lightweight, flexible and able to accommodate the declarative needs of both existing (RDA, DACS, VRA, etc..) and yet-to-be-developed community vocabularies. To best accomodate these communities the BIBFRAME RDF Schema is intentionally underspecified in terms of constraints such as domain and range. This same flexibility comes at a cost; without a way of constraining these vocabularies, authoring tools, for example, are unable to provide guidance to content authors for specific vocabularies and derived models. BIBFRAME Profiles provide such supplementary descriptions.
A BIBFRAME Profile is a document, or set of documents, that puts a Profile (e.g. local cataloging practices) into a broader context of functional requirements, domain models, guidelines on syntax and usage, and possibly data formats.
Good for local practices, display, and cataloguing interfaces.
+ 4. Variation within Bibframe itself: there are in most cases several ways to make assertions about titles and authors.
1.0. Very very self-contained: Linked data practice generally welcomes re-using vocabularies but Bibframe has attempted not to. Bibframe however, has brought forward several reasons for its approach, which can be summarised as authority and stability. A 2012 report from the Library of Congress said that:
“While the recommendation of a singular namespace is counter to several current Linked Data bibliographic efforts, it is crucial to clarify responsibility and authority behind the schematic framework of BIBFRAME in order to minimize confusion and reduce the complexity of the resulting data formats.”
Library of Congress. Bibliographic Framework as a Web of Data: Linked Data Model and Supporting Services. Washington, D.C.: Library of Congress, 2012. P. 15. http://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf
Bf:creator
Bf:contributor
Bf:relator role construct
Bibframe 2.0 is a lot more open to other vocabularies and different ways of expressing the same concept. This fits wider linked data practice better, although arguably makes it less standard.
Five ways to do the same thing:
_ bf:contribution _
_ Rdau:P60434 _ w/label
_ rdau:P50434 _ wo/label
_ bf: contribution [
A bf:contribution ;
Bf:role “creator” ;
Bf:agent _
]
Based on the NLM example
Bibframe Lite is developed by Zepheira separately and differs in several ways. It is unclear how this will develop: will it rejoin or remain separate or will they feed off each other?
Involves a Core vocab and a number of others including e.g. relation, library, archive
NLM one of those keen, and the above is based on one of their published examples. They’re also looking at incorporating e.g. RDA or other terminology as an example.
Here are some example initiatives with Bibframe involvement:
LC Trial. Phase 1 September 2015-March 2016. Report issued June 2016. Used Bibframe 1.0 for several formats. Considered a success. Cataloguers generally used RDA rules but some found it difficult to think outside of MARC
NLM. 2014. Development of a core Bibframe vocabulary with Zepheira, GWU, and UCD.
Libhub and Bibflow are Zepheira and partner initiatives looking at adoption and workflows of linked data/Bibframe
Linked Data for Libraries (LD4L). A Cornell University, Harvard University and Stanford University project to create a general linked data model for library and cultural use, including Bibframe for bibliographic data. Ongoing.
http://bibframe.org/tools/compare/
http://www.oclc.org/content/dam/research/publications/2015/oclcresearch-loc-linked-data-2015-a4.pdf
Go to any Worldcat book page and scroll to the bottom, and expand the Linked Data link. Worldcat embed schema.org linked data, a very general vocabulary mainly developed by search engines to help with presentation and understanding web pages.
The Bibframe editor as developed by LC. Uses RDA rules (rather than MARC or Bibframe elements). Zepheira have developed their own, and other linked data initiatives have also developed their own, eg Oslo Public Library: see https://vimeo.com/192831354
Googling (see schema.org etc). OCLC and special collections may benefit most
Storing (Opaque in any case, incl marc)
Editing (see editor, nobody edits MARc)
Openness
Ending the library silo
APIs!
A lifeboat to allow us to kill off MARC
Much in draft or unfinished.
Needs ILS support or at least software support to be used.
There is already some fracturing and rivalry
There is momentum
Issued by LC, so carries weight
Scope uncertain in the long run
Here are two Great Wester Railway trains.
On the left is a (1940) train sitting on standard gauge tracks (4 ft 8 ½ in). This is based on the gauge used on colliery locomotives by George Stephenson in the north east of England from about 1814. This is apparently based on the space between the shafts of a cart needed to fit a horse to pull it. It is now used on the Japanese Bullet train, the Eurostar, or any train from Cambridge station. All because of the horse (index card).
On the right is a (Firefly, 1840-) train on broad gauge tracks (7 ft). This was obviously better as IKBrunel knew it was faster smoother, and could carry more. However, moving from broad to standard gauge in 1846 took the authority of a government act and only required the rails to moved closer together. Moving to the better broader gauge would have come with massive costs in terms of infrastructure as the bridges would be the wrong size, more land would need to be bought up, and many more trains built. In some ways a move to linked data, however beneficial, also needs many assumptions, working practices, and infrastructure to be demolished or replaced.
https://www.flickr.com/photos/andreboeni/7882212230/
Do we simply want to replace MARC? Although at least having tolerance for some of the data’s idiosyncracies is not a bad idea. This means we can move quickly (?!) then tidy up later
Can a single standard provide the answer or realistically claim to be the “foundation for the future of bibliographic description” on the web? What about dbpedia/Wikipedia entries on books, for example, what about different uses, such as SEO, that schema.org might be more appropriate for, or foaf which is optimised for describing people
Bibframe is still based around cataloguing. Even then, it will inevitably strain and warp, as it has done with AV materials. What about other library materials, including archives, repositories, article databases, vendor knowledge bases, local authority files. I worked in an archive that had a ceremonial sword: will that be covered. Bibframe does seem to be moving away from the single format idea and embracing outside vocabularies and methods where necessary, e.g. annotations and direct creation.
This can only be answered at local business level and on merit. If you want to have an exclusively RDA catalogue, why not just use RDA elements? If DC works better for your needs, why not use that, or both? The important part of linked data is the links. Bibframe has a clear advantage and role to play given its trusted source but whether it is the foundation of bibliographic data I think is too hard to say.