As we begin modeling and migrating our data to work in a Linked Data environment, we need to avoid simply building new silos with a trendy new facade. It is important to think carefully about how our data models fit into the larger cloud of data. We must consider what is necessary for us to link to and reuse other data sources and for others to reuse ours. How do we balance the control we want over our own vocabularies and models while also not alienating ourselves from the larger web? What compromises do we need to make? What effect will schema.org have? After a short introduction to RDF and the concepts of Linked Data, we will explore some potential snags and solutions as well as datasets and technologies that might influence some of our decisions.
5. Rules of Linked Data
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up
those names.
3. When someone looks up a URI, provide
useful information, using the standards
(RDF*, SPARQL)
4. Include links to other URIs. so that they can
discover more things.
http://www.w3.org/DesignIssues/LinkedData.html
12. title author isbn language
Berners-Lee,
1 Weaving the Web
Tim
0062515861 eng
Durrenmatt,
2 Pour Vaclav Havel
Friedrich
2882822444 fre
García
3 Cien años de soledad Márquez, 9500700298 spa
Gabriel
Gorman,
4 The concise AACR2
Michael
0838903258 eng
13. title author isbn language
Berners-Lee,
1 Weaving the Web
Tim
0062515861 eng
Durrenmatt,
2 Pour Vaclav Havel
Friedrich
2882822444 fre
García
3 Cien años de soledad Márquez, 9500700298 spa
Gabriel
Gorman,
4 The concise AACR2
Michael
0838903258 eng
24. Objects
• Can be literals
• text
• numeric
• date
• language
• URIs
• relate to other resources
25. title author isbn language
Berners-Lee,
1 Weaving the Web
Tim
0062515861 eng
Durrenmatt,
2 Pour Vaclav Havel
Friedrich
2882822444 fre
García
3 Cien años de soledad Márquez, 9500700298 spa
Gabriel
Gorman,
4 The concise AACR2
Michael
0838903258 eng
30. Vocabularies
Dublin Core general bibliographic description
Friend-of-a-Friend (FOAF) describe people and organizations
Bibliontology (BIBO) citations and bibliographies
SKOS subjects and thesauri
WGS84 geographic coordinates
Creative Commons (CC) licenses and attribution
recordings, performances, performers,
Music Ontology (MO) etc.
OWL used to build schemas
33. title author isbn language
Berners-Lee,
1 Weaving the Web
Tim
0062515861 eng
Durrenmatt,
2 Pour Vaclav Havel
Friedrich
2882822444 fre
García
3 Cien años de soledad Márquez, 9500700298 spa
Gabriel
Gorman,
4 The concise AACR2
Michael
0838903258 eng
37. Versatile
• “Schemaless”
• Properties can be assigned from any
number of vocabularies
• Description can be both generalized as well
as domain or audience specific
54. Alignment
• If you can’t link to other things, what’s the
point?
• What are you describing?
• A “Book” or a “Manifestation”?
• Who is your audience?
• Who do you wish to consume from?
56. Work Expression
Manifestation Item
All WEMI entities are disjointed
57. Work
Expression
Manifestation
Item
No shortcuts between non-adjacent entities
58. No shortcuts between
non-adjacent entities
• Manifestations must have an Expression to
relate to a Work
• Lots of (possibly sketchy) scaffolding
required
• Who outside of libraries will do this?
59. FRBR
Work Expression Manifestation
Title Language ISBN
“type” of
Author copyright date
resource
place of
Subject
publication
60. bibo:Book
Title
Author
Subject
“type”
Language
ISBN
copyright date
place of publication
61. Work Expression Manifestation
Title Language ISBN
Author “type” of resource copyright date
Subject place of publication
bibo:Book
Title
Author
Subject
“type”
Language
ISBN
copyright date
place of publication
62. How do we relate?
• Bibliontology
• Dublin Core’s “BibliographicResource”
• http://schema.org/Book
• etc.
70. • MARC 6XX = SKOS Concept (or MADS
Authority)
• MARC 1XX = DC Agent, FOAF Agent,
RDA Agent, etc.
71. id.loc.gov
• Everything is a SKOS Concept (or MADS
Authority, which entails the same meaning)
• Languages
• Countries
• etc.
72. purl.org/NET/marccodes
• Unofficial modeling of:
• Languages
• Countries
• GACs
• Instruments/Voices
• Audiences
• Form of Items
• Form of Musical Composition
Full disclosure: I maintain this
73. purl.org/NET/marccodes
• Models the “things”
• Languages (http://www.lingvoj.org/
ontology#Lingvo)
• Countries (http://purl.org/dc/terms/
Location)
• etc.
• Links to dbpedia, geonames, Lexvo/Lingvoj,
id.loc.gov
78. DBpedia
• Data very messy
• http://purl.org/NET/marccodes/
muscomp/sn
• Data not as important as the identifiers
79. Geonames.org
• Geographic and administrative data
• 8 million+ resources described
• Places of interest
• “near” data
80. Musicbrainz
• One of the more comprehensive open
music databases
• Many copies, which to use?
• BBC Music
• DBTune
• zitgist
• dataincubator
• Modeled in Music Ontology
81. New York Times
• People
• Organizations
• Places
• All SKOS Concepts!
• Conflated with the “thing”
82. Open Library
• Works
• Editions (sort of like Manifestations)
• not entirely compatible: creator and
language properties
• Authors
• Subjects
83. Bibliontology
• Interested in modeling the citation, not the
relationships within the Endeavor
• Extremely easy to model an article, book
or journal
• Currently incompatible with FRBR
84. schema.org
• 900 lb. SEO gorilla
• Google, Bing,Yahoo!
• HTML5 microdata
• http://schema.org/Book
• http://schema.org/Article
• etc.
• Dublin Core working on alignment
85. Breaking free from our
silos
• Linked data gives us potential to integrate
into the larger web
• reuse of our data = relevance!
• reuse of other’s data
86. Important we don’t
exclude ourselves
by insisting on incompatible models!
87. Thank you!
Ross Singer
ross.singer@talis.com
http://twitter.com/rsinger
http://dilettantes.code4lib.org/blog