Invited seminar for UIUC's IS 575 class on metadata in theory and practice, about structural metadata practice in RDF/LOD. Touches on OAI-ORE, PCDM, Annotation, IIIF and Linked Art. Challenges explored are graph boundaries, APIs and context specific metadata.
8. StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
OAI-ORE Background
• Mellon Foundation grant 2006-2008
• Digital Library and Scholarly Communication focus
• Context:
• Aligning existing work: PMH, METS etc
• With the web: URIs, REST, Linked Open Data
• For interoperability of digital libraries:
• Scholarly communication
• Digital objects
• Research outputs
17. StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Proxies Considered Harmful
Proxies are a Usability nightmare!
• Now two places to look for all metadata
• Range/Domain inferences are out the window
• Can’t validate an application profile, as proxies are the
union of all other classes
• Can’t create a database structure other than triples
But at least they’re optional … no one will use them…
Right???
23. StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Challenges: Model / Semantics
• Order via Proxies
• Unclear semantics of Collection / Object
• Both related via hasMember
• Specialization / classification by subclassing
• Leads to proliferation of classes
• And greatly reduced interoperability
• Not opinionated enough where it was needed
25. StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
API – Linked Data Platform
• PCDM was developed in the context of the then new
Linked Data Platform Specification
https://www.w3.org/TR/ldp/
• Attempted to provide C/R/U/D specification for LOD
• REST-based (HTTP POST, GET, PUT, DELETE methods)
• Implemented in Fedora4 (and lots of others)
• Suffered from lack of clear vision in W3C WG – ended up
trying to meet competing goals
26. StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
API – Linked Data Platform
• Did not solve core challenges, leaving implementations
either not interoperable, or not functional
• Authentication (needed for write operations)
• Paging of large resources (c.f. downward relationships)
• Graph boundary conditions
• Did introduce useful notion of “containers”:
• Writing to a container could create additional triples
• Containers were resources, configured in triples
40. StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Annotation API – Opinionated LDP
• Submit / Return all of annotation data, not per subject URI
• Use ActivityStreams paging mechanism
• Allow just URI reference
• Or full representation of Anno
• Use JSON-LD!
Still didn’t solve authentication!
47. StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
IIIF Design Principles
1. Scope design through shared use cases
2. Design for international use
3. As simple as possible, but no simpler
4. Make easy things easy, complex things possible
5. Avoid dependency on specific technologies
6. Use REST / Don’t break the web
7. Separate concerns, keep APIs loosely coupled
8. Design for JSON-LD, using LOD principles
9. Follow existing standards, best practices
10. Define success, not failure (for extensibility)
51. StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Linked Art
Community developed Cultural Heritage descriptive metadata
profile, focused on (art) museum use cases and applications.
Progressive Enhancement:
1. Legacy Data – No things, just description
2. Data for Humans – Things, but only with descriptions
3. Data for Machines – Linked, Structured Data
4. Data for Research – Accurate data in sufficient quantity
to answer research questions when aggregated
52. StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Linked Art – Structural Data?
Partitioning & Membership are patterns used throughout:
• Parts of objects (frame is part of painting)
• Parts of places (New Haven is part of CT)
• Parts of events (actor’s particular role in larger event)
• Parts of texts (chapter is part of book)
• Parts of concepts (Watercolor is part of Painting concept)
• Membership in groups (Rob is a member of Yale staff,
painting is a member of auction lot set)
53. StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Conclusions
• Open World vs Local Structure:
• Order of entities: Just use rdf:List, JSON-LD arrays
• Context-specific data: Needs a Ph.D. or two please!
• Usability:
• Graph Boundaries: Don’t Repeat Yourself, API as guide
• Representation vs Resource: Problem in theory only
• API Interactions: Get retrieval right, focus on usability
Why? Because structure is about localized relationships, whereas RDF or LOD has an open world assumption – together with more general issues, they make usability and adoption a challenge.
ReM – the file that describes the aggregation. Each ReM can describe exactly one aggregation.
Agg – Set of resources, either digital or conceptual
Agg’d Resource – Any resource with a URI
Aggregations can be aggregated, and as the only way to get to them is via their resourcemap, we can add a reference to it
Now we have a recursive structure not just a flat list, but mandated to be in separate representations
Resources (including aggregations) can be aggregated by many aggregations, which can be added, along with their resource map.
History has shown that while the ontology is concise … it’s not all that basic!
ORE takes a firm position on the boundary of the graph and how you can retrieve the set of relationships that make up the graph. The aggregated aggregation in the first resource map cannot include its aggregated resources, they can only be in that aggregations resource map (resource map 2)
Aggregated resources could point to other aggregations, but aggregations could not. Aggregated aggregations could not point to their aggregated resources but could point to their resource map. Retrieval was also forced by this decision – you requested the ResourceMap by its URI and got the triples that fit within the boundary.
No official position about Create, Update and Delete.
Introduced the notion of a Proxy – a resource that stood for an aggregated resource in the context of the aggregation. Assertions about the proxy are about the resource, but are only valid in the context of the aggregation.
This gives us a way to specify order, without globally asserting that that in all aggregations (or any context) the resource comes before or after another.
And for non structural metadata as well, such as a title for the resource in the context of the aggregation.
In a pure RDF worldview, there’s nothing theoretically wrong with Proxies. They’re a resource, and they can have relationships associated with them. However…
As simple as possible … and then a bit more simple, but we’ll get to that.
(Explain)
Collection and Object are subclasses of ORE Aggregation, hasMember, hasFile and relatedObject are subProperties of aggregates.So this is exactly the same as ORE … just ignoring the resourcemap requirement.
Opinionated: No ReMs. No descMD on Files. Distinction between a collection and an “object” (never very clear boundaries)
Not everything is ordered, so Proxies (often called Poxies during PCDM implementation) are a sensible choice … theoretically. Mea culpa.
Some further local constraints: Files cannot be ordered, nor related objects, only actual members of collections or objects.
During PCDM, we discussed the Schema structure as alternative where ListItems can be the object of itemListElement (as well as resources generally) in order to assert order.
Didn’t want to /require/ ListItems and didn’t want to have both resources and listitems as object of the property (schema.org is very loose!)
But there was a long history before that.
Motivation, TextualBody, three core components.
Note Selector + Target = Specific Resource. Similar to ListItem / Proxy…
Protocol – Also LDP, but opinionated for usability, not theoretical correctness.
LOUD is the application of those design principles to LOD. We can summarize the five stars of LOUD as…
Another way to think about it is … who is the audience for linked data?
Try to learn from success of Usable data, and apply it in a more challenging environment than IIIF. Need to deal with all aspects of metadata, including especially structural.