Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
British Library Seminar: Shared Canvas (September 2011)
1. Introduction to SharedCanvas:
Linked Data for Facsimile Display and Annotation
Robert Sanderson
rsanderson@lanl.gov
Los Alamos National Laboratory
Benjamin Albritton
blalbrit@stanford.edu
Stanford University
http://www.shared-canvas.org/
This research is funded, in part, by the
Andrew W. Mellon Foundation
Introduction to SharedCanvas 1
British Library, 7th of September 2011, London, England
2. Overview
• Quick Motivation
• Technology Background:
• RDF and Linked Data
• Object Reuse and Exchange (OAI-ORE)
• Open Annotation (OAC)
• SharedCanvas:
• Requirements
• Model by Example
• Making it Real:
• DMS Tech Group
• Implementations and Demos
Introduction to SharedCanvas 2
British Library, 7th of September 2011, London, England
3. Motivation
Digital surrogates enable remote research
• Improve preservation of original,
and digital preservation of surrogate
• Promotes collaboration via shared
annotations and descriptions
A collaborative future:
• Rich landscape of interconnected
repositories, with seamless user
interfaces
• Improve efficiency and usability through
open, shared development
BNF f.fr 113, folio 1 recto
Introduction to SharedCanvas 3
British Library, 7th of September 2011, London, England
4. Requirements
To Realize this Future:
• Need a standardized input format to digital facsimile
presentation systems, to allow interoperability between and
across repositories
Architectural Requirements:
• Ability to model primarily textual items, where the individual
physical instance is an important cultural object
• Alignment of multiple Images, Texts, Commentary and other
Content resources per folio
• The Content, and Services that act upon it, are distributed
between institutions, and around the web
Introduction to SharedCanvas 4
British Library, 7th of September 2011, London, England
5. Naïve Approach: Transcribe Images Directly
But how to align multiple images, pages without images, fragments… ?!
Introduction to SharedCanvas 5
British Library, 7th of September 2011, London, England
6. Canvas Paradigm
A Canvas is an empty space in which to build up a display
• HTML5, SVG, PDF, … even Powerpoint!
• Can "paint" many different resources, including text, images and
audio, on to a Canvas
We can use a Canvas to represent a folio of a manuscript.
Distributed nature is fundamental in the requirements
• Painting resources, commentary and collaboration
• Idea: Use Annotations to do all of those
• Annotations can target the Canvas instead of individual Images
Introduction to SharedCanvas 6
British Library, 7th of September 2011, London, England
7. Annotations to Paint Text/Image to Canvas
Introduction to SharedCanvas 7
British Library, 7th of September 2011, London, England
8. Technology: RDF and Linked Data
Current technology of choice: XML
• XML files can't be built in a distributed, collaborative way.
• XML's tree structure insufficient
RDF (Resource Description Framework) is a Graph model
• W3C Standard: http://www.w3.org/TR/rdf-primer/
• A single, global graph of interconnected resources
• More Powerful … like the web
• More Complex … like the web
Linked Data is RDF with some constraints
• More web friendly
• Much support from Industry, Academia and Government sectors
• "Semantic Web" done right!
Introduction to SharedCanvas 8
British Library, 7th of September 2011, London, England
9. Technology: RDF and Linked Data
Primitives:
• Resource Something of Interest
• Predicate Typed, directed Relationship
• Literal Data (string, integer, etc)
• Triple ( Resource, Predicate, Literal/Resource )
Resource:
• Can be digital, physical or conceptual
• eg: An image file, an elephant, or "redness"
Predicate:
• Can be Resource to Resource (relationship)
• X isPartOf Y
• Or Resource to Literal (property)
• X title "Froissart's Chronicles"
Introduction to SharedCanvas 9
British Library, 7th of September 2011, London, England
10. Technology: RDF Skittles
Circle = Resource, Arrow = Predicate, Oval = Literal, Rectangle = Class
Introduction to SharedCanvas 10
British Library, 7th of September 2011, London, England
11. Technology: RDF and Linked Data
Namespaces:
• Interoperability comes from reusing Ontologies (namespaces) of
predicates and resources
• eg Dublin Core, Open Annotation, SharedCanvas…
Can define (multiple) Classes for resources
• Person, Image, Annotation, Canvas, …
• Class is just another resource referenced with rdf:type predicate
• X rdf:type Class
All Resources and Predicates are identified by URIs
• Linked Data recommends resolvable HTTP URIs
All statements are globally true, not just within the current document
Introduction to SharedCanvas 11
British Library, 7th of September 2011, London, England
12. Technology: RDF and Linked Data
Serializations:
• XML ugly (though recommended as default)
• Turtle much easier to read, but needs special parser
• JSON many competing formats, no standard yet
XML:
<dms:TranscriptionAnnotation rdf:about="urn:uuid:e7db526a…">!
<oac:hasBody rdf:resource="http://anno.lanl.gov/m804/Line-f1r-37"/>!
<oac:hasTarget !
rdf:resource="http://anno.lanl.gov/m804/View-f1r#xywh=696,1319,565,44"/>!
</dms:TranscriptionAnnotation>!
Turtle:
<urn:uuid:e7db526a…> a dms:TranscriptionAnnotation;!
! oac:hasBody ex:Line-f1r-37;!
! oac:hasTarget ex:View-f1r#xywh=696,1319,565,44 .!
Introduction to SharedCanvas 12
British Library, 7th of September 2011, London, England
13. ORE: Aggregations of Web Resources
http://www.openarchives.org/ore/
Aggregation: An abstract collection of resources, with an identity
Resource Map: A document that describes the Aggregation in RDF
AR-1 and AR-2 can be any web resource
Introduction to SharedCanvas 13
British Library, 7th of September 2011, London, England
14. ORE: Aggregations
Aggregations may aggregate other Aggregations,
but each must have its own Resource Map
Introduction to SharedCanvas 14
British Library, 7th of September 2011, London, England
15. ORE: Aggregations
Aggregations do not have a default order for the Aggregated Resources
Order can be imposed by RDF Lists
Introduction to SharedCanvas 15
British Library, 7th of September 2011, London, England
16. List/Aggregations
• How do those 'next' links
actually work using an
rdf:List?
• Verbose in full, but
serializations have shortcuts
to make this less ugly!
Introduction to SharedCanvas 16
British Library, 7th of September 2011, London, England
17. Technology: Open Annotation
• http://www.openannotation.org/
• Focus on interoperable sharing of annotations
• Web-centric and open, not locked down silos
• Create, consume and interact in different environments
• Build from a simple model for simple cases,
to more detailed for complex scholarly annotation requirements
• Status: Beta, with 9 ongoing funded experiments to inform 1.0
• Hardest part: Define what an Annotation is!
• "Aboutness" is key to distinguish from general metadata
A document that describes how one resource is about
one or more other resources, or part thereof.
Introduction to SharedCanvas 17
British Library, 7th of September 2011, London, England
18. Basic Model
The basic model has three resources:
• Annotation (an RDF document)
• Default: RDF/XML but others via Content Negotiation
• Body (the ‘comment’ of the annotation)
• Target (the resource the Body is ‘about’)
Introduction to SharedCanvas 18
British Library, 7th of September 2011, London, England
19. Basic Model Example
Introduction to SharedCanvas 19
British Library, 7th of September 2011, London, England
20. Additional Relationships and Properties
Any of the resources can have additional information attached,
such as creator, date of creation, title, etc.
Introduction to SharedCanvas 20
British Library, 7th of September 2011, London, England
21. Additional Properties Example
Introduction to SharedCanvas 21
British Library, 7th of September 2011, London, England
22. Annotation Types
There can be further types of Annotation, such as a Reply.
Example: Replies are Annotations on Annotations.
Introduction to SharedCanvas 22
British Library, 7th of September 2011, London, England
23. Annotation Types Example
Introduction to SharedCanvas 23
British Library, 7th of September 2011, London, England
24. Inline Information
It is important to be able to have content contained within the
Annotation document for Client Autonomy:
• Clients may be unable to mint new URIs for every resource
• Clients may wish to transmit only a single document
• Third parties can generate new URIs if the client does not
The W3C has a Content in RDF specification:
• http://www.w3.org/TR/Content-in-RDF10/
Introduction to SharedCanvas 24
British Library, 7th of September 2011, London, England
25. Inline Information: Body
• We introduce a resource identified by a non resolvable URI, such
as a UUID URN, as the Body.
• We then embed the data within the Annotation document using
the 'chars' property from the Content in RDF ontology.
Introduction to SharedCanvas 25
British Library, 7th of September 2011, London, England
26. Inline Body Example
Introduction to SharedCanvas 26
British Library, 7th of September 2011, London, England
27. Multiple Targets
There are many use cases for multiple targets for an Annotation:
• Comparison of two or more resources
• Making a statement that applies to all of the resources
• Making a statement about multiple parts of a resource
The OAC Data Model allows for multiple targets by simply having
more than one hasTarget relationship.
Introduction to SharedCanvas 27
British Library, 7th of September 2011, London, England
28. Multiple Targets Example
Introduction to SharedCanvas 28
British Library, 7th of September 2011, London, England
29. Segments of Resources
Most annotations are about part of a resource
Different segments for different media types:
• Text: paragraph, arbitrary span of words
• Image: rectangular or arbitrary shaped area
• Audio: start and end time points, track name/number
• Video: area and time points
• Other: slice of a data set, volume in a 3d object, …
Introduction to SharedCanvas 29
British Library, 7th of September 2011, London, England
30. Segments of Resources
Web Architecture Segmentation:
• A URI with a Fragment identifies part of the resource
• Media-specific fragment identifiers; eg XPointer for XML
• W3C Media Fragments URI specification for simple
segments of media: http://www.w3.org/TR/media-frags/
We introduce a method of constraining resources:
• Introduce an approach for arbitrarily complex segments that
cannot be expressed using Fragments
• Can be applied to Body or Target resource
Introduction to SharedCanvas 30
British Library, 7th of September 2011, London, England
31. Segments of Resources: Fragment URIs
URI Fragments are a syntax for creating subsidiary URIs that
identify part of the main resource
The syntax is defined per media type
• X/HTML: The named anchor or identified element
• http://www.example.net/foo.html#namedSection
• XML: An XPointer to the element(s)
• http://www.example.net/foo.xml#xpointer(/a/b/c)
• PDF: Many options, most relevant two operations:
• http://www.example.net/foo.pdf#page=2&viewrect=20,80,50,60
• Plain Text: Either by character position or line position:
• http://www.example.net/foo.txt#char=0,10
• http://www.example.net/foo.txt#line=1,5
Introduction to SharedCanvas 31
British Library, 7th of September 2011, London, England
• :
32. Segments of Resources: Media Fragments
Media Fragments allow anyone to create URIs that identify part of
an image, audio or video resource.
The most common case is for rectangular areas of images:
• http://www.example.org/image.jpg#xywh=50,100,640,480
Link to the full resource as well, for all Fragment URIs
Introduction to SharedCanvas 32
British Library, 7th of September 2011, London, England
33. Media Fragments Example
Introduction to SharedCanvas 33
British Library, 7th of September 2011, London, England
34. Complex Constraints
Fragment URIs are not always possible
• Introduce a Constraint that describes the segment of interest
• And a ConstrainedTarget that identifies the segment of interest
• Constraints are entire resources, so can be more expressive
• Constraints may also describe 'contextual' information
Introduction to SharedCanvas 34
British Library, 7th of September 2011, London, England
35. Constraint Example
Introduction to SharedCanvas 35
British Library, 7th of September 2011, London, England
36. RDF Constraints
Instead of having the information in an external document, it could be
within the RDF of the Annotation document.
• We can attach information
to the Constraint node
• Or use the Content in RDF
specification to include what
would have been in the
external document
Introduction to SharedCanvas 36
British Library, 7th of September 2011, London, England
37. RDF Constraint Example
Introduction to SharedCanvas 37
British Library, 7th of September 2011, London, England
38. Constrained Body
The Body may also be constrained in the same way as Targets
Introduction to SharedCanvas 38
British Library, 7th of September 2011, London, England
39. Annotation Protocols
Unlike previous systems, Open
Annotation does not mandate a
protocol.
No reliance on a client/server
combination gives the client
autonomy.
Instead we promote a publish/
subscribe methodology, where
annotations may be stored and
consumed from anywhere.
Protocol: publish, subscribe, consume linked
Introduction to SharedCanvas 39
3
British Library, 7th of September 2011, London, England
40. Publish/Subscribe Method
publish subscribe consume
Introduction to SharedCanvas 40
4
British Library, 7th of September 2011, London, England
41. Publish/Subscribe Method
publish subscribe consume
Introduction to SharedCanvas 41
4
British Library, 7th of September 2011, London, England
42. Publish/Subscribe Method
publish subscribe consume
Introduction to SharedCanvas 42
4
British Library, 7th of September 2011, London, England
43. Other Open Annotation Topics
Some other aspects of Open Annotation:
• Dealing with resources that change over time
• http://arxiv.org/abs/1003.2643
• http://www.slideshare.net/azaroth42/
making-web-annotations-persistent-over-time
• Precedence when using multiple Constraints:
• http://www.openannotation.org/spec/beta/precedence.html
• Machine Annotations, when the body is structured data intended
for machine consumption
• In the beta spec directly:
http://www.openannotation.org/spec/beta/#DM_Structured
Introduction to SharedCanvas 43
British Library, 7th of September 2011, London, England
44. BREAK
(Funny?) (Medieval) Picture of a Cat from the Web!
http://romantoes.blogspot.com/2009/05/medievalist-cat-came-back.html
Introduction to SharedCanvas 44
British Library, 7th of September 2011, London, England
45. Motivating Questions
Many implicit assumptions:
• What is a Manuscript?
• What is its relation to a facsimile?
• What is the relation of a transcription
of a facsimile to the original object?
What does this mean for digital tools?
• How do we rethink digital facsimiles in a
shared, distributed, global space?
• How do we enable collaboration and
encourage engagement?
Ms MurF: 10.5076/e-codices-kba-0003
Introduction to SharedCanvas 45
British Library, 7th of September 2011, London, England
46. Motivation
Digital surrogates enable remote research
• Improve preservation of original,
and digital preservation of surrogate
• Promotes collaboration via shared
annotations and descriptions
A collaborative future:
• Rich landscape of interconnected
repositories, with seamless user
interfaces
• Improve efficiency and usability through
open, shared development
BNF f.fr 113, folio 1 recto
Introduction to SharedCanvas 46
British Library, 7th of September 2011, London, England
47. Baseline Requirements
To Realize this Future:
• Need a standardized input format to digital facsimile
presentation systems, to allow interoperability between and
across repositories
Architectural Requirements:
• Ability to model primarily textual items, where the individual
physical instance is an important cultural object
• Alignment of multiple Images, Texts, Commentary and other
Content resources per folio
• The Content, and Services that act upon it, are distributed
between institutions, and around the web
Introduction to SharedCanvas 47
British Library, 7th of September 2011, London, England
48. Domain Requirements
Working at physical item level
provides unique challenges!
1. Only parts of pages may be
digitized
• Only illuminations digitized
• Fragments of pages
• Multiple fragments per
image
Cod. Sang. 1394: 10.5076/e-codices-csg-1394
Introduction to SharedCanvas 48
British Library, 7th of September 2011, London, England
49. Domain Requirements
2. Page may not be digitized at
all
• Not "interesting" enough This page intentionally,
• Digitization destructive but unfortunately,
left blank
• Page no longer exists
• Page only hypothetical
Introduction to SharedCanvas 49
British Library, 7th of September 2011, London, England
50. Domain Requirements
3. Non-rectangular pages
• Fashionable heart shaped
manuscripts
• Fragments
• Pages with foldouts
Facsimile of BNF Rothschild 2973
http://www.omifacsimiles.com/brochures/montchen.html
Introduction to SharedCanvas 50
British Library, 7th of September 2011, London, England
51. Domain Requirements
4. Alignment of multiple
images of same object
• Multi-spectral imaging
• Multiple resolutions
• Image tiling
• Microfilm vs photograph
• Multiple digitizations
Archimedes Palimpsest Multi-Spectral Images
http://www.archimedespalimpsest.org/
Introduction to SharedCanvas 51
British Library, 7th of September 2011, London, England
52. Domain Requirements
5. Multiple page orders over time
• Rebinding
• Scholarly disagreement on
reconstruction
6. Different pages of the manuscript
held by different institutions
Cod Sang 730: 10.5706/e-codices-csg-0730a
Introduction to SharedCanvas 52
British Library, 7th of September 2011, London, England
53. Domain Requirements
7. Transcription of:
• Text
• Music
• Musical Notation
• Performance
• Diagrams
Reusing existing resources, such
as TEI, where possible
8. Transcriptions both created and
stored in a distributed way, with
competing versions
Parker CCC 008, f1r
Introduction to SharedCanvas 53
British Library, 7th of September 2011, London, England
54. Naïve Approach: Transcribe Images Directly
But how to align multiple images, pages without images, fragments… ?!
Introduction to SharedCanvas 54
British Library, 7th of September 2011, London, England
55. Canvas Paradigm
A Canvas is an empty space in which to build up a display
• HTML5, SVG, PDF, … even Powerpoint!
• Can "paint" many different resources, including text, images and
audio, on to a Canvas
We can use a Canvas to represent a folio of a manuscript.
Distributed nature is fundamental in the requirements
• Painting resources, commentary and collaboration
• Idea: Use Annotations to do all of those
• Annotations can target the Canvas instead of individual Images
Introduction to SharedCanvas 55
British Library, 7th of September 2011, London, England
56. Canvas to Page Relationship
The Canvas's top left and bottom right corners correspond to the
corners of a rectangular box around the folio
Introduction to SharedCanvas 56
British Library, 7th of September 2011, London, England
57. OAC Annotations to Paint Images
We can paint the canvas by annotating it with resources.
Introduction to SharedCanvas 57
British Library, 7th of September 2011, London, England
58. OAC Annotations to Paint Text
Introduction to SharedCanvas 58
British Library, 7th of September 2011, London, England
59. Transcription: Morgan 804
Introduction to SharedCanvas 59
British Library, 7th of September 2011, London, England
60. Transcription: Morgan 804
Introduction to SharedCanvas 60
British Library, 7th of September 2011, London, England
61. Fragments: Cod Sang 1394
Introduction to SharedCanvas 61
British Library, 7th of September 2011, London, England
62. Musical Manuscripts: Parker CCC 008
Introduction to SharedCanvas 62
British Library, 7th of September 2011, London, England
63. Missing Pages: Parker CCC 286
Introduction to SharedCanvas 63
British Library, 7th of September 2011, London, England
64. Repeated Zones: Frauenfeld Y 112
Introduction to SharedCanvas 64
British Library, 7th of September 2011, London, England
66. Rebinding: BNF f.fr. 113-116
Introduction to SharedCanvas 66
British Library, 7th of September 2011, London, England
67. Discovery: Aggregations
Those Annotations could be anywhere on the web!
• Need to be able to discover them!
Introduce a discovery layer of sets of Annotations.
• Currently by type of Annotation, and then by Folio
eg: All ImageAnnotations, All text annotations for f1r
• Other divisions possible, just for discovery!
Need a meta discovery layer to find the lists!
• Introduce a "Manifest" resource:
• List of all of the resources known for the facsimile
Introduction to SharedCanvas 67
British Library, 7th of September 2011, London, England
68. SharedCanvas: Data Model
Introduction to SharedCanvas 68
British Library, 7th of September 2011, London, England
69. Digital Manuscript Interoperability for
Tools and Repositories
Overview:
Andrew W. Mellon Foundation funded numerous manuscript
digitization projects over several decades
All had in common:
Inability to share data across silos to satisfy scholarly use
Inability to leverage existing infrastructure
No sustainability model for data or access
Goal:
Interoperability between repositories and tools
Introduction to SharedCanvas 69
British Library, 7th of September 2011, London, England
70. Defining Interoperability
• Break down silos
• Separate data from
applications
• Share data models and
programming interfaces
• Enable interactions at the
tool and repository level
Introduction to SharedCanvas 70
British Library, 7th of September 2011, London, England
71. Designing Modular Repositories and Tools
3rd-Party Image Image
Transcription Annotation Discovery Tool X?
Analysis Viewer
Tools
Repository
User Image Viewer Discovery
Interface
Metadata (Canonical)
Repository
Image Data (Canonical)
Introduction to SharedCanvas 71
British Library, 7th of September 2011, London, England
72. Designing Modular Repositories and Tools
3rd-Party Annotation
Image Image
Discovery Tool X?
Transcription
Tools Analysis Viewer
Repository
User Image Viewer Discovery
Interface
Metadata (Canonical)
Repository
Image Data (Canonical)
Introduction to SharedCanvas 72
British Library, 7th of September 2011, London, England
73. Designing Modular Repositories and Tools
Image
Transcr Image
Annotation Analysi Discovery Tool X?
iption Viewer
s
Image Viewer Discovery
Metadata (Canonical)
Image Data (Canonical)
Introduction to SharedCanvas 73
British Library, 7th of September 2011, London, England
74. Service-based Discovery and Delivery Interactions
• Four primitives currently supported:
o Discovery
- New Name?
- http://dms-dev.stanford.edu/
o Image Viewing
- Independent zpr viewer
o Annotation
- Digital Mappaemundi
o Transcription
- T-PEN
Introduction to SharedCanvas 74
British Library, 7th of September 2011, London, England
75. Rendering Implementation
Rendering:
• Design considerations:
• Easy to reuse and extend, no* server side code
• Consume model directly from RDF
• Use existing, well-understood, documented libraries
• Pure Javascript (Rob)
• JQuery
• RDF extension for JQuery
• Audio Player extension
• iOS Touch support extension
• RaphaelJS for SVG (JQuery SVG not as easy, common)
* Except one minimal reflection script to avoid XSS/CORS issues
Introduction to SharedCanvas 75
British Library, 7th of September 2011, London, England
76. Rendering Implementation
Process:
• Fetch Manifest, Sequence, plus Lists of Annotations, via AJAX
• Populate menus from Manifest and Sequence
• Fetch any further resources needed, (TEI and SVG)
• Generate one or more canvases based on browser size
• Turn Annotation RDF/XML or n3 into JSON object for ease
• Process XPointer, Media Fragments into local structures
• Render annotations using HTML, or SVG if required, once all
needed resources have been obtained
• Retrieve commentary annotations, both public (pastebin) and
personal (blogger), and render
Introduction to SharedCanvas 76
British Library, 7th of September 2011, London, England
77. Rendering Implementation
Demos!
• Morgan 804 (transcription as string, detail images)
• http://www.shared-canvas.org/impl/demo1/
• Worlde's Blisce (audio, TEI transcription)
• http://www.shared-canvas.org/impl/demo2/
• Selected Walters Museum Manuscripts (ranges, pan/zoom)
• http://www.shared-canvas.org/impl/demo4/
• Archimedes Palimpsest (multi images, rotation, TEI transcription)
• http://www.shared-canvas.org/impl/demo5/
Introduction to SharedCanvas 77
British Library, 7th of September 2011, London, England
78. Future Work
• Refine model based on community feedback, please!
• Improve implementations:
• Ease of creation for new canvases and sequences
• Improve User Interfaces (integrate zoom/pan, persistence)
• High end technical aspects (zones)
• Annotation filtering (spam will be an issue)
• Increase the community and adoption!
• Non Manuscript Use Cases:
• Scientific Papers, Theses/Dissertations
• http://www.shared-canvas.org/impl/demo3/ & …/demo3b/
• Digitized Newspapers
• …
Introduction to SharedCanvas 78
British Library, 7th of September 2011, London, England
79. Summary
Distributed Canvas paradigm provides a coherent solution to modeling
the layout of medieval manuscripts
• Annotation, and Collaboration, at the heart of the model
• Distribution across repositories for images, text, commentary
• Granular accuracy, from full resource to non-rectangular segment
• Multiple page orders and Discovery via Aggregations
SharedCanvas brings the humanist's primary research objects
to their desktop in a powerful, extensible and interoperable fashion
Introduction to SharedCanvas 79
British Library, 7th of September 2011, London, England
80. Thank You
Robert Sanderson
rsanderson@lanl.gov
azaroth42@gmail.com
@azaroth42
Ben Albritton
blalbrit@stanford.edu
Web: http://www.shared-canvas.org/
Paper: http://arxiv.org/abs/1104.2925
Slides: http://slidesha.re/XXXXX
Acknowledgements
DMSTech Group: http://dmstech.group.stanford.edu/
Open Annotation Collaboration: http://www.openannotation.org/
Introduction to SharedCanvas 80
British Library, 7th of September 2011, London, England