An overview of work currently being done by the Digital Manuscript Technology group. This presentation was given to the 2013 CLIR fellows in medieval data curation, and is a synthesis of earlier presentations, some of which were co-authored with Robert Sanderson.
Web & Social Media Analytics Previous Year Question Paper.pdf
Digital Medieval Data Curation
1. Digital Medieval Data Curation
CLIR Postdoctoral Fellowship Seminar
Bryn Mawr, 2013
Benjamin Albritton, Stanford University Libraries
blalbrit@stanford.edu
@bla222
2. Current State: A World of Silos
Roman de la Rose Parker on the Web e-codices And so on…
3. Data Interoperability
• Break down silos
• Separate data from applications
• Share data models and
programming interfaces
• Enable interactions at the tool and
repository level
4. Designing Modular Repositories and
Tools
Image Data (Canonical)
Image
Viewer
Discovery
Annotation
Non-image data (Canonical)
Transcription
Image Viewer
Image
Analysis
Discovery Tool X?
Repository
Repository
User
Interface
3rd-Party
Tools
8. Multiple Data Sources
• Existing structured data (catalogs)
• User-added
– Comments
– Transcriptions
– Etc.
• Digital images
• Machine processing
9. Motivating Questions
What does this mean for medieval data?
• How do we rethink medieval object data in a
shared, distributed, global space?
• How do we enable collaboration and encourage
engagement?
• How do we deal with tools that are producing
new data on digital surrogates that are
implicitly about a real world object?
11. Naïve Approach: Attach Transcription to Image
One problem example: Multiple Representations
CCC 26 f. iiiR
12. Naïve Approach: Attach Transcription to Image
One problem example: Multiple Representations
CCC 26 f. iiiR Fold A Open
13. Naïve Approach: Attach Transcription to Image
One problem example: Multiple Representations
CCC 26 f. iiiR Fold A Open Fold A and B Open
14. Naïve Approach: Attach Transcription to Image
One problem example: Multiple Representations
CCC 26 f. iiiR Fold A Open Fold A and B Open f. iiiV
15. The Shared Canvas
• Represents a real world thing we
want to “talk” about
• Has a unique name
• http://dms-data.stanford.edu/Parker/CCC026/canvas-12
36. User-created Structured Data
Beinecke MS 310, f. 1r
• Each row = 1 day (January 1, here)
• Lists the feast of the Circumcision
• Optionally provides additional information
46. A Sea of Manuscript Data
• Thousands of manuscripts currently available
interoperably, with more coming rapidly
• Discovery data is a mixed bag
• Tools provide data back into the system that
can be re-used
• New data drives new discovery, new
interfaces, and new visualization challenges
• Management and manipulation of that “wild”
data is a serious challenge
Editor's Notes
Allows filtering by date, item, and manuscript, as well as search across the items