ICT Role in 21st Century Education & its Challenges.pptx
Why do we digitise? 20 reasons in 20 pictures
1. Why do we digitise?
20 reasons in 20 pictures
Dr. Mia Ridge, @mia_out
Digital Curator, British Library
digitalresearch@bl.uk @BL_DigiSchol
Europeana Network Association AGM 2016
Riga, November 2016
2. Why do we digitise?
TL;DR: access to our shared
heritages matters
Digitisation supports education,
engagement, research at huge scale
and with computational power.
3. A splendid assortment of Gceloag
and West of England. Tweed ; also
Black Doeakin Woollen Cloths
alwaya on hand. Snit made to
order in six hoars' notice, on most
reaainable terms. Mr. M'Mohon,
Cutter.
Mysteries of Melbourne life
by Cameron, Donald, 1848?-1888.
Published 1873
Usage Public Domain Mark 1.0
Topics Australia -- Fiction
18. Digitised sources + computational
methods = digital scholarship
Dr. Katrina Navickas and @BL_Labs, Political Meetings Mapper
19. Efficiency is under-rated
"I was able to do in minutes with Python
code what I'd spent the last ten years
trying to do by hand!"
-Dr. Katrina Navickas, BL Labs Winner
2015
21. It's easier to see patterns
'Distant reading has utterly transformed my view of literary
history. ...as we slice libraries in new ways we keep
stumbling over long, century-spanning trends that have
little relationship to the stories of movements and periods
we used to tell. We can see genres differentiating from
each other gradually. We can see assumptions about
gender gradually shifting. We've learned that the literary
standards defining a prestigious style change very slowly. It
doesn't happen in a generation; it takes centuries. ...it is
clear now that these methods can turn up important
patterns that we couldn't see before, and that's what I'm
loving about this.'
- The Digital in the Humanities: An Interview with Ted Underwood
From pragmatic, to direct access, to indirect and more abstract effects of digitisation.
The 'too long, didn't read' version is that we digitise and publish collections online for re-use because access to our shared cultural heritage matters. Digitisation supports education, entertainment, scholarship at scale.
From pragmatic, to direct access, to indirect and more abstract effects of digitisation.
What kinds of data are we talking about? At the very least, providing photographs of pages, which can then be transcribed as text. Can then offer collections of metadata, of text, of images, for reading individually or mining as a dataset. A shift from reading pages to reading a dataset enables entirely new research questions.
Image, data. https://archive.org/details/MysteriesOfMelbourneLife
Digitised catalogue data is great, direct access to catalogue contents is even better. Digitisation is a key part of everyday business of GLAMs.
Why should access to our collective cultural and scientific heritage be limited to those who happen to be nearby, or who can afford to travel to see it? And why should it be limited to the opening hours of an organisation? (The BL reading rooms aren't open on a Sunday)
Why should access to our collective cultural and scientific heritage be limited to those who happen to be nearby, or who can afford to travel to see it? And why should it be limited to the opening hours of an organisation? (The BL reading rooms aren't open on a Sunday)
Following Dan Cohen's mention of hedgehogs yesterday... If you're a medieval scholar you might know the story that hedgehogs shake grape vines then 'trundle off back to their burrows, carrying the grapes on their spines, as a meal for their young', or the deeper moral about the devil, but you can delight in the image without that knowledge.
This is a screenshot from a video clip made by a group from Malaysia using images from 19thC books the BL put on Flickr. They didn't need to ask us, but they were kind enough to email us afterwards.
Lots of scholarship with digital collections ends up in traditional outputs, like monographs or articles. It can be incredibly difficult to track these uses, particularly if people cite the original and not the digital surrogate they actually used.
I could literally talk for hours about the opportunities for deeper engagement and mutual rewards through crowdsourcing tasks like transcription and tagging.
Projects like Pelagios allow placenames in documents and maps to be annotated with linked open data identifiers. These identifiers mean the items are more easily linked to other collection items that mention the same places.
When people can access and re-use digitised content, they can do amazing things with it without having to spend three years and lots of money on lawyers.
New ways of processing images as data, texture - the library could never have applied the technologies that code artist Mario Klingemann brought to 19th images. His exploration of the images resulted in new ways of seeing collections at scale.
His web site is: http://mario-klingemann.tumblr.com/
And http://incubator.quasimondo.com/
'Her Hat Was In The Ring, which shares information about the thousands of women who ran for public office in the U.S. before women had universal suffrage.'
'Only a minuscule portion of the primary sources in this country have been digitized and made available in an easy way for the public to explore. Researchers for this project compiled information about thousands of women mainly from digital primary sources. Had these sources not been available and discoverable, we would likely never know the stories of these trailblazing women and we certainly would not understand the broader narrative'
Search! Any word a keyword!
At some point, scale - over time, space, breadth, topic, number of sources queried - becomes transformative. When have an entire corpus of digitised texts, can ask new questions. Tools like Bookworm, when combined with large corpus of text, allow scholars and the curious to explore change over time
Moving on to more complex, abstract forms of access
Effort shifting from people to computers means time that would be spent on manual or logistical tasks is freed up. This free time can change careers.
The Biodiversity Heritage Library improves research methodology by collaboratively making biodiversity literature openly available to the world as part of a global biodiversity community.
'These collections are of exceptional value because the domain of systematic biology depends, more than any other science, upon historic literature. Yet, this wealth of knowledge is available only to those few who can gain direct access to significant library collections. Literature about the biota existing in developing countries is often not available within their own borders.'
Students in other disciplines can use CH datasets to really test methods. Part of wider work on named entity recognition, optical character recognition, handwritten text recognition. This is a screenshot from Sherlock net, using machine learning to generate tags for images. 'we decided to use a 'voting' system where we find the 20 images most similar to our image of interest, and have all images vote on the nouns that appear most commonly in their surrounding text.' Able to add synonyms to searches e.g. lady also checks gentlewoman, madam, noblewoman.
Computational techniques can help us learn more about collections. This project was able to identify woodcuts re-used in different publications and order the publications by finding tiny differences in the condition of the woodcuts.
Computer Vision and the History of Printing, Joon Son Chung