SlideShare a Scribd company logo
1 of 48
Download to read offline
The Mechanical Curator, maps
and the online community
Ben O’Steen, British Library Labs
@benosteen ben.osteen@bl.uk
Andrew W. Mellon funded project seeking to bring
researchers and our digital data closer together.
There is a significant gap between them for many reasons.
Andrew W. Mellon funded project seeking to bring
researchers and our digital data closer together.
There is a significant gap between them for many reasons.
I’m there to work out what bridges to build.
Modern research forces us to re-evaluate what is
meant by ‘access’
Enabling compute for example:
Distant reading, machine learning, statistical methods -
an ever-growing list.
Infancy of understanding
Large-scale analysis of
text is evolving but
young.
Exasperating situation
where ‘black boxes’ of
algorithms are used to
draw conclusions.
http://www.scottbot.net/HIAL/?p=41271
“Black Boxes”:
a misnomer
It is legitimate and
useful to use code that
you could not write.
It is not legitimate to
simply believe the
‘label’ on the side of
the box.
E.g. “Sentiment
Analysis” is often
nothing of the sort.
Quoting Scott Weingart: (emphasis mine)
● Do sentiment analysis algorithms agree with one another enough to be considered
valid?
● Do sentiment analysis results agree with humans performing the same task
enough to be considered valid?
● Is Jockers’ instantiation of aggregate sentiment analysis validly measuring
anything besides random fluctuations?
● Is aggregate sentiment analysis, by human or machine, a valid method for revealing
plot arcs?
● If aggregate sentiment analysis finds common but distinct patterns and they don’t seem to
map onto plot arcs, can they still be valid measurements of anything at all?
● Can a subjective concept, whether measured by people or machines, actually be
considered invalid or valid?
(again from http://www.scottbot.net/HIAL/?p=41271)
Do researchers need to “level up” and become
machine learning experts to use it?
In short, no.
We do not require scientists to have a masters degree in
Statistics to publish on numerical results, nor be prize-
winning novelists to write research papers.*
There is a middle ground between treating something as
magic and being an expert in the field.
I cannot say who specifically - librarian, data scientist, PI,
consultant, etc - is best placed to gain and use this
knowledge without evidence or trials.
* although, it likely couldn’t hurt given the papers I’ve read.
Let’s consider a real
example.
Peter Francois,
2013 British Library Labs Competition winner
“I am interested in
travel accounts in
Europe during the
19th Century”
“The Great Unread”, Graph, Maps and Trees
and Franco Moretti
From a review of “Graph, Maps and Trees”:
“Professor Franco Moretti argues heretically that literature scholars should
stop reading books and start counting, graphing, and mapping them
instead [...]”
“For any given period scholars focus on a select group of a mere few
hundred texts: the canon. As a result, they have allowed a narrow
distorting slice of history to pass for the total picture.”
“Moretti offers bar charts, maps, and time lines instead, developing the
idea of "distant reading," set forth in his path-breaking essay
"Conjectures on World Literature," into a full-blown experiment in literary
historiography, where the canon disappears into the larger literary
system.”
2013 Competition winners
http://labs.bl.uk/Ideas+for+Labs
Pieter Francois
Bias in digitisation
The tool was made to give a statistically valid sample.
Due to the paltry amount digitised, it showed how skewed
the digital corpus is, compared to the overall holdings.
Allen B. Riddell in “Where are the novels?”* estimates
that using HathiTrust’s corpus:
“... about 58%—somewhere between 47% and 68%—of
the 2,903 novels [all publications in English between 1800
and 1836] have publicly accessible scans.”
* (2012) https://ariddell.org/where-are-the-novels.html
Written versus What is Read
Presentation shapes research questions
“On The Road”, Jack Kerouac
(via http://www.openculture.com/2007/08/on_the_road_the_original_scroll.html)
Impact?
Hard to measure but:
- 17-20 million hits on average every month,
over 250 million in 14 months.
- Over 200,000 tags added.
- > 5,500 clicks on ‘purchase a high
resolution version’
- Hundreds of contributors.
- Iterative crowdsourcing is ongoing.
- https://commons.wikimedia.org/wiki/Commons:
British_Library/Mechanical_Curator_collection/m
ap_tag_status
Rethinking access
What if everything had (at least) one URL?
Every book?
Every article?
Every page?
Every paragraph?
What if that URL worked in predictable ways?
David Normal
http://www.davidnormal.com/
Burning Man Festival
David Normal created light boxes around the
Burning man, using the British Library’s Flickr Images
Codename:
“Burning Man Meets the British
Library” - 20th June 2015
Can code identify subjective qualities?
http://www.robertelliottsmith.com/?p=530
There is a lot more to explore…
And too much for a single project to
tackle alone.
Tagathon found nearly 30,000 maps!
Georeferencing - http://bl.uk/maps
Iterative crowdsourcing* and
curation
Release data with the attitude that people will
tell you why it is wrong and give them tools to
fix it.
Georeferencing maps found in books, gives
data that can be used to generate more
specific metadata about what those books
concern.
* A term I have borrowed from Mia Ridge
Light-hearted but underlines a
crucial pattern of access
Interfaces to content need to expect and to
cater to machine access.
A human may not be present to say, ‘log in’.
Keyword search is useless as a filtering
mechanism
Text- and data-mining is like throwing a
magnet into a haystack, without knowing if
there are any steel needles in there.
Gaming re-use
Off the Map
2014 Winners
2014 winning team:
Gothulus Rift
University of South Wales
Created a Fonthill Abbey
inspired game called Nix using
Oculus Rift
Blog: http://nixgamedevblog.
blogspot.co.uk
YouTube flythrough: http:
//youtu.be/8ESieZO4VHw
Off the Map 2015
Alice’s Adventures Off the Map
Part of the British Library's celebrations for the 150th
anniversary of Alice in Wonderland
http://gamecity.org/alices-adventures-off-the-map/
British Library Labs Competitions
http://labs.bl.
uk/British+Library+Labs+Competition+2015
Unofficial descriptions of the two main aspects
of this:
“Tell us your ideas”
and
“Show us what you have done”
My contact details:
ben.osteen@bl.uk
@benosteen
Links:
http://labs.bl.uk
http://mechanicalcurator.tumblr.com
https://flickr.com/photos/britishlibrary
https://github.com/bl-labs
http://britishlibrary.typepad.co.uk/digital-scholarship/2013/12/a-million-first-steps.html

More Related Content

What's hot

Introduction to information visualisation for humanities PhDs
Introduction to information visualisation for humanities PhDsIntroduction to information visualisation for humanities PhDs
Introduction to information visualisation for humanities PhDsMia
 
MW2011: Cope, A., Authority Records, Future Computers and Other Unfinished Hi...
MW2011: Cope, A., Authority Records, Future Computers and Other Unfinished Hi...MW2011: Cope, A., Authority Records, Future Computers and Other Unfinished Hi...
MW2011: Cope, A., Authority Records, Future Computers and Other Unfinished Hi...museums and the web
 
Data visualisations as a gateway to programming
Data visualisations as a gateway to programmingData visualisations as a gateway to programming
Data visualisations as a gateway to programmingMia
 
How to stop sucking and be awesome instead
How to stop sucking and be awesome insteadHow to stop sucking and be awesome instead
How to stop sucking and be awesome insteadcodinghorror
 
Open data and Open Science
Open data and Open ScienceOpen data and Open Science
Open data and Open Sciencepetermurrayrust
 
Libraries and Wikis
Libraries and WikisLibraries and Wikis
Libraries and Wikisguestbb9660
 
Libraries and Wikis
Libraries and WikisLibraries and Wikis
Libraries and WikisBrenda Hough
 
The surprising adventures of the mechanical curator
The surprising adventures of the mechanical curatorThe surprising adventures of the mechanical curator
The surprising adventures of the mechanical curatorbenosteen
 
Introduction to the International Image Interoperability Framework
Introduction to the International Image Interoperability FrameworkIntroduction to the International Image Interoperability Framework
Introduction to the International Image Interoperability FrameworkIIIF_io
 
Crowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshopCrowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshopMia
 
Introduction to Semantic Web
Introduction to Semantic WebIntroduction to Semantic Web
Introduction to Semantic WebIvan Herman
 
Crowdsourcing as productive engagement with cultural heritage
Crowdsourcing as productive engagement with cultural heritageCrowdsourcing as productive engagement with cultural heritage
Crowdsourcing as productive engagement with cultural heritageMia
 
Principles and practice of Open Science
Principles and practice of Open SciencePrinciples and practice of Open Science
Principles and practice of Open Sciencepetermurrayrust
 
Designing Successful Heritage Crowdsourcing Projects
Designing Successful Heritage Crowdsourcing ProjectsDesigning Successful Heritage Crowdsourcing Projects
Designing Successful Heritage Crowdsourcing ProjectsMia
 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Dorothea Salo
 
Copyright Reform and Open Data
Copyright Reform and Open DataCopyright Reform and Open Data
Copyright Reform and Open Datapetermurrayrust
 
Why do we digitise? 20 reasons in 20 pictures
Why do we digitise? 20 reasons in 20 picturesWhy do we digitise? 20 reasons in 20 pictures
Why do we digitise? 20 reasons in 20 picturesMia
 
Network visualisations and the ‘so what?’ problem
Network visualisations and the ‘so what?’ problemNetwork visualisations and the ‘so what?’ problem
Network visualisations and the ‘so what?’ problemMia
 
Developments in digital scholarship: at the British Library and at kitchen ta...
Developments in digital scholarship: at the British Library and at kitchen ta...Developments in digital scholarship: at the British Library and at kitchen ta...
Developments in digital scholarship: at the British Library and at kitchen ta...Mia
 

What's hot (20)

Introduction to information visualisation for humanities PhDs
Introduction to information visualisation for humanities PhDsIntroduction to information visualisation for humanities PhDs
Introduction to information visualisation for humanities PhDs
 
MW2011: Cope, A., Authority Records, Future Computers and Other Unfinished Hi...
MW2011: Cope, A., Authority Records, Future Computers and Other Unfinished Hi...MW2011: Cope, A., Authority Records, Future Computers and Other Unfinished Hi...
MW2011: Cope, A., Authority Records, Future Computers and Other Unfinished Hi...
 
Data visualisations as a gateway to programming
Data visualisations as a gateway to programmingData visualisations as a gateway to programming
Data visualisations as a gateway to programming
 
How to stop sucking and be awesome instead
How to stop sucking and be awesome insteadHow to stop sucking and be awesome instead
How to stop sucking and be awesome instead
 
Web2 0storytelling 2009
Web2 0storytelling 2009Web2 0storytelling 2009
Web2 0storytelling 2009
 
Open data and Open Science
Open data and Open ScienceOpen data and Open Science
Open data and Open Science
 
Libraries and Wikis
Libraries and WikisLibraries and Wikis
Libraries and Wikis
 
Libraries and Wikis
Libraries and WikisLibraries and Wikis
Libraries and Wikis
 
The surprising adventures of the mechanical curator
The surprising adventures of the mechanical curatorThe surprising adventures of the mechanical curator
The surprising adventures of the mechanical curator
 
Introduction to the International Image Interoperability Framework
Introduction to the International Image Interoperability FrameworkIntroduction to the International Image Interoperability Framework
Introduction to the International Image Interoperability Framework
 
Crowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshopCrowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshop
 
Introduction to Semantic Web
Introduction to Semantic WebIntroduction to Semantic Web
Introduction to Semantic Web
 
Crowdsourcing as productive engagement with cultural heritage
Crowdsourcing as productive engagement with cultural heritageCrowdsourcing as productive engagement with cultural heritage
Crowdsourcing as productive engagement with cultural heritage
 
Principles and practice of Open Science
Principles and practice of Open SciencePrinciples and practice of Open Science
Principles and practice of Open Science
 
Designing Successful Heritage Crowdsourcing Projects
Designing Successful Heritage Crowdsourcing ProjectsDesigning Successful Heritage Crowdsourcing Projects
Designing Successful Heritage Crowdsourcing Projects
 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!
 
Copyright Reform and Open Data
Copyright Reform and Open DataCopyright Reform and Open Data
Copyright Reform and Open Data
 
Why do we digitise? 20 reasons in 20 pictures
Why do we digitise? 20 reasons in 20 picturesWhy do we digitise? 20 reasons in 20 pictures
Why do we digitise? 20 reasons in 20 pictures
 
Network visualisations and the ‘so what?’ problem
Network visualisations and the ‘so what?’ problemNetwork visualisations and the ‘so what?’ problem
Network visualisations and the ‘so what?’ problem
 
Developments in digital scholarship: at the British Library and at kitchen ta...
Developments in digital scholarship: at the British Library and at kitchen ta...Developments in digital scholarship: at the British Library and at kitchen ta...
Developments in digital scholarship: at the British Library and at kitchen ta...
 

Similar to UKSG 2015 Mechanical curator and British Library labs

British Library Labs - Overview Talk 2017
British Library Labs - Overview Talk 2017British Library Labs - Overview Talk 2017
British Library Labs - Overview Talk 2017benosteen
 
Bl labs what is british library labs
Bl labs   what is british library labsBl labs   what is british library labs
Bl labs what is british library labsbenosteen
 
Sharing and Serendipity
Sharing and SerendipitySharing and Serendipity
Sharing and Serendipitybenosteen
 
Doing the Digital: How Scholars Learned to Stop Worrying and Love the Computer
Doing the Digital: How Scholars Learned to Stop Worrying and Love the ComputerDoing the Digital: How Scholars Learned to Stop Worrying and Love the Computer
Doing the Digital: How Scholars Learned to Stop Worrying and Love the ComputerAndrew Prescott
 
James baker bronte 11.10pptx
James baker bronte 11.10pptxJames baker bronte 11.10pptx
James baker bronte 11.10pptxSoniaJones
 
Doing a dissertation: how the Digital Humanities can help you
Doing a dissertation: how the Digital Humanities can help youDoing a dissertation: how the Digital Humanities can help you
Doing a dissertation: how the Digital Humanities can help youJames Baker
 
Building a Network of Open Correspondence Projects A model for Open Science
Building a Network of Open Correspondence Projects A model for Open ScienceBuilding a Network of Open Correspondence Projects A model for Open Science
Building a Network of Open Correspondence Projects A model for Open ScienceFrancesca Di Donato
 
Describing Everything - Open Web standards and classification
Describing Everything - Open Web standards and classificationDescribing Everything - Open Web standards and classification
Describing Everything - Open Web standards and classificationDan Brickley
 
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final Revised
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final RevisedTonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final Revised
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final RevisedYasar Tonta
 
Building a Network of Open Correspondence Projects. A model for Open Science
Building a Network of Open Correspondence Projects. A model for Open ScienceBuilding a Network of Open Correspondence Projects. A model for Open Science
Building a Network of Open Correspondence Projects. A model for Open ScienceFrancesca Di Donato
 
Annotation and Scholarship
Annotation and ScholarshipAnnotation and Scholarship
Annotation and ScholarshipJohn Bradley
 
Data versus Text: 30 years of confrontation
Data versus Text: 30 years of confrontationData versus Text: 30 years of confrontation
Data versus Text: 30 years of confrontationLou Burnard
 

Similar to UKSG 2015 Mechanical curator and British Library labs (20)

British Library Labs - Overview Talk 2017
British Library Labs - Overview Talk 2017British Library Labs - Overview Talk 2017
British Library Labs - Overview Talk 2017
 
Bl labs what is british library labs
Bl labs   what is british library labsBl labs   what is british library labs
Bl labs what is british library labs
 
AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101  AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101
 
Sharing and Serendipity
Sharing and SerendipitySharing and Serendipity
Sharing and Serendipity
 
Doing the Digital: How Scholars Learned to Stop Worrying and Love the Computer
Doing the Digital: How Scholars Learned to Stop Worrying and Love the ComputerDoing the Digital: How Scholars Learned to Stop Worrying and Love the Computer
Doing the Digital: How Scholars Learned to Stop Worrying and Love the Computer
 
Digital Research Support by Stella Wisdom, for 20th & 21st Century Collection...
Digital Research Support by Stella Wisdom, for 20th & 21st Century Collection...Digital Research Support by Stella Wisdom, for 20th & 21st Century Collection...
Digital Research Support by Stella Wisdom, for 20th & 21st Century Collection...
 
James baker bronte 11.10pptx
James baker bronte 11.10pptxJames baker bronte 11.10pptx
James baker bronte 11.10pptx
 
Digital Research at the British Library, by Stella Wisdom
Digital Research at the British Library, by Stella WisdomDigital Research at the British Library, by Stella Wisdom
Digital Research at the British Library, by Stella Wisdom
 
Get Interactive With Fiction
Get Interactive With FictionGet Interactive With Fiction
Get Interactive With Fiction
 
Places of Inspiration: Playing and Making in the Library
Places of Inspiration: Playing and Making in the LibraryPlaces of Inspiration: Playing and Making in the Library
Places of Inspiration: Playing and Making in the Library
 
Doing a dissertation: how the Digital Humanities can help you
Doing a dissertation: how the Digital Humanities can help youDoing a dissertation: how the Digital Humanities can help you
Doing a dissertation: how the Digital Humanities can help you
 
Digital Research Support by Stella Wisdom, 20th & 21st Century Collections, D...
Digital Research Support by Stella Wisdom, 20th & 21st Century Collections, D...Digital Research Support by Stella Wisdom, 20th & 21st Century Collections, D...
Digital Research Support by Stella Wisdom, 20th & 21st Century Collections, D...
 
101 This is Digital Scholarship 2016
101 This is Digital Scholarship 2016101 This is Digital Scholarship 2016
101 This is Digital Scholarship 2016
 
Building a Network of Open Correspondence Projects A model for Open Science
Building a Network of Open Correspondence Projects A model for Open ScienceBuilding a Network of Open Correspondence Projects A model for Open Science
Building a Network of Open Correspondence Projects A model for Open Science
 
Describing Everything - Open Web standards and classification
Describing Everything - Open Web standards and classificationDescribing Everything - Open Web standards and classification
Describing Everything - Open Web standards and classification
 
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final Revised
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final RevisedTonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final Revised
Tonta World Is Flat Yet Not Open Oslo Workshop 10 May 2006 Final Revised
 
Building a Network of Open Correspondence Projects. A model for Open Science
Building a Network of Open Correspondence Projects. A model for Open ScienceBuilding a Network of Open Correspondence Projects. A model for Open Science
Building a Network of Open Correspondence Projects. A model for Open Science
 
Annotation and Scholarship
Annotation and ScholarshipAnnotation and Scholarship
Annotation and Scholarship
 
Data versus Text: 30 years of confrontation
Data versus Text: 30 years of confrontationData versus Text: 30 years of confrontation
Data versus Text: 30 years of confrontation
 
BL_English doctoral_open_day_session
BL_English doctoral_open_day_sessionBL_English doctoral_open_day_session
BL_English doctoral_open_day_session
 

More from benosteen

Arches Getty Brownbag Talk
Arches Getty Brownbag TalkArches Getty Brownbag Talk
Arches Getty Brownbag Talkbenosteen
 
Bl labs ucl-services
Bl labs ucl-servicesBl labs ucl-services
Bl labs ucl-servicesbenosteen
 
Uses of Library Collections
Uses of Library CollectionsUses of Library Collections
Uses of Library Collectionsbenosteen
 
British library labs - What? Why?
British library labs - What? Why?British library labs - What? Why?
British library labs - What? Why?benosteen
 
Lightning Talk - LDCX 2015 Stanford
Lightning Talk - LDCX 2015 StanfordLightning Talk - LDCX 2015 Stanford
Lightning Talk - LDCX 2015 Stanfordbenosteen
 
104 Communicating our Collections Online
104 Communicating our Collections Online104 Communicating our Collections Online
104 Communicating our Collections Onlinebenosteen
 
Mechanical Curator (@ CREATE PUBLIC DOMAIN WORKSHOP FOR CREATIVE BUSINESSES)
Mechanical Curator (@ CREATE PUBLIC DOMAIN WORKSHOP FOR CREATIVE BUSINESSES)Mechanical Curator (@ CREATE PUBLIC DOMAIN WORKSHOP FOR CREATIVE BUSINESSES)
Mechanical Curator (@ CREATE PUBLIC DOMAIN WORKSHOP FOR CREATIVE BUSINESSES)benosteen
 
BL Labs 2014 Symposium: The Mechanical Curator
BL Labs 2014 Symposium: The Mechanical CuratorBL Labs 2014 Symposium: The Mechanical Curator
BL Labs 2014 Symposium: The Mechanical Curatorbenosteen
 
Mechanical curator - Technical notes
Mechanical curator - Technical notesMechanical curator - Technical notes
Mechanical curator - Technical notesbenosteen
 
Apache pig as a researcher’s stepping stone
Apache pig as a researcher’s stepping stoneApache pig as a researcher’s stepping stone
Apache pig as a researcher’s stepping stonebenosteen
 
New methods of access and discoverability bring new affordances for digital r...
New methods of access and discoverability bring new affordances for digital r...New methods of access and discoverability bring new affordances for digital r...
New methods of access and discoverability bring new affordances for digital r...benosteen
 
Visualising Knowledge: Why? What? How?
Visualising Knowledge: Why? What? How?Visualising Knowledge: Why? What? How?
Visualising Knowledge: Why? What? How?benosteen
 
Postscript, books and binding
Postscript, books and bindingPostscript, books and binding
Postscript, books and bindingbenosteen
 
Open Bibliography, Citations and Scholarship
Open Bibliography, Citations and ScholarshipOpen Bibliography, Citations and Scholarship
Open Bibliography, Citations and Scholarshipbenosteen
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automationbenosteen
 
Bodleian Library's DAMS system
Bodleian Library's DAMS systemBodleian Library's DAMS system
Bodleian Library's DAMS systembenosteen
 
Choices, modelling and Frankenstein Ontologies
Choices, modelling and Frankenstein OntologiesChoices, modelling and Frankenstein Ontologies
Choices, modelling and Frankenstein Ontologiesbenosteen
 
Where are Repository's Going?
Where are Repository's Going?Where are Repository's Going?
Where are Repository's Going?benosteen
 

More from benosteen (19)

Arches Getty Brownbag Talk
Arches Getty Brownbag TalkArches Getty Brownbag Talk
Arches Getty Brownbag Talk
 
Bl labs ucl-services
Bl labs ucl-servicesBl labs ucl-services
Bl labs ucl-services
 
Uses of Library Collections
Uses of Library CollectionsUses of Library Collections
Uses of Library Collections
 
British library labs - What? Why?
British library labs - What? Why?British library labs - What? Why?
British library labs - What? Why?
 
Lightning Talk - LDCX 2015 Stanford
Lightning Talk - LDCX 2015 StanfordLightning Talk - LDCX 2015 Stanford
Lightning Talk - LDCX 2015 Stanford
 
104 Communicating our Collections Online
104 Communicating our Collections Online104 Communicating our Collections Online
104 Communicating our Collections Online
 
Mechanical Curator (@ CREATE PUBLIC DOMAIN WORKSHOP FOR CREATIVE BUSINESSES)
Mechanical Curator (@ CREATE PUBLIC DOMAIN WORKSHOP FOR CREATIVE BUSINESSES)Mechanical Curator (@ CREATE PUBLIC DOMAIN WORKSHOP FOR CREATIVE BUSINESSES)
Mechanical Curator (@ CREATE PUBLIC DOMAIN WORKSHOP FOR CREATIVE BUSINESSES)
 
BL Labs 2014 Symposium: The Mechanical Curator
BL Labs 2014 Symposium: The Mechanical CuratorBL Labs 2014 Symposium: The Mechanical Curator
BL Labs 2014 Symposium: The Mechanical Curator
 
Mechanical curator - Technical notes
Mechanical curator - Technical notesMechanical curator - Technical notes
Mechanical curator - Technical notes
 
Apache pig as a researcher’s stepping stone
Apache pig as a researcher’s stepping stoneApache pig as a researcher’s stepping stone
Apache pig as a researcher’s stepping stone
 
New methods of access and discoverability bring new affordances for digital r...
New methods of access and discoverability bring new affordances for digital r...New methods of access and discoverability bring new affordances for digital r...
New methods of access and discoverability bring new affordances for digital r...
 
Visualising Knowledge: Why? What? How?
Visualising Knowledge: Why? What? How?Visualising Knowledge: Why? What? How?
Visualising Knowledge: Why? What? How?
 
Mashspa
MashspaMashspa
Mashspa
 
Postscript, books and binding
Postscript, books and bindingPostscript, books and binding
Postscript, books and binding
 
Open Bibliography, Citations and Scholarship
Open Bibliography, Citations and ScholarshipOpen Bibliography, Citations and Scholarship
Open Bibliography, Citations and Scholarship
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automation
 
Bodleian Library's DAMS system
Bodleian Library's DAMS systemBodleian Library's DAMS system
Bodleian Library's DAMS system
 
Choices, modelling and Frankenstein Ontologies
Choices, modelling and Frankenstein OntologiesChoices, modelling and Frankenstein Ontologies
Choices, modelling and Frankenstein Ontologies
 
Where are Repository's Going?
Where are Repository's Going?Where are Repository's Going?
Where are Repository's Going?
 

Recently uploaded

ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 

Recently uploaded (20)

ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 

UKSG 2015 Mechanical curator and British Library labs

  • 1. The Mechanical Curator, maps and the online community Ben O’Steen, British Library Labs @benosteen ben.osteen@bl.uk
  • 2. Andrew W. Mellon funded project seeking to bring researchers and our digital data closer together. There is a significant gap between them for many reasons.
  • 3. Andrew W. Mellon funded project seeking to bring researchers and our digital data closer together. There is a significant gap between them for many reasons. I’m there to work out what bridges to build.
  • 4.
  • 5.
  • 6. Modern research forces us to re-evaluate what is meant by ‘access’ Enabling compute for example: Distant reading, machine learning, statistical methods - an ever-growing list.
  • 7. Infancy of understanding Large-scale analysis of text is evolving but young. Exasperating situation where ‘black boxes’ of algorithms are used to draw conclusions. http://www.scottbot.net/HIAL/?p=41271
  • 8. “Black Boxes”: a misnomer It is legitimate and useful to use code that you could not write. It is not legitimate to simply believe the ‘label’ on the side of the box. E.g. “Sentiment Analysis” is often nothing of the sort.
  • 9. Quoting Scott Weingart: (emphasis mine) ● Do sentiment analysis algorithms agree with one another enough to be considered valid? ● Do sentiment analysis results agree with humans performing the same task enough to be considered valid? ● Is Jockers’ instantiation of aggregate sentiment analysis validly measuring anything besides random fluctuations? ● Is aggregate sentiment analysis, by human or machine, a valid method for revealing plot arcs? ● If aggregate sentiment analysis finds common but distinct patterns and they don’t seem to map onto plot arcs, can they still be valid measurements of anything at all? ● Can a subjective concept, whether measured by people or machines, actually be considered invalid or valid? (again from http://www.scottbot.net/HIAL/?p=41271)
  • 10. Do researchers need to “level up” and become machine learning experts to use it?
  • 11. In short, no. We do not require scientists to have a masters degree in Statistics to publish on numerical results, nor be prize- winning novelists to write research papers.* There is a middle ground between treating something as magic and being an expert in the field. I cannot say who specifically - librarian, data scientist, PI, consultant, etc - is best placed to gain and use this knowledge without evidence or trials. * although, it likely couldn’t hurt given the papers I’ve read.
  • 12. Let’s consider a real example. Peter Francois, 2013 British Library Labs Competition winner
  • 13. “I am interested in travel accounts in Europe during the 19th Century”
  • 14. “The Great Unread”, Graph, Maps and Trees and Franco Moretti From a review of “Graph, Maps and Trees”: “Professor Franco Moretti argues heretically that literature scholars should stop reading books and start counting, graphing, and mapping them instead [...]” “For any given period scholars focus on a select group of a mere few hundred texts: the canon. As a result, they have allowed a narrow distorting slice of history to pass for the total picture.” “Moretti offers bar charts, maps, and time lines instead, developing the idea of "distant reading," set forth in his path-breaking essay "Conjectures on World Literature," into a full-blown experiment in literary historiography, where the canon disappears into the larger literary system.”
  • 16. Bias in digitisation The tool was made to give a statistically valid sample. Due to the paltry amount digitised, it showed how skewed the digital corpus is, compared to the overall holdings. Allen B. Riddell in “Where are the novels?”* estimates that using HathiTrust’s corpus: “... about 58%—somewhere between 47% and 68%—of the 2,903 novels [all publications in English between 1800 and 1836] have publicly accessible scans.” * (2012) https://ariddell.org/where-are-the-novels.html
  • 18. Presentation shapes research questions “On The Road”, Jack Kerouac (via http://www.openculture.com/2007/08/on_the_road_the_original_scroll.html)
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. Impact? Hard to measure but: - 17-20 million hits on average every month, over 250 million in 14 months. - Over 200,000 tags added. - > 5,500 clicks on ‘purchase a high resolution version’ - Hundreds of contributors. - Iterative crowdsourcing is ongoing. - https://commons.wikimedia.org/wiki/Commons: British_Library/Mechanical_Curator_collection/m ap_tag_status
  • 24.
  • 25.
  • 26. Rethinking access What if everything had (at least) one URL? Every book? Every article? Every page? Every paragraph? What if that URL worked in predictable ways?
  • 27.
  • 29.
  • 30.
  • 31. Burning Man Festival David Normal created light boxes around the Burning man, using the British Library’s Flickr Images
  • 32. Codename: “Burning Man Meets the British Library” - 20th June 2015
  • 33.
  • 34. Can code identify subjective qualities?
  • 36. There is a lot more to explore… And too much for a single project to tackle alone.
  • 37. Tagathon found nearly 30,000 maps!
  • 39. Iterative crowdsourcing* and curation Release data with the attitude that people will tell you why it is wrong and give them tools to fix it. Georeferencing maps found in books, gives data that can be used to generate more specific metadata about what those books concern. * A term I have borrowed from Mia Ridge
  • 40.
  • 41.
  • 42. Light-hearted but underlines a crucial pattern of access Interfaces to content need to expect and to cater to machine access. A human may not be present to say, ‘log in’. Keyword search is useless as a filtering mechanism Text- and data-mining is like throwing a magnet into a haystack, without knowing if there are any steel needles in there.
  • 44. Off the Map 2014 Winners 2014 winning team: Gothulus Rift University of South Wales Created a Fonthill Abbey inspired game called Nix using Oculus Rift Blog: http://nixgamedevblog. blogspot.co.uk YouTube flythrough: http: //youtu.be/8ESieZO4VHw
  • 45. Off the Map 2015 Alice’s Adventures Off the Map Part of the British Library's celebrations for the 150th anniversary of Alice in Wonderland http://gamecity.org/alices-adventures-off-the-map/
  • 46. British Library Labs Competitions http://labs.bl. uk/British+Library+Labs+Competition+2015 Unofficial descriptions of the two main aspects of this: “Tell us your ideas” and “Show us what you have done”
  • 47.