SlideShare a Scribd company logo
1 of 58
Open Science
Peter Murray-Rust,
ContentMine.org, and University of Cambridge
Opencon2015, Bologna, IT 2015-11-18
What is “Open”?
Why is it essential?
Open Data
Content Mining – a battle we must win
Young researchers are the present (Mike Eisen)
The Right to Read is the Right to Mine**PeterMurray-Rust, 2011
http://contentmine.org
My European Heroes
Young People(ContentMine)
NEELIE KROES
Messages
• The system is completely broken
• We are at war with major publishers
• Students have the power to change the world
• Universities need help from students
• Open is a state of mind
• The opposite of Open is broken [1]
• Friction destroys Open
• Don’t buy it, build it …
• … TOGETHER
[1] (John Wilbanks)
@Senficon (Julia Reda) :Text & Data mining in times of
#copyright maximalism:
"Elsevier stopped me doing my research"
http://onsnetwork.org/chartgerink/2015/11/16/elsevi
er-stopped-me-doing-my-research/ … #opencon #TDM
Breaking news:
Elsevier stopped me doing my research
Chris Hartgerink
I am a statistician interested in detecting potentially problematic research such as data fabrication,
which results in unreliable findings and can harm policy-making, confound funding decisions, and
hampers research progress.
To this end, I am content mining results reported in the psychology literature. Content mining the
literature is a valuable avenue of investigating research questions with innovative methods. For
example, our research group has written an automated program to mine research papers for errors in
the reported results and found that 1/8 papers (of 30,000) contains at least one result that could
directly influence the substantive conclusion [1].
In new research, I am trying to extract test results, figures, tables, and other information reported in
papers throughout the majority of the psychology literature. As such, I need the research papers
published in psychology that I can mine for these data. To this end, I started ‘bulk’ downloading research
papers from, for instance, Sciencedirect. I was doing this for scholarly purposes and took into account
potential server load by limiting the amount of papers I downloaded per minute to 9. I had no intention
to redistribute the downloaded materials, had legal access to them because my university pays a
subscription, and I only wanted to extract facts from these papers.
Full disclosure, I downloaded approximately 30GB of data from Sciencedirect in approximately 10 days.
This boils down to a server load of 0.0021GB/[min], 0.125GB/h, 3GB/day.
Approximately two weeks after I started downloading psychology research papers, Elsevier notified
my university that this was a violation of the access contract, that this could be considered stealing of
content, and that they wanted it to stop. My librarian explicitly instructed me to stop downloading
(which I did immediately), otherwise Elsevier would cut all access to Sciencedirect for my university.
I am now not able to mine a substantial part of the literature, and because of this Elsevier is directly
hampering me in my research.
[1] Nuijten, M. B., Hartgerink, C. H. J., van Assen, M. A. L. M., Epskamp, S., & Wicherts, J. M. (2015). The
prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods, 1–22.
doi: 10.3758/s13428-015-0664-2
Chris Hartgerink’s blog post
http://chemicaltagger.ch.cam.ac.uk/
• Typical
Typical chemical synthesis
Open Content Mining of FACTs
Machines can interpret chemical reactions
We have done 500,000 patents. There are >
3,000,000 reactions/year. Added value > 1B Eur.
C) What’s the problem with this spectrum?
Org. Lett., 2011, 13 (15), pp 4084–4087
Original thanks to ChemBark
After AMI2 processing…..
… AMI2 has detected a square
catalogue
getpapers
query
Daily
Crawl
EuPMC, arXiv
CORE , HAL,
(UNIV repos)
ToC
services
PDF HTML
DOC ePUB
TeX XML
PNG
EPS CSV
XLSURLs
DOIs
crawl
quickscrape
norma
Normalizer
Structurer
Semantic
Tagger
Text
Data
Figures
ami
UNIV
Repos
search
Lookup
CONTENT
MINING
Chem
Phylo
Trials
Crystal
Plants
COMMUNITY
plugins
Visualization
and Analysis
PloSONE, BMC,
peerJ… Nature, IEEE,
Elsevier…
Publisher Sites
scrapers
queries
taggers
abstract
methods
references
Captioned
Figures
Fig. 1
HTML tables
30, 000 pages/day
Semantic ScholarlyHTML
Facts
CONTENTMINE Complete OPEN Platform for Mining Scientific Literature
Stand back! I am about to do
ContentMining!
• Erriquez Daniela, Esame finale: Bologna, Aprile 2014
• Dott.ssa Elena Fiorentini, n. 0000274966, TESI DI DOTTORATO, Bologna
• Qian Gou, Esame finale: Bologna, finale 2014
• Maurizio BARONTINI, UNIVERSITÀ DEGLI STUDI DELLA TUSCIA DI VITERBO
• Terracciano Mario, Esame finale anno 2014
Refs: Erriquez_Daniela_tesi, Fiorentina_Elena_tesi, Gou_Qian_Tesi, mbarontini_tesid, terracciano_maria_tesi
BagOfWords for Italian Theses
Copyright and Mining
• UK (“Hargreaves”) 2014 legislation:
– “personal” “non-commercial*” “research” “data
analytics”
– legitimizes copying (?to disk), but not publishing
• PMR-premise: You cannot do reproducible
scientific mining and avoid violating copyright.
Massive political activity in Europe
REDA Publisher-influenced
Elsevier wants to control Open Data
[asked by Michelle Brook]
Scholarly infrastructure becomes closed
No accountability for monitoring and control
http://www.nytimes.com/2015/04/08/opinion/yes-we-were-warned-about-
ebola.html
We were stunned recently when we stumbled across an article by European
researchers in Annals of Virology [1982]: “The results seem to indicate that
Liberia has to be included in the Ebola virus endemic zone.” In the future,
the authors asserted, “medical personnel in Liberian health centers should be
aware of the possibility that they may come across active cases and thus be
prepared to avoid nosocomial epidemics,” referring to hospital-acquired
infection.
Adage in public health: “The road to inaction is paved with research
papers.”
Bernice Dahn (chief medical officer of Liberia’s Ministry of Health)
Vera Mussah (director of county health services)
Cameron Nutt (Ebola response adviser to Partners in Health)
A System Failure of Scholarly Publishing
[1] The Military-Industrial-Academic complex (1961)
(Dwight D Eisenhower, US President)
Publishers Academia
Glory+?
$$, MS
review
Taxpayer
Student
Researcher
$$ $$
in-kind
The Publisher-Academic complex[1]
[Wikipedia:] On the steps of Sproul Hall [Student] Mario Savio gave a
famous speech
... But we're a bunch of raw materials that don't mean … to end up being
bought by some clients of the University, be they the government, be they
industry, be they organized labor, be they anyone! We're human beings!
... There's a time when the operation of the machine becomes so odious
— makes you so sick at heart — that you can't take part. You can't even
passively take part. And you've got to put your bodies upon the gears and
upon the wheels, upon the levers, upon all the apparatus, and you've got
to make it stop. And you've got to indicate to the people who run it, to the
people who own it, that unless you're free, the machine will be prevented
from working at all. [1]
Univ California,
Berkeley 1964
The Free Speech Movement
1970’s UK,
student occupations and sit-ins
University of Stirling
Used without permission but with thanks and Love
Liverpool , Warwick, Emmanuel Coll Camb., UCL, Glasgow, Middlesex, …
Flower Power
1967
Berkeley 2010
“Flowerpoint”
["How We Stopped SOPA”:
This bill ... shut down whole websites. Essentially, it stopped Americans from
communicating entirely with certain groups....
I called all my friends, and we stayed up all night setting up a website for this new group,
Demand Progress, with an online petition opposing this noxious bill.... We [got] ... 300,000
signers.... We met with the staff of members of Congress and pleaded with them.... And then
it passed unanimously....
And then, suddenly, the process stopped. Senator Ron Wyden ... put a hold on the
bill.[48][49]
He added, "We won this fight because everyone made themselves the hero of their own
story. Everyone took it as their job to save this crucial freedom.”
Robert Swartz: "Aaron was killed by the government, and MIT betrayed all of its basic
principles."[116]
Aaron Swartz
Rules for Revolutionaries
• Be publicly clear about your public aims.
• Gather whole-hearted allies.
• Choose your moment/s carefully.
• Be prominent – blogs, talks, papers.
• Be bold – and probably brave.
• Write Liberation Software.
• Create slogans, warcries, mantras.
Take the fight to publishers. Hold them accountable for the near-
criminal business models they operate on, and the stranglehold they
have had on academia for too long.
Extending this, I need your help. I want to know if we initiate a formal
investigation into the practices of publishers, in terms of the fact that
they operate within an unregulated market and enjoy enormous
profits to commit immoral acts (creating knowledge inequality). …. I
want to know what we can do, and if such an investigation is even
feasible, and whether or not we have a legal case supporting us.
Don’t sacrifice your career.. [PMR] said it best, that for any revolution
blood will be spilled. If you’re making someone angry, you’re probably
doing it right. But when you’re ‘advocating’ for open access, maintain
one simple rule: don’t be a dick…. (and lots more)
Jon Tennant 2014-11-25
http://blogs.egu.eu/palaeoblog/2014/11/25/open-access-wins-all-of-
the-arguments-all-of-the-time/
The Right to Read
is
The Right to Roam
The Right to Mine
Kinder Mass Trespass
used without permission but with love and thanks
How can we achieve Freedom?
• Change the law to allow ContentMining
– Hard, tedious, but necessary
– Requires evidence, campaigning, making yourselves a
pain in the arse…
• Make all outputs Open
– Requires culture change in researchers
– Tools: Open Notebook Science, Github, Open source,
Social media.
– Needs support from funders, learned societies,
universities
Four Freedoms (Richard Stallman)
The freedom to:
0 run the program as you wish, for any purpose
1 study how the program works, and change it
2 to redistribute copies
3 distribute copies of your modified program
Most other “Opens” follow these principles, including CC-BY material.
However “Green Open Access” is incompatible with Freedom2 and 3
The Open Definition
“Open means anyone can freely access, use, modify, and share for
any purpose (subject, at most, to requirements that preserve
provenance and openness).”
http://www.budapestopenaccessinitiative.org/read
… an unprecedented public good. …
… completely free and unrestricted access to [peer-
reviewed literature] by all scientists, scholars, teachers,
students, and other curious minds. …
…Removing access barriers to this literature will
accelerate research, enrich education, share the
learning of the rich with the poor and the poor with
the rich, make this literature as useful as it can be, and
lay the foundation for uniting humanity in a common
intellectual conversation and quest for knowledge.
(Budapest Open Access Initiative, 2003)
Panton Principles for Open Data in
science(2010)
• PUBLISH YOUR DATA OPENLY
• …make an explicit and robust statement of your wishes.
• Use a recognized waiver or license that is appropriate for
data.
• open as defined by the Open Knowledge/Data Definition
(… NOT non-commercial)
• Explicit dedication of data … into the public domain via
PDDL or CCZero
Peter Murray-Rust, Cameron Neylon, Rufus Pollock, John
Wilbanks
Panton Authors and Fellows
Bjorn Brembs enhanced by OpenData
http://bjoern.brembs.net/2015/11/dont-be-afraid-of-open-data/
This is a response to Dorothy Bishop’s post “Who’s afraid of open data?“.
After we had published a paper on how Drosophila strains that are referred to by the same name in the literature
(Canton S), but came from different laboratories behaved completely different in a particular behavioral experiment,
Casey Bergman from Manchester contacted me, asking if we shouldn’t sequence the genomes of these five fly strains
to find out how they differ. So I went and behaviorally tested each of the strains again, extracted the DNA from the 100
individuals I had just tested and sent the material to him. I also published the behavioral data immediately on our
GitHub project page.
Casey then sequenced the strains and made the sequences available, as well. A few weeks later, both Casey and I
were contacted by Nelson Lau at Brandeis, showing us his bioinformatics analyses of our genome data. Importantly,
his analyses wasn’t even close to what we had planned. On the contrary, he had looked at something I (not being a
bioinformatician) would have considered orthogonal (Casey may disagree). So there we had a large chunk of work we
would have never done on the data we hadn’t even started analyzing, yet. I was so thrilled! I learned so much from
Nelson’s work, this was fantastic! Nelson even asked us to be co-author, to which I quickly protested and suggested, if
anything, I might be mentioned in the acknowledgments for “technical assistance” – after all, I had only extracted the
DNA.
However, after some back-and-forth, he persuaded me with the argument that he
wanted to have us as co-authors to set an example. He wanted to show everyone that
sharing data is something that can bring you direct rewards in publications. He
wanted us to be co-authors as a reward for posting our data and as incentive for
others to let go of their fears and also post their data online.
Arguments for Open
• Open Science:
– is Better Science
– can reach and involve everyone
– Open Science moves more quickly
– Open Science challenges injustice
– helps the world
It also happens to:
– Promote the careers of scientists
– Save money
Jean-Claude Bradley
Jean-Claude Bradley was one of the
most influential open scientists of our
time. He was an innovator in all that
he did, from Open Education to
bleeding edge Open Science; in 2006,
he coined the phrase Open Notebook
Science. His loss is felt deeply by
friends and colleagues around the
world.
On Monday July 14, 2014 we shall
gather at Cambridge University to
honour his memory and the legacy he
leaves behind with a highly
distinguished set of invited speakers to
revisit and build upon the ideas which
inspired and defined his life’s work.
Wikipedia CC BY-SA
Traditional Research and Publication
“Lab” work paper/th
esis
Write
rewrite
Re-experiment
publish
???
Validation??
DATA
output “belongs”
to publisher
process “belongs”
to publisher
Walls of
academia
Free/Open Software Development
CODE
REPOSITORY
World
community
CODE
rewrite
validate
CODE
fork
CODE
Re-use
CODE
Re-use
Github, BitBucket
StackOverflow,
Apache
inspires
OSI
Example: ContentMine at
http://github.com/ContentMine/quickscrape
BORN-OPEN-SOURCE
NO WALLS
TOOLS
Open Notebook Science
Open
engineered
repository
World
community
INSTRUMENT
validate
merge
MODEL
CODE
DATA
DATA
knowledge
calibrate
Problems are solved communally;
Nothing is needlessly duplicated; “publication“ is
continuous
Machines
and humans
Working
together
CC-BY
Mat Todd (Sydney) and MANY collaborators
http://opensourcemalaria.org/ (Chrome)
University of Southampton, BSD-like Open
Open Source and Open Data
www.crystallography.net
OPEN CLOSED
Zenodo Figshare
Git
Dat
OpenOffice Word, PPT
LabTrove, cheminfo.org Chemdraw
CrystallographyOpenDB Cambridge Cryst data Centre
WriteLatex / Overleaf
ReadCube, Symplectic,
From Wikipedia CC BY-SA
Crowdsourcing
Young people
Jenny Molloy
Ross Mounce
Sam Moore Peter Kraker Rosie GraySophie Kay
Sophie: 3rd yr Grad students train 1st year students
PANTON ARMS
Panton Fellows
Sophie Kershaw, Panton Fellow, Training PhD Students
Rotation-Based Learning (RBL)
Phase 1: Initiator
• No communication
permitted between groups
• Attempt to reproduce
existing literature
• Deliver a coherent research
story by the end of Phase 1
Phase 2: Successor
• Communication between
groups still prohibited
• Validate and develop the
inherited research story
• Critique your predecessors
• Role of research producer vs. research user
• Can this approach help to foster awareness of reproducibility issues?
Throughout Phases 1 & 2:
• Daily lectures on open
science culture & techniques
• First-hand application to own
research work
• Version control using GitHub
• Daily group supervision
“Do you think you would be
more confident in the future
about trying to apply Open
techniques to your work..?”
• 50% Yes, by myself
• 41% Yes, with help/guidance
• 9% No opinion/neutral
• 0% No
Some Children
of the Digital Enlightenment
• David Carroll & Joe McArthur: OAButton
• Rayna Stamboliyska & Pierre-Carl Langlais
• Jon Tennant
• Ross Mounce
• Jenny Molloy
• Erin McKiernan
• Jack Andraka
• Michelle Brook
• Heather Piwowar
• TheContentMine Team
• Rufus Pollock
• Jonathan Gray
• Sophie Kay
Jean-Claude Bradley [1] a chemist
developed Open notebook science;
making the entire primary record of a
research project publicly available
online as it is recorded. (WP)
J-C promoted these ideas with
UNDERGRADUATE scientists.
[1] Unfortunately J-C died in 2014;
we held a memorial meeting in
Cambridge
Sophie
Kay
More Thoughts
• Don’t negotiate with walled gardens, make
them change or make them obsolete
• Building on top of non-Open is very fragile,
unpredictable and usually bad engineering
Protecting innovation
• Many start-ups get acquired and lose their
mission
• “Embrace, extend, exterminate” (Microsoft)
• Consider adding “Open Lock” clauses to
articles of incorporation

More Related Content

What's hot

Improving the troubled relationship between Scientists and Wikipedia
Improving the troubled relationship between Scientists and Wikipedia Improving the troubled relationship between Scientists and Wikipedia
Improving the troubled relationship between Scientists and Wikipedia Duncan Hull
 
Authenticating Scientists with OpenID
Authenticating Scientists with OpenIDAuthenticating Scientists with OpenID
Authenticating Scientists with OpenIDDuncan Hull
 
Embrace the Open Revolution
Embrace the Open RevolutionEmbrace the Open Revolution
Embrace the Open Revolutionpetermurrayrust
 
OpenNotebookScience NOW!
OpenNotebookScience NOW!OpenNotebookScience NOW!
OpenNotebookScience NOW!petermurrayrust
 
From "Open the Social Sciences" to Open Social Science
From "Open the Social Sciences" to Open Social ScienceFrom "Open the Social Sciences" to Open Social Science
From "Open the Social Sciences" to Open Social ScienceAnkara University
 
The Content Mine (presented at UKSG)
The Content Mine (presented at UKSG)The Content Mine (presented at UKSG)
The Content Mine (presented at UKSG)petermurrayrust
 
Disruptive Communities and Technology
Disruptive Communities and TechnologyDisruptive Communities and Technology
Disruptive Communities and Technologypetermurrayrust
 
Open access for researchers, research managers and libraries
Open access for researchers, research managers and librariesOpen access for researchers, research managers and libraries
Open access for researchers, research managers and librariesIryna Kuchma
 
Open Access: Which Side Are You On
Open Access: Which Side Are You OnOpen Access: Which Side Are You On
Open Access: Which Side Are You OnJill Cirasella
 
What works and doesn't work in research dissemination
What works and doesn't work in research disseminationWhat works and doesn't work in research dissemination
What works and doesn't work in research disseminationtbirdcymru
 
Open Access: What it is and why it is required for scholarly community?
Open Access: What it is and why it is required for scholarly community?Open Access: What it is and why it is required for scholarly community?
Open Access: What it is and why it is required for scholarly community?Sukhdev Singh
 
Talking about Open Access: SMASH and Subtler Tactics
Talking about Open Access: SMASH and Subtler TacticsTalking about Open Access: SMASH and Subtler Tactics
Talking about Open Access: SMASH and Subtler TacticsJill Cirasella
 
Open Access Theses & Dissertations: Airing the Anxieties & Finding the Facts
Open Access Theses & Dissertations: Airing the Anxieties & Finding the FactsOpen Access Theses & Dissertations: Airing the Anxieties & Finding the Facts
Open Access Theses & Dissertations: Airing the Anxieties & Finding the FactsJill Cirasella
 
What the open access movement doesn't want you to know
What the open access movement doesn't want you to knowWhat the open access movement doesn't want you to know
What the open access movement doesn't want you to knowPattie Pattie
 
Open Access Publishing Crash Course
Open Access Publishing Crash CourseOpen Access Publishing Crash Course
Open Access Publishing Crash CourseJill Cirasella
 
How to get the pdf? UPDATED with LEANLIBRARY
How to get the pdf? UPDATED with LEANLIBRARYHow to get the pdf? UPDATED with LEANLIBRARY
How to get the pdf? UPDATED with LEANLIBRARYGuus van den Brekel
 
ContentMine and WikiData
ContentMine and WikiDataContentMine and WikiData
ContentMine and WikiDataTheContentMine
 
The past, present, and future of publishing
The past, present, and future of publishingThe past, present, and future of publishing
The past, present, and future of publishingJonathan Tennant
 
VOGIN IP 2021 Workshop “Hoe kom ik nu aan de full-text? – Actueler dan ooit, ...
VOGIN IP 2021 Workshop “Hoe kom ik nu aan de full-text? – Actueler dan ooit, ...VOGIN IP 2021 Workshop “Hoe kom ik nu aan de full-text? – Actueler dan ooit, ...
VOGIN IP 2021 Workshop “Hoe kom ik nu aan de full-text? – Actueler dan ooit, ...Guus van den Brekel
 
Your work, your rights? Open access in academia in the Netherlands (2012).
Your work, your rights? Open access in academia in the Netherlands (2012). Your work, your rights? Open access in academia in the Netherlands (2012).
Your work, your rights? Open access in academia in the Netherlands (2012). Sabine K. Lengger
 

What's hot (20)

Improving the troubled relationship between Scientists and Wikipedia
Improving the troubled relationship between Scientists and Wikipedia Improving the troubled relationship between Scientists and Wikipedia
Improving the troubled relationship between Scientists and Wikipedia
 
Authenticating Scientists with OpenID
Authenticating Scientists with OpenIDAuthenticating Scientists with OpenID
Authenticating Scientists with OpenID
 
Embrace the Open Revolution
Embrace the Open RevolutionEmbrace the Open Revolution
Embrace the Open Revolution
 
OpenNotebookScience NOW!
OpenNotebookScience NOW!OpenNotebookScience NOW!
OpenNotebookScience NOW!
 
From "Open the Social Sciences" to Open Social Science
From "Open the Social Sciences" to Open Social ScienceFrom "Open the Social Sciences" to Open Social Science
From "Open the Social Sciences" to Open Social Science
 
The Content Mine (presented at UKSG)
The Content Mine (presented at UKSG)The Content Mine (presented at UKSG)
The Content Mine (presented at UKSG)
 
Disruptive Communities and Technology
Disruptive Communities and TechnologyDisruptive Communities and Technology
Disruptive Communities and Technology
 
Open access for researchers, research managers and libraries
Open access for researchers, research managers and librariesOpen access for researchers, research managers and libraries
Open access for researchers, research managers and libraries
 
Open Access: Which Side Are You On
Open Access: Which Side Are You OnOpen Access: Which Side Are You On
Open Access: Which Side Are You On
 
What works and doesn't work in research dissemination
What works and doesn't work in research disseminationWhat works and doesn't work in research dissemination
What works and doesn't work in research dissemination
 
Open Access: What it is and why it is required for scholarly community?
Open Access: What it is and why it is required for scholarly community?Open Access: What it is and why it is required for scholarly community?
Open Access: What it is and why it is required for scholarly community?
 
Talking about Open Access: SMASH and Subtler Tactics
Talking about Open Access: SMASH and Subtler TacticsTalking about Open Access: SMASH and Subtler Tactics
Talking about Open Access: SMASH and Subtler Tactics
 
Open Access Theses & Dissertations: Airing the Anxieties & Finding the Facts
Open Access Theses & Dissertations: Airing the Anxieties & Finding the FactsOpen Access Theses & Dissertations: Airing the Anxieties & Finding the Facts
Open Access Theses & Dissertations: Airing the Anxieties & Finding the Facts
 
What the open access movement doesn't want you to know
What the open access movement doesn't want you to knowWhat the open access movement doesn't want you to know
What the open access movement doesn't want you to know
 
Open Access Publishing Crash Course
Open Access Publishing Crash CourseOpen Access Publishing Crash Course
Open Access Publishing Crash Course
 
How to get the pdf? UPDATED with LEANLIBRARY
How to get the pdf? UPDATED with LEANLIBRARYHow to get the pdf? UPDATED with LEANLIBRARY
How to get the pdf? UPDATED with LEANLIBRARY
 
ContentMine and WikiData
ContentMine and WikiDataContentMine and WikiData
ContentMine and WikiData
 
The past, present, and future of publishing
The past, present, and future of publishingThe past, present, and future of publishing
The past, present, and future of publishing
 
VOGIN IP 2021 Workshop “Hoe kom ik nu aan de full-text? – Actueler dan ooit, ...
VOGIN IP 2021 Workshop “Hoe kom ik nu aan de full-text? – Actueler dan ooit, ...VOGIN IP 2021 Workshop “Hoe kom ik nu aan de full-text? – Actueler dan ooit, ...
VOGIN IP 2021 Workshop “Hoe kom ik nu aan de full-text? – Actueler dan ooit, ...
 
Your work, your rights? Open access in academia in the Netherlands (2012).
Your work, your rights? Open access in academia in the Netherlands (2012). Your work, your rights? Open access in academia in the Netherlands (2012).
Your work, your rights? Open access in academia in the Netherlands (2012).
 

Viewers also liked

Mining Scientific Diagrams for facts
Mining Scientific Diagrams for facts Mining Scientific Diagrams for facts
Mining Scientific Diagrams for facts TheContentMine
 
Why ContentMining is useful
Why ContentMining is usefulWhy ContentMining is useful
Why ContentMining is usefulTheContentMine
 
Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)TheContentMine
 
OpenNotebookScience NOW!
OpenNotebookScience NOW!OpenNotebookScience NOW!
OpenNotebookScience NOW!TheContentMine
 
ContentMine: Liberating scholarship from Open publications and theses
ContentMine: Liberating scholarship from Open publications and thesesContentMine: Liberating scholarship from Open publications and theses
ContentMine: Liberating scholarship from Open publications and thesesTheContentMine
 
ContentMining and Clinical Trials
ContentMining and Clinical TrialsContentMining and Clinical Trials
ContentMining and Clinical TrialsTheContentMine
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature TheContentMine
 
Disruptive Communities and Technology
Disruptive Communities and TechnologyDisruptive Communities and Technology
Disruptive Communities and TechnologyTheContentMine
 
Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature TheContentMine
 
تصنيف البحوث
تصنيف البحوثتصنيف البحوث
تصنيف البحوثRawanAlturki
 
الانفوجرافيك
الانفوجرافيكالانفوجرافيك
الانفوجرافيكRawanAlturki
 
Canva شرح تطبيق
Canva شرح تطبيقCanva شرح تطبيق
Canva شرح تطبيقRawanAlturki
 
الأسس النفسية
الأسس النفسيةالأسس النفسية
الأسس النفسيةRawanAlturki
 
الانفوجرافيك
الانفوجرافيك الانفوجرافيك
الانفوجرافيك RawanAlturki
 

Viewers also liked (17)

Mining Scientific Diagrams for facts
Mining Scientific Diagrams for facts Mining Scientific Diagrams for facts
Mining Scientific Diagrams for facts
 
العرض
العرضالعرض
العرض
 
Why ContentMining is useful
Why ContentMining is usefulWhy ContentMining is useful
Why ContentMining is useful
 
Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)
 
Asmita Kulshrestha
Asmita KulshresthaAsmita Kulshrestha
Asmita Kulshrestha
 
BTS_Thailand
BTS_ThailandBTS_Thailand
BTS_Thailand
 
OpenNotebookScience NOW!
OpenNotebookScience NOW!OpenNotebookScience NOW!
OpenNotebookScience NOW!
 
ContentMine: Liberating scholarship from Open publications and theses
ContentMine: Liberating scholarship from Open publications and thesesContentMine: Liberating scholarship from Open publications and theses
ContentMine: Liberating scholarship from Open publications and theses
 
ContentMining and Clinical Trials
ContentMining and Clinical TrialsContentMining and Clinical Trials
ContentMining and Clinical Trials
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature
 
Disruptive Communities and Technology
Disruptive Communities and TechnologyDisruptive Communities and Technology
Disruptive Communities and Technology
 
Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature
 
تصنيف البحوث
تصنيف البحوثتصنيف البحوث
تصنيف البحوث
 
الانفوجرافيك
الانفوجرافيكالانفوجرافيك
الانفوجرافيك
 
Canva شرح تطبيق
Canva شرح تطبيقCanva شرح تطبيق
Canva شرح تطبيق
 
الأسس النفسية
الأسس النفسيةالأسس النفسية
الأسس النفسية
 
الانفوجرافيك
الانفوجرافيك الانفوجرافيك
الانفوجرافيك
 

Similar to Principles and practice of Open Science

Early Career Reseachers and Open Healthcare
Early Career Reseachers and Open HealthcareEarly Career Reseachers and Open Healthcare
Early Career Reseachers and Open Healthcarepetermurrayrust
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData TheContentMine
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustLEARN Project
 
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be BraveEarly Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be Bravepetermurrayrust
 
Embrace the Open Revolution
Embrace the Open RevolutionEmbrace the Open Revolution
Embrace the Open RevolutionTheContentMine
 
Disrupting the Publisher-Academic Complex
Disrupting the Publisher-Academic ComplexDisrupting the Publisher-Academic Complex
Disrupting the Publisher-Academic Complexpetermurrayrust
 
Publishing your research: Open Access (introduction & overview)
Publishing your research: Open Access (introduction & overview)Publishing your research: Open Access (introduction & overview)
Publishing your research: Open Access (introduction & overview)Jamie Bisset
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literatureAmanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literaturepetermurrayrust
 
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKpetermurrayrust
 
Rapid biomedical search
Rapid biomedical search Rapid biomedical search
Rapid biomedical search petermurrayrust
 
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and  Medicine from the scholarly literatureAutomatic Extraction of Science and  Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literaturepetermurrayrust
 
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literatureAutomatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literatureTheContentMine
 
Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape? Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape? TheContentMine
 
Young people in an Age of Knowledge Neocolonialism
Young people in an Age of Knowledge NeocolonialismYoung people in an Age of Knowledge Neocolonialism
Young people in an Age of Knowledge Neocolonialismpetermurrayrust
 
English Essay Narrative Techniques. Online assignment writing service.
English Essay Narrative Techniques. Online assignment writing service.English Essay Narrative Techniques. Online assignment writing service.
English Essay Narrative Techniques. Online assignment writing service.Jill Swenson
 
Paradise Lost and The Right to Read is the Right to Mine
Paradise Lost and The Right to Read is the Right to MineParadise Lost and The Right to Read is the Right to Mine
Paradise Lost and The Right to Read is the Right to Minepetermurrayrust
 
ContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific LiteratureContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific Literaturepetermurrayrust
 

Similar to Principles and practice of Open Science (20)

Early Career Reseachers and Open Healthcare
Early Career Reseachers and Open HealthcareEarly Career Reseachers and Open Healthcare
Early Career Reseachers and Open Healthcare
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-Rust
 
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be BraveEarly Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
 
Embrace the Open Revolution
Embrace the Open RevolutionEmbrace the Open Revolution
Embrace the Open Revolution
 
Disrupting the Publisher-Academic Complex
Disrupting the Publisher-Academic ComplexDisrupting the Publisher-Academic Complex
Disrupting the Publisher-Academic Complex
 
Publishing your research: Open Access (introduction & overview)
Publishing your research: Open Access (introduction & overview)Publishing your research: Open Access (introduction & overview)
Publishing your research: Open Access (introduction & overview)
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literatureAmanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature
 
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UK
 
Rapid biomedical search
Rapid biomedical search Rapid biomedical search
Rapid biomedical search
 
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and  Medicine from the scholarly literatureAutomatic Extraction of Science and  Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literature
 
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literatureAutomatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literature
 
Digital Scholarship
Digital ScholarshipDigital Scholarship
Digital Scholarship
 
Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape? Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape?
 
Young people in an Age of Knowledge Neocolonialism
Young people in an Age of Knowledge NeocolonialismYoung people in an Age of Knowledge Neocolonialism
Young people in an Age of Knowledge Neocolonialism
 
Plosslides
PlosslidesPlosslides
Plosslides
 
PLOS slides
PLOS slidesPLOS slides
PLOS slides
 
English Essay Narrative Techniques. Online assignment writing service.
English Essay Narrative Techniques. Online assignment writing service.English Essay Narrative Techniques. Online assignment writing service.
English Essay Narrative Techniques. Online assignment writing service.
 
Paradise Lost and The Right to Read is the Right to Mine
Paradise Lost and The Right to Read is the Right to MineParadise Lost and The Right to Read is the Right to Mine
Paradise Lost and The Right to Read is the Right to Mine
 
ContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific LiteratureContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific Literature
 

More from TheContentMine

High throughput mining of the scholarly literature
High throughput mining of the scholarly literature High throughput mining of the scholarly literature
High throughput mining of the scholarly literature TheContentMine
 
Open software and knowledge for MIOSS
Open software and knowledge for MIOSS Open software and knowledge for MIOSS
Open software and knowledge for MIOSS TheContentMine
 
Automatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the LiteratureAutomatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the LiteratureTheContentMine
 
Cochrane workshop 2016
Cochrane workshop 2016Cochrane workshop 2016
Cochrane workshop 2016TheContentMine
 
Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016 Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016 TheContentMine
 
Content Mining of Science and Medicine
Content Mining of Science and MedicineContent Mining of Science and Medicine
Content Mining of Science and MedicineTheContentMine
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! TheContentMine
 
Content Mining of Science in Cambridge
Content Mining of Science in CambridgeContent Mining of Science in Cambridge
Content Mining of Science in CambridgeTheContentMine
 
Open Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics InstituteOpen Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics InstituteTheContentMine
 
Open Data and Open Science
Open Data and Open ScienceOpen Data and Open Science
Open Data and Open ScienceTheContentMine
 
Mining Scientific Images
Mining Scientific ImagesMining Scientific Images
Mining Scientific ImagesTheContentMine
 
ContentMine: Open Data and Social Machines
ContentMine: Open Data and Social MachinesContentMine: Open Data and Social Machines
ContentMine: Open Data and Social MachinesTheContentMine
 
Content Mining for Machines and Humans
Content Mining for Machines and HumansContent Mining for Machines and Humans
Content Mining for Machines and HumansTheContentMine
 
TheContentMine: Mining for Everyone
TheContentMine: Mining for EveryoneTheContentMine: Mining for Everyone
TheContentMine: Mining for EveryoneTheContentMine
 
Overview of Practical Content Mining
Overview of Practical Content Mining Overview of Practical Content Mining
Overview of Practical Content Mining TheContentMine
 
Copyright Reform and Open Data
Copyright Reform and Open DataCopyright Reform and Open Data
Copyright Reform and Open DataTheContentMine
 
Content Mining at Wellcome Trust
Content Mining at Wellcome TrustContent Mining at Wellcome Trust
Content Mining at Wellcome TrustTheContentMine
 

More from TheContentMine (18)

High throughput mining of the scholarly literature
High throughput mining of the scholarly literature High throughput mining of the scholarly literature
High throughput mining of the scholarly literature
 
Open software and knowledge for MIOSS
Open software and knowledge for MIOSS Open software and knowledge for MIOSS
Open software and knowledge for MIOSS
 
Automatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the LiteratureAutomatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the Literature
 
Cochrane workshop 2016
Cochrane workshop 2016Cochrane workshop 2016
Cochrane workshop 2016
 
Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016 Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016
 
Content Mining of Science and Medicine
Content Mining of Science and MedicineContent Mining of Science and Medicine
Content Mining of Science and Medicine
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
Content Mining of Science in Cambridge
Content Mining of Science in CambridgeContent Mining of Science in Cambridge
Content Mining of Science in Cambridge
 
Open Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics InstituteOpen Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics Institute
 
Making Theses USEFUL
Making Theses USEFULMaking Theses USEFUL
Making Theses USEFUL
 
Open Data and Open Science
Open Data and Open ScienceOpen Data and Open Science
Open Data and Open Science
 
Mining Scientific Images
Mining Scientific ImagesMining Scientific Images
Mining Scientific Images
 
ContentMine: Open Data and Social Machines
ContentMine: Open Data and Social MachinesContentMine: Open Data and Social Machines
ContentMine: Open Data and Social Machines
 
Content Mining for Machines and Humans
Content Mining for Machines and HumansContent Mining for Machines and Humans
Content Mining for Machines and Humans
 
TheContentMine: Mining for Everyone
TheContentMine: Mining for EveryoneTheContentMine: Mining for Everyone
TheContentMine: Mining for Everyone
 
Overview of Practical Content Mining
Overview of Practical Content Mining Overview of Practical Content Mining
Overview of Practical Content Mining
 
Copyright Reform and Open Data
Copyright Reform and Open DataCopyright Reform and Open Data
Copyright Reform and Open Data
 
Content Mining at Wellcome Trust
Content Mining at Wellcome TrustContent Mining at Wellcome Trust
Content Mining at Wellcome Trust
 

Recently uploaded

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to VirusesAreesha Ahmad
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfTukamushabaBismark
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 

Recently uploaded (20)

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 

Principles and practice of Open Science

  • 1. Open Science Peter Murray-Rust, ContentMine.org, and University of Cambridge Opencon2015, Bologna, IT 2015-11-18 What is “Open”? Why is it essential? Open Data Content Mining – a battle we must win Young researchers are the present (Mike Eisen)
  • 2. The Right to Read is the Right to Mine**PeterMurray-Rust, 2011 http://contentmine.org
  • 3. My European Heroes Young People(ContentMine) NEELIE KROES
  • 4. Messages • The system is completely broken • We are at war with major publishers • Students have the power to change the world • Universities need help from students • Open is a state of mind • The opposite of Open is broken [1] • Friction destroys Open • Don’t buy it, build it … • … TOGETHER [1] (John Wilbanks)
  • 5. @Senficon (Julia Reda) :Text & Data mining in times of #copyright maximalism: "Elsevier stopped me doing my research" http://onsnetwork.org/chartgerink/2015/11/16/elsevi er-stopped-me-doing-my-research/ … #opencon #TDM Breaking news: Elsevier stopped me doing my research Chris Hartgerink
  • 6. I am a statistician interested in detecting potentially problematic research such as data fabrication, which results in unreliable findings and can harm policy-making, confound funding decisions, and hampers research progress. To this end, I am content mining results reported in the psychology literature. Content mining the literature is a valuable avenue of investigating research questions with innovative methods. For example, our research group has written an automated program to mine research papers for errors in the reported results and found that 1/8 papers (of 30,000) contains at least one result that could directly influence the substantive conclusion [1]. In new research, I am trying to extract test results, figures, tables, and other information reported in papers throughout the majority of the psychology literature. As such, I need the research papers published in psychology that I can mine for these data. To this end, I started ‘bulk’ downloading research papers from, for instance, Sciencedirect. I was doing this for scholarly purposes and took into account potential server load by limiting the amount of papers I downloaded per minute to 9. I had no intention to redistribute the downloaded materials, had legal access to them because my university pays a subscription, and I only wanted to extract facts from these papers. Full disclosure, I downloaded approximately 30GB of data from Sciencedirect in approximately 10 days. This boils down to a server load of 0.0021GB/[min], 0.125GB/h, 3GB/day. Approximately two weeks after I started downloading psychology research papers, Elsevier notified my university that this was a violation of the access contract, that this could be considered stealing of content, and that they wanted it to stop. My librarian explicitly instructed me to stop downloading (which I did immediately), otherwise Elsevier would cut all access to Sciencedirect for my university. I am now not able to mine a substantial part of the literature, and because of this Elsevier is directly hampering me in my research. [1] Nuijten, M. B., Hartgerink, C. H. J., van Assen, M. A. L. M., Epskamp, S., & Wicherts, J. M. (2015). The prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods, 1–22. doi: 10.3758/s13428-015-0664-2 Chris Hartgerink’s blog post
  • 8. Open Content Mining of FACTs Machines can interpret chemical reactions We have done 500,000 patents. There are > 3,000,000 reactions/year. Added value > 1B Eur.
  • 9. C) What’s the problem with this spectrum? Org. Lett., 2011, 13 (15), pp 4084–4087 Original thanks to ChemBark
  • 10. After AMI2 processing….. … AMI2 has detected a square
  • 11.
  • 12. catalogue getpapers query Daily Crawl EuPMC, arXiv CORE , HAL, (UNIV repos) ToC services PDF HTML DOC ePUB TeX XML PNG EPS CSV XLSURLs DOIs crawl quickscrape norma Normalizer Structurer Semantic Tagger Text Data Figures ami UNIV Repos search Lookup CONTENT MINING Chem Phylo Trials Crystal Plants COMMUNITY plugins Visualization and Analysis PloSONE, BMC, peerJ… Nature, IEEE, Elsevier… Publisher Sites scrapers queries taggers abstract methods references Captioned Figures Fig. 1 HTML tables 30, 000 pages/day Semantic ScholarlyHTML Facts CONTENTMINE Complete OPEN Platform for Mining Scientific Literature
  • 13. Stand back! I am about to do ContentMining! • Erriquez Daniela, Esame finale: Bologna, Aprile 2014 • Dott.ssa Elena Fiorentini, n. 0000274966, TESI DI DOTTORATO, Bologna • Qian Gou, Esame finale: Bologna, finale 2014 • Maurizio BARONTINI, UNIVERSITÀ DEGLI STUDI DELLA TUSCIA DI VITERBO • Terracciano Mario, Esame finale anno 2014
  • 14. Refs: Erriquez_Daniela_tesi, Fiorentina_Elena_tesi, Gou_Qian_Tesi, mbarontini_tesid, terracciano_maria_tesi BagOfWords for Italian Theses
  • 15. Copyright and Mining • UK (“Hargreaves”) 2014 legislation: – “personal” “non-commercial*” “research” “data analytics” – legitimizes copying (?to disk), but not publishing • PMR-premise: You cannot do reproducible scientific mining and avoid violating copyright.
  • 16. Massive political activity in Europe REDA Publisher-influenced
  • 17. Elsevier wants to control Open Data [asked by Michelle Brook]
  • 18. Scholarly infrastructure becomes closed No accountability for monitoring and control
  • 19. http://www.nytimes.com/2015/04/08/opinion/yes-we-were-warned-about- ebola.html We were stunned recently when we stumbled across an article by European researchers in Annals of Virology [1982]: “The results seem to indicate that Liberia has to be included in the Ebola virus endemic zone.” In the future, the authors asserted, “medical personnel in Liberian health centers should be aware of the possibility that they may come across active cases and thus be prepared to avoid nosocomial epidemics,” referring to hospital-acquired infection. Adage in public health: “The road to inaction is paved with research papers.” Bernice Dahn (chief medical officer of Liberia’s Ministry of Health) Vera Mussah (director of county health services) Cameron Nutt (Ebola response adviser to Partners in Health) A System Failure of Scholarly Publishing
  • 20. [1] The Military-Industrial-Academic complex (1961) (Dwight D Eisenhower, US President) Publishers Academia Glory+? $$, MS review Taxpayer Student Researcher $$ $$ in-kind The Publisher-Academic complex[1]
  • 21. [Wikipedia:] On the steps of Sproul Hall [Student] Mario Savio gave a famous speech ... But we're a bunch of raw materials that don't mean … to end up being bought by some clients of the University, be they the government, be they industry, be they organized labor, be they anyone! We're human beings! ... There's a time when the operation of the machine becomes so odious — makes you so sick at heart — that you can't take part. You can't even passively take part. And you've got to put your bodies upon the gears and upon the wheels, upon the levers, upon all the apparatus, and you've got to make it stop. And you've got to indicate to the people who run it, to the people who own it, that unless you're free, the machine will be prevented from working at all. [1] Univ California, Berkeley 1964 The Free Speech Movement
  • 22. 1970’s UK, student occupations and sit-ins University of Stirling Used without permission but with thanks and Love Liverpool , Warwick, Emmanuel Coll Camb., UCL, Glasgow, Middlesex, …
  • 24. ["How We Stopped SOPA”: This bill ... shut down whole websites. Essentially, it stopped Americans from communicating entirely with certain groups.... I called all my friends, and we stayed up all night setting up a website for this new group, Demand Progress, with an online petition opposing this noxious bill.... We [got] ... 300,000 signers.... We met with the staff of members of Congress and pleaded with them.... And then it passed unanimously.... And then, suddenly, the process stopped. Senator Ron Wyden ... put a hold on the bill.[48][49] He added, "We won this fight because everyone made themselves the hero of their own story. Everyone took it as their job to save this crucial freedom.” Robert Swartz: "Aaron was killed by the government, and MIT betrayed all of its basic principles."[116] Aaron Swartz
  • 25. Rules for Revolutionaries • Be publicly clear about your public aims. • Gather whole-hearted allies. • Choose your moment/s carefully. • Be prominent – blogs, talks, papers. • Be bold – and probably brave. • Write Liberation Software. • Create slogans, warcries, mantras.
  • 26. Take the fight to publishers. Hold them accountable for the near- criminal business models they operate on, and the stranglehold they have had on academia for too long. Extending this, I need your help. I want to know if we initiate a formal investigation into the practices of publishers, in terms of the fact that they operate within an unregulated market and enjoy enormous profits to commit immoral acts (creating knowledge inequality). …. I want to know what we can do, and if such an investigation is even feasible, and whether or not we have a legal case supporting us. Don’t sacrifice your career.. [PMR] said it best, that for any revolution blood will be spilled. If you’re making someone angry, you’re probably doing it right. But when you’re ‘advocating’ for open access, maintain one simple rule: don’t be a dick…. (and lots more) Jon Tennant 2014-11-25 http://blogs.egu.eu/palaeoblog/2014/11/25/open-access-wins-all-of- the-arguments-all-of-the-time/
  • 27. The Right to Read is The Right to Roam The Right to Mine Kinder Mass Trespass used without permission but with love and thanks
  • 28. How can we achieve Freedom? • Change the law to allow ContentMining – Hard, tedious, but necessary – Requires evidence, campaigning, making yourselves a pain in the arse… • Make all outputs Open – Requires culture change in researchers – Tools: Open Notebook Science, Github, Open source, Social media. – Needs support from funders, learned societies, universities
  • 29. Four Freedoms (Richard Stallman) The freedom to: 0 run the program as you wish, for any purpose 1 study how the program works, and change it 2 to redistribute copies 3 distribute copies of your modified program Most other “Opens” follow these principles, including CC-BY material. However “Green Open Access” is incompatible with Freedom2 and 3
  • 30. The Open Definition “Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness).”
  • 31. http://www.budapestopenaccessinitiative.org/read … an unprecedented public good. … … completely free and unrestricted access to [peer- reviewed literature] by all scientists, scholars, teachers, students, and other curious minds. … …Removing access barriers to this literature will accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be, and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge. (Budapest Open Access Initiative, 2003)
  • 32. Panton Principles for Open Data in science(2010) • PUBLISH YOUR DATA OPENLY • …make an explicit and robust statement of your wishes. • Use a recognized waiver or license that is appropriate for data. • open as defined by the Open Knowledge/Data Definition (… NOT non-commercial) • Explicit dedication of data … into the public domain via PDDL or CCZero Peter Murray-Rust, Cameron Neylon, Rufus Pollock, John Wilbanks
  • 34. Bjorn Brembs enhanced by OpenData http://bjoern.brembs.net/2015/11/dont-be-afraid-of-open-data/ This is a response to Dorothy Bishop’s post “Who’s afraid of open data?“. After we had published a paper on how Drosophila strains that are referred to by the same name in the literature (Canton S), but came from different laboratories behaved completely different in a particular behavioral experiment, Casey Bergman from Manchester contacted me, asking if we shouldn’t sequence the genomes of these five fly strains to find out how they differ. So I went and behaviorally tested each of the strains again, extracted the DNA from the 100 individuals I had just tested and sent the material to him. I also published the behavioral data immediately on our GitHub project page. Casey then sequenced the strains and made the sequences available, as well. A few weeks later, both Casey and I were contacted by Nelson Lau at Brandeis, showing us his bioinformatics analyses of our genome data. Importantly, his analyses wasn’t even close to what we had planned. On the contrary, he had looked at something I (not being a bioinformatician) would have considered orthogonal (Casey may disagree). So there we had a large chunk of work we would have never done on the data we hadn’t even started analyzing, yet. I was so thrilled! I learned so much from Nelson’s work, this was fantastic! Nelson even asked us to be co-author, to which I quickly protested and suggested, if anything, I might be mentioned in the acknowledgments for “technical assistance” – after all, I had only extracted the DNA. However, after some back-and-forth, he persuaded me with the argument that he wanted to have us as co-authors to set an example. He wanted to show everyone that sharing data is something that can bring you direct rewards in publications. He wanted us to be co-authors as a reward for posting our data and as incentive for others to let go of their fears and also post their data online.
  • 35. Arguments for Open • Open Science: – is Better Science – can reach and involve everyone – Open Science moves more quickly – Open Science challenges injustice – helps the world It also happens to: – Promote the careers of scientists – Save money
  • 36. Jean-Claude Bradley Jean-Claude Bradley was one of the most influential open scientists of our time. He was an innovator in all that he did, from Open Education to bleeding edge Open Science; in 2006, he coined the phrase Open Notebook Science. His loss is felt deeply by friends and colleagues around the world. On Monday July 14, 2014 we shall gather at Cambridge University to honour his memory and the legacy he leaves behind with a highly distinguished set of invited speakers to revisit and build upon the ideas which inspired and defined his life’s work. Wikipedia CC BY-SA
  • 37. Traditional Research and Publication “Lab” work paper/th esis Write rewrite Re-experiment publish ??? Validation?? DATA output “belongs” to publisher process “belongs” to publisher Walls of academia
  • 38. Free/Open Software Development CODE REPOSITORY World community CODE rewrite validate CODE fork CODE Re-use CODE Re-use Github, BitBucket StackOverflow, Apache inspires OSI Example: ContentMine at http://github.com/ContentMine/quickscrape BORN-OPEN-SOURCE NO WALLS
  • 39. TOOLS Open Notebook Science Open engineered repository World community INSTRUMENT validate merge MODEL CODE DATA DATA knowledge calibrate Problems are solved communally; Nothing is needlessly duplicated; “publication“ is continuous Machines and humans Working together CC-BY
  • 40. Mat Todd (Sydney) and MANY collaborators http://opensourcemalaria.org/ (Chrome)
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46. University of Southampton, BSD-like Open
  • 47. Open Source and Open Data www.crystallography.net
  • 48.
  • 49.
  • 50. OPEN CLOSED Zenodo Figshare Git Dat OpenOffice Word, PPT LabTrove, cheminfo.org Chemdraw CrystallographyOpenDB Cambridge Cryst data Centre WriteLatex / Overleaf ReadCube, Symplectic,
  • 51. From Wikipedia CC BY-SA Crowdsourcing
  • 52. Young people Jenny Molloy Ross Mounce Sam Moore Peter Kraker Rosie GraySophie Kay Sophie: 3rd yr Grad students train 1st year students PANTON ARMS Panton Fellows
  • 53. Sophie Kershaw, Panton Fellow, Training PhD Students
  • 54. Rotation-Based Learning (RBL) Phase 1: Initiator • No communication permitted between groups • Attempt to reproduce existing literature • Deliver a coherent research story by the end of Phase 1 Phase 2: Successor • Communication between groups still prohibited • Validate and develop the inherited research story • Critique your predecessors • Role of research producer vs. research user • Can this approach help to foster awareness of reproducibility issues? Throughout Phases 1 & 2: • Daily lectures on open science culture & techniques • First-hand application to own research work • Version control using GitHub • Daily group supervision
  • 55. “Do you think you would be more confident in the future about trying to apply Open techniques to your work..?” • 50% Yes, by myself • 41% Yes, with help/guidance • 9% No opinion/neutral • 0% No
  • 56. Some Children of the Digital Enlightenment • David Carroll & Joe McArthur: OAButton • Rayna Stamboliyska & Pierre-Carl Langlais • Jon Tennant • Ross Mounce • Jenny Molloy • Erin McKiernan • Jack Andraka • Michelle Brook • Heather Piwowar • TheContentMine Team • Rufus Pollock • Jonathan Gray • Sophie Kay Jean-Claude Bradley [1] a chemist developed Open notebook science; making the entire primary record of a research project publicly available online as it is recorded. (WP) J-C promoted these ideas with UNDERGRADUATE scientists. [1] Unfortunately J-C died in 2014; we held a memorial meeting in Cambridge Sophie Kay
  • 57. More Thoughts • Don’t negotiate with walled gardens, make them change or make them obsolete • Building on top of non-Open is very fragile, unpredictable and usually bad engineering
  • 58. Protecting innovation • Many start-ups get acquired and lose their mission • “Embrace, extend, exterminate” (Microsoft) • Consider adding “Open Lock” clauses to articles of incorporation

Editor's Notes

  1. Hi, I’m here to talk about AMI; a data extraction framework and tool. First, I just want highlight some of key contributors to the projects; Andy for his work on the ChemistryVisitor and Peter for the overall architecture. In this talk, I’m going to impress the importance of data in a specific format and its utility to automated machine processing. Then I’m going to demonstrate AMI’s architecture and the transformation of data as it flows through the process. I’m going to dwell a little on a core format used, Scalable Vector Graphics (SVG) before introducing the concept of visitors, which are pluggable context specific data extractors. Next, I’m going to introduce Andy’s ChemVisitor, for extracting semantic chemistry data, along with a few other visitors that can process non-chemistry specific data. Finally, I will demonstrate some uses of the ChemVisitor, within the realm of validation and metabolism.
  2. ChemBark