Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Citations needed for the sum of all human knowledge: Wikidata as the missing link between scholarly publishing and linked open data
1. [citations needed]
for the sum of all human knowledge
Dario Taraborelli
@readermeter
COASP 2016 • September 21, 2016
2.
3. 1. a major entry point into the scholarly literature
4. top sources of DOI lookups
http://crosstech.crossref.org/2014/02/many-metrics-such-data-wow.html
http://blog.crossref.org/2016/05/https-and-wikipedia.html
wikipedia.org
5. world’s most accessed online medical resource
Heilman and West (2015) doi.org/10.2196/jmir.4069
6. most visited resource on Ebola in West Africa
Heilman (2016) http://tinyurl.com/jfuyduv
Most used internet site in Liberia,
Sierra Leone and Guinea for
Ebola during 2014 outbreak
Greater than CNN, CDC and WHO
12. Free knowledge base that anyone can edit
Launched in 2012
Integrated with Wikipedia and other sister
projects
Statistics (September 2016)
Over 20M items
Over 100M statements
Fastest growing active editor population
among largest Wikimedia projects
17. Expert curation of scientific open data
Benjamin Good (2016) Opportunities and challenges
presented by Wikidata in the context of biocuration
http://tinyurl.com/hk9qrmz
18. Expert curation of scientific open data
Gene Wiki: WIkidata SPARQL examples
https://bitbucket.org/sulab/wikidatasparqlexamples/overview
Get all known drug-drug interactions for Methadone via its CHEMBL id
Get a list of all diseases known to be treated by Metformin
Get a list of all diseases that might be treated by Metformin
20. Benjamin Good (2016) Opportunities and challenges presented by Wikidata in the context of biocuration
http://tinyurl.com/hk9qrmz
21.
22. WikiCite: goals
Build a repository of all Wikimedia citations
and bibliographic metadata
Design data models and technology to improve the coverage,
quality, standards-compliance and machine-readability of
citations and bibliographic metadata in Wikimedia projects
@wikicite • meta.wikimedia.org/wiki/WikiCite
28. The Zika corpus
Open citation graph layer
Bibliographic metadata layer
Expert annotation layer
Encyclopedic layer
29. Most cited authors in the Zika research corpus (filtered by journal, OA status or type of statement)
SPARQL: http://tinyurl.com/jb8da68
30. Semi-automated recommendation of entities, missing statements, references for unsourced statements
https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool
https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References
https://meta.wikimedia.org/wiki/Grants:Project/WikiFactMine
31. all statements citing a New York Times article
most popular journals cited by statements of any item that is a subclass of economics
all statements citing the works of Joseph Stiglitz
all statements citing journal articles by physicists at Oxford University in the 1970s
all statements citing a journal article that was retracted
all statements citing a source that cites a journal article that was retracted
https://meta.wikimedia.org/wiki/WikiCite_2016/Report/Group_5
33. 1. release open citation data
Distributing references via Crossref: blog.crossref.org/2016/06/distributing-references-via-crossref.html
34. 2. use licenses supporting content mining
1. release open citation data
The Right to Read Is the Right to Mine: blog.okfn.org/2012/06/01/the-right-to-read-is-the-right-to-mine
Crossref Text and Data Mining Services: tdmsupport.crossref.org/
36. Thank you
Acknowledgments
Daniel Mietchen, Jonathan Dugan, Lydia Pintscher, Cameron Neylon, James Hare, James Heilman,
Magnus Manske, Egon Willighagen, the Gene Wiki team (especially Andra Waagmeester, Tim
Putman, Benjamin Good), the ContentMine team, the University of Chicago Knowledge Lab, all
WikiCite 2016 participants and Wikidata Source Metadata project contributors.
Additional image credits
Library, National Park Service Collection thenounproject.com/term/library/191/ [CC0]
Robot, Creative Stall thenounproject.com/term/robot/132360/ [CC BY]
Open Access logo commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_transparent.svg [CC0]
dario@wikimedia.org • @readermeter • @Wikidata • @WikiCite • @WikiResearch