A hands-on introduction to interrogation of Wikidata content using SPARQL, the query language used to query data represented in RDF, SKOS, OWL, and other Semantic Web standards.
Presented by myself and Peter Neish, Research Data Specialist @ University of Melbourne.
1. Intro to WikiData & SPARQL
VALA Tech Camp W4a
Jane Frazier
Ontology Operations Lead
SEEK
@mignon1915
Peter Neish
Research Data Curator
University of Melbourne
@peterneish
2. Agenda
What is RDF?
What is Linked Open Data?
What is SPARQL?
Examples and hands-on
This presentation >>> http://bit.ly/vala_sparql
3. What is RDF?
RDF
resource description framework
SKOS
simple knowledge
organization system
OWL
web ontology
language
SPARQL
SPARQL protocol
and RDF query
language
Triple
store
database for RDF
data
standards
18. Why is this stuff important?
● W3C approved standards for representing knowledge graphs
○ Lots of tools are built upon the same formats
● Each object is identified by a globally-unique machine-readable ID instead of a human-readable word or
phrase
○ Human-readable word or phrase can change when necessary without disrupting use
○ Can capture data in any language
● Any object can relate to any other object & relationship types are standard (i.e. SKOS) OR customisable
(i.e. OWL)
● Data (& the relationships between them) can be reused by people (human consumption) & products
(machine consumption) all over the web
● Can easily leverage other open data based on the same standards
○ schema.org
○ Geonames
○ Getty vocabularies (AAT, etc.)
○ Wikidata
● Lay the framework for international (multilingual & multicultural) interoperability
19. What is SPARQL?
● SPARQL Protocol and RDF Query Language
● Allows us to translate interlinked graph data into normalised tabular data that you
can do something with (eg graph or visualise)
● Is *kind of* like SQL
● Is used through a SPARQL endpoint
21. Wikidata
Wikidata is a free, collaborative, multilingual, secondary database, collecting structured
data to provide support for Wikipedia, Wikimedia Commons, the other wikis of the
Wikimedia movement, and to anyone in the world.
https://www.wikidata.org
SPARQL endpoint: https://query.wikidata.org/
24. Build things on wikidata
http://histropedia.com/timeline/vz1ndyv56m/Apple-Computers
25. Hands on
Start with this cats example
1. Go to query.wikidata.org
2. Choose examples
3. Click on cats
SELECT ?item ?itemLabel WHERE {
?item wdt:P31 wd:Q146.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Play along at home at http://bit.ly/vala_sparql
27. Modifying Wikidata query
1. Search for instance of library instead of cat
a. Change cat to library
b. Click run
2. Search for libraries in Australia
a. Click on +Filter and search for Australia
b. Click on instance of and change to country
c. Click run - seems a bit low - why don’t we get lots of libraries? (hint)
28. Improving our query
Our Query would work better if we found all libraries
SELECT ?item ?itemLabel WHERE {
?item wdt:P31/wdt:P279* wd:Q7075.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?item wdt:P17 wd:Q408.
}
http://tinyurl.com/y7ysrngp
29. Adding columns
Start with http://tinyurl.com/y7ysrngp from the last step
1. Click on +Show
2. Search for inception
3. Click on +Show
4. Search for image
Should now look like http://tinyurl.com/y8r3rp8t
Note the default behaviour is a table, check out the image grid and timeline
30. Accessing data
● Through the query page using query helper or sparql
● Through the download on the query page
● Using Linked Data content negotiation: http://www.wikidata.org/entity/Q42
● Using data url: http://www.wikidata.org/wiki/Special:EntityData/Q42.json
● Database dumps in rdf, json and xml
● SPARQL endpoint GET and POST requests. Eg cats as json
31. Further Resources
● This presentation: http://bit.ly/vala_sparql
● 5-Star Open Data: http://5stardata.info/en/
● Wikidata query manual:
https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual
● Wikidata properties: https://www.wikidata.org/wiki/Wikidata:List_of_properties
● Reasonator: https://tools.wmflabs.org/reasonator/
● Histropedia: http://histropedia.com
● Programming Historian - Linked Open Data lesson:
http://programminghistorian.org/lessons/graph-databases-and-SPARQL
● Comparison of SQL and SPARQL:
http://www.cambridgesemantics.com/semantic-university/sparql-vs-sql-intro