WordPress Websites for Engineers: Elevate Your Brand
EDF2012: The Web of Data and its Five Stars
1. The Web of Data and its Five Stars
Richard Cyganiak, DERI, NUI Galway
@cygri
6 June 2012
Realising and Exploiting the EU data cloud
European Data Forum, Copenhagen, Denmark
2. Generating insight from data
• Today, data is abundant
• New middlemen find new ways of getting data to the end user
• Supply and demand for data higher than ever
• Analyst's problem is no longer a lack of relevant data, but:
• Understanding data
• Assessing applicability
• Getting it into the right form for use
• Similar problems inside and outside of the firewall
4. Tim Berners-Lee’s 5-star plan for an open web of data
★
Make data available on the Web under an open license
★★
Make it available as structured data
★★★
Use a non-proprietary format
★★★★
Use URIs to identify things
★★★★★
Link your data to other people’s data
to provide context
5. The 0th star
• Data catalog with good metadata
• Make your data findable
15. Good reasons against opening data
• Privacy
• Competitive advantage
• Producing data and charging for it as business model
• Can't get license from upstream
16. Business models
Scott Brinker, http://www.chiefmartec.com/2010/01/7-business-models-for-linked-data.html
19. Enabling re-use
• Delivering data to end users in different forms
• Combining data with other data
• 3rd party analysis of data
20. Formats in government data
• Good for re-use: MS Excel, CSV, XML, JSON, Microdata
• Not so good for re-use: Pure websites, MS Word
• Bad for re-use: PDF
• Really bad for re-use: Only charts/maps without numbers
23. Specialist formats
• Specialist tools often have specialist formats
• Few people have the tools
• Expensive
• Difficult to re-use
• (Geospatial tools, statistics packages, etc.)
24.
25. Non-proprietary formats, open standards
• CSV (dead simple)
• XML
• JSON
• RDF (good for 4+5 stars)
• OGC web services
• OAI-ORE web services
31. Turning local identifiers into URIs–Why?
• Make them globally unique
• Clarify authority
• Make them resolvable
• Make them linkable
http://data.ordnancesurvey.co.uk/id/7000000000017765
32. The schema level
By using URIs, connections that existed only in people's
minds can be put explicitly into the data model.
35. Data links
Central Contractor Registration (CCR)
Geonames
36. Linked Data Principles
1. Use URIs to name things (not only documents, but also people, locations,
concepts, etc.)
2. To enable agents (human users and machine agents alike) to look up those
names, use HTTP URIs
3. When someone looks up a URI, provide useful information (structured data in
RDF, SPARQL).
4. Include links to other URIs allowing agents to discover more things
http://www.w3.org/DesignIssues/LinkedData.html
37.
38. Summary
• In the future, data will be open by default, unless good reason not to
• Emergence of a web of data
• “Five-star plan” for getting there, dataset by dataset
• 2 stars: re-usable data!
• 3 stars: open standards!
• 4+5 stars: connect the silos!