This slide deck accompanies a recent webinar that was designed to answer a common request from our community – how to make data visualizations from RDF datasets. Are there tools to help with developing queries? How can people who are not conversant with SPARQL get insights into data and understand its structure? How can they run SPARQL queries developed by others?
Listen to the webinar and explore the rest of the resources: https://ontotext.com/knowledgehub/webinars/building-knowledge-data-visualization/
How to Troubleshoot Apps for the Modern Connected Worker
[Webinar]Building Knowledge through Data Visualization
1. Data Visualization with
GraphDB and Workbench
vladimir.alexiev@ontotext.com
Co-lead, Innovation and Consulting Group, Ontotext Corp
2. Outline
↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
3. Ontotext History and Essential Facts
↗ Started in 2000 as a Semantic Web pioneer
↗ As Innovation lab within Sirma Group (listed as SKK), the biggest Bulgarian software house
↗ Got spun-off and took VC investment in 2008
↗ 65 staff, HQ in Bulgaria, reps in Canada, UK, Germany and USA
↗ Over 400 person-years invested in R&D
↗ Multiple innovation & technology awards: Washington Post, BBC, FT, BAIT, etc.
↗ Member of multiple industry bodies:
↗ W3C, EDMC, ODI, LDBC, STI, DBPedia Foundation
5. GraphDB
↗ Scalable RDF 1.1 engine
↗ Platform independent
↗ W3C standards support
↗ Open source API
↗ Reasoning and consistency checking
↗ Main contributor to RDF4J project
↗ Excellent support
6. This webinar
• SPARQL editing and data visualization features available in GraphDB
Workbench (GDB WB)
• Using queries written by others: query URL, parameterization
• Data visualizations that can be added with little programming
• 3rd party SPARQL writing aids and visualization tools that can be
integrated to GraphDB (we'd be glad to do that for you)
• Full report: HTML, PDF
• Webinar: presentation, TODO recording
7. Outline
↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
15. Google Sheet Formulas
● Top left cell: get data (see next for the long ugly URL)
=importdata("http://factforge.net/repositories/ff-news?query=%23+F4%3A+Top-level+industries+by
+number+of+companies%0A%23+-+benefits+from+the+mapping+and+consolidation+of+industry+cl
assifications%0A%23+++and+predicates+in+DBPedia+done+in+the+FactForge%0A%23+-+benefits+fr
om+reasoning+-+transitive+and+symmetric+properties+across%0A%23+++the+industry+classificatio
n+taxonomy+of+FactForge%0A%0APREFIX+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%
2F%3E%0APREFIX+ff-map%3A+%3Chttp%3A%2F%2Ffactforge.net%2Fff2016-mapping%2F%3E%0A%
0ASELECT+DISTINCT+%3Ftop_industry+(COUNT(*)+AS+%3Fcount)%0A%7B%0A+++%3Fcompany+dbo
%3Aindustry+%3Findustry+.%0A+++%3Findustry+%5Eff-map%3AindustryVariant+%2F+ff-map%3Aind
ustryCenter+%3Ftop_industry+.%0A%7D%0AGROUP+BY+%3Ftop_industry+ORDER+BY+DESC(%3Fcou
nt)+")
● Third col: extract industry name from industry URL
=regexreplace(A2,"http://dbpedia.org/resource/","")
16. Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
17. Query URL
• Interactive endpoint: http://factforge.net/sparql
− versus programmatic endpoint: http://factforge.net/repositories/ff-news
• List of repos as JSON: http://factforge.net/rest/repositories
• Get query URL, then replace the endpoint
• If you dislike CSV, add Accept header, e.g.
curl -H Accept:text/tab-separated-values
18. Query Parameters
• E.g. find the industries of a given $company
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?industry {$company dbo:industry ?industry}
• Add parameter to query URL (value in NTriples format):
&$company=<http://dbpedia.org/resource/Google>
− URL: <http://dbpedia.org/resource/Google>
− plain string: "Google"
− string with language: "Google"@en
− date with XSD type: "2017-05-25"^^<http://www.w3.org/2001/XMLSchema#date>
• Try it, returns
?industry
<http://dbpedia.org/resource/Software>
<http://dbpedia.org/resource/Internet>
<http://dbpedia.org/resource/Mobile_device>
<http://dbpedia.org/resource/Cloud_computing>
19. Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
20. IRISA SQUALL (CNL)
• SQUALL (Semantic Query and Update High-Level Language).
2011-2013. Paper 1, 2 , 3, examples.
• Example question
Which person is an author of at least 10 publication-s?
• Translates to
SELECT DISTINCT ?x1 WHERE {
?x1 a :person .
{SELECT DISTINCT ?x1 (COUNT(DISTINCT ?x3) AS ?x2) WHERE {
?x3 a :publication .
?x3 :author ?x1 .
22. GrammaticalFramework and MOLTO
• GrammaticalFramework: multilingual CNL
• MOLTO: EC FP project. Ontotext publications: 1, 2, 3, 4
• Define abstract grammar about a domain, with surface
grammars for several natural languages
• When one of the surface languages is SPARQL, this enables
CNL to/from SPARQL translation
23. MOLTO: CNL query to SPARQL
Question in English/Swedish is translated to SPARQL
24. MOLTO: RDF to NL Generation (Lexicalization)
painting description in a dozen languages
25. Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
26. W3C Data Cube
W3C Data Cube ontology:
• OLAP data model
• Statistical classifications following SDMX
Many statistical datasets available as RDF, e.g.:
• Linked SDMX Data developed by Sarven Capadisli: International Monetary Fund IMF,
OECD, UN Food and Agriculture Organization FAO, Swiss Federal Statistical Office
BFS, European Central Bank ECB, World Bank, Transparency International.
• Eurostat developed by the LOD Around the Clock (LATC) project (static)
• Eurostat wrapper developed by Benedikt Kämpgen (updateable)
• US Securities and Exchange Commission SEC Edgar Wrapper developed by Benedikt
Kämpgen
• UN ComTrade developed by the Multisensor project
27. AKSW CubeViz
CubeViz: faceted statistical browser, visualization charts.
● Original project: OntoWiki addon (dependency), PHP: demo, source , wiki, used at the EU Open Data Portal.
● Currently being rewritten to JavaScript: demo (doesn't quite work), source
29. OpenCube Toolkit
OpenCube Toolkit developed by OpenCube project. Tools for:
Data Creation (conversion)
• TARQL extension: CSV/TSV files
• D2RQ extension for data cubes: relational databases
• JSON-stat2qb extension: JSON-stat
• R2RML extension: relational databases, following W3C standard
Data Expanding
• OpenCube Compatibility Explorer: (a) search LOD and find cubes compatible to expand initial cube, (b) establish
typed links
• OpenCube Aggregator: (a) creates 2n−1 new cubes: all combinations of n dimensions. (b) new observations for all
attributes of a hierarchical dimension.
• OpenCube Expander: merge two compatible cubes.
Data Exploring
• Data catalogue management: user interface (UI) templates for managing metadata on RDF data cubes and
supporting search and discovery
• OpenCube Browser: table-based visualizations
• OpenCube OLAP Browser: OLAP operations: pivot, drill-down, and roll-up
• R statistical analysis: run R data analysis scripts
• Interactive chart visualization widgets: cube slices with charts
• OpenCube MapView: visualize geo-spatial dimension: chroplet, markers, bubbles
30. CubesViewer
• CubesViewer: excellent OLAP visualization tool: demo,
CubesViewer Studio demo, source, documentation.
• Based on DataBrewery Cubes framework: source, documentation.
• Unfortunately does not yet support W3C Cubes
− We'd love to develop such feature for you (tracking issue)
40. Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
41. Visualization Toolkits
Numerous powerful and popular visualization tools, creating an amazing
variety of graphs and charts, e.g.:
● d3.js, with addons (e.g. interactive selection of chart type)
● Tableau Public edition
● Microsoft PowerBI
● GoJS
● Google Charts
● Linkurious
Specialized tools, e.g.
● CrossFilter for "faceting" of multidimensional data,
● Cubism for viewing time series
● CubeViz and OpenCube Toolkit for statistical data
● Histropedia for making advanced timelines
42. Example with GDB and Tableau
Public procurement spending through last 5 Bulgarian cabinets
(2011-2016). Sofia Datathon, March 2017. Slides, Visualization
43. Example with GDB and PowerBI
Procurements by one contracting authority in time. Filtering by government
cabinet, focusing by time interval. Sofia Hackathon, Apr 2017
44. Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
45. RDF by Example
• ONTO tool for RDF instance visualization (rdfpuml) and R2RML generation (rdf2rml).
• E.g. mapping Dun & Bradstreet company data to Financial Industry Business Ontology (FIBO)
46. RDF by Example
• Dun & Bradstreet details (top-right): 3 "measures" (NetWorth, AnnualSales, ProfitLoss)
• Total of 152 fields grouped in 32 nodes: impossible to comprehend without such diagram
47. R2RML Generation
• Model of Museum Exhibitions (for J. Paul Getty Museum)
• Includes RDB joins and field names (Gallery TMS)
48. R2RML Generated From Model
• R2RML is verbose: 3 nodes, 15 statements for every model statement
• 1 model node (representing an Exhibition at a Venue) is expanded to
15 R2RML nodes: huge savings in complexity and maintainability
• R2RML requires semantic experts, whereas model diagrams can be
understood by subject-matter experts (museum curators, commodity
trade analysts, etc)
• Details in SWIB'16 presentation
50. Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
51. Why JDBC/ODBC?
• Many viz tools (e.g. Pentaho, Centrifuge, QlikView, Tableau) have ODBC/JDBC interfaces
• To save effort of constructing query URLs and saving results, we can provide a JDBC API
to GraphDB
• The user feeds SPARQL (not SQL queries) through JDBC, SPARQL tabular results are
returned to the tool
• We can reuse Jena JDBC or another open source library
• If the tool supports ODBC not JDBC, we can use the JDBC-ODBC bridge
(sun.jdbc.odbc.JdbcOdbcDriver).
• E.g. connecting from Java to Excel using ODBC and the JDBC-ODBC bridge
52.
53. • Contact: vladimir.alexiev@ontotext.com
Lead, Innovation and Consulting Group, Ontotext Corp
• We'd be glad to deploy any 3rd party tools and integrate them to GraphDB for you!
Thanks for your attention.
Question time!
DOWNLOAD GRAPHDB FREE