SlideShare a Scribd company logo
1 of 23
Download to read offline
Tabular Data on the Web
Intro to W3C CSV on the Web Specifications
Gregg Kellogg
gregg@greggkellogg.net
@gkellogg
1
Impact of Tabular Data
• Tabular Data represents a large amount of all
data published on the Web
• According to the Open Data Institute, the vast
majority of published open data is tabular
• “Over 90% of the data on data.gov.uk is
tabular data.”
• data.gov lists 158,631 datasets; largely in CSV
2
Sources of Tabular Data
• Easiest way to publish data
• Spreadsheet Dumps
• Database Dumps
• SPARQL results
3
CSV data is dumb
• It’s a simple text format, data has no inherent
meaning.
• Cells may be data-typed or have a regular
format: what does “08/09/2015” mean?
• Cells may be related to data in other tables/
columns: Foreign Keys
• Cells may be associated with different entities:
Join results
4
Web CSV
• 5-star Linked Data
• CSV URLs
• CSVs link to other CSVs
• CSVs link to other
Resources
• RDF and JSON
conversion
5
W3C CSV on the Web
• Working Group chartered to allow applications to provide higher
interoperability with working with CSV, or similar formats.
• Use Cases: http://www.w3.org/TR/csvw-ucr/
• Model for Tabular Data and Metadata on the Web: http://
www.w3.org/TR/tabular-data-model/
• Metadata Vocabulary for Tabular Data: http://www.w3.org/TR/tabular-
metadata/
• Generating JSON from Tabular Data on the Web: http://www.w3.org/
TR/csv2json/
• Generating RDF from Tabular Data on the Web: http://www.w3.org/
TR/csv2rdf/
6
Examples
7
countryCode latitude longitude name
AD 42.5 1.6 Andorra
AE 23.4 53.8 United Arab Emirates
AF 33.9 67.7 Afghanistan
countries.csv
countryRef year population
AF 1960 9,616,353
AF 1961 9,799,379
AF 1961 9,989,846
country_slice.csv
Model for Tabular Data
id
Table Group
id
Table
notes
transformations
about URL
cells
datatype
default
Column
lang
name
number
ordered
property URL
required
separator
table
text direction
titles
value URL
virtual
cells
number
primary key
titles
Row
referenced rows
source number
table
about URL
column
errors
ordered
Cell
property URL
row
string value
table
text direction
value
value URL
8
notes
foreign keys
other annotations
url
other annotations
tables
columns
rows
table direction
other annotations
rows
table
Mapping CSV to Model
• Parse CSV: RFC4180 + dialect metadata.
• delimiter, doubleQuote, headerRowCount,
lineTerminators, quoteChar, …
• Dialect Description comes from Metadata Document.
• Match Headers to Columns.
• Parse Cells using Column metadata/datatype.
• Abstract data model used for viewing, validation, and
conversions.
9
Metadata
• Finding Metadata from a CSV
• User-specified, Link Header, well-known
locations
• Matching Metadata to a CSV
• CSV must be compatible with metadata (titles/
names)
• Metadata must reference CSV URL
10
foreignKeys
columns
@id
@type
Schema
primaryKey
rowTitles
11
url
targetFormat
scriptFormat
titles
source
@id
@type
Transformation
Definition
name
titles
required
suppressOutput
virtual
@id
@type
Column Description
columnReference
reference
Foreign Key
Definition
resource
schemaReference
columnReference
Foreign Key
Reference
array property
link property
URI template property
column reference property
object property
natural language property
atomic property
Legend:
reference to an array of values of a specific category
reference to a value of a specific category
@language
@base
Top-Level
Properties
tables
transformations
tableDirection
tableSchema
dialect
@context
@id
Table Group
notes
@type
decimalChar
groupChar
pattern
Number Format
url
transformations
tableDirection
tableSchema
dialect
notes
Table
@context
@id
@type
suppressOutput
null
lang
textDirection
separator
ordered
default
datatype
Inherited Properties
aboutUrl
propertyUrl
valueUrl
required
base
format
length
minLength
maxLength
minimum
maximum
Datatype
Description
minInclusive
maxInclusive
minExclusive
maxExclusive
@id
@type
encoding
lineTerminators
quoteChar
doubleQuote
skipRows
commentPrefix
header
Dialect Description
headerRowCount
skipBlankRows
skipInitialSpace
trim
@id
delimiter
skipColumns
Schema
• Column Descriptions
• Names/Titles
• Datatype
• Primary Keys
• Foreign Key Relationships
12
Embedded Metadata
• Generally Column Titles.
• Formats may define CSV conventions for
embedded metadata.
• Principally used to determine metadata
compatibility.
• Also serves as default metadata if no file
located.
13
Datatypes
• Basic XSD datatypes
• maximum/minimum facets
• minLength/maxLength facets
• format/pattern
• RegExp, Boolean, UAX35 date/time picture
string, UAX35 number picture string
14
Other Features
• Split cells into multiple items
• Validate Primary Keys and Foreign Key
references (single and multiple columns)
• Define URL properties for columns
• Multiple subjects per column (may be URLs)
• Values as URLs
15
Conversions: JSON
countryCode latitude longitude name
AD 42.5 1.6 Andorra
AE 23.4 53.8 United Arab
Emirates
AF 33.9 67.7 Afghanistan
countries.csv
16
{
"tables": [{
"url": "http://example.org/countries.csv",
"row": [{
"url": "http://example.org/countries.csv#row=2",
"rownum": 1,
"describes": [{
"countryCoe": "AD",
"latitude": "42.5",
"longitude": "1.6",
"name": "Andorra"
}]
}, {
"url": "http://example.org/countries.csv#row=3",
"rownum": 2,
"describes": [{
"countryCode": "AE",
"latitude": "23.4",
"longitude": "53.8",
"name": "United Arab Emirates"
}]
}, {
"url": "http://example.org/countries.csv#row=4",
"rownum": 3,
"describes": [{
"countryCode": "AF",
"latitude": "33.9",
"longitude": "67.7",
"name": "Afghanistan"
}]
}]
}]
}
countries.json
countries-standard.json
Conversions: JSON (min)
countryCode latitude longitude name
AD 42.5 1.6 Andorra
AE 23.4 53.8 United Arab
Emirates
AF 33.9 67.7 Afghanistan
17
[{
"countryCode": "AD",
"latitude": "42.5",
"longitude": "1.6",
"name": "Andorra"
}, {
"countryCode": "AE",
"latitude": "23.4",
"longitude": "53.8",
"name": "United Arab Emirates"
}, {
"countryCode": "AF",
"latitude": "33.9",
"longitude": "67.7",
"name": "Afghanistan"
}]
countries.csv
countries.json
countries-minimal.json
Conversions: RDF
countryCode latitude longitude name
AD 42.5 1.6 Andorra
AE 23.4 53.8 United Arab
Emirates
AF 33.9 67.7 Afghanistan
18
@base <http://example.org/countries.csv> .
@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
_:tg a csvw:TableGroup ;
csvw:table [ a csvw:Table ;
csvw:url <http://example.org/countries.csv> ;
csvw:row [ a csvw:Row ;
csvw:rownum "1"^^xsd:integer ;
csvw:url <#row=2> ;
csvw:describes _:t1r1
], [ a csvw:Row ;
csvw:rownum "2"^^xsd:integer ;
csvw:url <#row=3> ;
csvw:describes _:t1r2
], [ a csvw:Row ;
csvw:rownum "3"^^xsd:integer ;
csvw:url <#row=4> ;
csvw:describes _:t1r3
]
] .
_:t1r1
<#countryCode> "AD" ;
<#latitude> "42.5" ;
<#longitude> "1.6" ;
<#name> "Andorra" .
_:t1r2
<#countryCode> "AE" ;
<#latitude> "23.4" ;
<#longitude> "53.8" ;
<#name> "United Arab Emirates" .
_:t1r3
<#countryCode> "AF" ;
<#latitude> "33.9" ;
<#longitude> "67.7" ;
<#name> "Afghanistan" .
countries.csv
countries.json
countries-standard.ttl
Conversions: RDF (min)
countryCode latitude longitude name
AD 42.5 1.6 Andorra
AE 23.4 53.8 United Arab
Emirates
AF 33.9 67.7 Afghanistan
19
@base <http://example.org/countries.csv> .
_:t1r1
<#countryCode> "AD" ;
<#latitude> "42.5" ;
<#longitude> "1.6" ;
<#name> "Andorra" .
_:t1r2
<#countryCode> "AE" ;
<#latitude> "23.4" ;
<#longitude> "53.8" ;
<#name> "United Arab Emirates" .
_:t1r3
<#countryCode> "AF" ;
<#latitude> "33.9" ;
<#longitude> "67.7" ;
<#name> "Afghanistan" .
countries.csv
countries.json
countries-minimal.ttl
Other examples
• Rich Annotations: JSON RDF
• Virtual Columns/Multiple Subjects: JSON RDF
• For more see Specifications and Test Suite
20
Tools
• CSVLint
• CKAN – open source data portal platform
• Socrata – cloud-based open data
• Google Fusion Tables – data visualization
• Ruby rdf-tabular – CSVW reference implementation
• RDF Distiller
• Structured Data Linter
21
Next Steps
• At-Risk – /.well-known/csvm
• More datatype formats
• Metadata in HTML (embedded JSON-LD)
• Tabular Data in HTML
• More implementations!
• Timeline
• Candidate Recommendation – July 2015
• Proposed Recommendation – Oct 2015
• W3C Recommendation – Dec 2015
22
More Information
GitHub
w3c
Gregg Kellogg
@gkellogg
gregg@greggkellogg.net
http://greggkellogg.net/
http://www.slideshare.net/gkellogg1/tabular-data-on-the-web
distiller
linterSlideshare

More Related Content

What's hot

SemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsSemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsRinke Hoekstra
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jArangoDB Database
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring dataJimmy Ray
 
Building Spring Data with MongoDB
Building Spring Data with MongoDBBuilding Spring Data with MongoDB
Building Spring Data with MongoDBMongoDB
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And VisualizationIvan Ermilov
 
Searching Relational Data with Elasticsearch
Searching Relational Data with ElasticsearchSearching Relational Data with Elasticsearch
Searching Relational Data with Elasticsearchsirensolutions
 
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...Ícaro Medeiros
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Max Neunhöffer
 
OrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsOrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsFabrizio Fortino
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyNandana Mihindukulasooriya
 
Overview of GraphQL & Clients
Overview of GraphQL & ClientsOverview of GraphQL & Clients
Overview of GraphQL & ClientsPokai Chang
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDBNorberto Leite
 
Comparison with storing data using NoSQL(CouchDB) and a relational database.
Comparison with storing data using NoSQL(CouchDB) and a relational database.Comparison with storing data using NoSQL(CouchDB) and a relational database.
Comparison with storing data using NoSQL(CouchDB) and a relational database.eross77
 
Introduction to Linked Data Platform (LDP)
Introduction to Linked Data Platform (LDP)Introduction to Linked Data Platform (LDP)
Introduction to Linked Data Platform (LDP)Hector Correa
 
Gerry McNicol Graph Databases
Gerry McNicol Graph DatabasesGerry McNicol Graph Databases
Gerry McNicol Graph DatabasesGerry McNicol
 

What's hot (20)

SemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsSemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n Bolts
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4j
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring data
 
Building Spring Data with MongoDB
Building Spring Data with MongoDBBuilding Spring Data with MongoDB
Building Spring Data with MongoDB
 
SHACL Overview
SHACL OverviewSHACL Overview
SHACL Overview
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
JSON-LD
JSON-LDJSON-LD
JSON-LD
 
Searching Relational Data with Elasticsearch
Searching Relational Data with ElasticsearchSearching Relational Data with Elasticsearch
Searching Relational Data with Elasticsearch
 
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?
 
OrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsOrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data Relationships
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core Vocabulary
 
Introduction to W3C Linked Data Platform
Introduction to W3C Linked Data PlatformIntroduction to W3C Linked Data Platform
Introduction to W3C Linked Data Platform
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
Overview of GraphQL & Clients
Overview of GraphQL & ClientsOverview of GraphQL & Clients
Overview of GraphQL & Clients
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
ArangoDB
ArangoDBArangoDB
ArangoDB
 
Comparison with storing data using NoSQL(CouchDB) and a relational database.
Comparison with storing data using NoSQL(CouchDB) and a relational database.Comparison with storing data using NoSQL(CouchDB) and a relational database.
Comparison with storing data using NoSQL(CouchDB) and a relational database.
 
Introduction to Linked Data Platform (LDP)
Introduction to Linked Data Platform (LDP)Introduction to Linked Data Platform (LDP)
Introduction to Linked Data Platform (LDP)
 
Gerry McNicol Graph Databases
Gerry McNicol Graph DatabasesGerry McNicol Graph Databases
Gerry McNicol Graph Databases
 

Viewers also liked

RDFS In A Nutshell V1
RDFS In A Nutshell V1RDFS In A Nutshell V1
RDFS In A Nutshell V1Fabien Gandon
 
Tutorials--Logarithmic Functions in Tabular and Graph Form
Tutorials--Logarithmic Functions in Tabular and Graph Form	Tutorials--Logarithmic Functions in Tabular and Graph Form
Tutorials--Logarithmic Functions in Tabular and Graph Form Media4math
 
Approaches to Develop Curriculum for Children Visual Impairment
Approaches to Develop Curriculum for Children Visual ImpairmentApproaches to Develop Curriculum for Children Visual Impairment
Approaches to Develop Curriculum for Children Visual ImpairmentRajnish Kumar Arya
 
CRL: A Rule Language for Table Analysis and Interpretation
CRL: A Rule Language for Table Analysis and InterpretationCRL: A Rule Language for Table Analysis and Interpretation
CRL: A Rule Language for Table Analysis and InterpretationAlexey Shigarov
 
Visual impairment
Visual impairmentVisual impairment
Visual impairmentCachelle
 
Visual Impairment Information and Teaching Strategies
Visual Impairment Information and Teaching StrategiesVisual Impairment Information and Teaching Strategies
Visual Impairment Information and Teaching StrategiesMauro Garcia
 
Construction ontologies
Construction ontologiesConstruction ontologies
Construction ontologiesAggoumazax Moh
 
Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)Fabien Gandon
 
Visual Impairment
Visual ImpairmentVisual Impairment
Visual Impairmentaniwilfi
 
visual impairment
visual impairmentvisual impairment
visual impairmentwajiha b
 
Visual Impairments
Visual ImpairmentsVisual Impairments
Visual ImpairmentsPetri Myllys
 
Frequency Distributions and Graphs
Frequency Distributions and GraphsFrequency Distributions and Graphs
Frequency Distributions and Graphsmonritche
 
Policies and Guidelines of Special Education in the Philippines
Policies and Guidelines of Special Education in the PhilippinesPolicies and Guidelines of Special Education in the Philippines
Policies and Guidelines of Special Education in the Philippinesmaria martha manette madrid
 

Viewers also liked (20)

RDFS In A Nutshell V1
RDFS In A Nutshell V1RDFS In A Nutshell V1
RDFS In A Nutshell V1
 
Kxu stat-anderson-ch02
Kxu stat-anderson-ch02Kxu stat-anderson-ch02
Kxu stat-anderson-ch02
 
V.i.new
V.i.newV.i.new
V.i.new
 
Tutorials--Logarithmic Functions in Tabular and Graph Form
Tutorials--Logarithmic Functions in Tabular and Graph Form	Tutorials--Logarithmic Functions in Tabular and Graph Form
Tutorials--Logarithmic Functions in Tabular and Graph Form
 
Approaches to Develop Curriculum for Children Visual Impairment
Approaches to Develop Curriculum for Children Visual ImpairmentApproaches to Develop Curriculum for Children Visual Impairment
Approaches to Develop Curriculum for Children Visual Impairment
 
V.i. ppt copy
V.i. ppt   copyV.i. ppt   copy
V.i. ppt copy
 
CRL: A Rule Language for Table Analysis and Interpretation
CRL: A Rule Language for Table Analysis and InterpretationCRL: A Rule Language for Table Analysis and Interpretation
CRL: A Rule Language for Table Analysis and Interpretation
 
Visual impairment
Visual impairmentVisual impairment
Visual impairment
 
Visual Impairment Information and Teaching Strategies
Visual Impairment Information and Teaching StrategiesVisual Impairment Information and Teaching Strategies
Visual Impairment Information and Teaching Strategies
 
Ontologies pour le Web 2.0
Ontologies pour le Web 2.0Ontologies pour le Web 2.0
Ontologies pour le Web 2.0
 
Ses 4 tabulation
Ses 4 tabulationSes 4 tabulation
Ses 4 tabulation
 
Construction ontologies
Construction ontologiesConstruction ontologies
Construction ontologies
 
Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)
 
Visual Impairment
Visual ImpairmentVisual Impairment
Visual Impairment
 
visual impairment
visual impairmentvisual impairment
visual impairment
 
visual impairment
visual impairment visual impairment
visual impairment
 
Visual Impairments
Visual ImpairmentsVisual Impairments
Visual Impairments
 
Ncf 2005
Ncf 2005Ncf 2005
Ncf 2005
 
Frequency Distributions and Graphs
Frequency Distributions and GraphsFrequency Distributions and Graphs
Frequency Distributions and Graphs
 
Policies and Guidelines of Special Education in the Philippines
Policies and Guidelines of Special Education in the PhilippinesPolicies and Guidelines of Special Education in the Philippines
Policies and Guidelines of Special Education in the Philippines
 

Similar to Tabular Data on the Web

aip-workshop1-dev-tutorial
aip-workshop1-dev-tutorialaip-workshop1-dev-tutorial
aip-workshop1-dev-tutorialMatthew Vaughn
 
Expose your data as an api is with oracle rest data services -spoug Madrid
Expose your data as an api is with oracle rest data services -spoug MadridExpose your data as an api is with oracle rest data services -spoug Madrid
Expose your data as an api is with oracle rest data services -spoug MadridVinay Kumar
 
The never-ending REST API design debate
The never-ending REST API design debateThe never-ending REST API design debate
The never-ending REST API design debateRestlet
 
Building RESTfull Data Services with WebAPI
Building RESTfull Data Services with WebAPIBuilding RESTfull Data Services with WebAPI
Building RESTfull Data Services with WebAPIGert Drapers
 
MuleSoft London Community February 2020 - MuleSoft and OData
MuleSoft London Community February 2020 - MuleSoft and ODataMuleSoft London Community February 2020 - MuleSoft and OData
MuleSoft London Community February 2020 - MuleSoft and ODataPace Integration
 
Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21vty
 
Flexible metadata schemes for research data repositories - Clarin Conference...
Flexible metadata schemes for research data repositories  - Clarin Conference...Flexible metadata schemes for research data repositories  - Clarin Conference...
Flexible metadata schemes for research data repositories - Clarin Conference...Vyacheslav Tykhonov
 
Data science at the command line
Data science at the command lineData science at the command line
Data science at the command lineSharat Chikkerur
 
2. Content Registration
2. Content Registration2. Content Registration
2. Content RegistrationCrossref
 
Semantic framework for web scraping.
Semantic framework for web scraping.Semantic framework for web scraping.
Semantic framework for web scraping.Shyjal Raazi
 
5\9 SSIS 2008R2_Training - DataFlow Basics
5\9 SSIS 2008R2_Training - DataFlow Basics5\9 SSIS 2008R2_Training - DataFlow Basics
5\9 SSIS 2008R2_Training - DataFlow BasicsPramod Singla
 
(ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service (ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service BIOVIA
 
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Informationballoon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference InformationKai Schlegel
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
 
SAS Online Training Institute in Hyderabad - C-Point
SAS Online Training Institute in Hyderabad - C-PointSAS Online Training Institute in Hyderabad - C-Point
SAS Online Training Institute in Hyderabad - C-Pointcpointss
 
The never-ending REST API design debate -- Devoxx France 2016
The never-ending REST API design debate -- Devoxx France 2016The never-ending REST API design debate -- Devoxx France 2016
The never-ending REST API design debate -- Devoxx France 2016Restlet
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaGuido Schmutz
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital.AI
 
EAD Revision, EAC-CPF introduction
EAD Revision, EAC-CPF introductionEAD Revision, EAC-CPF introduction
EAD Revision, EAC-CPF introductiontimothyryan50
 
Information Intermediaries
Information IntermediariesInformation Intermediaries
Information IntermediariesDave Reynolds
 

Similar to Tabular Data on the Web (20)

aip-workshop1-dev-tutorial
aip-workshop1-dev-tutorialaip-workshop1-dev-tutorial
aip-workshop1-dev-tutorial
 
Expose your data as an api is with oracle rest data services -spoug Madrid
Expose your data as an api is with oracle rest data services -spoug MadridExpose your data as an api is with oracle rest data services -spoug Madrid
Expose your data as an api is with oracle rest data services -spoug Madrid
 
The never-ending REST API design debate
The never-ending REST API design debateThe never-ending REST API design debate
The never-ending REST API design debate
 
Building RESTfull Data Services with WebAPI
Building RESTfull Data Services with WebAPIBuilding RESTfull Data Services with WebAPI
Building RESTfull Data Services with WebAPI
 
MuleSoft London Community February 2020 - MuleSoft and OData
MuleSoft London Community February 2020 - MuleSoft and ODataMuleSoft London Community February 2020 - MuleSoft and OData
MuleSoft London Community February 2020 - MuleSoft and OData
 
Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21
 
Flexible metadata schemes for research data repositories - Clarin Conference...
Flexible metadata schemes for research data repositories  - Clarin Conference...Flexible metadata schemes for research data repositories  - Clarin Conference...
Flexible metadata schemes for research data repositories - Clarin Conference...
 
Data science at the command line
Data science at the command lineData science at the command line
Data science at the command line
 
2. Content Registration
2. Content Registration2. Content Registration
2. Content Registration
 
Semantic framework for web scraping.
Semantic framework for web scraping.Semantic framework for web scraping.
Semantic framework for web scraping.
 
5\9 SSIS 2008R2_Training - DataFlow Basics
5\9 SSIS 2008R2_Training - DataFlow Basics5\9 SSIS 2008R2_Training - DataFlow Basics
5\9 SSIS 2008R2_Training - DataFlow Basics
 
(ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service (ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service
 
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Informationballoon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
SAS Online Training Institute in Hyderabad - C-Point
SAS Online Training Institute in Hyderabad - C-PointSAS Online Training Institute in Hyderabad - C-Point
SAS Online Training Institute in Hyderabad - C-Point
 
The never-ending REST API design debate -- Devoxx France 2016
The never-ending REST API design debate -- Devoxx France 2016The never-ending REST API design debate -- Devoxx France 2016
The never-ending REST API design debate -- Devoxx France 2016
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
 
EAD Revision, EAC-CPF introduction
EAD Revision, EAC-CPF introductionEAD Revision, EAC-CPF introduction
EAD Revision, EAC-CPF introduction
 
Information Intermediaries
Information IntermediariesInformation Intermediaries
Information Intermediaries
 

Recently uploaded

NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationMarko4394
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxeditsforyah
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 

Recently uploaded (17)

NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentation
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptx
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 

Tabular Data on the Web

  • 1. Tabular Data on the Web Intro to W3C CSV on the Web Specifications Gregg Kellogg gregg@greggkellogg.net @gkellogg 1
  • 2. Impact of Tabular Data • Tabular Data represents a large amount of all data published on the Web • According to the Open Data Institute, the vast majority of published open data is tabular • “Over 90% of the data on data.gov.uk is tabular data.” • data.gov lists 158,631 datasets; largely in CSV 2
  • 3. Sources of Tabular Data • Easiest way to publish data • Spreadsheet Dumps • Database Dumps • SPARQL results 3
  • 4. CSV data is dumb • It’s a simple text format, data has no inherent meaning. • Cells may be data-typed or have a regular format: what does “08/09/2015” mean? • Cells may be related to data in other tables/ columns: Foreign Keys • Cells may be associated with different entities: Join results 4
  • 5. Web CSV • 5-star Linked Data • CSV URLs • CSVs link to other CSVs • CSVs link to other Resources • RDF and JSON conversion 5
  • 6. W3C CSV on the Web • Working Group chartered to allow applications to provide higher interoperability with working with CSV, or similar formats. • Use Cases: http://www.w3.org/TR/csvw-ucr/ • Model for Tabular Data and Metadata on the Web: http:// www.w3.org/TR/tabular-data-model/ • Metadata Vocabulary for Tabular Data: http://www.w3.org/TR/tabular- metadata/ • Generating JSON from Tabular Data on the Web: http://www.w3.org/ TR/csv2json/ • Generating RDF from Tabular Data on the Web: http://www.w3.org/ TR/csv2rdf/ 6
  • 7. Examples 7 countryCode latitude longitude name AD 42.5 1.6 Andorra AE 23.4 53.8 United Arab Emirates AF 33.9 67.7 Afghanistan countries.csv countryRef year population AF 1960 9,616,353 AF 1961 9,799,379 AF 1961 9,989,846 country_slice.csv
  • 8. Model for Tabular Data id Table Group id Table notes transformations about URL cells datatype default Column lang name number ordered property URL required separator table text direction titles value URL virtual cells number primary key titles Row referenced rows source number table about URL column errors ordered Cell property URL row string value table text direction value value URL 8 notes foreign keys other annotations url other annotations tables columns rows table direction other annotations rows table
  • 9. Mapping CSV to Model • Parse CSV: RFC4180 + dialect metadata. • delimiter, doubleQuote, headerRowCount, lineTerminators, quoteChar, … • Dialect Description comes from Metadata Document. • Match Headers to Columns. • Parse Cells using Column metadata/datatype. • Abstract data model used for viewing, validation, and conversions. 9
  • 10. Metadata • Finding Metadata from a CSV • User-specified, Link Header, well-known locations • Matching Metadata to a CSV • CSV must be compatible with metadata (titles/ names) • Metadata must reference CSV URL 10
  • 11. foreignKeys columns @id @type Schema primaryKey rowTitles 11 url targetFormat scriptFormat titles source @id @type Transformation Definition name titles required suppressOutput virtual @id @type Column Description columnReference reference Foreign Key Definition resource schemaReference columnReference Foreign Key Reference array property link property URI template property column reference property object property natural language property atomic property Legend: reference to an array of values of a specific category reference to a value of a specific category @language @base Top-Level Properties tables transformations tableDirection tableSchema dialect @context @id Table Group notes @type decimalChar groupChar pattern Number Format url transformations tableDirection tableSchema dialect notes Table @context @id @type suppressOutput null lang textDirection separator ordered default datatype Inherited Properties aboutUrl propertyUrl valueUrl required base format length minLength maxLength minimum maximum Datatype Description minInclusive maxInclusive minExclusive maxExclusive @id @type encoding lineTerminators quoteChar doubleQuote skipRows commentPrefix header Dialect Description headerRowCount skipBlankRows skipInitialSpace trim @id delimiter skipColumns
  • 12. Schema • Column Descriptions • Names/Titles • Datatype • Primary Keys • Foreign Key Relationships 12
  • 13. Embedded Metadata • Generally Column Titles. • Formats may define CSV conventions for embedded metadata. • Principally used to determine metadata compatibility. • Also serves as default metadata if no file located. 13
  • 14. Datatypes • Basic XSD datatypes • maximum/minimum facets • minLength/maxLength facets • format/pattern • RegExp, Boolean, UAX35 date/time picture string, UAX35 number picture string 14
  • 15. Other Features • Split cells into multiple items • Validate Primary Keys and Foreign Key references (single and multiple columns) • Define URL properties for columns • Multiple subjects per column (may be URLs) • Values as URLs 15
  • 16. Conversions: JSON countryCode latitude longitude name AD 42.5 1.6 Andorra AE 23.4 53.8 United Arab Emirates AF 33.9 67.7 Afghanistan countries.csv 16 { "tables": [{ "url": "http://example.org/countries.csv", "row": [{ "url": "http://example.org/countries.csv#row=2", "rownum": 1, "describes": [{ "countryCoe": "AD", "latitude": "42.5", "longitude": "1.6", "name": "Andorra" }] }, { "url": "http://example.org/countries.csv#row=3", "rownum": 2, "describes": [{ "countryCode": "AE", "latitude": "23.4", "longitude": "53.8", "name": "United Arab Emirates" }] }, { "url": "http://example.org/countries.csv#row=4", "rownum": 3, "describes": [{ "countryCode": "AF", "latitude": "33.9", "longitude": "67.7", "name": "Afghanistan" }] }] }] } countries.json countries-standard.json
  • 17. Conversions: JSON (min) countryCode latitude longitude name AD 42.5 1.6 Andorra AE 23.4 53.8 United Arab Emirates AF 33.9 67.7 Afghanistan 17 [{ "countryCode": "AD", "latitude": "42.5", "longitude": "1.6", "name": "Andorra" }, { "countryCode": "AE", "latitude": "23.4", "longitude": "53.8", "name": "United Arab Emirates" }, { "countryCode": "AF", "latitude": "33.9", "longitude": "67.7", "name": "Afghanistan" }] countries.csv countries.json countries-minimal.json
  • 18. Conversions: RDF countryCode latitude longitude name AD 42.5 1.6 Andorra AE 23.4 53.8 United Arab Emirates AF 33.9 67.7 Afghanistan 18 @base <http://example.org/countries.csv> . @prefix csvw: <http://www.w3.org/ns/csvw#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . _:tg a csvw:TableGroup ; csvw:table [ a csvw:Table ; csvw:url <http://example.org/countries.csv> ; csvw:row [ a csvw:Row ; csvw:rownum "1"^^xsd:integer ; csvw:url <#row=2> ; csvw:describes _:t1r1 ], [ a csvw:Row ; csvw:rownum "2"^^xsd:integer ; csvw:url <#row=3> ; csvw:describes _:t1r2 ], [ a csvw:Row ; csvw:rownum "3"^^xsd:integer ; csvw:url <#row=4> ; csvw:describes _:t1r3 ] ] . _:t1r1 <#countryCode> "AD" ; <#latitude> "42.5" ; <#longitude> "1.6" ; <#name> "Andorra" . _:t1r2 <#countryCode> "AE" ; <#latitude> "23.4" ; <#longitude> "53.8" ; <#name> "United Arab Emirates" . _:t1r3 <#countryCode> "AF" ; <#latitude> "33.9" ; <#longitude> "67.7" ; <#name> "Afghanistan" . countries.csv countries.json countries-standard.ttl
  • 19. Conversions: RDF (min) countryCode latitude longitude name AD 42.5 1.6 Andorra AE 23.4 53.8 United Arab Emirates AF 33.9 67.7 Afghanistan 19 @base <http://example.org/countries.csv> . _:t1r1 <#countryCode> "AD" ; <#latitude> "42.5" ; <#longitude> "1.6" ; <#name> "Andorra" . _:t1r2 <#countryCode> "AE" ; <#latitude> "23.4" ; <#longitude> "53.8" ; <#name> "United Arab Emirates" . _:t1r3 <#countryCode> "AF" ; <#latitude> "33.9" ; <#longitude> "67.7" ; <#name> "Afghanistan" . countries.csv countries.json countries-minimal.ttl
  • 20. Other examples • Rich Annotations: JSON RDF • Virtual Columns/Multiple Subjects: JSON RDF • For more see Specifications and Test Suite 20
  • 21. Tools • CSVLint • CKAN – open source data portal platform • Socrata – cloud-based open data • Google Fusion Tables – data visualization • Ruby rdf-tabular – CSVW reference implementation • RDF Distiller • Structured Data Linter 21
  • 22. Next Steps • At-Risk – /.well-known/csvm • More datatype formats • Metadata in HTML (embedded JSON-LD) • Tabular Data in HTML • More implementations! • Timeline • Candidate Recommendation – July 2015 • Proposed Recommendation – Oct 2015 • W3C Recommendation – Dec 2015 22