A BASILar Approach for Building Web APIs on top of SPARQL Endpoints

A BASILar Approach
for Building Web APIs on top of SPARQL
Endpoints
1
Enrico Daga, Luca Panziera and Carlos Pedrinaci
June
1st,
2015

Services
and
Applica7ons
over
Linked
APIs
and
Data
-‐
ESWC2015
workshop

hDp://salad2015.linked.services/

Background
Distributed systems based on open data are a challenge for the
development of the WWW.
Two (major) approaches:
2
REST principles
» Web APIs
» Loose Coupling
» For Web Developers
Linked Data
» SPARQL Endpoints
» Data Integration
» For Semantic Web Geeks

Is it possible to beneﬁt from the advantages of
both approaches?
Can we ﬁnd a solution that minimises the barrier
to adoption?
3
Questions

Methodology (two-ways)
• Bottom-Up: use the experience (Facts)
– http://data.open.ac.uk
– Take into account consumers and providers
• Top-Down: use the principles of data services (Dimensions)
– Seamless data integration between distributed sources;
– Loose coupling between client & server
– Interoperability between services and applications;
– Description to allow search and proper data consumption.
• Organise feedback: S.W.O.T. (Method)
– Used for decision making in project management
– Web APIs and SPARQL as two possible solutions
• Extract requirements
4

{Consumers and Providers working together}
http://data.open.ac.uk
{Feedback in E-Mails, Informal conversations}
{From 2010}
{Appreciated by developers}
{SPARQL Endpoint}

6
Issue Feedback Affects Aspect
S1S
SPARQL is a rich query language, capable of selecting any portion of the data and to
exploit relations and paths between resources.
P,C di
S2S
The output can be an RDF graph, a semantic meta model that generalizes from speciﬁc
syntaxes.
C di, io
S3S The interaction is a standard protocol. P,C io, de
W1S
RDF and SPARQL are not widespread technologies, and the related data model may not
be optimised for the needs of a dedicated application
C di, ad
W2S Some requests might require too many resources (CPU, RAM). P,C ad
W3S
Embedded queries in the consumer’s code add a dependency between the application and
the provider’s data schema.
C lc
O1S
The SPARQL query language allows for a deep data exploration and design of task
tailored views.
C di
O2S The RDF output can be integrated with other RDF data with little effort. C di, io
O3S The data provider maintains a standard infrastructure. P ad
T1S
Data consumers may decide not to use the service because of a too steep learning curve
(both for querying or post- processing of the output).
C ad
T2S
The data provider cannot optimize the infrastructure in advance (or contribute to optimize
the query) and the system could crash.
P lc
T3S Changes in the data schema will break existing embedded queries. C lc
SWOT: SPARQL endpoint

SWOT: Web APIs
7
Issue Feedback Affects Aspect
S1A
Web APIs can be made simple and intuitive, the data models are made ad-hoc for
specific tasks and reused by a wide community of developers.
C di, io, de
S2A Resources of the underlying infrastructure are controlled. P lc
S3A Each resource (request) is fully decoupled from the underlying database schema. P,C lc
W1A The set of possible requests and data objects is preordered. C di
W2A The output data model cannot be customised to better fit the use case. C io
W3A Interfaces and documentation need to be setup and maintained. P de, ad
O1A
Web developers can use the service straight forward to integrate the output in the
applications. Data models are reused by different applications.
C di, ad
O2A The infrastructure can be easily optimised. P lc
O3A
Evolutions in the stored data model in many cases can be reflected in the way the
API interacts with the database wihout disrupting existing applications.
P,C lc
T1A The supported APIs may not cover relevant use cases, and may be hard to extend. P,C di, io, ad
T2A Data consumers cannot easily implement data integration strategies. C di
T3A
The cost for maintaining infrastructure and documentation increases with the
amount of functionalities/data provided.
P ad

SWOT: Result
SPARQL endpoint and Web APIs are
complementary approaches!
8

Requirements
• Data Integration. Explorable data, to be extracted and reused
as RDF as well as non-standard formats.
• Loose Coupling. Do not introduce dependencies between the
systems, both syntactically and semantically
• Description. Described for both human and agents with small
effort
• Interoperability. Customisable and reusable data models,
relevant for both consumers and providers, and formally
speciﬁed.
• +Adoption. Limit additional technologies, speciﬁcations or
formalisms, and provide new opportunities for publishers and
consumers.
9

Building Apis SImpLy
https://github.com/the-open-university/basil
10
BASIL API
Data Provider
consumes
data or views
Web
API
Data Consumer
Linked Data Cloud
tailors WEB API
(API speciﬁcation)
Web API Tailor
.
.
.
Web
API maintains
SPARQL
endpoint
deﬁnes view
(template)
Web
API
Web
API
BASIL is designed as middleware system that mediates between SPARQL
endpoints and applications.
BASIL stores SPARQL queries and builds APIs with standard and custom
outputs

Convention instead of conﬁguration
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> […] 
SELECT DISTINCT
(?related as ?identifier)
?type
(STR(?label) AS ?title)
(STR(?location) AS ?link)
FROM <http://data.open.ac.uk/context/youtube> 
FROM <http://data.open.ac.uk/context/audioboo> 
FROM <http://data.open.ac.uk/context/podcast>
FROM <http://data.open.ac.uk/context/openlearn>
FROM <http://data.open.ac.uk/context/course> 
FROM <http://data.open.ac.uk/context/qualification>
FROM <http://data.open.ac.uk/context/xcri>
WHERE { 
BIND(IRI(CONCAT("http://data.open.ac.uk/qualification/",?_qid)) AS ?qualification)
{
# related video podcasts
?related podcast:relatesToQualification ?qualification .
?related a podcast:VideoPodcast . 
?related rdfs:label ?label . 
optional { ?related bazaar:download ?location }
BIND( "VideoPodcast" as ?type ) .
} UNION {
# related audio podcasts … } UNION {
# related audioboo posts … } UNION {
# related openlearn units … } UNION {
# related youtube videos … } }
11
https://gist.github.com/enridaga/3b71423df42328ea2110
{PUT}

A CRUD interface for tailored APIs
12
• /basil/1jyd93olk8c3b → base resource, redirects to /spec
• /api → to retrieve the data
• /spec → to get and update the stored query
• /explain → to inspect the query after variables substitution
• /view → to manage views
• /api-docs → to access the Swagger description
Example:
/api?qid=q18 → with content negotiation
/api.json?qid=q18 → or .xml, .rdf, .jsonld, .csv, .nt, .ttl, . . .
/api.html-list?qid=q18 → preprocess the output using a user deﬁned view script

A journey with cURL…
curl -v http://basil.kmi.open.ac.uk/basil/1jyd93olk8c3b
curl -v http://basil.kmi.open.ac.uk/basil/1jyd93olk8c3b/spec
curl -v http://basil.kmi.open.ac.uk/basil/1jyd93olk8c3b/api
curl -v http://basil.kmi.open.ac.uk/basil/1jyd93olk8c3b/api?
qid=q18
curl -v http://basil.kmi.open.ac.uk/basil/1jyd93olk8c3b/api.json?
qid=q18
curl -v http://basil.kmi.open.ac.uk/basil/1jyd93olk8c3b/view
curl -v http://basil.kmi.open.ac.uk/basil/1jyd93olk8c3b/view/
html-list
… and with a browser:
http://basil.kmi.open.ac.uk/basil/1jyd93olk8c3b/api.html-list?
qid=q18

Benefits
• Data integration.
• Guaranteed by design, relying on SPARQL and RDF
• Vocabulary rewriting, inferences materialisation, data refactoring,
cleaning, and patching
• Views allow to shape the data to fit specific needs
• Interoperability.
• Specialised Web APIs for consumers (towards clients)
• SPARQL as a data requirement language (towards servers)
14

Beneﬁts
• Loose Coupling.
• No dependencies on (evolving!) remote schemas
• Support independent evolution of client and server
• … just change the query!
• Description.
• Swagger offers a good support for discovery, consumption and
testing
• SPARQL queries are also conceptual descriptions of data
requirements!
15

(More) benefits
• Adoption.
• No introduction of new specifications or formalisms for both data
consumers and providers.
• Moreover:
• Queries can be shared, discovered and reused.
• APIs are sharable and reusable.
• Specifications (Queries!) can be exploited to perform usage
analysis and optimisations…
• A middleware could integrate sophisticated solutions for caching,
query execution or federation …
16

Conclusions & Related Work
• Tailor (linked) open data for a task is a problem
• Query design is rarely easy
• Queries are important, we need to reuse them
• We propose a basilar approach to query reuse by exploiting Web
APIs
• Like SQL Stored Queries (SQL Views) and Procedures
– But in an open data settings use cases are unpredictable!
• Also, BASIL views are from MVC
• Linked Data Platform (to R/W RDF, Entity-based)
• Linked Data API (and others…)
– Too much conﬁguration… you need to study/depend on it!
– We don’t need a new speciﬁcation and a sophisticated API!
17

Future work
• User based evaluation
• we will apply it on data.open.ac.uk (and let you know…)
• Extend description and explanation
• Provenance
• Data cataloguing
• Support discovery and query reuse
• BASIL as the BASIC for:
– Improve availability (caching, LDFragments)
– Endpoint federations?
– APIs composition?
• Improve the BASIL reference implementation
• Better query storage
• Documentation!
• Nice web interface
18

Thank you
Enrico Daga
enrico.daga@open.ac.uk
19

A BASILar Approach for Building Web APIs on top of SPARQL Endpoints

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A BASILar Approach for Building Web APIs on top of SPARQL Endpoints

Similar to A BASILar Approach for Building Web APIs on top of SPARQL Endpoints (20)

More from Enrico Daga

More from Enrico Daga (19)

Recently uploaded

Recently uploaded (20)

A BASILar Approach for Building Web APIs on top of SPARQL Endpoints