SlideShare a Scribd company logo
1 of 64
Download to read offline
Behind the Scenes of KnetMiner:
Towards Standardised and Interoperable
Knowledge Graphs
Harpenden, 3/6/2018

Marco Brandizi <marco.brandizi@rothamsted.ac.uk>
Find these slides on SlideShare
KnetMiner-inspired Artwork

by Hugo Dalton (hugodalton.com)
Behind the scenes of KnetMiner
Putting it on a Bigger Picture
Putting it on a Bigger Picture
<concept>
<id>1</id>
<pid>Q75WV3</pid>
<description/>
<elementOf>
<idRef>UNIPROTKB-SwissProt</idRef>
</elementOf>
<ofType>
<idRef>Protein</idRef>
</ofType>
<evidences>
<evidence>
<idRef>IMPD</idRef>
</evidence>
</evidences>
<conames>
<concept_name>
<name>Probable trehalose-phosphate phosphatase 1</name>
<isPreferred>true</isPreferred>
</concept_name>
…
<cc>
<id>Protein</id>
<fullname>Protein</fullname>
<description>
A protein is comprised of one or more Polypeptides
and potentially other molecules.
</description>
<specialisationOf>
<idRef>MolCmplx</idRef>
</specialisationOf>
</cc>
<relation>
<fromConcept>1</fromConcept>
<toConcept>3</toConcept>
<ofType>
<idRef>participates_in</idRef>
</ofType>
<evidences>
<evidence>
<idRef>ECO:0000316</idRef>
</evidence>
</evidences>
<relgds/>
</relation>
<concept>
<id>3</id>
<pid>GO:0009651</pid>
<description>response to salt stress</description>
<ofType><idRef>BioProc</idRef></ofType>
<coaccessions>
<concept_accession>
<accession>GO:0009651</accession>
<elementOf><idRef>GO</idRef></elementOf>
<ambiguous>false</ambiguous>
</concept_accession>
</coaccessions>
</concept>
Is XML/OXL Enough?
A Brief History of Data Models/Formats
The Semantic Web Approach: RDF
The Semantic Web Approach: RDF
URI Resolution
@prefix bkr: <http://www.ondex.org/bioknet/resources/> .
@prefix bk: <http://www.ondex.org/bioknet/terms/> .
@prefix bka: <http://www.ondex.org/bioknet/terms/attributes/> .
bkr:TOB1 a bk:Protein ;
bk:participates_in <http://www.wikipathways.org/id1> ;
bk:prefName "TOB1";
bk:published_in bkr:23236473.

The Turtle Syntax:
https://www.w3.org/TR/turtle/
Schema/Ontologies
Schema/Ontologies
Data store
Schema store
Schema/Ontologies
Data store
Schema store
Sharing Identifiers via URIs
Data store
Schema store
Wikipathways
Mapping Data for Interoperability
Our Data Model: The BioKNO Ontology
wp:id1
a bk:Path ; # a subclass of bk:Concept
bk:evidence bkev:IMPD ; # Imported from database, a predefined resource type.
bk:prefName "Bone Morphogenic Protein (BMP) Signalling and Regulation".
bkr:TOB1 a bk:Protein ;
dc:identifier bkr:TOB1_acc ;
bk:prefName "TOB1 HUMAN";

# A simplified link, hiding the BioPax chain:
# pathwayComponent -> BioChemicalReaction|Complex -> Protein
bk:participates_in wp:id1;


bk:is_annotated_by obo:GO_0030014. # Same URI as the OBO Gene Ontology Term.
# Structured accession, allow for linking of identifier and context.
bkr:TOB1_acc a bk:Accession ;
dcterms:identifier "TOB1";
# instance of bk:DataSource. Another predefined entity.
bk:dataSource bkds:UNIPROTKB.
BioKNO: Biological Entities
# For practical reasons, we always expect that the straight
# triple is always asserted, with the
# reified version optionally added to it.
bkr:TOB1 bk:published_in bkr:20068231.
bkr:citation_TOB1_15489334 a bk:Relation ;
# the same properties that are used for regular relations
bk:relTypeRef bk:published_in;
bk:relFrom bkr:TOB1 ;
bk:relTo bkr:15489334 ;
# An attribute
bka:score 0.95 ;

# Both attributes and object properties can be linked to a
# reified relation.
bk:evidence bkev:TextMining.
Attributes in Reified Relations
Talking to the Rest of The World
BioKNO External Ontologies Mapping Type
bk:Concept skos:Concept Subclass
bk:Relation
bk:relFrom
bk:relTypeRef
bk:relTo
rdf:Statement

rdf:subject
rdf:predicate
rdf:object
Subclass
Subproperties
(ie, mapping to RDF reified
statements)
bk:Path, bk:Participant, bk:Interaction, bk:Transport,
bk:Protein, bk:Gene
Classes with same names in BioPAX and SIO Equivalent Class
bk:participates_in
bk:has_participant
Relation Ontology (RO) properties with same names

biopax:participant (as sub-property)
Equivalent property
bk:produces
bk:produced_by
bk:consumes
bk:consumed_by
biopax:product (as sub-property)
RO properties with same names
Equivalent property
bk:regulates
bk:positively_regulates
bk:negatively_regulates
RO properties with same names Equivalent property
bk:is_a
bk:part_of, bk:has_part
bk:occurs_in, bk:co_occurs_with
skos:broader
Basic Formal Ontology (BFO)/RO properties with same
names
Equivalent property
bk:Publication schema:CreativeWork Subclass
bka:abstract
bka:title (also known as AbstractHeader)
bka:authors
dcterms:description
dcterms:title
dc:creator
Sub-property
How to Serve and Query RDF?
Typical RDF (and Data) Architecture
How to Use it, Concretely?
Playground: SPARQL Browsers
How to Use it, Concretely?
Playground: SPARQL Browsers
How to Use it, Concretely?
Playground: SPARQL Browsers
How to Use it, Concretely?
Programmatically: RDF Frameworks (Jena in this case)
How to Use it, Concretely?
Programmatically: RDF Frameworks (Jena in this case)
How to Use it, Concretely?
Programmatically: RDF Frameworks (Jena in this case)
String service = "http://localhost:3030/ds/query";
String sparql =
"PREFIX bk: <http://www.ondex.org/bioknet/terms/>n" + 

…
"n" +
"n" +
"SELECT DISTINCT ?pmid ?title ?year ?pub n" +
"{n" +
" ?prot a bk:Protein;n" +
" bk:prefName 'TOB1'.n" +
" n" +
" ?pubRel a bk:Relation;n" +
" bk:relFrom ?prot;n" +
" bk:relTo ?pub;n" +
" bka:Score ?score.n" +
" n" +
" FILTER ( ?score > 0.90 )n" +
" n" +
" ?pub n" +
" bka:PMID ?pmid ;n" +
" bka:YEAR ?dyear;n" +
" bka:abstractHeader ?titlen" +
"n" +
" BIND ( xsd:int ( ?dyear ) AS ?year )n" +
"}n" +
"LIMIT 1000";
How to Use it, Concretely?
Programmatically: RDF Frameworks (Jena in this case)
String service = "http://localhost:3030/ds/query";
String sparql =
"PREFIX bk: <http://www.ondex.org/bioknet/terms/>n" + 

…
"n" +
"n" +
"SELECT DISTINCT ?pmid ?title ?year ?pub n" +
"{n" +
" ?prot a bk:Protein;n" +
" bk:prefName 'TOB1'.n" +
" n" +
" ?pubRel a bk:Relation;n" +
" bk:relFrom ?prot;n" +
" bk:relTo ?pub;n" +
" bka:Score ?score.n" +
" n" +
" FILTER ( ?score > 0.90 )n" +
" n" +
" ?pub n" +
" bka:PMID ?pmid ;n" +
" bka:YEAR ?dyear;n" +
" bka:abstractHeader ?titlen" +
"n" +
" BIND ( xsd:int ( ?dyear ) AS ?year )n" +
"}n" +
"LIMIT 1000";
Query query = QueryFactory.create ( sparql );
QueryEngineHTTP qexec = QueryExecutionFactory.createServiceRequest(
service, query
);
ResultSet results = qexec.execSelect() ;
results.forEachRemaining ( (QuerySolution soln ) ->
{
Resource pubNode = soln.getResource ( "pub" );
String uri = pubNode.getURI ();
Literal titleNode = soln.getLiteral ( "title" );
String title = titleNode.getString ();
String titleLang = titleNode.getLanguage ();
Literal yearNode = soln.getLiteral ( "year" );
int year = yearNode.getInt ();
System.out.format (
"Publication ID: <%s>, title: %s (in %s), year: %dn",
uri, title, titleLang, year
);
});
CONSTRUCT {
?path a bk:Path;
bk:prefName ?pathName;
bk:evidence bkev:IMPD.
?bkProt a bk:Protein;
dc:identifier ?bkProtAccUri;
bk:prefName ?protName;
bk:participates_in ?path.
?bkProtAccUri a bk:Accession;
dcterms:identifier ?protName;
bk:dataSource bkds:UNIPROTKB.
}
SPARQL for Extraction, Loading, Transformation
(The Simpler-than-Ondex Way)
WHERE
{
?path a bp:Pathway;
bp:displayName ?pathName;
bp:pathwayComponent ?comp.
{
?comp a bp:BiochemicalReaction;
bp:left|bp:right ?protein.
}
UNION {
?react a bp:Complex;
bp:component ?protein.
}
?protein a bp:Protein;
bp:displayName ?protName.
BIND ( IRI ( CONCAT ( STR ( bkr: ), STR ( ?protName ) ) ) AS ?bkProt )
BIND ( IRI ( CONCAT ( STR ( ?bkProt ), "_acc" ) ) AS ?bkProtAccUri )
}
CONSTRUCT {
?path a bk:Path;
bk:prefName ?pathName;
bk:evidence bkev:IMPD.
?bkProt a bk:Protein;
dc:identifier ?bkProtAccUri;
bk:prefName ?protName;
bk:participates_in ?path.
?bkProtAccUri a bk:Accession;
dcterms:identifier ?protName;
bk:dataSource bkds:UNIPROTKB.
}
SPARQL for Extraction, Loading, Transformation
(The Simpler-than-Ondex Way)
WHERE
{
?path a bp:Pathway;
bp:displayName ?pathName;
bp:pathwayComponent ?comp.
{
?comp a bp:BiochemicalReaction;
bp:left|bp:right ?protein.
}
UNION {
?react a bp:Complex;
bp:component ?protein.
}
?protein a bp:Protein;
bp:displayName ?protName.
BIND ( IRI ( CONCAT ( STR ( bkr: ), STR ( ?protName ) ) ) AS ?bkProt )
BIND ( IRI ( CONCAT ( STR ( ?bkProt ), "_acc" ) ) AS ?bkProtAccUri )
}
SPARQL/RDF for ELT
• TARQL: Using SPARQL to RDF-Convert Tabular CSV Files
• RDF/XML can be transformed via XSL
• We have done it for bio-specific ontology definitions in Ondex
• Programmatic conversions
• Using RDF frameworks, eg, Jena, RDF4J (former Sesame), rdflib for
Python
• See also java2rdf (https://github.com/EBIBioSamples/java2rdf)
• We have used it for the Ondex->RDF converter
SPARQL/RDF for ELT
• TARQL: Using SPARQL to RDF-Convert Tabular CSV Files
• RDF/XML can be transformed via XSL
• We have done it for bio-specific ontology definitions in Ondex
• Programmatic conversions
• Using RDF frameworks, eg, Jena, RDF4J (former Sesame), rdflib for
Python
• See also java2rdf (https://github.com/EBIBioSamples/java2rdf)
• We have used it for the Ondex->RDF converter
The Bigger Picture
The Bigger Picture
https://www.economist.com/node/21521548
The Bigger Picture
https://goo.gl/n4m5xL
Artificial	Intelligence	(AI)
8
https://www.economist.com/node/21521548
The Bigger Picture
https://goo.gl/n4m5xL
Artificial	Intelligence	(AI)
8
https://www.economist.com/node/21521548
The Bigger Picture: Linked Open Data
Artificial	Intelligence	(AI)
8
https://lod-cloud.net/
In the Life Sciences
Another Graph Database World
Another Graph Database World
The Cypher Query/DML Language
Proteins->Reactions->Pathways:

// chain of paths, node selection via property (exploits indices)

MATCH (prot:Protein) - [csby:consumed_by] -> (:Reaction) -
[:part_of] -> (pway:Path{ title: ‘apoptosis’ })

// further conditions, not always so performant

WHERE prot.name =~ ‘(?i)^DNA.+’

// Usual projection and post-selection operators

RETURN prot.name, pway

// Relations can have properties

ORDER BY csby.pvalue

LIMIT 1000
Proteins->Reactions->Pathways:
// Single-path (or same-direction branching) easy to write

MATCH (prot:Protein) - [:produced_by|consumed_by] -> (:Reaction) 

- [:part_of*1..3] -> (pway:Path)

RETURN ID(prot), ID(pway) LIMIT 1000

// Very compact forms available, depending on the data

MATCH (prot:Protein) - (pway:Path) RETURN pway
Cypher as Semantic Motif Language
Cypher as Semantic Motif Language
The rdf2neo Tool
The rdf2neo Tool
The rdf2neo Tool
The rdf2neo Tool
SELECT ?iri
{
?label rdfs:subClassOf* bk:Concept.
?iri a ?label.
}
SELECT ?label
{
{
?iri a ?label.
?label rdfs:subClassOf* bk:Concept.
}
UNION {
# it's always instance of concept
BIND ( bk:Concept AS ?label )
BIND ( ?iri AS ?iri )
}
} SELECT ?name ?value
{
{
?iri ?name ?value.
VALUES ( ?name ) {
(dcterms:identifier)
(dcterms:description)
(rdfs:comment)
(bk:prefName)
(bk:altName)
}
}
UNION {
?iri ?name ?value.
?name rdfs:subPropertyOf* bk:attribute.
}
}
The rdf2neo Tool
https://github.com/Rothamsted/rdf2neo
How to Use it, Concretely?
Playground: The Neo4j Browser
How to Use it, Concretely?
Programmatically: The Neo4j Drivers (for Java in this case)
How to Use it, Concretely?
Programmatically: The Neo4j Drivers (for Java in this case)
AuthToken auth = AuthTokens.basic ( "neo4j", "test" );
try (
Driver neodb = GraphDatabase.driver ( "bolt://127.0.0.1:7687", auth );
Session session = neodb.session ();
)
{
String cypher =
"MATCH (prot:Protein{ prefName:'TOB1' }) - [r:published_in] -> (pub)n" +
"WHERE toFloat ( r.Score ) > 0.9n" +
"RETURN pub.PMID, pub.AbstractHeader, pub.YEARn" +
"ORDER BY pub.YEAR DESCn" +
"LIMIT 30";
Statement stmt = new Statement ( cypher );
StatementResult rs = session.run ( stmt );
rs.forEachRemaining ( rec -> {
String pmid = rec.get ( "pub.PMID" ).asString ();
String title = rec.get ( "pub.AbstractHeader" ).asString ();
String year = rec.get ( "pub.YEAR" ).asString ();
System.out.format (
"PMID: %s, Title: "%s", year: %sn",
pmid, title, year
);
});
}
Triple Stores vs Prop Graphs
Neo4j, Cypher DBs, Graph DBs Semantic Web/Triple Stores
Data xchg format
- No official one, just Cypher, 

Support for GraphML, RDF

+/- Focus on backing applications

+ Focus on data sharing standards

Data model
+ Relations with properties

- Metadata/schemas/ontologies management
- Relations cannot have properties (reification
required)

+ Metadata/schemas/ontologies as first citizen
and standardised OWL
Performance + complex graph traversals + Comparable in most cases
Query Language
+ Cypher is easier (eg, compact, implicit elems)?

- Expressivity issues (unions)

- No standard QL (but efforts in progress, eg,
OpenCypher)
- SPARQL is Harder? (URIs, namespaces,
verbosity)

+ SPARQL More expressive
Standardisation,
openness
+/- (TinkerPop is open, Neo4j isn’t)

+ Commercial support

+ More alive and up-to date (e.g., support for
Hadoop, nice Neo4j browser, easy installation)
+ Natively open, many open implementations

- Instability and many short-lived prototypes

- Advancements seems to be slowing down

+ Some nice open and commercial browser
(LODEStar,
Scalability,

big data
+/- Commercial support to clustering/clouds for
Neo4j

+ Open support in TinkerPop
+ Load Balancing/Cluster solutions, Commercial
Cloud support (eg GraphDB)

+ SPARQL Over TinkerPop (via SAIL inteface)
Supporting Web APIs via JSON
{
"type": "Protein",
"id": "TOB1",
"prefName": "TOB1 Human",
"participates_in":
{
"type": "Pathway",
"id": "id1",
"evidence": "IMPD",
"prefName": "Bone Morphogenic Protein (BMP) Signalling and Regulation"
},
"is_annotated_by": "GO_0030014"
}
• Designed to be compatible with browser, i.e., Javascript
• Language of choice for web APIs, web browser consuming, dynamic
web interfaces (i.e., AJAX)
• Conceptually similar to XML (trees, nested structures)
• Often used in a lightweight way, without much schema constraints
Supporting Web APIs via JSON
{
"type": "Protein",
"id": "TOB1",
"prefName": "TOB1 Human",
"participates_in":
{
"type": "Pathway",
"id": "id1",
"evidence": "IMPD",
"prefName": "Bone Morphogenic Protein (BMP) Signalling and Regulation"
},
"is_annotated_by": "GO_0030014"
}
• Designed to be compatible with browser, i.e., Javascript
• Language of choice for web APIs, web browser consuming, dynamic
web interfaces (i.e., AJAX)
• Conceptually similar to XML (trees, nested structures)
• Often used in a lightweight way, without much schema constraints
Bridging to RDF: JSON-LD
…
"@id": "bkr:TOB1",
"@type": "bk:Protein",
"prefName": "TOB1 Human",
"dcterms:identifier": "TOB1",
"is_annotated_by": "obo:GO_0030014",
"participates_in": {
"@id": "http://www.wikipathways.org/id1",
"@type": "bk:Pathway",
"evidence": "bkev:IMPD",
"prefName":

“Bone Morphogenic Protein (BMP) Signalling and Regulation"
}
}
{
"@context": {
"bk": "http://www.ondex.org/bioknet/terms/",
"bka": "http://www.ondex.org/bioknet/terms/attributes/",
"bkds": "http://www.ondex.org/bioknet/terms/dataSources/",
"bkev": "http://www.ondex.org/bioknet/terms/evidences/",
"bkr": "http://www.ondex.org/bioknet/resources/",
"dcterms": "http://purl.org/dc/terms/",
"obo": "http://purl.obolibrary.org/obo/",
"xsd": "http://www.w3.org/2001/XMLSchema#",
"@vocab": "http://www.ondex.org/bioknet/terms/",
"dcterms:identifier": { "@type": "xsd:string" },
"evidence": { "@type": “@id" }
},
…
JSON Schemas Babylon (and Our Focus)
JSON Schemas Babylon (and Our Focus)
JSON Schemas Babylon (and Our Focus)
JSON Schemas Babylon (and Our Focus)
JSON Schemas Babylon (and Our Focus)
Take-Home Messages
• From small data integration farm to sharing with the rest of the world => FAIR Principles
• Semantic Web has pros and cons
• Still useful for data model and schema governance, identifiers, complex models (namely,
ontologies)
• Alternative data sharing approaches, PG in particular
• More alive area, can be simpler (blends into existing industrial software better)
• LOD/FAIR principles not addressed much
• Integrating the two is useful
• APIs are a useful alternative/complementary approach
• LOD/FAIR principles to be addressed as well
• In our radar:
• complete the work, publishing SPARQL, Neo4j access, APIs
• Integrating similar projects in the agrifood field (e.g. BrAPI, DFW)
• Contribute to standardisation efforts like Bioschemas
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowledge Graphs

More Related Content

What's hot

The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD CloudRuben Verborgh
 
Distributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data ManagementDistributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data ManagementOlafGoerlitz
 
BioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioCatalogue
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
Bio ontologies and semantic technologies
Bio ontologies and semantic technologiesBio ontologies and semantic technologies
Bio ontologies and semantic technologiesProf. Wim Van Criekinge
 
Elastic Relevance Presentation feb4 2020
Elastic Relevance Presentation feb4 2020Elastic Relevance Presentation feb4 2020
Elastic Relevance Presentation feb4 2020Brian Nauheimer
 
Biopython
BiopythonBiopython
Biopythonbosc
 
The Semantic Web #4 - RDF (1)
The Semantic Web #4 - RDF (1)The Semantic Web #4 - RDF (1)
The Semantic Web #4 - RDF (1)Myungjin Lee
 
Tabular Data on the Web
Tabular Data on the WebTabular Data on the Web
Tabular Data on the WebGregg Kellogg
 
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.info
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.infoChunlei Wu BD2K 201601 MyGene.info and MyVariant.info
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.infoChunlei Wu
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!Alexander Byndyu
 
2015 bioinformatics databases_wim_vancriekinge
2015 bioinformatics databases_wim_vancriekinge2015 bioinformatics databases_wim_vancriekinge
2015 bioinformatics databases_wim_vancriekingeProf. Wim Van Criekinge
 
History and Background of the USEWOD Data Challenge
History and Background of the  USEWOD Data ChallengeHistory and Background of the  USEWOD Data Challenge
History and Background of the USEWOD Data ChallengeKnud Möller
 
The Semantic Web - This time... its Personal
The Semantic Web - This time... its PersonalThe Semantic Web - This time... its Personal
The Semantic Web - This time... its PersonalMark Wilkinson
 
Linked Data Technology and Status
Linked Data Technology and StatusLinked Data Technology and Status
Linked Data Technology and StatusMyungjin Lee
 

What's hot (20)

2016 02 23_biological_databases_part1
2016 02 23_biological_databases_part12016 02 23_biological_databases_part1
2016 02 23_biological_databases_part1
 
The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD Cloud
 
Distributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data ManagementDistributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data Management
 
BioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogue
 
Phpconf2008 Sphinx En
Phpconf2008 Sphinx EnPhpconf2008 Sphinx En
Phpconf2008 Sphinx En
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
XSPARQL CrEDIBLE workshop
XSPARQL CrEDIBLE workshopXSPARQL CrEDIBLE workshop
XSPARQL CrEDIBLE workshop
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
Bio ontologies and semantic technologies
Bio ontologies and semantic technologiesBio ontologies and semantic technologies
Bio ontologies and semantic technologies
 
Elastic Relevance Presentation feb4 2020
Elastic Relevance Presentation feb4 2020Elastic Relevance Presentation feb4 2020
Elastic Relevance Presentation feb4 2020
 
Biopython
BiopythonBiopython
Biopython
 
The Semantic Web #4 - RDF (1)
The Semantic Web #4 - RDF (1)The Semantic Web #4 - RDF (1)
The Semantic Web #4 - RDF (1)
 
Bioinfomatics laboratory
Bioinfomatics laboratoryBioinfomatics laboratory
Bioinfomatics laboratory
 
Tabular Data on the Web
Tabular Data on the WebTabular Data on the Web
Tabular Data on the Web
 
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.info
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.infoChunlei Wu BD2K 201601 MyGene.info and MyVariant.info
Chunlei Wu BD2K 201601 MyGene.info and MyVariant.info
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!
 
2015 bioinformatics databases_wim_vancriekinge
2015 bioinformatics databases_wim_vancriekinge2015 bioinformatics databases_wim_vancriekinge
2015 bioinformatics databases_wim_vancriekinge
 
History and Background of the USEWOD Data Challenge
History and Background of the  USEWOD Data ChallengeHistory and Background of the  USEWOD Data Challenge
History and Background of the USEWOD Data Challenge
 
The Semantic Web - This time... its Personal
The Semantic Web - This time... its PersonalThe Semantic Web - This time... its Personal
The Semantic Web - This time... its Personal
 
Linked Data Technology and Status
Linked Data Technology and StatusLinked Data Technology and Status
Linked Data Technology and Status
 

Similar to Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowledge Graphs

Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...
Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...
Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...Rothamsted Research, UK
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookReproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookKeiichiro Ono
 
Semantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care AnalyticsSemantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care AnalyticsRobert Piro
 
Semantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care AnalyticsSemantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care AnalyticsRobert Piro
 
Modware
ModwareModware
Modwarebosc
 
Bio it 2005_rdf_workshop05
Bio it 2005_rdf_workshop05Bio it 2005_rdf_workshop05
Bio it 2005_rdf_workshop05Joanne Luciano
 
The Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchThe Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchJeremy Leipzig
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod GmodJun Zhao
 
Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013Antonio De Marinis
 
Knowledge Sharing - aCCCeso
Knowledge Sharing - aCCCesoKnowledge Sharing - aCCCeso
Knowledge Sharing - aCCCesoKaitlin Thaney
 
Open-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitOpen-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitGreg Landrum
 
Functional manipulations of large data graphs 20160601
Functional manipulations of large data graphs 20160601Functional manipulations of large data graphs 20160601
Functional manipulations of large data graphs 20160601David Wood
 
Linked Data for improved organization of research data
Linked Data  for improved organization  of research dataLinked Data  for improved organization  of research data
Linked Data for improved organization of research dataSamuel Lampa
 
2011-03-29 London - drools
2011-03-29 London - drools2011-03-29 London - drools
2011-03-29 London - droolsGeoffrey De Smet
 

Similar to Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowledge Graphs (20)

Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...
Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...
Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...
 
Knetminer Backend Training, Nov 2018
Knetminer Backend Training, Nov 2018Knetminer Backend Training, Nov 2018
Knetminer Backend Training, Nov 2018
 
GoTermsAnalysisWithR
GoTermsAnalysisWithRGoTermsAnalysisWithR
GoTermsAnalysisWithR
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookReproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter Notebook
 
Semantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care AnalyticsSemantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care Analytics
 
Semantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care AnalyticsSemantic Web Technologies in Health Care Analytics
Semantic Web Technologies in Health Care Analytics
 
Modware
ModwareModware
Modware
 
Bio2RDF@BH2010
Bio2RDF@BH2010Bio2RDF@BH2010
Bio2RDF@BH2010
 
Bio it 2005_rdf_workshop05
Bio it 2005_rdf_workshop05Bio it 2005_rdf_workshop05
Bio it 2005_rdf_workshop05
 
The Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchThe Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational Research
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013
 
Bioinformatica t2-databases
Bioinformatica t2-databasesBioinformatica t2-databases
Bioinformatica t2-databases
 
Knowledge Sharing - aCCCeso
Knowledge Sharing - aCCCesoKnowledge Sharing - aCCCeso
Knowledge Sharing - aCCCeso
 
Open-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitOpen-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKit
 
Functional manipulations of large data graphs 20160601
Functional manipulations of large data graphs 20160601Functional manipulations of large data graphs 20160601
Functional manipulations of large data graphs 20160601
 
Linked Data for improved organization of research data
Linked Data  for improved organization  of research dataLinked Data  for improved organization  of research data
Linked Data for improved organization of research data
 
2011-03-29 London - drools
2011-03-29 London - drools2011-03-29 London - drools
2011-03-29 London - drools
 
Bill howe 2_databases
Bill howe 2_databasesBill howe 2_databases
Bill howe 2_databases
 
2012 03 01_bioinformatics_ii_les1
2012 03 01_bioinformatics_ii_les12012 03 01_bioinformatics_ii_les1
2012 03 01_bioinformatics_ii_les1
 

More from Rothamsted Research, UK

FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseRothamsted Research, UK
 
Interoperable Data for KnetMiner and DFW Use Cases
Interoperable Data for KnetMiner and DFW Use CasesInteroperable Data for KnetMiner and DFW Use Cases
Interoperable Data for KnetMiner and DFW Use CasesRothamsted Research, UK
 
AgriSchemas: Sharing Agrifood data with Bioschemas
AgriSchemas: Sharing Agrifood data with BioschemasAgriSchemas: Sharing Agrifood data with Bioschemas
AgriSchemas: Sharing Agrifood data with BioschemasRothamsted Research, UK
 
Publishing and Consuming FAIR Data A Case in the Agri-Food Domain
Publishing and Consuming FAIR DataA Case in the Agri-Food DomainPublishing and Consuming FAIR DataA Case in the Agri-Food Domain
Publishing and Consuming FAIR Data A Case in the Agri-Food DomainRothamsted Research, UK
 
AgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesAgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesRothamsted Research, UK
 
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
A Preliminary survey of RDF/Neo4j as backends for KnetMinerA Preliminary survey of RDF/Neo4j as backends for KnetMiner
A Preliminary survey of RDF/Neo4j as backends for KnetMinerRothamsted Research, UK
 
Towards FAIRer Biological Knowledge Networks 
Using a Hybrid Linked Data 
and...
Towards FAIRer Biological Knowledge Networks 
Using a Hybrid Linked Data 
and...Towards FAIRer Biological Knowledge Networks 
Using a Hybrid Linked Data 
and...
Towards FAIRer Biological Knowledge Networks 
Using a Hybrid Linked Data 
and...Rothamsted Research, UK
 
graph2tab, a library to convert experimental workflow graphs into tabular for...
graph2tab, a library to convert experimental workflow graphs into tabular for...graph2tab, a library to convert experimental workflow graphs into tabular for...
graph2tab, a library to convert experimental workflow graphs into tabular for...Rothamsted Research, UK
 
myEquivalents, aka a new cross-reference service
myEquivalents, aka a new cross-reference servicemyEquivalents, aka a new cross-reference service
myEquivalents, aka a new cross-reference serviceRothamsted Research, UK
 
BioSamples Database Linked Data, SWAT4LS Tutorial
BioSamples Database Linked Data, SWAT4LS TutorialBioSamples Database Linked Data, SWAT4LS Tutorial
BioSamples Database Linked Data, SWAT4LS TutorialRothamsted Research, UK
 

More from Rothamsted Research, UK (20)

FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use Case
 
Interoperable Data for KnetMiner and DFW Use Cases
Interoperable Data for KnetMiner and DFW Use CasesInteroperable Data for KnetMiner and DFW Use Cases
Interoperable Data for KnetMiner and DFW Use Cases
 
AgriSchemas: Sharing Agrifood data with Bioschemas
AgriSchemas: Sharing Agrifood data with BioschemasAgriSchemas: Sharing Agrifood data with Bioschemas
AgriSchemas: Sharing Agrifood data with Bioschemas
 
Publishing and Consuming FAIR Data A Case in the Agri-Food Domain
Publishing and Consuming FAIR DataA Case in the Agri-Food DomainPublishing and Consuming FAIR DataA Case in the Agri-Food Domain
Publishing and Consuming FAIR Data A Case in the Agri-Food Domain
 
Continuos Integration @Knetminer
Continuos Integration @KnetminerContinuos Integration @Knetminer
Continuos Integration @Knetminer
 
Better Data for a Better World
Better Data for a Better WorldBetter Data for a Better World
Better Data for a Better World
 
AgriSchemas Progress Report
AgriSchemas Progress ReportAgriSchemas Progress Report
AgriSchemas Progress Report
 
AgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesAgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use Cases
 
Notes about SWAT4LS 2018
Notes about SWAT4LS 2018Notes about SWAT4LS 2018
Notes about SWAT4LS 2018
 
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
A Preliminary survey of RDF/Neo4j as backends for KnetMinerA Preliminary survey of RDF/Neo4j as backends for KnetMiner
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
 
Towards FAIRer Biological Knowledge Networks 
Using a Hybrid Linked Data 
and...
Towards FAIRer Biological Knowledge Networks 
Using a Hybrid Linked Data 
and...Towards FAIRer Biological Knowledge Networks 
Using a Hybrid Linked Data 
and...
Towards FAIRer Biological Knowledge Networks 
Using a Hybrid Linked Data 
and...
 
graph2tab, a library to convert experimental workflow graphs into tabular for...
graph2tab, a library to convert experimental workflow graphs into tabular for...graph2tab, a library to convert experimental workflow graphs into tabular for...
graph2tab, a library to convert experimental workflow graphs into tabular for...
 
Interoperable Open Data: Which Recipes?
Interoperable Open Data: Which Recipes?Interoperable Open Data: Which Recipes?
Interoperable Open Data: Which Recipes?
 
Linked Data with the EBI RDF Platform
Linked Data with the EBI RDF PlatformLinked Data with the EBI RDF Platform
Linked Data with the EBI RDF Platform
 
BioSD Linked Data: Lessons Learned
BioSD Linked Data: Lessons LearnedBioSD Linked Data: Lessons Learned
BioSD Linked Data: Lessons Learned
 
BioSD Tutorial 2014 Editition
BioSD Tutorial 2014 EdititionBioSD Tutorial 2014 Editition
BioSD Tutorial 2014 Editition
 
myEquivalents, aka a new cross-reference service
myEquivalents, aka a new cross-reference servicemyEquivalents, aka a new cross-reference service
myEquivalents, aka a new cross-reference service
 
Dev 2014 LOD tutorial
Dev 2014 LOD tutorialDev 2014 LOD tutorial
Dev 2014 LOD tutorial
 
BioSamples Database Linked Data, SWAT4LS Tutorial
BioSamples Database Linked Data, SWAT4LS TutorialBioSamples Database Linked Data, SWAT4LS Tutorial
BioSamples Database Linked Data, SWAT4LS Tutorial
 
Semic 2013
Semic 2013Semic 2013
Semic 2013
 

Recently uploaded

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfsimulationsindia
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 

Recently uploaded (20)

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 

Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowledge Graphs

  • 1. Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowledge Graphs Harpenden, 3/6/2018
 Marco Brandizi <marco.brandizi@rothamsted.ac.uk> Find these slides on SlideShare KnetMiner-inspired Artwork
 by Hugo Dalton (hugodalton.com)
  • 2. Behind the scenes of KnetMiner
  • 3. Putting it on a Bigger Picture
  • 4. Putting it on a Bigger Picture
  • 5. <concept> <id>1</id> <pid>Q75WV3</pid> <description/> <elementOf> <idRef>UNIPROTKB-SwissProt</idRef> </elementOf> <ofType> <idRef>Protein</idRef> </ofType> <evidences> <evidence> <idRef>IMPD</idRef> </evidence> </evidences> <conames> <concept_name> <name>Probable trehalose-phosphate phosphatase 1</name> <isPreferred>true</isPreferred> </concept_name> … <cc> <id>Protein</id> <fullname>Protein</fullname> <description> A protein is comprised of one or more Polypeptides and potentially other molecules. </description> <specialisationOf> <idRef>MolCmplx</idRef> </specialisationOf> </cc> <relation> <fromConcept>1</fromConcept> <toConcept>3</toConcept> <ofType> <idRef>participates_in</idRef> </ofType> <evidences> <evidence> <idRef>ECO:0000316</idRef> </evidence> </evidences> <relgds/> </relation> <concept> <id>3</id> <pid>GO:0009651</pid> <description>response to salt stress</description> <ofType><idRef>BioProc</idRef></ofType> <coaccessions> <concept_accession> <accession>GO:0009651</accession> <elementOf><idRef>GO</idRef></elementOf> <ambiguous>false</ambiguous> </concept_accession> </coaccessions> </concept> Is XML/OXL Enough?
  • 6. A Brief History of Data Models/Formats
  • 7. The Semantic Web Approach: RDF
  • 8. The Semantic Web Approach: RDF
  • 9. URI Resolution @prefix bkr: <http://www.ondex.org/bioknet/resources/> . @prefix bk: <http://www.ondex.org/bioknet/terms/> . @prefix bka: <http://www.ondex.org/bioknet/terms/attributes/> . bkr:TOB1 a bk:Protein ; bk:participates_in <http://www.wikipathways.org/id1> ; bk:prefName "TOB1"; bk:published_in bkr:23236473.
 The Turtle Syntax: https://www.w3.org/TR/turtle/
  • 13. Sharing Identifiers via URIs Data store Schema store Wikipathways
  • 14. Mapping Data for Interoperability
  • 15.
  • 16.
  • 17. Our Data Model: The BioKNO Ontology
  • 18. wp:id1 a bk:Path ; # a subclass of bk:Concept bk:evidence bkev:IMPD ; # Imported from database, a predefined resource type. bk:prefName "Bone Morphogenic Protein (BMP) Signalling and Regulation". bkr:TOB1 a bk:Protein ; dc:identifier bkr:TOB1_acc ; bk:prefName "TOB1 HUMAN";
 # A simplified link, hiding the BioPax chain: # pathwayComponent -> BioChemicalReaction|Complex -> Protein bk:participates_in wp:id1; 
 bk:is_annotated_by obo:GO_0030014. # Same URI as the OBO Gene Ontology Term. # Structured accession, allow for linking of identifier and context. bkr:TOB1_acc a bk:Accession ; dcterms:identifier "TOB1"; # instance of bk:DataSource. Another predefined entity. bk:dataSource bkds:UNIPROTKB. BioKNO: Biological Entities
  • 19. # For practical reasons, we always expect that the straight # triple is always asserted, with the # reified version optionally added to it. bkr:TOB1 bk:published_in bkr:20068231. bkr:citation_TOB1_15489334 a bk:Relation ; # the same properties that are used for regular relations bk:relTypeRef bk:published_in; bk:relFrom bkr:TOB1 ; bk:relTo bkr:15489334 ; # An attribute bka:score 0.95 ;
 # Both attributes and object properties can be linked to a # reified relation. bk:evidence bkev:TextMining. Attributes in Reified Relations
  • 20. Talking to the Rest of The World BioKNO External Ontologies Mapping Type bk:Concept skos:Concept Subclass bk:Relation bk:relFrom bk:relTypeRef bk:relTo rdf:Statement
 rdf:subject rdf:predicate rdf:object Subclass Subproperties (ie, mapping to RDF reified statements) bk:Path, bk:Participant, bk:Interaction, bk:Transport, bk:Protein, bk:Gene Classes with same names in BioPAX and SIO Equivalent Class bk:participates_in bk:has_participant Relation Ontology (RO) properties with same names
 biopax:participant (as sub-property) Equivalent property bk:produces bk:produced_by bk:consumes bk:consumed_by biopax:product (as sub-property) RO properties with same names Equivalent property bk:regulates bk:positively_regulates bk:negatively_regulates RO properties with same names Equivalent property bk:is_a bk:part_of, bk:has_part bk:occurs_in, bk:co_occurs_with skos:broader Basic Formal Ontology (BFO)/RO properties with same names Equivalent property bk:Publication schema:CreativeWork Subclass bka:abstract bka:title (also known as AbstractHeader) bka:authors dcterms:description dcterms:title dc:creator Sub-property
  • 21.
  • 22. How to Serve and Query RDF?
  • 23. Typical RDF (and Data) Architecture
  • 24. How to Use it, Concretely? Playground: SPARQL Browsers
  • 25. How to Use it, Concretely? Playground: SPARQL Browsers
  • 26. How to Use it, Concretely? Playground: SPARQL Browsers
  • 27. How to Use it, Concretely? Programmatically: RDF Frameworks (Jena in this case)
  • 28. How to Use it, Concretely? Programmatically: RDF Frameworks (Jena in this case)
  • 29. How to Use it, Concretely? Programmatically: RDF Frameworks (Jena in this case) String service = "http://localhost:3030/ds/query"; String sparql = "PREFIX bk: <http://www.ondex.org/bioknet/terms/>n" + 
 … "n" + "n" + "SELECT DISTINCT ?pmid ?title ?year ?pub n" + "{n" + " ?prot a bk:Protein;n" + " bk:prefName 'TOB1'.n" + " n" + " ?pubRel a bk:Relation;n" + " bk:relFrom ?prot;n" + " bk:relTo ?pub;n" + " bka:Score ?score.n" + " n" + " FILTER ( ?score > 0.90 )n" + " n" + " ?pub n" + " bka:PMID ?pmid ;n" + " bka:YEAR ?dyear;n" + " bka:abstractHeader ?titlen" + "n" + " BIND ( xsd:int ( ?dyear ) AS ?year )n" + "}n" + "LIMIT 1000";
  • 30. How to Use it, Concretely? Programmatically: RDF Frameworks (Jena in this case) String service = "http://localhost:3030/ds/query"; String sparql = "PREFIX bk: <http://www.ondex.org/bioknet/terms/>n" + 
 … "n" + "n" + "SELECT DISTINCT ?pmid ?title ?year ?pub n" + "{n" + " ?prot a bk:Protein;n" + " bk:prefName 'TOB1'.n" + " n" + " ?pubRel a bk:Relation;n" + " bk:relFrom ?prot;n" + " bk:relTo ?pub;n" + " bka:Score ?score.n" + " n" + " FILTER ( ?score > 0.90 )n" + " n" + " ?pub n" + " bka:PMID ?pmid ;n" + " bka:YEAR ?dyear;n" + " bka:abstractHeader ?titlen" + "n" + " BIND ( xsd:int ( ?dyear ) AS ?year )n" + "}n" + "LIMIT 1000"; Query query = QueryFactory.create ( sparql ); QueryEngineHTTP qexec = QueryExecutionFactory.createServiceRequest( service, query ); ResultSet results = qexec.execSelect() ; results.forEachRemaining ( (QuerySolution soln ) -> { Resource pubNode = soln.getResource ( "pub" ); String uri = pubNode.getURI (); Literal titleNode = soln.getLiteral ( "title" ); String title = titleNode.getString (); String titleLang = titleNode.getLanguage (); Literal yearNode = soln.getLiteral ( "year" ); int year = yearNode.getInt (); System.out.format ( "Publication ID: <%s>, title: %s (in %s), year: %dn", uri, title, titleLang, year ); });
  • 31. CONSTRUCT { ?path a bk:Path; bk:prefName ?pathName; bk:evidence bkev:IMPD. ?bkProt a bk:Protein; dc:identifier ?bkProtAccUri; bk:prefName ?protName; bk:participates_in ?path. ?bkProtAccUri a bk:Accession; dcterms:identifier ?protName; bk:dataSource bkds:UNIPROTKB. } SPARQL for Extraction, Loading, Transformation (The Simpler-than-Ondex Way) WHERE { ?path a bp:Pathway; bp:displayName ?pathName; bp:pathwayComponent ?comp. { ?comp a bp:BiochemicalReaction; bp:left|bp:right ?protein. } UNION { ?react a bp:Complex; bp:component ?protein. } ?protein a bp:Protein; bp:displayName ?protName. BIND ( IRI ( CONCAT ( STR ( bkr: ), STR ( ?protName ) ) ) AS ?bkProt ) BIND ( IRI ( CONCAT ( STR ( ?bkProt ), "_acc" ) ) AS ?bkProtAccUri ) }
  • 32. CONSTRUCT { ?path a bk:Path; bk:prefName ?pathName; bk:evidence bkev:IMPD. ?bkProt a bk:Protein; dc:identifier ?bkProtAccUri; bk:prefName ?protName; bk:participates_in ?path. ?bkProtAccUri a bk:Accession; dcterms:identifier ?protName; bk:dataSource bkds:UNIPROTKB. } SPARQL for Extraction, Loading, Transformation (The Simpler-than-Ondex Way) WHERE { ?path a bp:Pathway; bp:displayName ?pathName; bp:pathwayComponent ?comp. { ?comp a bp:BiochemicalReaction; bp:left|bp:right ?protein. } UNION { ?react a bp:Complex; bp:component ?protein. } ?protein a bp:Protein; bp:displayName ?protName. BIND ( IRI ( CONCAT ( STR ( bkr: ), STR ( ?protName ) ) ) AS ?bkProt ) BIND ( IRI ( CONCAT ( STR ( ?bkProt ), "_acc" ) ) AS ?bkProtAccUri ) }
  • 33. SPARQL/RDF for ELT • TARQL: Using SPARQL to RDF-Convert Tabular CSV Files • RDF/XML can be transformed via XSL • We have done it for bio-specific ontology definitions in Ondex • Programmatic conversions • Using RDF frameworks, eg, Jena, RDF4J (former Sesame), rdflib for Python • See also java2rdf (https://github.com/EBIBioSamples/java2rdf) • We have used it for the Ondex->RDF converter
  • 34. SPARQL/RDF for ELT • TARQL: Using SPARQL to RDF-Convert Tabular CSV Files • RDF/XML can be transformed via XSL • We have done it for bio-specific ontology definitions in Ondex • Programmatic conversions • Using RDF frameworks, eg, Jena, RDF4J (former Sesame), rdflib for Python • See also java2rdf (https://github.com/EBIBioSamples/java2rdf) • We have used it for the Ondex->RDF converter
  • 39. The Bigger Picture: Linked Open Data Artificial Intelligence (AI) 8 https://lod-cloud.net/
  • 40. In the Life Sciences
  • 43. The Cypher Query/DML Language Proteins->Reactions->Pathways:
 // chain of paths, node selection via property (exploits indices)
 MATCH (prot:Protein) - [csby:consumed_by] -> (:Reaction) - [:part_of] -> (pway:Path{ title: ‘apoptosis’ })
 // further conditions, not always so performant
 WHERE prot.name =~ ‘(?i)^DNA.+’
 // Usual projection and post-selection operators
 RETURN prot.name, pway
 // Relations can have properties
 ORDER BY csby.pvalue
 LIMIT 1000 Proteins->Reactions->Pathways: // Single-path (or same-direction branching) easy to write
 MATCH (prot:Protein) - [:produced_by|consumed_by] -> (:Reaction) 
 - [:part_of*1..3] -> (pway:Path)
 RETURN ID(prot), ID(pway) LIMIT 1000
 // Very compact forms available, depending on the data
 MATCH (prot:Protein) - (pway:Path) RETURN pway
  • 44. Cypher as Semantic Motif Language
  • 45. Cypher as Semantic Motif Language
  • 49. The rdf2neo Tool SELECT ?iri { ?label rdfs:subClassOf* bk:Concept. ?iri a ?label. } SELECT ?label { { ?iri a ?label. ?label rdfs:subClassOf* bk:Concept. } UNION { # it's always instance of concept BIND ( bk:Concept AS ?label ) BIND ( ?iri AS ?iri ) } } SELECT ?name ?value { { ?iri ?name ?value. VALUES ( ?name ) { (dcterms:identifier) (dcterms:description) (rdfs:comment) (bk:prefName) (bk:altName) } } UNION { ?iri ?name ?value. ?name rdfs:subPropertyOf* bk:attribute. } }
  • 51. How to Use it, Concretely? Playground: The Neo4j Browser
  • 52. How to Use it, Concretely? Programmatically: The Neo4j Drivers (for Java in this case)
  • 53. How to Use it, Concretely? Programmatically: The Neo4j Drivers (for Java in this case) AuthToken auth = AuthTokens.basic ( "neo4j", "test" ); try ( Driver neodb = GraphDatabase.driver ( "bolt://127.0.0.1:7687", auth ); Session session = neodb.session (); ) { String cypher = "MATCH (prot:Protein{ prefName:'TOB1' }) - [r:published_in] -> (pub)n" + "WHERE toFloat ( r.Score ) > 0.9n" + "RETURN pub.PMID, pub.AbstractHeader, pub.YEARn" + "ORDER BY pub.YEAR DESCn" + "LIMIT 30"; Statement stmt = new Statement ( cypher ); StatementResult rs = session.run ( stmt ); rs.forEachRemaining ( rec -> { String pmid = rec.get ( "pub.PMID" ).asString (); String title = rec.get ( "pub.AbstractHeader" ).asString (); String year = rec.get ( "pub.YEAR" ).asString (); System.out.format ( "PMID: %s, Title: "%s", year: %sn", pmid, title, year ); }); }
  • 54. Triple Stores vs Prop Graphs Neo4j, Cypher DBs, Graph DBs Semantic Web/Triple Stores Data xchg format - No official one, just Cypher, 
 Support for GraphML, RDF
 +/- Focus on backing applications + Focus on data sharing standards Data model + Relations with properties - Metadata/schemas/ontologies management - Relations cannot have properties (reification required) + Metadata/schemas/ontologies as first citizen and standardised OWL Performance + complex graph traversals + Comparable in most cases Query Language + Cypher is easier (eg, compact, implicit elems)?
 - Expressivity issues (unions) - No standard QL (but efforts in progress, eg, OpenCypher) - SPARQL is Harder? (URIs, namespaces, verbosity)
 + SPARQL More expressive Standardisation, openness +/- (TinkerPop is open, Neo4j isn’t) + Commercial support + More alive and up-to date (e.g., support for Hadoop, nice Neo4j browser, easy installation) + Natively open, many open implementations - Instability and many short-lived prototypes - Advancements seems to be slowing down + Some nice open and commercial browser (LODEStar, Scalability,
 big data +/- Commercial support to clustering/clouds for Neo4j
 + Open support in TinkerPop + Load Balancing/Cluster solutions, Commercial Cloud support (eg GraphDB)
 + SPARQL Over TinkerPop (via SAIL inteface)
  • 55. Supporting Web APIs via JSON { "type": "Protein", "id": "TOB1", "prefName": "TOB1 Human", "participates_in": { "type": "Pathway", "id": "id1", "evidence": "IMPD", "prefName": "Bone Morphogenic Protein (BMP) Signalling and Regulation" }, "is_annotated_by": "GO_0030014" } • Designed to be compatible with browser, i.e., Javascript • Language of choice for web APIs, web browser consuming, dynamic web interfaces (i.e., AJAX) • Conceptually similar to XML (trees, nested structures) • Often used in a lightweight way, without much schema constraints
  • 56. Supporting Web APIs via JSON { "type": "Protein", "id": "TOB1", "prefName": "TOB1 Human", "participates_in": { "type": "Pathway", "id": "id1", "evidence": "IMPD", "prefName": "Bone Morphogenic Protein (BMP) Signalling and Regulation" }, "is_annotated_by": "GO_0030014" } • Designed to be compatible with browser, i.e., Javascript • Language of choice for web APIs, web browser consuming, dynamic web interfaces (i.e., AJAX) • Conceptually similar to XML (trees, nested structures) • Often used in a lightweight way, without much schema constraints
  • 57. Bridging to RDF: JSON-LD … "@id": "bkr:TOB1", "@type": "bk:Protein", "prefName": "TOB1 Human", "dcterms:identifier": "TOB1", "is_annotated_by": "obo:GO_0030014", "participates_in": { "@id": "http://www.wikipathways.org/id1", "@type": "bk:Pathway", "evidence": "bkev:IMPD", "prefName":
 “Bone Morphogenic Protein (BMP) Signalling and Regulation" } } { "@context": { "bk": "http://www.ondex.org/bioknet/terms/", "bka": "http://www.ondex.org/bioknet/terms/attributes/", "bkds": "http://www.ondex.org/bioknet/terms/dataSources/", "bkev": "http://www.ondex.org/bioknet/terms/evidences/", "bkr": "http://www.ondex.org/bioknet/resources/", "dcterms": "http://purl.org/dc/terms/", "obo": "http://purl.obolibrary.org/obo/", "xsd": "http://www.w3.org/2001/XMLSchema#", "@vocab": "http://www.ondex.org/bioknet/terms/", "dcterms:identifier": { "@type": "xsd:string" }, "evidence": { "@type": “@id" } }, …
  • 58. JSON Schemas Babylon (and Our Focus)
  • 59. JSON Schemas Babylon (and Our Focus)
  • 60. JSON Schemas Babylon (and Our Focus)
  • 61. JSON Schemas Babylon (and Our Focus)
  • 62. JSON Schemas Babylon (and Our Focus)
  • 63. Take-Home Messages • From small data integration farm to sharing with the rest of the world => FAIR Principles • Semantic Web has pros and cons • Still useful for data model and schema governance, identifiers, complex models (namely, ontologies) • Alternative data sharing approaches, PG in particular • More alive area, can be simpler (blends into existing industrial software better) • LOD/FAIR principles not addressed much • Integrating the two is useful • APIs are a useful alternative/complementary approach • LOD/FAIR principles to be addressed as well • In our radar: • complete the work, publishing SPARQL, Neo4j access, APIs • Integrating similar projects in the agrifood field (e.g. BrAPI, DFW) • Contribute to standardisation efforts like Bioschemas