SlideShare a Scribd company logo
1 of 32
A Conversation with PHUSE,
Stardog, and FDA DRSI
Knowledge Graphs: Changing
How We Think About Data.
Presenters
Tim Williams
• Knowledge Graph Project Lead, PHUSE
• Lead Statistical Solutions Analyst, UCB Biosciences
Laura Firey
• Product Manager, Stardog
2
• Knowledge Graphs [ ~20 min]
• Cross Industry Perspective [~10min]
• Open Discussion [~30min]
Outline
Most important: “A Conversation...”
3
PHUSE Linked Data Projects
• CDISC Foundational Standards in RDF
• CDISC Conformance Checks
• Reusing Medical Summaries for Enabling Clinical Research
• Regulatory Guidance in RDF
• Clinical Program Design in RDF
• CDISC Protocol Representation Model in RDF
• Analysis Results & Metadata
• RDF Data Cubes for clinical trial results
• Clinical Trials Data as RDF
• Study Data Tabulation Model as Linked Data
• Going Translational with Linked Data
• Study Data Validation and Submission Conformance
• Pre-clinical data + submission metadata
4
PHUSE Linked Data Workshop
5
Thinking About Data
6
Knowledge Graph Data-Centric Model
• Use-case neutral
• Real-world processes,
entities, relationships
Person 1
Hookah
Person 2
• Knowledge Graph as Resource
Description Framework (RDF)
e-Cigarette
smokes
e-Cigarette
smokes smokes smokes
The whiteboard model you draw is the data.
7
• Shared Definitions and Understanding
– What is a Tobacco Product?
– Who is a Tobacco User?
• Coding of
– Medical Conditions
– Adverse Events
– Products, Manufacturers
• Data Classification, Rules, Validation..
• Ontology Driven/Supported
Knowledge Graphs Facilitate Standards
8
• Dictionary
– terms and their definitions
• Taxonomy
– class hierarchy
• Thesaurus
– relationships between terms
Ontology-based Knowledge Graph Standards
• Rules and Restrictions
– Group membership, exclusions, types
– Employ reasoner, infer values, relations
A Data Engineer’s Guide to Semantic Modelling
Ilaria Maresi (June 2020)
http://blog.thehyve.nl/news/ebook-semantic-model
What is an Ontology?
9
Standards: Identifier Examples
1,2,3-TrihydroxypropaneGlycerolGlycerin
http://dbpedia.org/page/GlycerolGlycerin
• Common terminology • Link to other Knowledge Graphs
• Manufacturers• Devices
• Products • Ingredients
RDF
• Uniform Resource Identifier (URI)
• Internationalized Resource Identifier (IRI)
10
• What ingredient may have
contributed to Pam having
COPD while Ray does not?
Knowledge Graph: Easy Answers to Complex Questions
COPD
11
Ray_GillettePam_Poovey
COPD
Hookah smokes
smokes
smokes
smokes
e-Cigarette e-Cigarette
:man_v241kz
Hookah_Me_Up!
Tobacco
:ing_a6766s
Dried_Mint
:ing_2we3q4
Honey Glycerin
:ing_xr7234:ing_3j6g2i
manufacturer:pro_81fq21
contains
ingredient
Mint_Shisha
Mint_Flavor
:ing_hjk66g
Nicotine
:ing_q77e3u:man_5ez82i
e-CignatureMango_Flavor
:ing_4fg421
Nicotine
:ing_q77e3u
contains contains
: pro_0wq01h:pro_a60e2b
ingredientingredient manufacturer
Mango_Flavor_Pod Mint_Flavor_Pod
:dev_zr10q2
:dev_u18lsd :dev_aq174h
:per_452h78:per_xp52kl
sex
age
wt
sex
age
wt
12
Exposure Data View the interactive
visualization at:
https://bit.ly/PamAndRay
Data error
Person
Demographics
Entity Type
Device
Ingredients
Product
Manufacturer
13
This seems complicated!
“People think RDF is a pain because it
is complicated. The truth is even
worse. RDF is painfully simplistic, but
it allows you to work with real-world
data and problems that are horribly
complicated.”
- attributed to Dan Brickley and Libby Miller
14
Data
https://bit.ly/PamAndRayTTL
View the data file at:
15
Pam’s Unique Exposure Ingredient?
PamUniqueIngred.rq
16
• Who is a TobaccoUser ?
Ontology Revisited
• uses
• smokes
• consumes
• chews ...
User Tobacco
• What is a TobaccoProduct?
Tobacco
Hookah
Cigar
e-Cigarette
Cigarette
contains
contains
contains
? ?
Mint_Shisha
Corojo
Virginia
Mango_Flavor_Pod
contains
ingredient
ingredient
ingredient
ingredient
contains ingredient
:Person
:Device
:Device :Product :Ingredient
:Product
:Ingredient
17
Ray_Gillette
:per_452h78
Pam_Poovey
:per_xp52kl
:dev_zr10q2
:man_v241kz
:dev_u18lsd
Hookah_Me_Up!
:dev_aq174h
Hookah
Tobacco
:ing_a6766s
Dried_Mint
:ing_2we3q4
Honey Glycerin
:ing_xr7234:ing_3j6g2i
smokes
contains contains
manufacturer
: pro_0wq01h:pro_a60e2b
smokes
smokes
smokes
:pro_81fq21
contains
e-Cigarette e-Cigarette
Mint_Shisha
Mango_Flavor_Pod Mint_Flavor_Pod
sex
age
wt
sex
age
wt
ingredient
Tobacco Smoker: The path between Person and Ingredient
:Ingredient
:Product
:Device:Person :Person
18
Query: Tobacco Users
TobaccoUsers.rq
19
Tobacco Smoker: Ontology Definition & Query
TobaccoSmoker-Infer.rq
20
Data Validation
Validating Ray’s Demographics data
21
SHApes Constraint Language (SHACL)
“Person Shape”
(Validation Constraints) Person Data
Data Has Shape. Validation has Shape.
22
SEX AGE
M
SHACL
Person Shape
WEIGHT
Demographics
sex: M
age: 43
wt: 165
Validating Ray’s Demographics
Ray Gillette
Data Entry
M/F/U
165
403
> 0
< 800
> 0
< 120
Violation!
23
Validating Age with SHACL
https://bit.ly/RayDemogAndSHACL
View the interactive
visualization at:
Instance Data SHACL Shape
24
Demographics, Constraints, Report
https://bit.ly/RaySHACLResult
View the interactive
visualization at:
Instance Data
SHACL Shape
Validation Result
25
Metadata is Part of the Graph
Ray_Gillette
M
165
sex
age
wt
2020-08-01
Duke U. Clinic
Dr. A. Krieger
enteredBy
site
entryDate
:per_g367k1
403:per_452h78
26
FAIR Data
https://www.go-fair.org/fair-principles/
• Ontologies Mapping
https://www.pistoiaalliance.org/projects/current-projects/ontologies-mapping/
• FAIR Implementation Project & Toolkit
https://www.pistoiaalliance.org/projects/current-projects/fair-implementation/
Findable Accessible Interoperable Reusable
FAIR Data is Linked Data is a Knowledge Graph
27
Knowledge Graphs :
Now for a cross-industry perspective
28
Additional Reading
General / Introductory
• A Data Engineer’s Guide to Semantic Modeling - Maresi . Free e-book download.
• The Data Centric Revolution - McComb
Technical
• Semantic Web for the Working Ontologist (3rd Ed, 2020) – Hendler, Gandon, Allemang
• Demystifying OWL for the Enterprise – Uschold, Ding, Groth
• Validating RDF Data – Gayo, Prud’hommeaux, Boneva . Comparison of SHEX and SHACL.
• Learning SPARQL - DuCharme . Learn RDF by querying the data.
• 3D-force-graph Network graph visualizations in this presentation.
https://bit.ly/PamAndRay
https://bit.ly/RayDemogAndSHACL
https://bit.ly/RaySHACLResult https://bit.ly/PamAndRayTTL
Pam and Ray Exposure Data
29
Reference Slides
30
The Roofshot / Moonshot Manifesto
Concept & Image Attribution: https://rework.withgoogle.com/blog/the-roofshot-manifesto/
Examples
1. Unique Identifiers
2. Validation Rules in SHACL
3. Open Ontology Development for your domain
Invent and apply state-
of-the-art
Roofshot
Incremental impacts
in production
Moonshot
• Enterprise and Industry
Knowledge Graphs
• Across the Data Life Cycle
31
Industry Knowledge Graphs
Demonstrations
Results Data as RDF
Enterprise Knowledge Graphs
Prototype
Industry Standards & Models
Terminology and Coding
The Stairway
to the Stars
Manifesto
32
Unique Identifiers for Pharma
Validation Rules in SHACL
Study Design
Study Protocol

More Related Content

What's hot

PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences Pistoia Alliance
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for BiopharmaTom Plasterer
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Tom Plasterer
 
Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016Pistoia Alliance
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesPistoia Alliance
 
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscapingAI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscapingDr. Haxel Consult
 
THOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingTHOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingMaaike Duine
 
Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016Pistoia Alliance
 
FocalCxm presentation on improving productivity in life sciences research
FocalCxm presentation on improving productivity in life sciences researchFocalCxm presentation on improving productivity in life sciences research
FocalCxm presentation on improving productivity in life sciences researchFOCALCXM
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...David Peyruc
 
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryBioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryWolfgang G. Hoeck
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Tom Plasterer
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIPistoia Alliance
 
OTN Gambia 2008
OTN Gambia 2008OTN Gambia 2008
OTN Gambia 2008Greg Fegan
 
Knowledge graphs ilaria maresi the hyve 23apr2020
Knowledge graphs   ilaria maresi the hyve 23apr2020Knowledge graphs   ilaria maresi the hyve 23apr2020
Knowledge graphs ilaria maresi the hyve 23apr2020Pistoia Alliance
 
Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesPistoia Alliance
 
Haystack 2019 - Making the case for human judgement relevance testing - Tara ...
Haystack 2019 - Making the case for human judgement relevance testing - Tara ...Haystack 2019 - Making the case for human judgement relevance testing - Tara ...
Haystack 2019 - Making the case for human judgement relevance testing - Tara ...OpenSource Connections
 

What's hot (20)

PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
 
Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscapingAI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
 
THOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingTHOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier Linking
 
SciBite
SciBiteSciBite
SciBite
 
Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016
 
FocalCxm presentation on improving productivity in life sciences research
FocalCxm presentation on improving productivity in life sciences researchFocalCxm presentation on improving productivity in life sciences research
FocalCxm presentation on improving productivity in life sciences research
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: Recent tranSMART Lessons ...
 
AI-SDV 2020: Biomax
AI-SDV 2020: BiomaxAI-SDV 2020: Biomax
AI-SDV 2020: Biomax
 
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryBioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBI
 
OTN Gambia 2008
OTN Gambia 2008OTN Gambia 2008
OTN Gambia 2008
 
Knowledge graphs ilaria maresi the hyve 23apr2020
Knowledge graphs   ilaria maresi the hyve 23apr2020Knowledge graphs   ilaria maresi the hyve 23apr2020
Knowledge graphs ilaria maresi the hyve 23apr2020
 
Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matrices
 
Haystack 2019 - Making the case for human judgement relevance testing - Tara ...
Haystack 2019 - Making the case for human judgement relevance testing - Tara ...Haystack 2019 - Making the case for human judgement relevance testing - Tara ...
Haystack 2019 - Making the case for human judgement relevance testing - Tara ...
 
TAIR ICAR 2010 Presentation
TAIR ICAR 2010 PresentationTAIR ICAR 2010 Presentation
TAIR ICAR 2010 Presentation
 

Similar to Knowledge Graphs: Changing How We Think About Data

Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...William Gunn
 
Microsoft: A Waking Giant in Healthcare Analytics and Big Data
Microsoft: A Waking Giant in Healthcare Analytics and Big DataMicrosoft: A Waking Giant in Healthcare Analytics and Big Data
Microsoft: A Waking Giant in Healthcare Analytics and Big DataDale Sanders
 
Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015Research Data Alliance
 
Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016Research Data Alliance
 
Monthly statistics of the RDA community - March 2016
Monthly statistics of the RDA community - March 2016Monthly statistics of the RDA community - March 2016
Monthly statistics of the RDA community - March 2016Research Data Alliance
 
Curlew Research Brussels 2014 Electronic Data & Knowledge Management
Curlew Research Brussels 2014 Electronic Data & Knowledge ManagementCurlew Research Brussels 2014 Electronic Data & Knowledge Management
Curlew Research Brussels 2014 Electronic Data & Knowledge ManagementNick Lynch
 
RDA, EOSC and FAIR
RDA, EOSC and FAIRRDA, EOSC and FAIR
RDA, EOSC and FAIREUDAT
 
Microsoft: A Waking Giant In Healthcare Analytics and Big Data
Microsoft: A Waking Giant In Healthcare Analytics and Big DataMicrosoft: A Waking Giant In Healthcare Analytics and Big Data
Microsoft: A Waking Giant In Healthcare Analytics and Big DataHealth Catalyst
 
20131117 charleston bryant
20131117 charleston bryant20131117 charleston bryant
20131117 charleston bryantORCID, Inc
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314Philip Bourne
 
Rachel Bruce UK research and data management where are we now
Rachel Bruce UK research and data management where are we nowRachel Bruce UK research and data management where are we now
Rachel Bruce UK research and data management where are we nowJisc
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...Neo4j
 
RDA in a nutshell with Plenary 6 details
RDA in a nutshell with Plenary 6 detailsRDA in a nutshell with Plenary 6 details
RDA in a nutshell with Plenary 6 detailsResearch Data Alliance
 

Similar to Knowledge Graphs: Changing How We Think About Data (20)

Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
 
Microsoft: A Waking Giant in Healthcare Analytics and Big Data
Microsoft: A Waking Giant in Healthcare Analytics and Big DataMicrosoft: A Waking Giant in Healthcare Analytics and Big Data
Microsoft: A Waking Giant in Healthcare Analytics and Big Data
 
Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015
 
Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016
 
Monthly statistics of the RDA community - March 2016
Monthly statistics of the RDA community - March 2016Monthly statistics of the RDA community - March 2016
Monthly statistics of the RDA community - March 2016
 
Curlew Research Brussels 2014 Electronic Data & Knowledge Management
Curlew Research Brussels 2014 Electronic Data & Knowledge ManagementCurlew Research Brussels 2014 Electronic Data & Knowledge Management
Curlew Research Brussels 2014 Electronic Data & Knowledge Management
 
RDA, EOSC and FAIR
RDA, EOSC and FAIRRDA, EOSC and FAIR
RDA, EOSC and FAIR
 
FAIR, FAIRplus and the FAIR Cookbook
FAIR, FAIRplus and the FAIR Cookbook FAIR, FAIRplus and the FAIR Cookbook
FAIR, FAIRplus and the FAIR Cookbook
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
Microsoft: A Waking Giant In Healthcare Analytics and Big Data
Microsoft: A Waking Giant In Healthcare Analytics and Big DataMicrosoft: A Waking Giant In Healthcare Analytics and Big Data
Microsoft: A Waking Giant In Healthcare Analytics and Big Data
 
20131117 charleston bryant
20131117 charleston bryant20131117 charleston bryant
20131117 charleston bryant
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314
 
Rda members 05 april2016
Rda members 05 april2016Rda members 05 april2016
Rda members 05 april2016
 
Rachel Bruce UK research and data management where are we now
Rachel Bruce UK research and data management where are we nowRachel Bruce UK research and data management where are we now
Rachel Bruce UK research and data management where are we now
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
 
Rda in a Nutshell - February 2019
Rda in a Nutshell - February 2019Rda in a Nutshell - February 2019
Rda in a Nutshell - February 2019
 
RDA in a nutshell with Plenary 6 details
RDA in a nutshell with Plenary 6 detailsRDA in a nutshell with Plenary 6 details
RDA in a nutshell with Plenary 6 details
 
Rda in-a-nutshell-march-2019
Rda in-a-nutshell-march-2019Rda in-a-nutshell-march-2019
Rda in-a-nutshell-march-2019
 
Rda in a_nutshell_november_2018
Rda in a_nutshell_november_2018Rda in a_nutshell_november_2018
Rda in a_nutshell_november_2018
 
Rda in a_nutshell_october2016
Rda in a_nutshell_october2016Rda in a_nutshell_october2016
Rda in a_nutshell_october2016
 

Recently uploaded

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Recently uploaded (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Knowledge Graphs: Changing How We Think About Data

  • 1. A Conversation with PHUSE, Stardog, and FDA DRSI Knowledge Graphs: Changing How We Think About Data.
  • 2. Presenters Tim Williams • Knowledge Graph Project Lead, PHUSE • Lead Statistical Solutions Analyst, UCB Biosciences Laura Firey • Product Manager, Stardog 2
  • 3. • Knowledge Graphs [ ~20 min] • Cross Industry Perspective [~10min] • Open Discussion [~30min] Outline Most important: “A Conversation...” 3
  • 4. PHUSE Linked Data Projects • CDISC Foundational Standards in RDF • CDISC Conformance Checks • Reusing Medical Summaries for Enabling Clinical Research • Regulatory Guidance in RDF • Clinical Program Design in RDF • CDISC Protocol Representation Model in RDF • Analysis Results & Metadata • RDF Data Cubes for clinical trial results • Clinical Trials Data as RDF • Study Data Tabulation Model as Linked Data • Going Translational with Linked Data • Study Data Validation and Submission Conformance • Pre-clinical data + submission metadata 4
  • 5. PHUSE Linked Data Workshop 5
  • 7. Knowledge Graph Data-Centric Model • Use-case neutral • Real-world processes, entities, relationships Person 1 Hookah Person 2 • Knowledge Graph as Resource Description Framework (RDF) e-Cigarette smokes e-Cigarette smokes smokes smokes The whiteboard model you draw is the data. 7
  • 8. • Shared Definitions and Understanding – What is a Tobacco Product? – Who is a Tobacco User? • Coding of – Medical Conditions – Adverse Events – Products, Manufacturers • Data Classification, Rules, Validation.. • Ontology Driven/Supported Knowledge Graphs Facilitate Standards 8
  • 9. • Dictionary – terms and their definitions • Taxonomy – class hierarchy • Thesaurus – relationships between terms Ontology-based Knowledge Graph Standards • Rules and Restrictions – Group membership, exclusions, types – Employ reasoner, infer values, relations A Data Engineer’s Guide to Semantic Modelling Ilaria Maresi (June 2020) http://blog.thehyve.nl/news/ebook-semantic-model What is an Ontology? 9
  • 10. Standards: Identifier Examples 1,2,3-TrihydroxypropaneGlycerolGlycerin http://dbpedia.org/page/GlycerolGlycerin • Common terminology • Link to other Knowledge Graphs • Manufacturers• Devices • Products • Ingredients RDF • Uniform Resource Identifier (URI) • Internationalized Resource Identifier (IRI) 10
  • 11. • What ingredient may have contributed to Pam having COPD while Ray does not? Knowledge Graph: Easy Answers to Complex Questions COPD 11
  • 12. Ray_GillettePam_Poovey COPD Hookah smokes smokes smokes smokes e-Cigarette e-Cigarette :man_v241kz Hookah_Me_Up! Tobacco :ing_a6766s Dried_Mint :ing_2we3q4 Honey Glycerin :ing_xr7234:ing_3j6g2i manufacturer:pro_81fq21 contains ingredient Mint_Shisha Mint_Flavor :ing_hjk66g Nicotine :ing_q77e3u:man_5ez82i e-CignatureMango_Flavor :ing_4fg421 Nicotine :ing_q77e3u contains contains : pro_0wq01h:pro_a60e2b ingredientingredient manufacturer Mango_Flavor_Pod Mint_Flavor_Pod :dev_zr10q2 :dev_u18lsd :dev_aq174h :per_452h78:per_xp52kl sex age wt sex age wt 12
  • 13. Exposure Data View the interactive visualization at: https://bit.ly/PamAndRay Data error Person Demographics Entity Type Device Ingredients Product Manufacturer 13
  • 14. This seems complicated! “People think RDF is a pain because it is complicated. The truth is even worse. RDF is painfully simplistic, but it allows you to work with real-world data and problems that are horribly complicated.” - attributed to Dan Brickley and Libby Miller 14
  • 16. Pam’s Unique Exposure Ingredient? PamUniqueIngred.rq 16
  • 17. • Who is a TobaccoUser ? Ontology Revisited • uses • smokes • consumes • chews ... User Tobacco • What is a TobaccoProduct? Tobacco Hookah Cigar e-Cigarette Cigarette contains contains contains ? ? Mint_Shisha Corojo Virginia Mango_Flavor_Pod contains ingredient ingredient ingredient ingredient contains ingredient :Person :Device :Device :Product :Ingredient :Product :Ingredient 17
  • 18. Ray_Gillette :per_452h78 Pam_Poovey :per_xp52kl :dev_zr10q2 :man_v241kz :dev_u18lsd Hookah_Me_Up! :dev_aq174h Hookah Tobacco :ing_a6766s Dried_Mint :ing_2we3q4 Honey Glycerin :ing_xr7234:ing_3j6g2i smokes contains contains manufacturer : pro_0wq01h:pro_a60e2b smokes smokes smokes :pro_81fq21 contains e-Cigarette e-Cigarette Mint_Shisha Mango_Flavor_Pod Mint_Flavor_Pod sex age wt sex age wt ingredient Tobacco Smoker: The path between Person and Ingredient :Ingredient :Product :Device:Person :Person 18
  • 20. Tobacco Smoker: Ontology Definition & Query TobaccoSmoker-Infer.rq 20
  • 21. Data Validation Validating Ray’s Demographics data 21
  • 22. SHApes Constraint Language (SHACL) “Person Shape” (Validation Constraints) Person Data Data Has Shape. Validation has Shape. 22
  • 23. SEX AGE M SHACL Person Shape WEIGHT Demographics sex: M age: 43 wt: 165 Validating Ray’s Demographics Ray Gillette Data Entry M/F/U 165 403 > 0 < 800 > 0 < 120 Violation! 23
  • 24. Validating Age with SHACL https://bit.ly/RayDemogAndSHACL View the interactive visualization at: Instance Data SHACL Shape 24
  • 25. Demographics, Constraints, Report https://bit.ly/RaySHACLResult View the interactive visualization at: Instance Data SHACL Shape Validation Result 25
  • 26. Metadata is Part of the Graph Ray_Gillette M 165 sex age wt 2020-08-01 Duke U. Clinic Dr. A. Krieger enteredBy site entryDate :per_g367k1 403:per_452h78 26
  • 27. FAIR Data https://www.go-fair.org/fair-principles/ • Ontologies Mapping https://www.pistoiaalliance.org/projects/current-projects/ontologies-mapping/ • FAIR Implementation Project & Toolkit https://www.pistoiaalliance.org/projects/current-projects/fair-implementation/ Findable Accessible Interoperable Reusable FAIR Data is Linked Data is a Knowledge Graph 27
  • 28. Knowledge Graphs : Now for a cross-industry perspective 28
  • 29. Additional Reading General / Introductory • A Data Engineer’s Guide to Semantic Modeling - Maresi . Free e-book download. • The Data Centric Revolution - McComb Technical • Semantic Web for the Working Ontologist (3rd Ed, 2020) – Hendler, Gandon, Allemang • Demystifying OWL for the Enterprise – Uschold, Ding, Groth • Validating RDF Data – Gayo, Prud’hommeaux, Boneva . Comparison of SHEX and SHACL. • Learning SPARQL - DuCharme . Learn RDF by querying the data. • 3D-force-graph Network graph visualizations in this presentation. https://bit.ly/PamAndRay https://bit.ly/RayDemogAndSHACL https://bit.ly/RaySHACLResult https://bit.ly/PamAndRayTTL Pam and Ray Exposure Data 29
  • 31. The Roofshot / Moonshot Manifesto Concept & Image Attribution: https://rework.withgoogle.com/blog/the-roofshot-manifesto/ Examples 1. Unique Identifiers 2. Validation Rules in SHACL 3. Open Ontology Development for your domain Invent and apply state- of-the-art Roofshot Incremental impacts in production Moonshot • Enterprise and Industry Knowledge Graphs • Across the Data Life Cycle 31
  • 32. Industry Knowledge Graphs Demonstrations Results Data as RDF Enterprise Knowledge Graphs Prototype Industry Standards & Models Terminology and Coding The Stairway to the Stars Manifesto 32 Unique Identifiers for Pharma Validation Rules in SHACL Study Design Study Protocol