SlideShare a Scribd company logo
1 of 79
Download to read offline
Semantics of and for the diversity of life:
Opportunities and perils of trying to
reason on the frontier
Hilmar Lapp
National Evolutionary Synthesis Center
(NESCent)
CSHALS, Boston, February 28, 2014
Regier et al (2010

Parfrey et al (2010, Parfrey & Katz

The
diversity of
life is
stunning

Images: Web Tree of Life (http://tolweb.org)
Extinct biodiversity
is even greater than
extant
Dunn et al (2013) Home life: factors structuring the Huttenhower et al. (2012) Structure, function and
bacterial diversity found within and between homes. diversity of the healthy human microbiome.
PLoS One 8: e64133
Nature 486: 207–214.
B. Sidlauskas, Oregon State University
Smithsonian Institution, Natural History Museum

• About 1.2-2.1 billion

natural history specimens

• > 7,000 natural history
collections registered

Losos JB et al. (2013) Evolutionary Biology for the
21st Century. PLoS Biol 11: e1001466.
Currently digitized:
73,270 titles
42,794,095 pages
Comparative features documented
in meticulous detail, in free text
Finding similar information
in free-text is difficult
“lacrymal bone...flat’’

Mayden 1989

“lacrimal...small, flat”

Grande and
Poyato-Ariza
1999

“lacrimal...triangular’’

“first infraorbital (lachrimal)
shape...flattened”
“fourth infraorbital...anterior
and posterior margins...in
parallel”

Royero 1999

Kailola 2004

Zanata and Vari
2005

OMIM query
“large bone”
“enlarged bone”
“big bones”
“huge bones”

#records
1083
224
21
4

“massive bones”

41

“hyperplastic bones”

12

“hyperplastic bone”

45

“bone hyperplasia”

181

“increased bone growth”

879
Comparative features underlie
knowledge about evolutionary history

Sereno (1999)
O’Leary et al. (2013) The placental
mammal ancestor and the postK-Pg radiation of placentals.
Science 339: 662–667

vs.
Dos Reis et al (2014) Neither
phylogenomic nor palaeontological
data support a Palaeogene origin
of placental mammals. Biol Lett 10
Even though, much of
biodiversity is not well described
Content-rich = richness score > 0.4
Only families with > 10 species
Dark areas in the Tree of Life

EOL/BHL Research Sprint
Jessica Oswald, Karen Cranston, Gordon Burleigh, and Cyndy Parr
Our knowledge is conflicting

Open Tree of Life - Phylografter
(Richard Ree, FMNH)
Making biodiversity knowledge part of
Big & Linked Data faces major challenges
• The Linnean system for taxonomy was not
designed for Linked Data.

• Names as identifiers, and track provenance.
• No canonical comprehensive taxonomy.
• Specimens referenced by a combination of
metadata (with unreliable provision &
uniqueness).

• No common resolver to identifiers or URLs.
• Most knowledge and data is in free text using the
expressivity of natural language.
Also huge opportunities for
advancing science

• Organizing and linking data to
biodiversity

• Data mining morphology, traits,

habitats, ecological interactions, and
other descriptive data
Using reasoning to
make linking data by
taxon less perilous
The perils of linking data by taxon
How to render the definition
of taxon names computable?

1
2

1
3

2
3

Tetrapoda (crown group)
Tetrapoda (with digits)
Tetrapoda (stem group)
OBO:VTO_0001465

Definition: “The first
organisms derived from
Sarcopterygians that
possessed digits homologous
with those in Homo sapiens,
and all its descendants.”
(Ahlberg and Clack 1998,
Anderson 2002)

obo:NCBITaxon_32523
Naming everything in the Tree of Life is impractical
Phyloreferencing: A universal
computable coordinate system for life

Tetrapoda

What if this were not just a
signpost but a coordinate
computable against any tree?
Phyloreferencing: A universal
computable coordinate system for life

Requirements:

• Any node, branch, subtree is referencable
• References are unambiguous
• References are portable
• References are computable
• Adapts easily to new and changing knowledge
Phylogenetic clade
definitions
Form the basis of
Phylogenetic
Nomenclature

http://en.wikipedia.org/wiki/File:Clade_types.svg
We already know how to render the
semantics of names computable
part_of some
forebrain

part_of some
telencephalon

part_of some
telencephalic
ventricle

stroma

choroid plexus
stroma

lateral ventricle
choroid plexus
stroma

• Conjunctive OWL
class expressions

• Necessary and

sufficient conditions
for class membership

Class:
lateral_ventricle_choroid_plexus_
stroma
EquivalentTo:
uberon:choroid_plexus_stroma
and bfo:part_of some
uberon:telencephalic_ventricle
Ontologies for evolutionary
relationships exist
Comparative Data
Analysis Ontology:

• OWL ontology
• Scope: phylogenetic
data and trees

• Rich set of axioms
• Prosdocimi et al (2009)
Evol Bioinform Online:
47–66

Prosdocimi et al (2009)
The Tree of Life as an
ontology

1
2

3

Class: Tetrapoda_Total
EquivalentTo:
cdao:has_Descendant value taxon:Amniota
and phyloref:excludes_lineage value taxon:Dipnoi
The Tree of Life as an
ontology

1
2

3

Class: Tetrapoda_Crown
EquivalentTo:
cdao:has_Descendant value taxon:Amniota
and phyloref:excludes_lineage value taxon:Crassigyrinus
The Tree of Life as an
ontology

1
2

3

Class: Tetrapoda_Digit_hand
EquivalentTo:
cdao:has_Descendant value taxon:Amniota
and phyloref:has_Progenitor some (
bfo:has_part some uberon:’manual digit’)
Phyloreferences can be
named or anonymous
•

Phyloreference expressions
can be anonymous, or named
and registered

•
•
•

Semantics is exactly the
same
Naming has benefits:
promotes reuse,
consistency, usability and
accessibility by non-experts

Ontology of computable
clade names

Class: AGF4-SHRU-3560
EquivalentTo:
cdao:has_Descendant value taxon:Amniota
and phyloref:has_Progenitor some (
bfo:has_part some uberon:’manual digit’)

vs.
Class: Tetrapoda_Digit_hand
Annotations:
rdfs:label “Tetrapoda with limbs with digits”
dc:description “the first sarcopterygian to have
possessed digits homologous with Amniotes”
EquivalentTo:
cdao:has_Descendant value taxon:Amniota
and phyloref:has_Progenitor some (
bfo:has_part some uberon:’manual digit’)
Common coordinate systems
also require standards
• Common formally defined language (OWL)
• Common ontologies to encode knowledge and
queries (CDAO, PhyloRef)

• Common identifiers for phyloreference
specifiers

• Phenotype and anatomy ontologies
• Canonical taxonomy for OTU names
• Canonical GUIDs for specimens
Using ontologies &
reasoning to mine our
knowledge of trait diversity
Challenge: What do we
know about the evolution
of morphological
biodiversity?
http://www-news.uchicago.edu/releases/06/060405.tiktaalik.shtml

Clack, J. A. (2009). The Fin to
Limb Transition: New Data,
Interpretations, and Hypotheses
from Paleontology and
Developmental Biology. Annual
Review of Earth and Planetary
Sciences, 37(1), 163-179
Vast stores of data, but all free text
Manual effort of experts is not scalable

Fig. 7, Sereno (2009)

Fig. 6, Sereno (2009)
Schneider et al (2011) Proc Natl Acad
Sci U S A 108: 12782–12786

Challenge: What can we
hypothesize about genes involved in
the evolution of biodiversity?
Model organism & health literature
contains descriptions of gene and
disease phenotypes

Kimmel et al, 2003
Translational
bioinformatics
Fig. 3, Washington et al (2009)

Model organism
->
Human
Fig. 1, Washington et al (2009)
Translational biodiversity
informatics

Fig. 1, Washington et al (2009)

Model organism genes -> Evolutionary diversity
by semantic similarity of phenotypes
Focus: vertebrate
fin/limb transition
Schneider & Shubin NH (2013) Trends Genet 29: 419–426
42
Computable via shared ontologies,
rich semantics, OWL reasoning
Ontology-annotated descriptions
integrate across studies & fields

+
Comparative studies

Model organism datasets

= Phenoscape Knowledgebase
Phenoscape KB content
• 16,000 character states from >120

comparative morphological datasets,
linked to 4,000 vertebrate taxa.

• Imported genetic phenotype and

expression data from ZFIN, Xenbase, MGI,
and Human Phenotype project.

• Shared semantics: Uberon (anatomy),

PATO (phenotypic qualities), Entity–Quality
(EQ) OWL axioms (phenotype observations)

• Plus a dozen other ontologies ...
KB enables candidate gene
hypothesis generation
Mutation of eda gene
in Danio:

Ictalurus punctatus:

Harris et al., 2007
Copyright	
  ©	
  Jean	
  Ricardo	
  Simões	
  Vitule,	
  All	
  Rights	
  Reserved
Wet lab test

(Work by Richard Edmunds)
Ictalurus punctatus

eda expression is lacking in the epidermis
KB enables candidate gene
hypothesis generation
KB enables candidate gene
hypothesis generation
Clack, J. A. (2009) Annual Review of Earth and Planetary Sciences, 37:163-179

Work on vertebrate fin/limb
transition candidate genes is ongoing
Using reasoning to
synthesize data implied
by what we know
Elbow joint

What do we know about
the evolution of
morphological
biodiversity?
Humeral(head(in(human
Shubin et al (2006) Nature 440: 764–771

Human shoulder joint

Shubin et al (2006) Nature 440: 764–771
Do we really not know more?

Presence /
absence of
digits

Author-asserted:
74 present
25 absent
11% of taxa
Character synthesis by inference
• Most phenotypic descriptions of some feature of
a structure implies its presence or absence:

• “Humerus slender and elongate: with length more than three

times the diameter of its distal end” → humerus must be present

• Partonomy axioms in the ontology allow inferring
presence or absence:

• ‘all humerus part_of some forelimb’ → forelimb must be

present if humerus is; humerus must be absent if forelimb is

• Developmental origin axioms allow inferring
presence or absence:

• ‘all limb develops_from some limb bud’ → limb bud must be
present if limb is; limb must be absent if limb bud is
A reasoner can fill in lots of gaps

Presence /
absence of
digits

Asserted & inferred:
645 inferred present
74 asserted present
25 asserted absent
82% of taxa
A reasoner can fill in lots of gaps
asserted presence/absence

with inference

Mesquite “birds-eye view”
Even knowing only presence/
absence of traits can be powerful
Synthesis highlights conflict and
gaps
• Conflicting interpretations in studies
• supinator process of humerus: both

absent & present in Strepsodus (Zhu et al.
1999 vs. Ruta 2011)

figure from Parker et al., 2005

• Gaps in knowledge
• acetabulum present or absent?
• Same term, different meaning?
• Acanthostega— “radials, jointed” (Swartz

Acetabulum of
pelvic girdle:
present/absent

2012)

• but doesn’t have radials...
• Uneven taxon sampling
http://characterdesignnotes.blogspot.com/2011/04/proper-use-of-reference-andanatomy-in.html
How to reason and
query at the scale of
biodiversity?
Rich axioms allow rich inferences,
but make scaling a challenge
• Anatomy ontologies and EQ annotation employ
rich OWL semantics → requires DL reasoner

• Classifying and querying over large dataset (~25
million RDF triples) does not scale well

• Presently, the only feasible OWL reasoner is ELK
• constrained to OWL EL profile → limits kinds
of expressions that can be used

• best performance over class axioms only →

data must be modeled so as to avoid need for
classifying instances
MOD
curators

Phenoscape curators Ontology curators

ontologies
MODs
phenotype & gene
expression
annotations
gene identifiers

tab-delimited

SPARQL
queries

Owlet SPARQL endpoint

Bigdata
triplestore
NeXML
matrices
RDF (all)
kb-owl-tools

OWL conversion
• includes translation of
EQ to OWL expressions

Identifier cleanup
URIs for standard
properties across
ontologies are a mess

Axiom generation
• "Absence" classes for OWL EL
negation classification workaround
• SPARQL facilitation (e.g.
materialized existential hierarchies
such as part_of)

applications

ELK
OWL
tbox axioms

Assertion of
absence hierarchy
based on inverse of
hierarchy of negated
classes computed by
ELK
Materialize inferred
subclass axioms
ELK reasoner using
extracted tbox axioms
only (not feasible with
individuals included)
Querying a triple store with
complex OWL expressions
• Want to allow arbitrary selection of

structures of interest, using rich semantics:
(part_of some (limb/fin or girdle skeleton))
or (connected_to some girdle skeleton)

• RDF triplestores provide very limited

reasoning expressivity, and scale poorly
with large ontologies.

• However, ELK can answer class expression
queries within seconds.
What if instead of this (*):
PREFIX	
  rdf:	
  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>
PREFIX	
  rdfs:	
  <http://www.w3.org/2000/01/rdf-­‐schema#>
PREFIX	
  ao:	
  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/>
PREFIX	
  owl:	
  <http://www.w3.org/2002/07/owl#>
SELECT	
  DISTINCT	
  ?gene
WHERE	
  
{
?gene	
  ao:expressed_in	
  ?structure	
  .
?structure	
  rdf:type	
  ?structure_class	
  .
#	
  Triple	
  pattern	
  selecting	
  structure:
?structure_class	
  rdfs:subClassOf	
  "ao:muscle”	
  .
?structure_class	
  rdfs:subClassOf	
  ?restriction
?restriction	
  owl:onProperty	
  ao:part_of	
  .
?restriction	
  owl:someValuesFrom	
  "ao:head"	
  .
}

we could do this:
PREFIX	
  rdf:	
  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>
PREFIX	
  rdfs:	
  <http://www.w3.org/2000/01/rdf-­‐schema#>
PREFIX	
  ao:	
  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/>
PREFIX	
  ow:	
  <http://purl.org/phenoscape/owlet/syntax#>
SELECT	
  DISTINCT	
  ?gene
WHERE	
  
{
?gene	
  ao:expressed_in	
  ?structure	
  .
?structure	
  rdf:type	
  ?structure_class	
  .
#	
  Triple	
  pattern	
  containing	
  an	
  OWL	
  expression:
?structure_class	
  rdfs:subClassOf	
  "ao:muscle	
  and	
  (ao:part_of	
  some	
  ao:head)"^^ow:omn	
  .
}
owlet: SPARQL query expansion
with in-memory OWL reasoner
• owlet interprets OWL class expressions
embedded within SPARQL queries

• Uses any OWL API-based reasoner to preprocess
query.

• We use ELK that holds terminology in memory.
• Replaces OWL expression with FILTER statement
listing matching terms

• http://github.com/phenoscape/owlet
PREFIX	
  rdf:	
  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>
PREFIX	
  rdfs:	
  <http://www.w3.org/2000/01/rdf-­‐schema#>
PREFIX	
  ao:	
  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/>
PREFIX	
  ow:	
  <http://purl.org/phenoscape/owlet/syntax#>
SELECT	
  DISTINCT	
  ?gene
WHERE	
  
{
?gene	
  ao:expressed_in	
  ?structure	
  .
?structure	
  rdf:type	
  ?structure_class	
  .
#	
  Triple	
  pattern	
  containing	
  an	
  OWL	
  expression:
?structure_class	
  rdfs:subClassOf	
  "ao:muscle	
  and	
  (ao:part_of	
  some	
  ao:head)"^^ow:omn	
  .
}

➡︀︁︂︃︄︅︆︇︈︉︊︋︌︍︎️
owlet
➡︀︁︂︃︄︅︆︇︈︉︊︋︌︍︎️
PREFIX	
  rdf:	
  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>
PREFIX	
  rdfs:	
  <http://www.w3.org/2000/01/rdf-­‐schema#>
PREFIX	
  ao:	
  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/>
PREFIX	
  ow:	
  <http://purl.org/phenoscape/owlet/syntax#>
SELECT	
  DISTINCT	
  ?gene
WHERE	
  
{
?gene	
  ao:expressed_in	
  ?structure	
  .
?structure	
  rdf:type	
  ?structure_class	
  .
#	
  Filter	
  constraining	
  ?structure_class	
  to	
  the	
  terms	
  returned	
  by	
  the	
  OWL	
  query:
FILTER(?structure_class	
  IN	
  (ao:adductor_mandibulae,	
  ao:constrictor_dorsalis,	
  ...))
}
How to annotate
descriptive biodiversity
data at scale?
Curation
Dahdul et al., 2010 PLoS ONE
2. Students:
Manual entry of free
text character
descriptions, matrix,
taxon list, specimens
and museum numbers
using Phenex

1. Students:
gather publications
(scan hard copies,
produce OCR PDFs)

3. Character
annotation by experts:
Entry of phenotypes
and homology
assertions using
Phenex

139 Publications
8,073 characters
18,811 states
for 4397 taxa
~6.7 FTE years

4. Build pipeline for
the Phenoscape KB
MOD
curators

Phenoscape curators Ontology curators

ontologies
MODs
phenotype & gene
expression
annotations
gene identifiers

tab-delimited

SPARQL
queries

Owlet SPARQL endpoint

Bigdata
triplestore
NeXML
matrices
RDF (all)
kb-owl-tools

OWL conversion
• includes translation of
EQ to OWL expressions

Identifier cleanup
URIs for standard
properties across
ontologies are a mess

Axiom generation
• "Absence" classes for OWL EL
negation classification workaround
• SPARQL facilitation (e.g.
materialized existential hierarchies
such as part_of)

applications

ELK
OWL
tbox axioms

Assertion of
absence hierarchy
based on inverse of
hierarchy of negated
classes computed by
ELK
Materialize inferred
subclass axioms
ELK reasoner using
extracted tbox axioms
only (not feasible with
individuals included)
CharaParser: Using NLP to
scale annotation

Cui H (2012) CharaParser
for fine-grained
semantic annotation of
organism morphological
descriptions.
JASIST 63: 738–754
Most of the effort may actually be data
digitization, ontology building, etc
6%

94%
Actual data annotation
Preparatory overhead
Building semantically rich
community ontologies is hard
222 terms
790 terms
Uberon: cross-species
anatomy ontology
Taxonomy ontology needs
to be synthesized
Vertebrate Taxonomy
Ontology (VTO)

Midford et al., in press JBMS
How to evaluate
semantic similarity
metrics meaningfully?
Many
different
metrics

Pesquita et al (2009) Semantic
similarity in biomedical ontologies.
PLoS Comput Biol 5: e1000443
What if there is no
“gold standard”?
Evaluate by tendency to
maximizing inter-curator similarity
Curator 1

Curator 2

Curator 1

Curator 2

• Can also be used to assess payoff from
annotation granularity
Summary: Big obstacles, but also big
opportunities for semantics-driven discovery

• Vast stores of published knowledge and
data waiting to be exploited

• Countless possibilities for knowledge
discovery by connecting data

• Scaling out expressive reasoning is hard
• Workforce training is a serious issue
Phenoscape project team
National Evolutionary Synthesis Center
(NESCent)		

University of Oregon (Zebrafish Information
Network)	

Todd Vision (also University of North
Carolina at Chapel Hill)

Monte Westerfield

Hilmar Lapp

Ceri Van Slyke	 	

Jim Balhoff

Cincinnati Children's Hospital (Xenbase)

Prashanti Manda	
University of South Dakota	
Paula Mabee

David Blackburn
	

Paul Sereno
Nizar Ibrahim
Mouse Genome Informatics	
Terry Hayamizu

Christina James-Zorn
California Academy of Sciences	

Alex Dececchi	

Judith Blake

Aaron Zorn
Virgilio Ponferrada

Wasila Dahdul
University of Chicago	

Yvonne Bradford

University of Arizona	
Hong Cui
Oregon Health & Science University	
Melissa Haendel
Lawrence Berkeley National Labs
Chris Mungall
Acknowledgements
• Phenoscape personnel,

PIs, curators & workshop
participants

• National Evolutionary
Synthesis Center
(NESCent)

• Nico Cellinese (U.
Florida):
Phyloreferencing

• NSF (DBI-1062404,
DBI-1062542)

• Karen Cranston (Open
Tree of Life)

• Cynthia Parr (EOL)
• William Ulate (BHL)

More Related Content

What's hot

Genome-scale phylogenomics
Genome-scale phylogenomicsGenome-scale phylogenomics
Genome-scale phylogenomicsboussau
 
Iowa State Bioinformatics BCB Symposium 2018 - There and Back Again
Iowa State Bioinformatics BCB Symposium 2018 - There and Back AgainIowa State Bioinformatics BCB Symposium 2018 - There and Back Again
Iowa State Bioinformatics BCB Symposium 2018 - There and Back AgainAdina Chuang Howe
 
Models of gene duplication, transfer and loss to study genome evolution
Models of gene duplication, transfer and loss to study genome evolutionModels of gene duplication, transfer and loss to study genome evolution
Models of gene duplication, transfer and loss to study genome evolutionboussau
 
Andrew Polaszek - ZooBank: ICZN’s open-access web-based register of all new a...
Andrew Polaszek - ZooBank: ICZN’s open-access web-based register of all new a...Andrew Polaszek - ZooBank: ICZN’s open-access web-based register of all new a...
Andrew Polaszek - ZooBank: ICZN’s open-access web-based register of all new a...ICZN
 
Natural history research as a replicable data science
Natural history research as a replicable data scienceNatural history research as a replicable data science
Natural history research as a replicable data scienceRutger Vos
 
Fbip specify2015
Fbip specify2015Fbip specify2015
Fbip specify2015wcoetzer
 
Nomenclature for the Future: The power and challenges for stable and sensible...
Nomenclature for the Future: The power and challenges for stable and sensible...Nomenclature for the Future: The power and challenges for stable and sensible...
Nomenclature for the Future: The power and challenges for stable and sensible...ICZN
 
Grasses Online - Scratchpads for Poaceae
Grasses Online -  Scratchpads for PoaceaeGrasses Online -  Scratchpads for Poaceae
Grasses Online - Scratchpads for PoaceaeBryan_Simon
 
How to use PHYLDOG: a tutorial
How to use PHYLDOG: a tutorialHow to use PHYLDOG: a tutorial
How to use PHYLDOG: a tutorialboussau
 
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013millerjeremya
 
5 evolution & biodiversity syllabus statements
5 evolution & biodiversity syllabus statements5 evolution & biodiversity syllabus statements
5 evolution & biodiversity syllabus statementscartlidge
 
Evolutionary Classification
Evolutionary ClassificationEvolutionary Classification
Evolutionary Classificationlegoscience
 
Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010Robin Ryder
 
Semantic tools for aggregation of morphological characters across studies
Semantic tools for aggregation of morphological characters across studiesSemantic tools for aggregation of morphological characters across studies
Semantic tools for aggregation of morphological characters across studiesbalhoff
 
Getting Started with the Hymenoptera Anatomical Ontology
Getting Started with the Hymenoptera Anatomical OntologyGetting Started with the Hymenoptera Anatomical Ontology
Getting Started with the Hymenoptera Anatomical OntologyKatja C. Seltmann
 

What's hot (17)

Genome-scale phylogenomics
Genome-scale phylogenomicsGenome-scale phylogenomics
Genome-scale phylogenomics
 
Iowa State Bioinformatics BCB Symposium 2018 - There and Back Again
Iowa State Bioinformatics BCB Symposium 2018 - There and Back AgainIowa State Bioinformatics BCB Symposium 2018 - There and Back Again
Iowa State Bioinformatics BCB Symposium 2018 - There and Back Again
 
Models of gene duplication, transfer and loss to study genome evolution
Models of gene duplication, transfer and loss to study genome evolutionModels of gene duplication, transfer and loss to study genome evolution
Models of gene duplication, transfer and loss to study genome evolution
 
Andrew Polaszek - ZooBank: ICZN’s open-access web-based register of all new a...
Andrew Polaszek - ZooBank: ICZN’s open-access web-based register of all new a...Andrew Polaszek - ZooBank: ICZN’s open-access web-based register of all new a...
Andrew Polaszek - ZooBank: ICZN’s open-access web-based register of all new a...
 
Natural history research as a replicable data science
Natural history research as a replicable data scienceNatural history research as a replicable data science
Natural history research as a replicable data science
 
Fbip specify2015
Fbip specify2015Fbip specify2015
Fbip specify2015
 
Nomenclature for the Future: The power and challenges for stable and sensible...
Nomenclature for the Future: The power and challenges for stable and sensible...Nomenclature for the Future: The power and challenges for stable and sensible...
Nomenclature for the Future: The power and challenges for stable and sensible...
 
5.4 cladistic
5.4 cladistic5.4 cladistic
5.4 cladistic
 
Remsen Lect04
Remsen Lect04Remsen Lect04
Remsen Lect04
 
Grasses Online - Scratchpads for Poaceae
Grasses Online -  Scratchpads for PoaceaeGrasses Online -  Scratchpads for Poaceae
Grasses Online - Scratchpads for Poaceae
 
How to use PHYLDOG: a tutorial
How to use PHYLDOG: a tutorialHow to use PHYLDOG: a tutorial
How to use PHYLDOG: a tutorial
 
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
pro-iBiosphere Towards Open Biodiversity Knowledge COOPEUS 2013
 
5 evolution & biodiversity syllabus statements
5 evolution & biodiversity syllabus statements5 evolution & biodiversity syllabus statements
5 evolution & biodiversity syllabus statements
 
Evolutionary Classification
Evolutionary ClassificationEvolutionary Classification
Evolutionary Classification
 
Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010Talk at Institut Jean Nicod on 6 October 2010
Talk at Institut Jean Nicod on 6 October 2010
 
Semantic tools for aggregation of morphological characters across studies
Semantic tools for aggregation of morphological characters across studiesSemantic tools for aggregation of morphological characters across studies
Semantic tools for aggregation of morphological characters across studies
 
Getting Started with the Hymenoptera Anatomical Ontology
Getting Started with the Hymenoptera Anatomical OntologyGetting Started with the Hymenoptera Anatomical Ontology
Getting Started with the Hymenoptera Anatomical Ontology
 

Viewers also liked

Phyloinformatics VoCamp
Phyloinformatics VoCampPhyloinformatics VoCamp
Phyloinformatics VoCampHilmar Lapp
 
Quality professional development
Quality professional developmentQuality professional development
Quality professional developmenttrtkaren
 
Bringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descentBringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descentHilmar Lapp
 
Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Hilmar Lapp
 
OBF Address at BOSC 2012
OBF Address at BOSC 2012OBF Address at BOSC 2012
OBF Address at BOSC 2012Hilmar Lapp
 
Safe school environments
Safe school environmentsSafe school environments
Safe school environmentstrtkaren
 
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...Hilmar Lapp
 
Reasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentReasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentHilmar Lapp
 

Viewers also liked (10)

Phyloinformatics VoCamp
Phyloinformatics VoCampPhyloinformatics VoCamp
Phyloinformatics VoCamp
 
Quality professional development
Quality professional developmentQuality professional development
Quality professional development
 
Present simple tense
Present simple tensePresent simple tense
Present simple tense
 
Bringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descentBringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descent
 
Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?
 
OBF Address at BOSC 2012
OBF Address at BOSC 2012OBF Address at BOSC 2012
OBF Address at BOSC 2012
 
KIERASAYS:
KIERASAYS:KIERASAYS:
KIERASAYS:
 
Safe school environments
Safe school environmentsSafe school environments
Safe school environments
 
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
 
Reasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentReasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descent
 

Similar to The diversity of life and opportunities for reasoning about evolution

Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesLeighton Pritchard
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK Cyndy Parr
 
How the Encyclopedia of Life is wrangling organismal attribute data
How the Encyclopedia of Life is wrangling organismal attribute dataHow the Encyclopedia of Life is wrangling organismal attribute data
How the Encyclopedia of Life is wrangling organismal attribute dataCyndy Parr
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life Cyndy Parr
 
From baleen to cleft palate: an ontological exploration of evolution and dis...
From baleen to cleft palate: an ontological exploration of evolution and dis...From baleen to cleft palate: an ontological exploration of evolution and dis...
From baleen to cleft palate: an ontological exploration of evolution and dis...mhaendel
 
Molecular Systematics and Biodiversity
Molecular Systematics and BiodiversityMolecular Systematics and Biodiversity
Molecular Systematics and BiodiversitySarwar A.D
 
Abstract A Critique Of Pure Folly
Abstract  A Critique Of Pure FollyAbstract  A Critique Of Pure Folly
Abstract A Critique Of Pure FollyAllison Thompson
 
CRISPR PROJECT.pptx
CRISPR PROJECT.pptxCRISPR PROJECT.pptx
CRISPR PROJECT.pptxAcSni
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giantsBenjamin Good
 
Science Seminar Series 4 Norman Johnson
Science Seminar Series 4 Norman JohnsonScience Seminar Series 4 Norman Johnson
Science Seminar Series 4 Norman JohnsonUniversity of Adelaide
 
Comparing the Amount and Quality of Information from Different Sequencing Str...
Comparing the Amount and Quality of Information from Different Sequencing Str...Comparing the Amount and Quality of Information from Different Sequencing Str...
Comparing the Amount and Quality of Information from Different Sequencing Str...jembrown
 
Bioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogeneticsBioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogeneticsProf. Wim Van Criekinge
 
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...Arvinder Singh
 
Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesCyndy Parr
 
PENSOFT ARTICLE COLLECTION ABOUT MYANMAR
PENSOFT ARTICLE COLLECTION ABOUT MYANMARPENSOFT ARTICLE COLLECTION ABOUT MYANMAR
PENSOFT ARTICLE COLLECTION ABOUT MYANMARMYO AUNG Myanmar
 
Global Names Architecture - Remsen
Global Names Architecture - RemsenGlobal Names Architecture - Remsen
Global Names Architecture - RemsenDavid Remsen
 
Why Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItWhy Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItAnita de Waard
 

Similar to The diversity of life and opportunities for reasoning about evolution (20)

Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In Sequences
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
How the Encyclopedia of Life is wrangling organismal attribute data
How the Encyclopedia of Life is wrangling organismal attribute dataHow the Encyclopedia of Life is wrangling organismal attribute data
How the Encyclopedia of Life is wrangling organismal attribute data
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life
 
From baleen to cleft palate: an ontological exploration of evolution and dis...
From baleen to cleft palate: an ontological exploration of evolution and dis...From baleen to cleft palate: an ontological exploration of evolution and dis...
From baleen to cleft palate: an ontological exploration of evolution and dis...
 
Molecular Systematics and Biodiversity
Molecular Systematics and BiodiversityMolecular Systematics and Biodiversity
Molecular Systematics and Biodiversity
 
Abstract A Critique Of Pure Folly
Abstract  A Critique Of Pure FollyAbstract  A Critique Of Pure Folly
Abstract A Critique Of Pure Folly
 
CRISPR PROJECT.pptx
CRISPR PROJECT.pptxCRISPR PROJECT.pptx
CRISPR PROJECT.pptx
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
Science Seminar Series 4 Norman Johnson
Science Seminar Series 4 Norman JohnsonScience Seminar Series 4 Norman Johnson
Science Seminar Series 4 Norman Johnson
 
Comparing the Amount and Quality of Information from Different Sequencing Str...
Comparing the Amount and Quality of Information from Different Sequencing Str...Comparing the Amount and Quality of Information from Different Sequencing Str...
Comparing the Amount and Quality of Information from Different Sequencing Str...
 
Shorthouse
ShorthouseShorthouse
Shorthouse
 
Bioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogeneticsBioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogenetics
 
Chibucos annot go_final
Chibucos annot go_finalChibucos annot go_final
Chibucos annot go_final
 
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...Rapid Impact Assessment of Climatic and Physio-graphic Changes  on Flagship G...
Rapid Impact Assessment of Climatic and Physio-graphic Changes on Flagship G...
 
Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypes
 
Meghyn slides-hse-2014
Meghyn slides-hse-2014Meghyn slides-hse-2014
Meghyn slides-hse-2014
 
PENSOFT ARTICLE COLLECTION ABOUT MYANMAR
PENSOFT ARTICLE COLLECTION ABOUT MYANMARPENSOFT ARTICLE COLLECTION ABOUT MYANMAR
PENSOFT ARTICLE COLLECTION ABOUT MYANMAR
 
Global Names Architecture - Remsen
Global Names Architecture - RemsenGlobal Names Architecture - Remsen
Global Names Architecture - Remsen
 
Why Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItWhy Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About It
 

More from Hilmar Lapp

Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...Hilmar Lapp
 
Integrating data with phylogenies, at scale
Integrating data with phylogenies, at scaleIntegrating data with phylogenies, at scale
Integrating data with phylogenies, at scaleHilmar Lapp
 
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...Hilmar Lapp
 
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Hilmar Lapp
 
Open Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionOpen Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionHilmar Lapp
 
Reproducible Science - Panel at iEvoBio 2014
Reproducible Science - Panel at iEvoBio 2014 Reproducible Science - Panel at iEvoBio 2014
Reproducible Science - Panel at iEvoBio 2014 Hilmar Lapp
 
The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...Hilmar Lapp
 
PhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing PhylogeniesPhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing PhylogeniesHilmar Lapp
 
OBF Address at BOSC 2013
OBF Address at BOSC 2013OBF Address at BOSC 2013
OBF Address at BOSC 2013Hilmar Lapp
 
The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...Hilmar Lapp
 
Liberating Our Beautiful Trees: A Call to Arms.
Liberating Our Beautiful Trees: A Call to Arms.Liberating Our Beautiful Trees: A Call to Arms.
Liberating Our Beautiful Trees: A Call to Arms.Hilmar Lapp
 
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic DatabaseTowards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic DatabaseHilmar Lapp
 
Lapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing SymposiumLapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing SymposiumHilmar Lapp
 
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future FeaturesBioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future FeaturesHilmar Lapp
 

More from Hilmar Lapp (14)

Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
 
Integrating data with phylogenies, at scale
Integrating data with phylogenies, at scaleIntegrating data with phylogenies, at scale
Integrating data with phylogenies, at scale
 
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
 
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
 
Open Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionOpen Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some Introspection
 
Reproducible Science - Panel at iEvoBio 2014
Reproducible Science - Panel at iEvoBio 2014 Reproducible Science - Panel at iEvoBio 2014
Reproducible Science - Panel at iEvoBio 2014
 
The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...
 
PhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing PhylogeniesPhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing Phylogenies
 
OBF Address at BOSC 2013
OBF Address at BOSC 2013OBF Address at BOSC 2013
OBF Address at BOSC 2013
 
The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...
 
Liberating Our Beautiful Trees: A Call to Arms.
Liberating Our Beautiful Trees: A Call to Arms.Liberating Our Beautiful Trees: A Call to Arms.
Liberating Our Beautiful Trees: A Call to Arms.
 
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic DatabaseTowards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
 
Lapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing SymposiumLapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing Symposium
 
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future FeaturesBioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
 

Recently uploaded

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 

Recently uploaded (20)

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 

The diversity of life and opportunities for reasoning about evolution

  • 1. Semantics of and for the diversity of life: Opportunities and perils of trying to reason on the frontier Hilmar Lapp National Evolutionary Synthesis Center (NESCent) CSHALS, Boston, February 28, 2014
  • 2. Regier et al (2010 Parfrey et al (2010, Parfrey & Katz The diversity of life is stunning Images: Web Tree of Life (http://tolweb.org)
  • 3. Extinct biodiversity is even greater than extant
  • 4. Dunn et al (2013) Home life: factors structuring the Huttenhower et al. (2012) Structure, function and bacterial diversity found within and between homes. diversity of the healthy human microbiome. PLoS One 8: e64133 Nature 486: 207–214.
  • 5.
  • 6. B. Sidlauskas, Oregon State University
  • 7. Smithsonian Institution, Natural History Museum • About 1.2-2.1 billion natural history specimens • > 7,000 natural history collections registered Losos JB et al. (2013) Evolutionary Biology for the 21st Century. PLoS Biol 11: e1001466.
  • 9.
  • 10. Comparative features documented in meticulous detail, in free text
  • 11. Finding similar information in free-text is difficult “lacrymal bone...flat’’ Mayden 1989 “lacrimal...small, flat” Grande and Poyato-Ariza 1999 “lacrimal...triangular’’ “first infraorbital (lachrimal) shape...flattened” “fourth infraorbital...anterior and posterior margins...in parallel” Royero 1999 Kailola 2004 Zanata and Vari 2005 OMIM query “large bone” “enlarged bone” “big bones” “huge bones” #records 1083 224 21 4 “massive bones” 41 “hyperplastic bones” 12 “hyperplastic bone” 45 “bone hyperplasia” 181 “increased bone growth” 879
  • 12. Comparative features underlie knowledge about evolutionary history Sereno (1999)
  • 13. O’Leary et al. (2013) The placental mammal ancestor and the postK-Pg radiation of placentals. Science 339: 662–667 vs. Dos Reis et al (2014) Neither phylogenomic nor palaeontological data support a Palaeogene origin of placental mammals. Biol Lett 10
  • 14. Even though, much of biodiversity is not well described Content-rich = richness score > 0.4 Only families with > 10 species
  • 15. Dark areas in the Tree of Life EOL/BHL Research Sprint Jessica Oswald, Karen Cranston, Gordon Burleigh, and Cyndy Parr
  • 16. Our knowledge is conflicting Open Tree of Life - Phylografter (Richard Ree, FMNH)
  • 17. Making biodiversity knowledge part of Big & Linked Data faces major challenges • The Linnean system for taxonomy was not designed for Linked Data. • Names as identifiers, and track provenance. • No canonical comprehensive taxonomy. • Specimens referenced by a combination of metadata (with unreliable provision & uniqueness). • No common resolver to identifiers or URLs. • Most knowledge and data is in free text using the expressivity of natural language.
  • 18. Also huge opportunities for advancing science • Organizing and linking data to biodiversity • Data mining morphology, traits, habitats, ecological interactions, and other descriptive data
  • 19. Using reasoning to make linking data by taxon less perilous
  • 20. The perils of linking data by taxon
  • 21. How to render the definition of taxon names computable? 1 2 1 3 2 3 Tetrapoda (crown group) Tetrapoda (with digits) Tetrapoda (stem group)
  • 22. OBO:VTO_0001465 Definition: “The first organisms derived from Sarcopterygians that possessed digits homologous with those in Homo sapiens, and all its descendants.” (Ahlberg and Clack 1998, Anderson 2002) obo:NCBITaxon_32523
  • 23. Naming everything in the Tree of Life is impractical
  • 24. Phyloreferencing: A universal computable coordinate system for life Tetrapoda What if this were not just a signpost but a coordinate computable against any tree?
  • 25. Phyloreferencing: A universal computable coordinate system for life Requirements: • Any node, branch, subtree is referencable • References are unambiguous • References are portable • References are computable • Adapts easily to new and changing knowledge
  • 26. Phylogenetic clade definitions Form the basis of Phylogenetic Nomenclature http://en.wikipedia.org/wiki/File:Clade_types.svg
  • 27. We already know how to render the semantics of names computable part_of some forebrain part_of some telencephalon part_of some telencephalic ventricle stroma choroid plexus stroma lateral ventricle choroid plexus stroma • Conjunctive OWL class expressions • Necessary and sufficient conditions for class membership Class: lateral_ventricle_choroid_plexus_ stroma EquivalentTo: uberon:choroid_plexus_stroma and bfo:part_of some uberon:telencephalic_ventricle
  • 28. Ontologies for evolutionary relationships exist Comparative Data Analysis Ontology: • OWL ontology • Scope: phylogenetic data and trees • Rich set of axioms • Prosdocimi et al (2009) Evol Bioinform Online: 47–66 Prosdocimi et al (2009)
  • 29. The Tree of Life as an ontology 1 2 3 Class: Tetrapoda_Total EquivalentTo: cdao:has_Descendant value taxon:Amniota and phyloref:excludes_lineage value taxon:Dipnoi
  • 30. The Tree of Life as an ontology 1 2 3 Class: Tetrapoda_Crown EquivalentTo: cdao:has_Descendant value taxon:Amniota and phyloref:excludes_lineage value taxon:Crassigyrinus
  • 31. The Tree of Life as an ontology 1 2 3 Class: Tetrapoda_Digit_hand EquivalentTo: cdao:has_Descendant value taxon:Amniota and phyloref:has_Progenitor some ( bfo:has_part some uberon:’manual digit’)
  • 32. Phyloreferences can be named or anonymous • Phyloreference expressions can be anonymous, or named and registered • • • Semantics is exactly the same Naming has benefits: promotes reuse, consistency, usability and accessibility by non-experts Ontology of computable clade names Class: AGF4-SHRU-3560 EquivalentTo: cdao:has_Descendant value taxon:Amniota and phyloref:has_Progenitor some ( bfo:has_part some uberon:’manual digit’) vs. Class: Tetrapoda_Digit_hand Annotations: rdfs:label “Tetrapoda with limbs with digits” dc:description “the first sarcopterygian to have possessed digits homologous with Amniotes” EquivalentTo: cdao:has_Descendant value taxon:Amniota and phyloref:has_Progenitor some ( bfo:has_part some uberon:’manual digit’)
  • 33. Common coordinate systems also require standards • Common formally defined language (OWL) • Common ontologies to encode knowledge and queries (CDAO, PhyloRef) • Common identifiers for phyloreference specifiers • Phenotype and anatomy ontologies • Canonical taxonomy for OTU names • Canonical GUIDs for specimens
  • 34. Using ontologies & reasoning to mine our knowledge of trait diversity
  • 35. Challenge: What do we know about the evolution of morphological biodiversity? http://www-news.uchicago.edu/releases/06/060405.tiktaalik.shtml Clack, J. A. (2009). The Fin to Limb Transition: New Data, Interpretations, and Hypotheses from Paleontology and Developmental Biology. Annual Review of Earth and Planetary Sciences, 37(1), 163-179
  • 36. Vast stores of data, but all free text
  • 37. Manual effort of experts is not scalable Fig. 7, Sereno (2009) Fig. 6, Sereno (2009)
  • 38. Schneider et al (2011) Proc Natl Acad Sci U S A 108: 12782–12786 Challenge: What can we hypothesize about genes involved in the evolution of biodiversity?
  • 39. Model organism & health literature contains descriptions of gene and disease phenotypes Kimmel et al, 2003
  • 40. Translational bioinformatics Fig. 3, Washington et al (2009) Model organism -> Human Fig. 1, Washington et al (2009)
  • 41. Translational biodiversity informatics Fig. 1, Washington et al (2009) Model organism genes -> Evolutionary diversity by semantic similarity of phenotypes
  • 42. Focus: vertebrate fin/limb transition Schneider & Shubin NH (2013) Trends Genet 29: 419–426 42
  • 43.
  • 44. Computable via shared ontologies, rich semantics, OWL reasoning
  • 45. Ontology-annotated descriptions integrate across studies & fields + Comparative studies Model organism datasets = Phenoscape Knowledgebase
  • 46. Phenoscape KB content • 16,000 character states from >120 comparative morphological datasets, linked to 4,000 vertebrate taxa. • Imported genetic phenotype and expression data from ZFIN, Xenbase, MGI, and Human Phenotype project. • Shared semantics: Uberon (anatomy), PATO (phenotypic qualities), Entity–Quality (EQ) OWL axioms (phenotype observations) • Plus a dozen other ontologies ...
  • 47. KB enables candidate gene hypothesis generation Mutation of eda gene in Danio: Ictalurus punctatus: Harris et al., 2007 Copyright  ©  Jean  Ricardo  Simões  Vitule,  All  Rights  Reserved
  • 48. Wet lab test (Work by Richard Edmunds) Ictalurus punctatus eda expression is lacking in the epidermis
  • 49. KB enables candidate gene hypothesis generation
  • 50. KB enables candidate gene hypothesis generation
  • 51. Clack, J. A. (2009) Annual Review of Earth and Planetary Sciences, 37:163-179 Work on vertebrate fin/limb transition candidate genes is ongoing
  • 52. Using reasoning to synthesize data implied by what we know
  • 53. Elbow joint What do we know about the evolution of morphological biodiversity? Humeral(head(in(human Shubin et al (2006) Nature 440: 764–771 Human shoulder joint Shubin et al (2006) Nature 440: 764–771
  • 54. Do we really not know more? Presence / absence of digits Author-asserted: 74 present 25 absent 11% of taxa
  • 55. Character synthesis by inference • Most phenotypic descriptions of some feature of a structure implies its presence or absence: • “Humerus slender and elongate: with length more than three times the diameter of its distal end” → humerus must be present • Partonomy axioms in the ontology allow inferring presence or absence: • ‘all humerus part_of some forelimb’ → forelimb must be present if humerus is; humerus must be absent if forelimb is • Developmental origin axioms allow inferring presence or absence: • ‘all limb develops_from some limb bud’ → limb bud must be present if limb is; limb must be absent if limb bud is
  • 56. A reasoner can fill in lots of gaps Presence / absence of digits Asserted & inferred: 645 inferred present 74 asserted present 25 asserted absent 82% of taxa
  • 57. A reasoner can fill in lots of gaps asserted presence/absence with inference Mesquite “birds-eye view”
  • 58. Even knowing only presence/ absence of traits can be powerful
  • 59. Synthesis highlights conflict and gaps • Conflicting interpretations in studies • supinator process of humerus: both absent & present in Strepsodus (Zhu et al. 1999 vs. Ruta 2011) figure from Parker et al., 2005 • Gaps in knowledge • acetabulum present or absent? • Same term, different meaning? • Acanthostega— “radials, jointed” (Swartz Acetabulum of pelvic girdle: present/absent 2012) • but doesn’t have radials... • Uneven taxon sampling http://characterdesignnotes.blogspot.com/2011/04/proper-use-of-reference-andanatomy-in.html
  • 60. How to reason and query at the scale of biodiversity?
  • 61. Rich axioms allow rich inferences, but make scaling a challenge • Anatomy ontologies and EQ annotation employ rich OWL semantics → requires DL reasoner • Classifying and querying over large dataset (~25 million RDF triples) does not scale well • Presently, the only feasible OWL reasoner is ELK • constrained to OWL EL profile → limits kinds of expressions that can be used • best performance over class axioms only → data must be modeled so as to avoid need for classifying instances
  • 62. MOD curators Phenoscape curators Ontology curators ontologies MODs phenotype & gene expression annotations gene identifiers tab-delimited SPARQL queries Owlet SPARQL endpoint Bigdata triplestore NeXML matrices RDF (all) kb-owl-tools OWL conversion • includes translation of EQ to OWL expressions Identifier cleanup URIs for standard properties across ontologies are a mess Axiom generation • "Absence" classes for OWL EL negation classification workaround • SPARQL facilitation (e.g. materialized existential hierarchies such as part_of) applications ELK OWL tbox axioms Assertion of absence hierarchy based on inverse of hierarchy of negated classes computed by ELK Materialize inferred subclass axioms ELK reasoner using extracted tbox axioms only (not feasible with individuals included)
  • 63. Querying a triple store with complex OWL expressions • Want to allow arbitrary selection of structures of interest, using rich semantics: (part_of some (limb/fin or girdle skeleton)) or (connected_to some girdle skeleton) • RDF triplestores provide very limited reasoning expressivity, and scale poorly with large ontologies. • However, ELK can answer class expression queries within seconds.
  • 64. What if instead of this (*): PREFIX  rdf:  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#> PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#> PREFIX  ao:  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/> PREFIX  owl:  <http://www.w3.org/2002/07/owl#> SELECT  DISTINCT  ?gene WHERE   { ?gene  ao:expressed_in  ?structure  . ?structure  rdf:type  ?structure_class  . #  Triple  pattern  selecting  structure: ?structure_class  rdfs:subClassOf  "ao:muscle”  . ?structure_class  rdfs:subClassOf  ?restriction ?restriction  owl:onProperty  ao:part_of  . ?restriction  owl:someValuesFrom  "ao:head"  . } we could do this: PREFIX  rdf:  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#> PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#> PREFIX  ao:  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/> PREFIX  ow:  <http://purl.org/phenoscape/owlet/syntax#> SELECT  DISTINCT  ?gene WHERE   { ?gene  ao:expressed_in  ?structure  . ?structure  rdf:type  ?structure_class  . #  Triple  pattern  containing  an  OWL  expression: ?structure_class  rdfs:subClassOf  "ao:muscle  and  (ao:part_of  some  ao:head)"^^ow:omn  . }
  • 65. owlet: SPARQL query expansion with in-memory OWL reasoner • owlet interprets OWL class expressions embedded within SPARQL queries • Uses any OWL API-based reasoner to preprocess query. • We use ELK that holds terminology in memory. • Replaces OWL expression with FILTER statement listing matching terms • http://github.com/phenoscape/owlet
  • 66. PREFIX  rdf:  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#> PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#> PREFIX  ao:  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/> PREFIX  ow:  <http://purl.org/phenoscape/owlet/syntax#> SELECT  DISTINCT  ?gene WHERE   { ?gene  ao:expressed_in  ?structure  . ?structure  rdf:type  ?structure_class  . #  Triple  pattern  containing  an  OWL  expression: ?structure_class  rdfs:subClassOf  "ao:muscle  and  (ao:part_of  some  ao:head)"^^ow:omn  . } ➡︀︁︂︃︄︅︆︇︈︉︊︋︌︍︎️ owlet ➡︀︁︂︃︄︅︆︇︈︉︊︋︌︍︎️ PREFIX  rdf:  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#> PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#> PREFIX  ao:  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/> PREFIX  ow:  <http://purl.org/phenoscape/owlet/syntax#> SELECT  DISTINCT  ?gene WHERE   { ?gene  ao:expressed_in  ?structure  . ?structure  rdf:type  ?structure_class  . #  Filter  constraining  ?structure_class  to  the  terms  returned  by  the  OWL  query: FILTER(?structure_class  IN  (ao:adductor_mandibulae,  ao:constrictor_dorsalis,  ...)) }
  • 67. How to annotate descriptive biodiversity data at scale?
  • 68. Curation Dahdul et al., 2010 PLoS ONE 2. Students: Manual entry of free text character descriptions, matrix, taxon list, specimens and museum numbers using Phenex 1. Students: gather publications (scan hard copies, produce OCR PDFs) 3. Character annotation by experts: Entry of phenotypes and homology assertions using Phenex 139 Publications 8,073 characters 18,811 states for 4397 taxa ~6.7 FTE years 4. Build pipeline for the Phenoscape KB MOD curators Phenoscape curators Ontology curators ontologies MODs phenotype & gene expression annotations gene identifiers tab-delimited SPARQL queries Owlet SPARQL endpoint Bigdata triplestore NeXML matrices RDF (all) kb-owl-tools OWL conversion • includes translation of EQ to OWL expressions Identifier cleanup URIs for standard properties across ontologies are a mess Axiom generation • "Absence" classes for OWL EL negation classification workaround • SPARQL facilitation (e.g. materialized existential hierarchies such as part_of) applications ELK OWL tbox axioms Assertion of absence hierarchy based on inverse of hierarchy of negated classes computed by ELK Materialize inferred subclass axioms ELK reasoner using extracted tbox axioms only (not feasible with individuals included)
  • 69. CharaParser: Using NLP to scale annotation Cui H (2012) CharaParser for fine-grained semantic annotation of organism morphological descriptions. JASIST 63: 738–754
  • 70. Most of the effort may actually be data digitization, ontology building, etc 6% 94% Actual data annotation Preparatory overhead
  • 71. Building semantically rich community ontologies is hard 222 terms 790 terms Uberon: cross-species anatomy ontology
  • 72. Taxonomy ontology needs to be synthesized Vertebrate Taxonomy Ontology (VTO) Midford et al., in press JBMS
  • 73. How to evaluate semantic similarity metrics meaningfully?
  • 74. Many different metrics Pesquita et al (2009) Semantic similarity in biomedical ontologies. PLoS Comput Biol 5: e1000443
  • 75. What if there is no “gold standard”?
  • 76. Evaluate by tendency to maximizing inter-curator similarity Curator 1 Curator 2 Curator 1 Curator 2 • Can also be used to assess payoff from annotation granularity
  • 77. Summary: Big obstacles, but also big opportunities for semantics-driven discovery • Vast stores of published knowledge and data waiting to be exploited • Countless possibilities for knowledge discovery by connecting data • Scaling out expressive reasoning is hard • Workforce training is a serious issue
  • 78. Phenoscape project team National Evolutionary Synthesis Center (NESCent) University of Oregon (Zebrafish Information Network) Todd Vision (also University of North Carolina at Chapel Hill) Monte Westerfield Hilmar Lapp Ceri Van Slyke Jim Balhoff Cincinnati Children's Hospital (Xenbase) Prashanti Manda University of South Dakota Paula Mabee David Blackburn Paul Sereno Nizar Ibrahim Mouse Genome Informatics Terry Hayamizu Christina James-Zorn California Academy of Sciences Alex Dececchi Judith Blake Aaron Zorn Virgilio Ponferrada Wasila Dahdul University of Chicago Yvonne Bradford University of Arizona Hong Cui Oregon Health & Science University Melissa Haendel Lawrence Berkeley National Labs Chris Mungall
  • 79. Acknowledgements • Phenoscape personnel, PIs, curators & workshop participants • National Evolutionary Synthesis Center (NESCent) • Nico Cellinese (U. Florida): Phyloreferencing • NSF (DBI-1062404, DBI-1062542) • Karen Cranston (Open Tree of Life) • Cynthia Parr (EOL) • William Ulate (BHL)