SlideShare a Scribd company logo
1 of 16
ELSEVIER | The Research Object Authoring Tool --- CNI 2018 1
FAIR4CURES
A Research Object Authoring Tool for the Data Commons
December 11, 2018
Anita de Waard (she, her)
VP Research Collaborations
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Overview:
1. The NIH Data Commons: a very short introduction
2. The FAIR4CURES Project
3. A Global Unique Identifier Broker
4. Research Objects: a very very short introduction
5. Building a Research Object Authoring Tool on Mendeley Data
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
The NIH Data Commons Pilot Phase aims to
provide a marketplace for tools, data and
workflows
based on existing technologies of commercial and
academic platforms that strive to embody the FAIR
Data principles.
Overview:
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Data Commons Overview:
Goal of the project:
1. Advance the policies and protocols for accessing human subjects data
2. Support global identification, indexing and searching of available data sets;
3. Provide a collection of computational pipelines that can be applied to data sets
4. Utilize standards to globally identify and access data sets, tools and workflows
5. Create policies for data citation, reuse and reproducibility
6. Enable researchers to port their own data and workflows into the cloud
Project structure:
• DCPPC research groups are addressing important Key Capabilities =>
• The Commons will be composed of four stacks, incorporating products from the KCs
Final output:
• Data from three large NIH Databases will be available through all of these systems
• Users can securely access data within all stacks, on multiple cloud providers
• Users have access a basic set of applications that run the same way on all stacks.
https://public.nihdatacommons.us/ExecutiveSummary_4YP/
Key Capabilities:
1: FAIR Guidelines & Metrics
2: Global Unique IDs for FAIR Digital Objects
3: Open Standard APIs
4: Cloud Agnostic Architecture Framework
5: Workspaces for Computation
6: Research Ethics, Privacy, and Security
7: Indexing and Search
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Data Commons Guiding Principles:
• 1. Identifiers for data: Develop and implement an interoperable global unique identifier system for digital
objects.
• 2. Data access: Develop and implement authentication and authorization policies and protocols for controlled
access to digital objects and derivatives.
• 3. Findability: Enable search and indexing of digital objects and data sets.
• 4. Software stacks: The Commons will encompass multiple robust and sustainable software stacks
implementing Commons standards and systems.
• 5. Data use, standards: All tools will be build using standard application interfaces.
• 6. Use cases: The Commons will develop and utilize an extensive use case library.
• 7. Community: The Commons is developed through intense Community engagement and support across
multiple levels of expertise.
• 8. Community: Governance, membership, and coordination will be established with and through the
community.
• 9. Evaluation methods and metrics: We plan a culture of frequent release of products, with small iterations,
routine evaluation and redesign.
• 10. FAIR guidelines and metrics: Once FAIR metrics and rubrics are defined, these will be used to measure the
level of “FAIRness” of repositories, datasets, and other digital objects.
https://public.nihdatacommons.us/executive-summary/
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Team Xenon – Four partner organisations
Findable Accessible Interoperable Reusable
Collaborative Usable Reproducible Extendable Scalable
The FAIR4CURES Collaboration:
Index 3 datasets:
• Trans-omics for Precision Medicine (TOPMed)
• Genotype Tissue Expression (GTEx)
• Model Organisms Database (MODs)
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
The FAIR4CURES PlatformThe FAIR4CURES System:
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
• Identifiers for hosted data files within TOPMed studies, GTEx dataset, and MODs
• Feature for researchers to register identifiers for their derived data files on the
platform, making the content public and searchable
• Selecting types of identifiers to support in the Data Commons ecosystem and the
required identifier metadata
• Open Source tool, connected to the SevenBridges Platform
• Also accessible via Github/SmartAPI
Global Unique Identifier Broker:
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Digital Object Types Identified following the KC2 Metadata Spec:
Seven Bridges
Object Type
DataCite
Resource
Type
Proposed
Schema.Org
CreativeWork Types
Supported Relationships Notes
File Dataset Dataset Source Of a Task (input file)
Derived From a Task
(output file)
Part Of a Collection
One (or more) files packaged with metadata as a dataset
App (Tool) Software SoftwareSourceCode Part of Task or Collection or
Workflow
Same as dataset, but file is source code
App
(Workflow)
Workflow SoftwareSourceCode
(?)
Has Part of Software An aggregation of Tools (Software). File is CWL definition
describing how the tools are chained.
Task Collection Collection Composition of Files and
Apps (Tools or Workflows)
An aggregation of Apps (either tools or workflows), plus files
(input & output) plus a record of all the settings used for each
App.
Collection
(Study)
Collection Collection Composition of any object An aggregation of heterogeneous objects for purpose of
publishing.
https://docs.google.com/document/d/1FD3aXr_uHnPy-YrFhQhuXET73tBVxu7F_Q5uS9TPUZs/edit
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Seven Bridges Data Publication Concept
Requirements Analysis:
1. Landing page URL including GUID
2. URL for page where file can be accessed (downloaded)
3. Metadata for object
4. Reference to the Task (zero or one) that this dataset was Derived From
5. Reference to the Task(s) (zero, one or more) that this dataset is the Source Of
1
2
3
4
5
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Seven Bridges Workflow Configuration (CWL)
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Standards-based metadata framework for
logically and physically bundling resources
with context
http://researchobject.org
What are Research Objects?
Aggregates
link things together
Annotations
about things & their
relationships
Container
Packaging content & links:
Zip files, BagIt, Docker images
Identification
locate things
regardless where
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Research Objects can be used to capture outputs in a wide range of scopes
• Profiles help define the shape and form of a research object.
• A profile defines the general purpose of that type of Research Objects:
• A format (e.g. Research Object Bundle),
• An expectation of what kind of resources should be expected,
• A link to any specific vocabularies that should be used in its annotations.
Applications of Research Objects include BDBags (Big Data Bags):
• In digital libraries, preservation of source artifacts commonly use the BagIt format for archive serialization, capturing
digital resources like audio recordings, document scans and their transcriptions, provenance and annotations.
• The Research Object BagIt archive is a profile for describing a BagIt archive and its content as a Research Object to
structure the metadata and relate the captured resources
• The NIH-funded Big Data for Discovery Science (BDDS) project captures Big Data bags (BDBag) of large complex datasets
from genomics workflows (https://doi.org/10.1109/BigData.2016.7840618).
• A key aspect of BDBag is the ability to use Minimal Viable Identifiers (minid) for referencing potentially large data sources
held in multiple remote repositories, effectively making a “Big Data” Research Object for large-scale workflows
(https://doi.org/10.1101/268755).
• A bag of bags (minid:b9vx04) is a metadata skeleton which may be completed with tools like bdbag to download the big
data
• The bags’ Research Object manifests can be consumed independently, linking to the remote resources.
Research Objects and BDBags:
http://www.researchobject.org/scopes/
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Moving from Datasets to Research Objects in Mendeley Data:
In Mendeley Data Repository, datasets are lists of files (stored in our S3 bucket) with metadata packaging (e.g. Titles,
Description, Categories, License) and a persistent identifier DOI).
We will introduce:
• Collections as an aggregation of Datasets. Similar to a Dataset, BUT, the contents are other datasets, not files.
• Software and Workflow as different types of Digital Objects. Similar to a Dataset, BUT files are source code or
workflow specifications (e.g. CWL) and metadata properties could be a bit different.
This forms the foundation for Research Objects, which are:
• Collections or aggregations of different types of Digital Objects (not just datasets)
• References to digital objects on other platforms, based on standard identifiers (e.g. DOIs or ARKs)
• A manifest which lists and describes the contents of the Research Object
• Exposed in JSON-LD:
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
GUID Broker (API Only)
Seven Bridges
Fair4CURES Platform
Phase 1
Pilot Project
(Apr – Sep 2018)
Register Datasets (Data Files)
Register Software Objects
Register Workflow Objects
Uses
Register a Collection as a list of
digital objects (data, sw, wf)
In Summary:
Objective 1 – support “Task” type
Research Objects on Seven Bridges
platform.
Objective 2 - support configurable
Research Objects on Mendeley Data
platform.
Phase 2
Project
(Oct 2018 - 2019) Add annotation and relationships
to collection to describe a research
object
Research Object Composer
Serialise Research Object in
standard format based on BDBags
and RO standards Mendeley Data
Platform
Uses Re-uses
http://smart-api.info/ui/
bf9abe9c17c9c78c432832382ef9e16a#/
ELSEVIER | The Research Object Authoring Tool --- CNI 2018
Acknowledgements:
• This work is supported by the NIH Data Commons Pilot Phase under the Research Opportunity
Announcement (ROA) RM-17-026 https://commonfund.nih.gov/commons/:
• NIH Data Commons - 1 OT3 OD025463-01
• NHLBI STAGE Project - 1 OT3 HL142478-01
• The FAIR4CURES Project lead by SevenBridges (Alison Leaf, Brandi Davis-Dusenbury and Sarper Avcil)
• We partner in the Project with Repositive UK and the US Dept of Veteran’s Affairs
• The metadata standards development was done by KC2, lead by Team Sodium (esp. Merce Crosas, Tim
Clark, Trisha Cruse and Martin Fenner)
• The Research Objects Authoring Tool work is lead by the University of Manchester, who pioneered work
on Research Objects (Stian Soiland-Reyes and Carole Goble)
• The Mendeley Data team has built the GUID Broker Prototype (Gabriel Oscares, Gareth Harvey

More Related Content

What's hot

FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
 
Wikidata as a hub for the linked data cloud
Wikidata as a hub for the linked data cloudWikidata as a hub for the linked data cloud
Wikidata as a hub for the linked data cloudJoachim Neubert
 
Leverage DSpace for an enterprise, mission critical platform
Leverage DSpace for an enterprise, mission critical platformLeverage DSpace for an enterprise, mission critical platform
Leverage DSpace for an enterprise, mission critical platformAndrea Bollini
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryCarole Goble
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishingVarsha Khodiyar
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsCarole Goble
 
DSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISDSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISAndrea Bollini
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformAndrea Bollini
 
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the CloudBuilding Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the CloudPeter Haase
 
Dataverse opportunities
Dataverse opportunitiesDataverse opportunities
Dataverse opportunitiesvty
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data ManagementCarole Goble
 
6.15.17 DSpace-Cris Webinar Presentation Slides
6.15.17 DSpace-Cris Webinar Presentation Slides6.15.17 DSpace-Cris Webinar Presentation Slides
6.15.17 DSpace-Cris Webinar Presentation SlidesDuraSpace
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsPeter Haase
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Jian Qin
 
PIDs and DOI registration with DataCite - IATUL Workshop 2013
PIDs and DOI registration with DataCite - IATUL Workshop 2013PIDs and DOI registration with DataCite - IATUL Workshop 2013
PIDs and DOI registration with DataCite - IATUL Workshop 2013Frauke Ziedorn
 
Putting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAMPutting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAM4Science
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
The Information Workbench - Linked Data and Semantic Wikis in the Enterprise
The Information Workbench - Linked Data and Semantic Wikis in the EnterpriseThe Information Workbench - Linked Data and Semantic Wikis in the Enterprise
The Information Workbench - Linked Data and Semantic Wikis in the EnterprisePeter Haase
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyvty
 

What's hot (20)

FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Future
 
Wikidata as a hub for the linked data cloud
Wikidata as a hub for the linked data cloudWikidata as a hub for the linked data cloud
Wikidata as a hub for the linked data cloud
 
Leverage DSpace for an enterprise, mission critical platform
Leverage DSpace for an enterprise, mission critical platformLeverage DSpace for an enterprise, mission critical platform
Leverage DSpace for an enterprise, mission critical platform
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishing
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
DSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISDSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRIS
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platform
 
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the CloudBuilding Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
 
Dataverse opportunities
Dataverse opportunitiesDataverse opportunities
Dataverse opportunities
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
6.15.17 DSpace-Cris Webinar Presentation Slides
6.15.17 DSpace-Cris Webinar Presentation Slides6.15.17 DSpace-Cris Webinar Presentation Slides
6.15.17 DSpace-Cris Webinar Presentation Slides
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08
 
PIDs and DOI registration with DataCite - IATUL Workshop 2013
PIDs and DOI registration with DataCite - IATUL Workshop 2013PIDs and DOI registration with DataCite - IATUL Workshop 2013
PIDs and DOI registration with DataCite - IATUL Workshop 2013
 
Putting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAMPutting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAM
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
The Information Workbench - Linked Data and Semantic Wikis in the Enterprise
The Information Workbench - Linked Data and Semantic Wikis in the EnterpriseThe Information Workbench - Linked Data and Semantic Wikis in the Enterprise
The Information Workbench - Linked Data and Semantic Wikis in the Enterprise
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhy
 

Similar to CNI 2018: A Research Object Authoring Tool for the Data Commons

DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...OpenAIRE
 
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...UKSG: connecting the knowledge community
 
DSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISDSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRIS4Science
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Sarah Anna Stewart
 
Access the world’s research outputs through the CORE API
Access the world’s research outputs through the CORE API Access the world’s research outputs through the CORE API
Access the world’s research outputs through the CORE API Matteo Cancellieri
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseAnita de Waard
 
FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksRaul Palma
 
DSpace-CRIS Workshop OR2015: Slides
DSpace-CRIS Workshop OR2015: SlidesDSpace-CRIS Workshop OR2015: Slides
DSpace-CRIS Workshop OR2015: SlidesAndrea Bollini
 
Metadata-powered dissemination of content
Metadata-powered dissemination of contentMetadata-powered dissemination of content
Metadata-powered dissemination of contentNikos Manouselis
 
NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016Susanna-Assunta Sansone
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesPistoia Alliance
 
Open Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and ExchangeOpen Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and Exchangelagoze
 
NDS Relevant Update from the NIH Data Science (ADDS) Office
NDS Relevant Update from the NIH Data Science (ADDS) OfficeNDS Relevant Update from the NIH Data Science (ADDS) Office
NDS Relevant Update from the NIH Data Science (ADDS) OfficePhilip Bourne
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecycleAnita de Waard
 
Vellino presentationtocisti
Vellino presentationtocistiVellino presentationtocisti
Vellino presentationtocistiAndre Vellino
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout Carole Goble
 
DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013Frauke Ziedorn
 

Similar to CNI 2018: A Research Object Authoring Tool for the Data Commons (20)

DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
 
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
 
DSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISDSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRIS
 
Trm Trusted Repositories
Trm Trusted RepositoriesTrm Trusted Repositories
Trm Trusted Repositories
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
Access the world’s research outputs through the CORE API
Access the world’s research outputs through the CORE API Access the world’s research outputs through the CORE API
Access the world’s research outputs through the CORE API
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacks
 
Data management
Data management Data management
Data management
 
DSpace-CRIS Workshop OR2015: Slides
DSpace-CRIS Workshop OR2015: SlidesDSpace-CRIS Workshop OR2015: Slides
DSpace-CRIS Workshop OR2015: Slides
 
OAI-PMH
OAI-PMHOAI-PMH
OAI-PMH
 
Metadata-powered dissemination of content
Metadata-powered dissemination of contentMetadata-powered dissemination of content
Metadata-powered dissemination of content
 
NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
Open Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and ExchangeOpen Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and Exchange
 
NDS Relevant Update from the NIH Data Science (ADDS) Office
NDS Relevant Update from the NIH Data Science (ADDS) OfficeNDS Relevant Update from the NIH Data Science (ADDS) Office
NDS Relevant Update from the NIH Data Science (ADDS) Office
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data Lifecycle
 
Vellino presentationtocisti
Vellino presentationtocistiVellino presentationtocisti
Vellino presentationtocisti
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013
 

More from Anita de Waard

Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?Anita de Waard
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataAnita de Waard
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesAnita de Waard
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Anita de Waard
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?Anita de Waard
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data ManagementAnita de Waard
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseAnita de Waard
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryAnita de Waard
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data SharingAnita de Waard
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingAnita de Waard
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataAnita de Waard
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016Anita de Waard
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...Anita de Waard
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupAnita de Waard
 
The Rocky Road to Reuse
The Rocky Road to ReuseThe Rocky Road to Reuse
The Rocky Road to ReuseAnita de Waard
 
Collaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareCollaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareAnita de Waard
 

More from Anita de Waard (20)

Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
History of the future
History of the futureHistory of the future
History of the future
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
 
The Rocky Road to Reuse
The Rocky Road to ReuseThe Rocky Road to Reuse
The Rocky Road to Reuse
 
Collaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareCollaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and software
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

CNI 2018: A Research Object Authoring Tool for the Data Commons

  • 1. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 1 FAIR4CURES A Research Object Authoring Tool for the Data Commons December 11, 2018 Anita de Waard (she, her) VP Research Collaborations
  • 2. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Overview: 1. The NIH Data Commons: a very short introduction 2. The FAIR4CURES Project 3. A Global Unique Identifier Broker 4. Research Objects: a very very short introduction 5. Building a Research Object Authoring Tool on Mendeley Data
  • 3. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 The NIH Data Commons Pilot Phase aims to provide a marketplace for tools, data and workflows based on existing technologies of commercial and academic platforms that strive to embody the FAIR Data principles. Overview:
  • 4. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Data Commons Overview: Goal of the project: 1. Advance the policies and protocols for accessing human subjects data 2. Support global identification, indexing and searching of available data sets; 3. Provide a collection of computational pipelines that can be applied to data sets 4. Utilize standards to globally identify and access data sets, tools and workflows 5. Create policies for data citation, reuse and reproducibility 6. Enable researchers to port their own data and workflows into the cloud Project structure: • DCPPC research groups are addressing important Key Capabilities => • The Commons will be composed of four stacks, incorporating products from the KCs Final output: • Data from three large NIH Databases will be available through all of these systems • Users can securely access data within all stacks, on multiple cloud providers • Users have access a basic set of applications that run the same way on all stacks. https://public.nihdatacommons.us/ExecutiveSummary_4YP/ Key Capabilities: 1: FAIR Guidelines & Metrics 2: Global Unique IDs for FAIR Digital Objects 3: Open Standard APIs 4: Cloud Agnostic Architecture Framework 5: Workspaces for Computation 6: Research Ethics, Privacy, and Security 7: Indexing and Search
  • 5. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Data Commons Guiding Principles: • 1. Identifiers for data: Develop and implement an interoperable global unique identifier system for digital objects. • 2. Data access: Develop and implement authentication and authorization policies and protocols for controlled access to digital objects and derivatives. • 3. Findability: Enable search and indexing of digital objects and data sets. • 4. Software stacks: The Commons will encompass multiple robust and sustainable software stacks implementing Commons standards and systems. • 5. Data use, standards: All tools will be build using standard application interfaces. • 6. Use cases: The Commons will develop and utilize an extensive use case library. • 7. Community: The Commons is developed through intense Community engagement and support across multiple levels of expertise. • 8. Community: Governance, membership, and coordination will be established with and through the community. • 9. Evaluation methods and metrics: We plan a culture of frequent release of products, with small iterations, routine evaluation and redesign. • 10. FAIR guidelines and metrics: Once FAIR metrics and rubrics are defined, these will be used to measure the level of “FAIRness” of repositories, datasets, and other digital objects. https://public.nihdatacommons.us/executive-summary/
  • 6. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Team Xenon – Four partner organisations Findable Accessible Interoperable Reusable Collaborative Usable Reproducible Extendable Scalable The FAIR4CURES Collaboration: Index 3 datasets: • Trans-omics for Precision Medicine (TOPMed) • Genotype Tissue Expression (GTEx) • Model Organisms Database (MODs)
  • 7. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 The FAIR4CURES PlatformThe FAIR4CURES System:
  • 8. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 • Identifiers for hosted data files within TOPMed studies, GTEx dataset, and MODs • Feature for researchers to register identifiers for their derived data files on the platform, making the content public and searchable • Selecting types of identifiers to support in the Data Commons ecosystem and the required identifier metadata • Open Source tool, connected to the SevenBridges Platform • Also accessible via Github/SmartAPI Global Unique Identifier Broker:
  • 9. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Digital Object Types Identified following the KC2 Metadata Spec: Seven Bridges Object Type DataCite Resource Type Proposed Schema.Org CreativeWork Types Supported Relationships Notes File Dataset Dataset Source Of a Task (input file) Derived From a Task (output file) Part Of a Collection One (or more) files packaged with metadata as a dataset App (Tool) Software SoftwareSourceCode Part of Task or Collection or Workflow Same as dataset, but file is source code App (Workflow) Workflow SoftwareSourceCode (?) Has Part of Software An aggregation of Tools (Software). File is CWL definition describing how the tools are chained. Task Collection Collection Composition of Files and Apps (Tools or Workflows) An aggregation of Apps (either tools or workflows), plus files (input & output) plus a record of all the settings used for each App. Collection (Study) Collection Collection Composition of any object An aggregation of heterogeneous objects for purpose of publishing. https://docs.google.com/document/d/1FD3aXr_uHnPy-YrFhQhuXET73tBVxu7F_Q5uS9TPUZs/edit
  • 10. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Seven Bridges Data Publication Concept Requirements Analysis: 1. Landing page URL including GUID 2. URL for page where file can be accessed (downloaded) 3. Metadata for object 4. Reference to the Task (zero or one) that this dataset was Derived From 5. Reference to the Task(s) (zero, one or more) that this dataset is the Source Of 1 2 3 4 5
  • 11. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Seven Bridges Workflow Configuration (CWL)
  • 12. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Standards-based metadata framework for logically and physically bundling resources with context http://researchobject.org What are Research Objects? Aggregates link things together Annotations about things & their relationships Container Packaging content & links: Zip files, BagIt, Docker images Identification locate things regardless where
  • 13. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Research Objects can be used to capture outputs in a wide range of scopes • Profiles help define the shape and form of a research object. • A profile defines the general purpose of that type of Research Objects: • A format (e.g. Research Object Bundle), • An expectation of what kind of resources should be expected, • A link to any specific vocabularies that should be used in its annotations. Applications of Research Objects include BDBags (Big Data Bags): • In digital libraries, preservation of source artifacts commonly use the BagIt format for archive serialization, capturing digital resources like audio recordings, document scans and their transcriptions, provenance and annotations. • The Research Object BagIt archive is a profile for describing a BagIt archive and its content as a Research Object to structure the metadata and relate the captured resources • The NIH-funded Big Data for Discovery Science (BDDS) project captures Big Data bags (BDBag) of large complex datasets from genomics workflows (https://doi.org/10.1109/BigData.2016.7840618). • A key aspect of BDBag is the ability to use Minimal Viable Identifiers (minid) for referencing potentially large data sources held in multiple remote repositories, effectively making a “Big Data” Research Object for large-scale workflows (https://doi.org/10.1101/268755). • A bag of bags (minid:b9vx04) is a metadata skeleton which may be completed with tools like bdbag to download the big data • The bags’ Research Object manifests can be consumed independently, linking to the remote resources. Research Objects and BDBags: http://www.researchobject.org/scopes/
  • 14. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Moving from Datasets to Research Objects in Mendeley Data: In Mendeley Data Repository, datasets are lists of files (stored in our S3 bucket) with metadata packaging (e.g. Titles, Description, Categories, License) and a persistent identifier DOI). We will introduce: • Collections as an aggregation of Datasets. Similar to a Dataset, BUT, the contents are other datasets, not files. • Software and Workflow as different types of Digital Objects. Similar to a Dataset, BUT files are source code or workflow specifications (e.g. CWL) and metadata properties could be a bit different. This forms the foundation for Research Objects, which are: • Collections or aggregations of different types of Digital Objects (not just datasets) • References to digital objects on other platforms, based on standard identifiers (e.g. DOIs or ARKs) • A manifest which lists and describes the contents of the Research Object • Exposed in JSON-LD:
  • 15. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 GUID Broker (API Only) Seven Bridges Fair4CURES Platform Phase 1 Pilot Project (Apr – Sep 2018) Register Datasets (Data Files) Register Software Objects Register Workflow Objects Uses Register a Collection as a list of digital objects (data, sw, wf) In Summary: Objective 1 – support “Task” type Research Objects on Seven Bridges platform. Objective 2 - support configurable Research Objects on Mendeley Data platform. Phase 2 Project (Oct 2018 - 2019) Add annotation and relationships to collection to describe a research object Research Object Composer Serialise Research Object in standard format based on BDBags and RO standards Mendeley Data Platform Uses Re-uses http://smart-api.info/ui/ bf9abe9c17c9c78c432832382ef9e16a#/
  • 16. ELSEVIER | The Research Object Authoring Tool --- CNI 2018 Acknowledgements: • This work is supported by the NIH Data Commons Pilot Phase under the Research Opportunity Announcement (ROA) RM-17-026 https://commonfund.nih.gov/commons/: • NIH Data Commons - 1 OT3 OD025463-01 • NHLBI STAGE Project - 1 OT3 HL142478-01 • The FAIR4CURES Project lead by SevenBridges (Alison Leaf, Brandi Davis-Dusenbury and Sarper Avcil) • We partner in the Project with Repositive UK and the US Dept of Veteran’s Affairs • The metadata standards development was done by KC2, lead by Team Sodium (esp. Merce Crosas, Tim Clark, Trisha Cruse and Martin Fenner) • The Research Objects Authoring Tool work is lead by the University of Manchester, who pioneered work on Research Objects (Stian Soiland-Reyes and Carole Goble) • The Mendeley Data team has built the GUID Broker Prototype (Gabriel Oscares, Gareth Harvey