Presenter: Jeffrey Grethe, PhD, Principal Investigator of NIDDK Information Network (dkNET), Center for Research in Biological Systems, University of California San Diego
For all proposals submitted on/after January 25 2023, NIH requires the sharing of data from all NIH funded studies. Do you have appropriate data management practices and sharing plans in place to meet these requirements? Have questions or need some help? Join the dkNET office hours to learn about NIH’s policy (NOT-OD-21-013) and resources that could help.
*Previous Office Hours Slides and Recording: https://dknet.org/rin/research-data-management
Upcoming Webinars Schedule: https://dknet.org/about/webinar
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024
1. An NIDDK Resource dknet.org
The NIDDK Information Network
Jeffrey S. Grethe, Ph.D.
2. An NIDDK Resource dknet.org
dkNET Office Hours:
NIH Data Management and Sharing Mandate
Jeffrey S. Grethe, PhD
PI, NIDDK Information Network (dkNET)
Co-Director, FAIR Data Informatics Laboratory, UCSD
3. An NIDDK Resource dknet.org
What is dkNET?
• dkNET provides a single point of access to information about diverse research resources,
including data, information, materials, tools, funding opportunities, literature, services,
events, news, and projects that advance the mission of the NIDDK.
• dkNET provides tools and services in support of rigor and reproducibility, built around the
Research Resource Identifier (RRID) and the FAIR data principles (Findable, Accessible,
Interoperable, Re-usable).
https://dknet.org
4. An NIDDK Resource dknet.org
What is dkNET?
• dkNET provides a single point of access to information about diverse research resources,
including data, information, materials, tools, funding opportunities, literature, services,
events, news, and projects that advance the mission of the NIDDK.
• dkNET provides tools and services in support of rigor and reproducibility, built around the
Research Resource Identifier (RRID) and the FAIR data principles (Findable, Accessible,
Interoperable, Re-usable).
https://dknet.org
Next Phase of dkNET
Resource Core: Current dkNET Services
Computational Core: Provide computational services and
AI/ML methods to the DK community
Outreach Core: Community board and use case
development
NEW
5. An NIDDK Resource dknet.org
How Is dkNET Helping Researchers?
Construct a
hypothesis
Ask a question
Do background
research
Plan
experiments
Collect and
analyze data Publish
results
Discovery Portal Hypothesis Center Resource Reports
Authentication
Reports
Information
Material
Data
Tools
Funded grant and
funding opportunities
Literature
Tutorials
Resource Reports
Hypothesis Center
FAIR Data Resources
Data Management
Data Repositories
Resource Reports
Cite RRID
Track Resources
Resource Identification
Authentication plans
NIH Mandates on Rigor
and Reproducibility for
grant submission
SPP
MMPC
•
•
•
Researchers can use dkNET to:
6. An NIDDK Resource dknet.org
New NIH Policy Effective January 25, 2023
• US National Institutes of Health new
data sharing policy goes into effect
• All data must be managed; most
data should be shared
• “As open as possible; as closed as
necessary”
• Mandates the inclusion, approval
and execution of a Data
Management and Sharing Plan
• (DMP + S = DMS)
https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html
7. An NIDDK Resource dknet.org
The Details...
What
• Defines Scientific Data as: “The recorded factual material commonly accepted in the scientific
community as of sufficient quality to validate and replicate research findings, regardless of whether
the data are used to support scholarly publications. Scientific data do not include laboratory
notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for
future research, peer reviews, communications with colleagues, or physical objects, such as
laboratory specimens.”
• Even those scientific data not used to support a publication are considered scientific data and within
the final DMS Policy’s scope
When
• “[s]hared scientific data should be made accessible as soon as possible, and no later than the time of
an associated publication, or the end of the award/support period, whichever comes first.”
• Researchers may share data underlying publication during the period of award but may share other
data that have not yet led to a publication by the end of the award period.
Where
• Encourages the use of established repositories to the extent possible.
8. An NIDDK Resource dknet.org
More Details...
How
• NIH encourages data management and data sharing practices consistent with the FAIR data
principles
Funding
• Fees for long-term data preservation and sharing are allowable, but funds for these activities must be
spent during the performance period, even for scientific data and metadata preserved and shared
beyond the award period.
Repercussions
• After the end of the funding period, non-compliance with the NIH ICO-approved Plan may be taken
into account by NIH for future funding decisions for the recipient institution
The DMS Policy applies to all research, funded or conducted in whole or in part by NIH, that results in the
generation of scientific data. This includes research funded or conducted by extramural grants, contracts,
Intramural Research Projects, or other funding agreements regardless of NIH funding level or funding
mechanism.
9. An NIDDK Resource dknet.org
Data as a Research Product
Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be
so in practice as well as theory, data must be accorded due importance in the practice of
scholarship and in the enduring scholarly record…”
https://www.force11.org/group/joint-declaration-dat
a-citation-principles-final
Joint Declaration of Data Citation Principles
1. Data should be considered legitimate, citable products of research. Data citations should be
accorded the same importance in the scholarly record as citations of other research objects,
such as publications.
2. Data citations should facilitate giving scholarly credit and normative and legal attribution to all
contributors to the data, recognizing that a single style or mechanism of attribution may not be
applicable to all data.
3. In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data
should be cited.
10. A data citation looks like a regular citation
DOI
Full citation
DOI:10.34945/F5XW2P
11. Proper data citation = data citation metrics
https://datasetsearch.research.google.com/
12. An NIDDK Resource dknet.org
Good data management is the gateway to data sharing
Borghi J, Abrams S, Lowenberg D, Simms S, Chodacki J (2018) Support Your Data: A Research Data
Management Guide for Researchers. Research Ideas and Outcomes 4: e26439.
https://doi.org/10.3897/rio.4.e26439
13. An NIDDK Resource dknet.org
Changing the culture around data management and sharing
• Me
• Answer to the underpowered study
• Data sharing and good data
management are closely aligned
• Compliance with mandates
• Credit for the totality of my work
• Science and Society
• Transparency
• Reproducibility
• Reduced waste
• Driving discovery
• Future me
• One most likely to benefit from good
data management and sharing
through stable archives
• No one ever regretted annotating too
much
• My colleagues (and PI)
• Easy to engage with colleagues over
well annotated data and associated
code
• What happens when the post doc
leaves?
April 2021; National Academies of Science Workshop
14. An NIDDK Resource dknet.org
But how do I do that?
● “NIH encourages data management and data sharing practices
consistent with the FAIR data principles.”
● “NIH strongly encourages the use of established repositories to
the extent possible for preserving and sharing scientific data”
15. An NIDDK Resource dknet.org
The FAIR Guiding Principles for scientific data management and
stewardship
High level principles to make data:
• Findable
• Accessible
• Interoperable
• Re-usable
Mark D. Wilkinson et al. The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data (2016). DOI: 10.1038/sdata.2016.18
…for humans and machines
16. An NIDDK Resource dknet.org
Findable
● F1. (meta)data are assigned a globally unique and
persistent identifier
● F2. data are described with rich metadata
● F3. metadata clearly and explicitly include the identifier of
the data it describes
● F4. (meta)data are registered or indexed in a searchable
resource
Accessible
● A1. (meta)data are retrievable by their identifier using a
standardized communications protocol
● A1.1 the protocol is open, free, and universally
implementable
● A1.2 the protocol allows for an authentication and
authorization procedure, where necessary
● A2. metadata are accessible, even when the data are no
longer available
● I1. (meta)data use a formal, accessible, shared, and
broadly applicable language for knowledge
representation.
● I2. (meta)data use vocabularies that follow FAIR
principles
● I3. (meta)data include qualified references to other
(meta)data
Re-usable
● R1. meta(data) are richly described with a plurality of
accurate and relevant attributes
● R1.1. (meta)data are released with a clear and accessible
data usage license
● R1.2. (meta)data are associated with detailed provenance
● R1.3. (meta)data meet domain-relevant community
standards
Interoperable
G
o
o
d
M
e
t
a
d
a
t
a
DOI for your data
U
s
e
o
f
A
p
p
r
o
p
r
i
a
t
e
R
e
p
o
s
i
t
o
r
y
17. An NIDDK Resource dknet.org
Resource sharing plan →
Data Management and Sharing Plan
18. An NIDDK Resource dknet.org
Required Elements of the NIH Plan
Data Type
Related Tools, Software and/or Code
Standards
Data Preservation, Access, and Associated Timelines
Access, Distribution, or Reuse Considerations
Oversight of Data Management and Sharing
How compliance with the Plan will be monitored and managed,
frequency of oversight, and by whom (e.g., titles, roles).
Adapted from Ghosh et al. 2022
Not a part of review
19. An NIDDK Resource dknet.org
Data Type
• Type and amount/size
• Modality
• Level of aggregation
• Degree of data processing
• Which data and why
• A brief listing of the metadata
• Other relevant data, and
• Associated documentation to
facilitate interpretation of the
scientific data.
Example of some metadata needed
• Acquisition: Instrument, Protocol
• Biosample: Tissue, Cell, Organoid
• Participant/Donor
• Assay
• Analytics
Voltage traces Spike times Fluorescence traces
Optophysiology MRI Microscopy
Adapted from Ghosh et al. 2022
20. An NIDDK Resource dknet.org
Standards
• Data formats
• CSV/TSV, PNG, Tiff
• NWB, NIfTI, OME.Zarr, SWC
• Data dictionaries
• Data identifiers
• Sub-01, Ses-02,
• Definitions
• Ontologies
• Unique identifiers
• UUID, RRID
• Other data documentation
• Quality assurance, control
xkcd: 927
Adapted from Ghosh et al. 2022
21. An NIDDK Resource dknet.org
What standards should I use?
● Repositories often enforce specific
standards for metadata and data
● Thinking about where your data will
end up before you start your
experiments will help you determine
how to collect, annotate and
organize your (meta)data
● Fairsharing.org maintains a database
of standards and policies across
biomedicine
22. An NIDDK Resource dknet.org
Related Tools, Software and/or Code
• Any specialized tools needed
to access or manipulate
shared scientific data
• How needed tools can be
accessed
• Whether such tools are
likely to remain available for
as long as the scientific data
remain available.
Adapted from Ghosh et al. 2022
This should include the following information, if applicable:
● Which statistical package or program was used to
manipulate the data, along with the version of the
software that was used and any packages, scripts, or
settings that were used or developed during the course of
the study, as well as how users can access the software
● Whether there were any custom workflows or pipelines
developed as part of the study necessary to analyze or
process the data, and how
● Whether there were any executable programs or macros
written as part of the study necessary to analyze or
process the data, as well as how users can access the code
23. An NIDDK Resource dknet.org
Data Preservation, Access, and Associated Timelines
• Name of the repository(ies)
• How data will be findable and
identifiable
• Timeline
Adapted from Ghosh et al. 2022
24. An NIDDK Resource dknet.org
Lesson: Think about where your
data will end up in the beginning
Best practice: Submit your data to repository specialized for your type of data or your
domain
..if there isn’t one, then there are also general purpose repositories available
25. dknet.org
An NIDDK Resource
Where Can I Deposit My Data?
• List of DK relevant
repositories,
recommended by NLM
and various journals
• Created in conjunction
with NIDDK
• Coming soon: FAIR data
wizard
● FAIR Standards
● Clinical Repositories
Information
● Data maintenance
● Data size limit and cost
● Dynamic database
https://dknet.org/rin/suggested-data-repositories
26. An NIDDK Resource dknet.org
Access, Distribution, or Reuse Considerations
• Informed consent
• Privacy and confidentiality
protections
• Whether access to scientific data
derived from humans will be
controlled
• Any restrictions imposed by
federal, Tribal, or state laws,
regulations, or policies, or existing
or anticipated agreements
• Any other considerations that may
limit the extent of data sharing.
Adapted from Ghosh et al. 2022
27. An NIDDK Resource dknet.org
Data Management and Sharing Plan
● Creating a good data management
and sharing plan allows you to:
○ Comply with NIH mandates
○ Ensure that you allocate
enough resources for
preparing and sharing your
data
○ Ensure that you collect your
data in a FAIR manner
○ Easily share data with yourself,
future you, your colleagues
and the scientific community
● dkNET provides links to resources
that can help
https://dknet.org/rin/research-data-management
28. An NIDDK Resource dknet.org
• Helps plan ahead
• Start alongside proposal, well
before data collection
• Human and technological
resources
• Cost: $$, Effort
• Cost
• Training/re-training
• Time
• Resources
It benefits your scientific work!
Data Management and Sharing Plan
• Benefit
• Reduction in $$, effort, less
surprises
• Organizing data and metadata
reduces transition effort
• Open data opens new avenues and
may reduce collection cost
• Using standards lowers
development cost
Adapted from Ghosh et al. 2022
34. An NIDDK Resource dknet.org
Repository Wizard
• Rate 80
repositories against
FAIR, Open,
Trustworthy,
Citable principles
and other criteria
• Wizard guides a
user through
criteria that may be
important for
selection
Coming Soon
35. dknet.org
An NIDDK Resource
Get involved in the dkNET Community
dkNET Homepage: dkNET.org
Check out dkNET Resour
Join Webinar
Follow us
@dknet_info
Check Out or Post
News and Funding
Opportunities
Blog, Calendar
Sign up email list