Hands-On Data Management Planning for Life Sciences
1. Hands-‐On
Data
Management
Planning
for
Life
Sciences
Andrew
Sallans
Andrea
Denton
Head
of
Strategic
Data
Ini5a5ves
Research
and
Data
Services
Manager
University
of
Virginia
Library
Claude
Moore
Health
Sciences
Library
als9q@virginia.edu
ash6b@virginia.edu
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
2. Goals
for
the
workshop
• Learn
about
data
management
planning
• Learn
about
available
resources
• Develop
rough
draU
of
a
data
management
plan
for
a
grant
• Gain
peer
and
expert
feedback
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
3. Why
should
you
care
about
data
management
planning?
• It’s
good
science:
reproducible
results
and
con5nuity
• Transparency
and
accountability
• Gain
a
compe55ve
edge
in
grant
compe55on
• Get
credit
by
making
your
data
citable,
more
impact
• Be
efficient
and
avoid
data
loss
• It’s
complex
and
requires
aPen5on
to
many
parts
• You
may
be
required
to
by
your
government,
funder,
ins5tu5on,
publishers,
etc.
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
4. Why
not?
h"p://memegenerator.net/Fist-‐Pump-‐Baby
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
5. Recent
news
• White
House,
Office
of
Science
and
Technology
Policy
from
February
22,
2013
• Federal
research
agencies
funding
more
than
$100M/
year
must
develop
plan
to
make
the
results
(papers
and
data)
of
federally
funded
research
available
to
the
public
within
one
year
of
publica5on
• Also
requires
researchers
to
bePer
account
for
and
manage
data
h"p://www.whitehouse.gov/sites/default/files/microsites/ostp/
ostp_public_access_memo_2013.pdf
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
6. Example:
Na5onal
Science
Founda5on
– Data
Sharing
Policy:
Awards
&
Administra5on
Guide
Chapter
IV.D.4
– Data
Management
Plan
requirement:
Grant
Proposal
Guide
Chapter
II.C.2.j
– Addi5onal
requirements
from
individual
Directorates
and
Divisions
(e.g.,
BIO,
CISE,
EHR,
GEO,
MPS,
SBE):
Dissemina5on
and
Sharing
of
Results
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
7. Caveat:
it’s
not
just
the
NSF
CDC
NEH
DOE
NIH
Read
calls
for
proposals
carefully
CRkfs73p
and
ask
program
director
about
EPA
USDA
specific
data
management
IMLS
Private
and
public
founda5ons
requirements.
Build
5me
into
your
NASA
Many
research
funding
agencies
proposal
development
to
in
the
U.K.,
Australia,
and
other
formulate
a
data
management
countries
plan!
NOAA
Etc…
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
8. What
is
a
Data
Management
Plan?
• A
comprehensive
plan
of
how
you
will
manage
your
research
data
throughout
the
lifecycle
of
your
research
project
AND
• Brief
descrip5on
of
how
you
will
comply
with
funder’s
data
sharing
policy
• Reviewed
as
part
of
a
grant
applica5on
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
9. Dissemina=on
&
Sharing
of
Research
Results
“Inves5gators
are
expected
to
share
with
other
researchers,
at
no
more
than
incremental
cost
and
within
a
reasonable
5me,
the
primary
data,
samples,
physical
collec5ons
and
other
suppor5ng
materials
created
or
gathered
in
the
course
of
work
under
NSF
grants.
Grantees
are
expected
to
encourage
and
facilitate
such
sharing.”
Na=onal
Science
Founda=on:
Award
&
AdministraGon
Guide
(AAG)
Chapter
VI.D.4
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
10. Plan
for
Data
Management
&
Sharing
of
the
Products
of
Research
As
of
January
18,
2011:
“Proposals
must
include
a
supplementary
document
of
no
more
than
two
pages
labeled
“Data
Management
Plan”.
This
supplement
should
describe
how
the
proposal
will
conform
to
NSF
policy
on
the
dissemina5on
and
sharing
of
research
results,
and
may
include…...”
NSF:
Grant
Proposal
Guide
(GPG)
Chapter
II.C.2.j
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
11. Which
NSF
requirement
to
use?
• Which
Guideline
Should
I
follow?
§ First:
follow
the
requirements
laid
out
in
the
specific
solicita5on,
if
any.
§ Second:
follow
the
guidelines
published
by
the
appropriate
NSF
directorate
and/or
division.
If
there
is
a
conflict,
the
laPer
takes
precedence.
§ Third:
follow
the
more
general
guidelines.
• Interdisciplinary
Proposals
§ Use
guidelines
appropriate
to
the
lead
program
(if
there
are
specific
guidelines)
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
12. Parts
of
a
(Generic)
NSF
Data
Management
Plan
I. Products
of
the
Research:
The
types
of
data,
samples,
physical
collec5ons,
soUware,
curriculum
materials,
and
other
materials
to
be
produced
in
the
course
of
the
project.
II. Data
Formats:
The
standards
to
be
used
for
data
and
metadata
format
and
content
(where
exis5ng
standards
are
absent
or
deemed
inadequate,
this
should
be
documented
along
with
any
proposed
solu5ons
or
remedies).
III. Access
to
Data
and
Data
Sharing
Prac=ces
and
Policies:
Policies
for
access
and
sharing
including
provisions
for
appropriate
protec5on
of
privacy,
confiden5ality,
security,
intellectual
property,
or
other
rights
or
requirements.
IV. Policies
for
Re-‐Use,
Re-‐Distribu=on,
and
Produc=on
of
Deriva=ves.
V. Archiving
of
Data:
Plans
for
archiving
data,
samples,
and
other
research
products,
and
for
preserva5on
of
access
to
them.
Grant
Proposal
Guide
(GPG)
Chapter
II.C.2.j
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
13. I.
Types
of
Data
• Ques=ons
to
answer:
§ What
data
will
be
generated
in
the
research?
§ What
data
types
will
you
be
crea5ng
or
capturing?
§ How/when/where
will
you
capture
or
create
the
data?
§ How
will
the
data
be
processed?
§ If
you
will
be
using
exis5ng
data,
state
that
fact
and
include
where
you
got
it.
What
is
the
rela5onship
between
the
data
you
are
collec5ng
and
the
exis5ng
data?
13
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
14. II.
Data
and
Metadata
Standards
• Ques=ons
to
answer:
§ Which
file
formats
will
you
use
for
your
data,
and
why?
§ What
form
will
metadata
describing/
documen5ng
your
data
take?
§ How
will
you
create
or
capture
these
details?
§ Which
metadata
standards
will
you
use
and
why
have
you
chosen
them?
§ What
contextual
details
(metadata)
are
needed
to
make
the
data
you
capture
or
collect
meaningful?
14
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
15. III.
Policies
for
Access
and
Sharing
&
Provisions
for
Appropriate
Protec=on/Privacy
• Ques=ons
to
answer:
§ How/when
will
you
make
the
data
available?
§ What
is
the
process
for
gaining
access
to
the
data?
§ Does
the
original
data
collector/creator/principal
inves5gator
retain
the
right
to
use
the
data
before
opening
it
up
to
wider
use?
§ Are
there
any
embargo
periods
for
poli5cal/
commercial/
patent
reasons?
If
so,
give
details.
15
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
16. III.
Policies
for
Access
and
Sharing
&
Provisions
for
Appropriate
Protec=on/Privacy
(Cont.)
• More
Ques=ons
to
answer:
§ Are
there
ethical
and
privacy
issues?
If
so,
how
will
these
be
resolved?
§ What
have
you
done
to
comply
with
your
obliga5ons
in
your
IRB
Protocol?
§ Who
will
hold
the
intellectual
property
rights
to
the
data
and
how
might
this
affect
data
access?
16
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
17. IV.
Policies
and
Provisions
for
Re-‐Use,
Re-‐Distribu=on
• Ques=ons
to
answer:
§ Will
any
permission
restric5ons
need
to
be
placed
on
the
data?
§ Which
bodies/groups
are
likely
to
be
interested
in
the
data?
§ What
could
be
the
intended
uses
of
the
data?
17
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
18. V.
Plans
for
Archiving
and
Preserva=on
of
Access
• Ques=ons
to
answer:
§ What
data
will
be
preserved
for
the
long-‐term?
§ What
is
the
long-‐term
strategy
for
maintaining,
cura5ng
and
archiving
the
data?
§ Which
archive/repository/database
have
you
iden5fied
as
a
place
to
deposit
data?
§ What
procedures
does
your
intended
long-‐term
data
storage
facility
have
in
place
for
preserva5on
and
backup?
§ How
long
will/should
data
be
kept
beyond
the
life
of
the
project?
18
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
19. V.
Plans
for
Archiving
and
Preserva=on
of
Access
(Cont.)
• More
Ques=ons
to
answer:
§ What
transforma5ons
will
be
necessary
to
prepare
data
for
preserva5on
/
data
sharing?
§ What
metadata/
documenta5on
will
be
submiPed
alongside
the
data
or
created
on
deposit/
transforma5on
in
order
to
make
the
data
reusable?
§ What
related
informa5on
will
be
deposited?
19
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
20. What
needs
to
be
in
a
data
management
plan
for
a
grant?
Example:
NSF
• Two
pages
long
• Reviewed
for
merit
and/or
impact
with
proposal
• Your
plan
should
minimally
address
five
points:
– Data
being
produced
– Format
and
descrip5on
– Access
and
sharing
– Reuse
– Archiving
Be
sure
to
address
addi5onal
Directorate
or
Division
guidelines
and
specific
program
requirements!
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
21. Three
Data
Management
Planning
Resources
• DMPTool,
hPp://dmptool.org
–
Helps
you
create
a
data
management
plan
to
meet
grant
requirements
and
iden5fy
UVA
support
resources
and
policies
• Databib,
hPp://databib.org
–
Helps
you
find
an
appropriate
place
to
deposit
your
data
• Libra,
hPp://libra.virginia.edu
-‐
Helps
UVA
faculty,
graduate
students,
and
staff
by
providing
a
place
to
deposit
and
share
datasets
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
22. h`p://dmptool.org
Step-‐by-‐step
wizard
for
genera5ng
DMP
Create
|
edit
|
re-‐use
|
share
|
save
|
generate
Open
to
community
Links
to
institutional
resources
Directorate
information
&
updates
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
23. Goals
of
the
DMPTool
I. To
provide
researchers
a
simple
way
to
create
a
DMP
for
their
funding
agency
• Ques5ons
asked
by
the
agency
• Addi5onal
explana5on/context
provided
by
the
agency
• Links
to
the
agency
website
for
policies,
help,
guidance
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
24. Goals
of
the
DMPTool
II. To
provide
researchers
with
DMP
informa=on
from
their
home
ins=tu=on
• Resources
and
services
to
help
them
manage
data
• Help
text
for
specific
ques5ons
• Suggested
answers
to
ques5ons;
easy
to
cut-‐N-‐paste
• News
&
events
related
to
data
management
on
campus
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
25. Last
point:
Grant
requirements
versus
ideal
• Grant
Driven
– Requirements
– Sharing
and
public
access
to
research
• Opera5onal
– Research
con5nuity
– Avoiding
data
loss
– Efficiency
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
26. Team
Exercise
30
minutes
1. Iden5fy
a
grant
that
you
have
or
might
apply
for.
2. Locate
the
requirements
for
that
grant
in
the
DMPTool.
3. Go
through
plan
sec5ons
in
DMPTool
workflow
to
produce
draU
plan.
– Be
sure
to
address
metadata,
access
policies,
repositories
.
4. Iden5fy
solu5ons
and
available
support
through
DMPTool
sec5ons
or
ask
for
guidance.
5. Record
issues
and
ques5ons
for
discussion.
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
27. Presenta5on
of
DraU
DMPs
15
minutes
• Iden5fy
grant
• Describe
project
briefly
• Explain
requirements
• Describe
planned
solu5ons
– Must
address
metadata,
access
policies,
and
repositories.
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
28. Ques5ons
and
Discussion?
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.
29. Follow-‐up
• Contact
the
Scien5fic
Data
Consul5ng
Group
for
help
with
DMP
prepara5on
– Grant
driven:
hPp://www2.lib.virginia.edu/brown/data/
DMP_Support.html
– Opera5onal
• Email:
scidac@virginia.edu
Crea5ve
Commons
License
”Hands-‐On
Data
Management
Planning
for
Life
Sciences",
3/19/13
by
Andrew
L.
Sallans
is
licensed
under
a
Crea5ve
Commons
APribu5on-‐ShareAlike
3.0
Unported
License.