This document discusses the data skills required of librarians and presents a matrix of factors that influence these skills, including the librarian's role, the data lifecycle services provided by the library, and the research intensity of the institution. It notes the wide range of possible data-related skills and acknowledges that no individual can master all of them, emphasizing the need for librarians to work as a team with complementary skills. The document also examines questions around how librarians can become more involved in data science and what their future roles may be in supporting data-intensive research.
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Data Skills Matrix for Librarians
1. Research Data Lifecycle: Data Skills for
Librarians
Kathryn Unsworth (ANDS)
RSCD - BoF
13 February 2017
2. What data skills do librarians require?
There’s a matrix of possibilities
- it’s complex!
https://etherpad.openstack.org/p/RSCD_2017_Data_lifecycle_&_libs
3. Elements of the data skills for librarians matrix
• Current librarian role - where and at what level it connects with
research/researchers – data-related scope
• Future aspirations of the librarian – data-related scope
• Aspects of the Data Lifecycle the Library provides services for
• RDM maturity of the institution (dictates services provided)
• Research intensity of the institution (dictates services provided)
= data skills and knowledge required
4. Where and at what level you connect with your
research communities:
● Based mainly in the library?
● Embedded in research teams/labs?
● Hot desking it out of faculty spaces?
● Roving in a Research Commons (not only the Library)?
● Undertaking research support services (e.g. data
consultations, data collection/collating, data cleaning) out
in the (research) field?
● A combination?
5. Which parts of the Data Lifecycle your Library actively
supports with services:
● Data seeking for analysis
● Data documentation
(metadata)
● Data citation
● Data storage and
backups
● Data sharing
● Data archiving and
preservation
● Data and teaching
● Data cleaning
● Data visualisation
● DMP tools and advice
+ any intended new services
6. Your role - what proportion is research & data related?
Hybrid or specialist role?
Librarian
Liaison Librarian
Metadata Librarian
Subject Librarian
Research Librarian
Scholarly Communications Librarian
Repository Manager
Repository Officer
Data Librarian
Research Data Management – Librarian
Data Science Librarian
and more…
Same title, but
different
responsibilities
at University X
vs University Y
7. Discussion: Are there other
elements that we can add to
the matrix?
https://etherpad.openstack.org/p/RSCD_2017_Data_lifecycle_&_libs
8. How do librarian roles and
data skills and knowledge
intersect with the Data
Lifecycle?
14. Domain knowledge – best practice (current
research methods and data models)
Connecting researchers to research / research
data
“Map knowledge/data gaps”
“Identify emerging disciplinary cross-overs”
“Assist in the formulation and refinement of
innovative research questions”
“Digital tools to automate Literature reviews –
Meta, CHORUS system?”
“Applying network analysis to visualise trends
in emerging research”
“Tools to map key research terms in articles
– where are the terms appearing?”
“Text and data mining techniques for refining
research questions”
15. Project planning
• project management (Prince, Agile, Waterfall,
Critical Path, etc.)
• Business analysis – requirements gathering
• Problem solving, troubleshooting
Collaboration tools/platforms (OSF, Confluence,
Syncplicity, Google Apps)
RDMPs (DMPtool, DMPonline)
• Project governance – roles and responsibilities
• Data standards
• Data organisation – file formats, file naming
conventions, versioning, etc.
• Ethics and privacy – consent for sharing
• Copyright and other IP, Licensing
• Data storage
• Data security
Funder and publisher requirements for
data
Digital literacies training
16. Data search/discovery:
• Discovery tools and services
• Locate existing data
• Full text search
• Text and data mining
• Web APIs to discover, extract, enrich existing data
Data organisation
Data collection methods – generating new data,
transforming legacy data, sharing/exchanging data,
purchasing data
Metadata capture and creation tools and services
Metadata standards:
• Data description
• Controlled vocabularies
• Metadata modeling
• Interoperability
Patten recognition
Collaboration tools and platforms
Databases, including relational
Version tracking
Reference/citation management
Storage options for working, master, raw,
sensitive and big data
Data appraisal and selection
Licensing – data access/sharing agreements
Data security
17. Storage options for active data, collaborative research,
data and metadata flows
Data security
Access rules
Data cleaning
Data aggregation
Machine learning/algorithms – graphical modeling
Scripting/coding
Data mapping across data sources
Data transforms, e.g. raster to shape files
Lab notebooks (eLNs)
Data screening and preparation
Iterative data changes prompted by analysis
Preparing data for long-term preservation
and sharing
Process documentation – process
diagrams, workflows, tools and automation
18. Data visualisation
Storage options for active data
Data security, including access controls
Data manipulation
Text and data mining
Scripting/coding
Machine learning
Analysis: Statistical, Spatial, Image
Analysis documentation
Modeling
Interpretation
Database programming (querying DBs)
Problem solving/troubleshooting
Analytical thinking
19. Why share data?
Author/Creator rights
Data catalogs and portals
Sensitive data
Access rules
Metadata standards
• Descriptive metadata
• Controlled vocabularies
Persistent identifiers (DOIs, ORCIDs)
Data citation
Data licensing
Performance/Impact metrics
Programming – front-end – editing web page source
code, incorporate forms, multimedia
Contributor badges
Communication
Storytelling
Data visualisation
Client engagement
Advocacy
20. Persistent identifiers (DOIs, ORCIDs)
Using tools to identify file formats
Conversion to access and preservation
formats/mediums
Batch/automation
Data decoding
Data warehousing
Data archives and repositories
Long-term archival storage for final-state data
Metadata standards
• Descriptive metadata for discovery
• Provenance and other administrative
metadata
Disposition – disposing of obsolete or
redundant data, or archival retention
21. Licensing – legal framework around how data
can be (re)used
Reuse documentation (code, simulations,
models, protocols, workflows, etc.)
Impact and assessment metrics (Altmetrics,
PlumX, ImpactStory)
Data for teaching
Data citation – how and why to cite data
22. Whole of lifecycle activities
• Describing and contextualising data (metadata,
documentation, associated research outputs)
• Managing data quality
• Storage, Back ups and Security
23. Are you kidding me?
Who has the capacity to
attain all these skills?
24. Teams, not unicorns
“Team-building is another important tactic
in tackling the skills gap. There is little point
looking for the great, single all-rounder
who can do everything – the mythical
unicorn. Even if such people existed (and
they may) they would be too expensive as
they can walk into any job. It is much more
profitable to look across the skill-set
required and build a team to fulfil it.”
Read more at: http://www.techweekeurope.co.uk/e-management/skills/bridging-
data-science-skills-gap-requires-team-effort-160818#msRDJHrzR8QUhLHa.99
Copyrighted Image - Data Science Roles
https://libraryconnect.elsevier.com/articles/learning-
about-research-data-lab-pitt-ischool
25. So what’s the minimum data skills requirement for librarians?
Is there an optimal level? Maybe even an aspirational level?
Are we talking about all librarians or only those with data-
related responsibilities?
As an academic librarian is it ok to just be “data aware” or do
we all need to be “data savvy” or maybe something in
between?
Discussion…
https://etherpad.openstack.org/p/RSCD_2017_Data_lifecycle_&_libs
26. What is a Data Savvy Librarian?
“...librarians need increasingly to become data-savvy themselves and to have a
deeper understanding of the research data lifecycle in order to enhance the
services they offer.”
“...the main requirement is a basic familiarity with how various software tools can
transform data.” And, “...to learn the basics of some of the latest tools for extracting,
analyzing, storing, and visualizing data.”
“...working directly with messy, unavailable or difficult to-access data it is possible to
have a more complete vision of the different issues the researchers have to face
when working with data.”
Barbaro, A. (2016). On the importance of being a data-savvy librarian. Journal of EAHIL, 12(1):24-27
28. The research librarian of the future:
data scientist and co-investigator
There remains something of a disconnect between how
research librarians themselves see their role and its responsibilities and
how these are viewed by their faculty colleagues. Jeannette Ekstrøm,
Mikael Elbaek, Chris Erdmann and Ivo Grigorov imagine how the
research librarian of the future might work, utilising new data science and
digital skills to drive more collaborative and open scholarship. Arguably
this future is already upon us but institutions must implement a
structured approach to developing librarians’ skills and services to fully
realise the benefits.
29. Core duties versus ‘stretch’ services
The research librarian community is not in consensus as to what exactly
are the emerging roles of future librarians in a rapidly evolving digital
scholarship environment (see #libraryfutures). Added to the polarised
views within that community, a recent survey shows there is also a clear
gap in perception and expectations between librarians and faculty staff.
While librarians surveyed agreed that “information literacy” and “aiding
students one-on-one in conducting research” are primary and essential
roles, they viewed “supporting faculty research” as less important than
their faculty colleagues. So does this present an opportunity in the digital
age?
30.
31.
32. The Role of Librarians in Data Science: A Call to Action
“All of this hesitancy on the part of librarians to participate in the data
movement is happening at a time when we have seen an increase in the money
and involvement in data initiatives from a range of other professions and
academic disciplines (e.g. computer science, informatics, etc.). For me, this is an
especially critical moment for librarians to talk about data and actively plan and
implement our strategies collectively.
I want to share with you a proposed framework for the librarian’s role in
data science. I come to the discussion with the fear that data science is an
evolving academic discipline being defined solely by computer science and that
the field of library and information science is being left behind. I would argue
that the principles and values of the field of library and information science that
form the core of our profession need to be part of this new discipline and that we
can add unique perspectives and roles.”
(Opinion piece by Elaine R. Martin, 2015)
33. Data Science – is there a future where you see
librarians filling the DS skills gap?
What’s your next data skills challenge?
https://etherpad.openstack.org/p/RSCD_2017_Data_lifecycle_&_libs
Discussion points:
34. Acknowledgements
Barbaro, A. (2016). On the importance of being a data-savvy librarian. Journal of EAHIL, 12(1):24-27
https://www.researchgate.net/publication/299394172_On_the_importance_of_being_a_data-savvy_librarian
Ekstrom, J., Elbeaek, M., Erdmann, C., & Grigorov, I. (2016). The research librarian of the future: data scientist and co-investigator,
The Impact Blog LSE. http://blogs.lse.ac.uk/impactofsocialsciences/2016/12/14/the-research-librarian-of-the-future-data-scientist-
and-co-investigator/
Faundeen, J.L., Burley, T.E., Carlino, J.A., Govoni, D.L., Henkel, H.S., Holl, S.L., Hutchison, V.B., Martín, Elizabeth, Montgomery,
E.T., Ladino, C.C., Tessler, Steven, and Zolly, L.S., 2013, The United States Geological Survey Science Data Lifecycle Model: U.S.
Geological Survey Open-File Report 2013–1265, 4 p., https://doi.org/10.3133/ofr20131265.
Macrae, D. (2015). Why Bridging The Data Science Skills Gap Requires A Team Effort. TechWeek Europe.
http://www.techweekeurope.co.uk/e-management/skills/bridging-data-science-skills-gap-requires-team-effort-
160818#25vTmJ6UpzSfI20F.99
Martin, Elaine R. (2015). "The Role of Librarians in Data Science: A Call to Action." Journal of eScience Librarianship 4(2): e1092.
http://dx.doi.org/10.7191/jeslib.2015.1092
Library Journal Research. (2015). Bridging the Librarian-Faculty Gap in the Academic Library. Gale Cengage Learning.
https://s3.amazonaws.com/WebVault/surveys/LJ_AcademicLibrarySurvey2015_results.pdf
University of Central Florida Libraries Research Lifecycle Committee. (2012). The research lifecycle at UCF [Online Graphic].
Retrieved (February, 13, 2017) from library.ucf.edu/ScholarlyCommunication/ResearchLifecycleUCF.php
35. With the exception of logos, third party images or where otherwise indicated, this
work is licensed under the Creative Commons Australia Attribution 3.0 Licence.
ANDS is supported by the Australian
Government through the National Collaborative
Research Infrastructure Strategy Program.
Monash University leads the partnership with
the Australian National University and CSIRO.
Kathryn Unsworth - ANDS Outreach Officer and Data Librarian
kathryn.unsworth@ands.org.au