Presentation given at the Open Repositories 2018 conference in Bozeman, Montana, 6th June 2018. Starting with an assessment of the UK open access repository environment, this presentation asks broader questions about the state of the open repository landscape globally. In response to a report to the UK government on open access, Universities UK have set up a repositories working group to identify issues where common benefit can be delivered and actions that can be taken. In this talk I will combine my own assessment of the repository landscape with a summary of the work of the working group and its recommendations. The presentation will also introduce work underway at the British Library to address some of the issues the working group has identified, including an assessment of a national OA preservation solution and a shared-services repository infrastructure. I will make the case that to realise the benefits of open repositories we need to move away from the model of locally hosted repositories.
For repositories to succeed they have to end. Reflections on (not just) the UK repository scene
1. For repositories to succeed they
have to end. Reflections on (not
just) the UK repository scene
Dr Torsten Reimer
Head of Research Services
Torsten.Reimer@bl.uk / @torstenreimer
http://orcid.org/0000-0001-8357-9422
Open Repositories 2018, Bozeman MT, 6 June 2018
2. Authors do not care about
repository systems
https://www.flickr.com/photos/jasonbain/33169141452/ CC BY NC ND 2.0
3. Readers do not care about
repository systems
https://www.flickr.com/photos/p_marione/10353933614/ CC BY NC ND 2.0
4. What matters is the repository
function, its purpose
https://www.flickr.com/photos/missrogue/1064784666/ CC BY SA 2.0
5. The repository function is the
same as that of a library: help
people to find, access and use
information – persistently.
https://www.flickr.com/photos/andrewgustar/16793367681/ CC BY ND 2.0
6. You don’t need a local system to
deliver this function.
https://www.flickr.com/photos/han_shot_first/7771438844/ CC BY 2.0
7. We too often think about locally
developed systems.
https://www.flickr.com/photos/mukluk/207619079/ CC BY 2.0
8. As a result we build local
systems, and, from a global
perspective, maintain them badly.
https://www.flickr.com/photos/waynerd/6677201937/ CC BY NC ND 2.0
9. Conceptualising the repository
service mostly as local repository
system has dangers:
•Inefficiency (duplication of effort)
•Systems-over-service approach
•Over-customisation hinders
interoperability and staying up-
to-date
10. We should focus on repository
services instead of systems.
We should ask ourselves whether
it is always best to develop and
host our own repository system.
https://www.flickr.com/photos/betsyweber/8734581153/ CC BY 2.0
11. www.bl.uk 11
UK context: why do we care?
• UK has very strong mandates:
– Government assessment of universities
requires open access for articles, with
green OA as default route
– Research Councils require open access to
articles; green default for most universities
– Research data has to be made available
for 10 years minimum
• Substantial increase in OA material:
– 54% of UK articles available openly within
12 months in 2016 (vs 32% globally)
– Imperial College example: 300 manuscripts
deposits in 2012, 11,000 in 2016
https://www.universitiesuk.ac.uk/policy-and-analysis/reports/Pages/monitoring-transition-open-access-2017.aspx
12. www.bl.uk 12
What the UK could have done in 2012
• Procure preservation
solution
• Procure access
solution
• Mandate deposit of
all scholarly content
on this platform
• Provide portals for each higher education organisation (HEI)
• Provide interfaces so HEIs and other platforms can push
and pull content
Discovery & Access
Humans Machines
National preservation solution
Content deposited by
Machines Authors
13. www.bl.uk 13
Repositories as a national concern
• In February 2016, the Department for Business, Innovation
& Skills published an independent report on open access,
written by Professor Adam Tickell.
• One of his recommendations was “that the British Library,
Research Libraries UK and the Society of College, National
and University Libraries (SCONUL) convene, with
appropriate support, to advise as to the best mechanisms to
ensure that there is at least one permanent copy of an
open access publication and that due regard is given to
long term curation of digital assets.”
14. www.bl.uk 14
UUK repositories working group
• Universities UK Open Access Coordination Group with different
stakeholders (government, funders, research organisations,
publishers, libraries)
• A repositories working group was set
up to look into Adam Tickell’s
recommendations.
• Meetings throughout 2017, a
workshop at the British Library and
a survey of repository managers
across UK universities.
• Report not yet published ->
take everything below as my personal opinion.
15. www.bl.uk 15
UK OA landscape - positives
Strong community
Track record for open
solutions
Jisc
services
UK PubMed
Central
Technical
expertise
EThOS
Many
repositories
Policies
16. www.bl.uk 16
Challenges, (not just) for the UK
1. Concerns about sustainability of the
underlying repository software package
2. Difficulties with CRIS system integration
3. Difficulties with integration with university
systems (other than CRIS)
4. Difficulties with maintaining custom
functionality
5. Issues with changing publisher and/or
funder policies changing compliance
status of articles
6. Lack of integration with identifiers
(such as ORCID or DOIs)
7. Limitation of reuse through deposit
licence (‘all rights reserved’), e.g.
for text and data mining
8. Limited/no facilities (such as API) to
support text and data mining
9. Linking publications to related datasets
(and vice versa)
10. Linking publications to relevant funders
11. Management effort for journal
embargoes
12. No or limited preservation functionality
13. Not enough resource to update from
older/out-of date version of repository
software
14. Not enough staff resource for
operational management
15. Reporting facilities not sufficient for
funder reporting
16. Reporting facilities not sufficient for
internal reporting
17. Technical support: lack of skills /
capability
18. Technical support: not enough capacity
19. Tracking/integrating AAMs deposited in
subject/other institutional repositories
(REF OA policy)
20. Usability and user interface issues
https://doi.org/10.5281/zenodo.1136075
17. www.bl.uk 17
Selected WG recommendations
• [Many recommendations on metadata, persistent identifiers etc.]
• A study into the feasibility of a national preservation solution be
undertaken, recognising that the British Library and Jisc are key
stakeholders.
• HEIs, Jisc, subject repositories and other stakeholders take
forward as a high priority improvements in the user experience.
• A study be conducted to explore the need for national repository
solutions or ‘hubs’ for one or all of the big challenges –
discoverability, sustainability and preservation. This study will
consider costs and benefits, and ultimately seek to define the
guiding principles and services […].
18. www.bl.uk 18
A national preservation solution?
• The British Library is already preserving the nation’s
published output, working with UK legal deposit libraries.
• However, we are only allowed to give access on our
premises and it doesn’t cover UK content not published by
UK publishers (unless it is in our web archive). So the
current solution isn’t fit for (this) purpose.
• A solution wouldn’t require a single national repository. It
could be done by pushing/pulling content from repositories
to one or more preservation platforms. From a preservation
perspective this may be better than a single system.
19. www.bl.uk 19
A national discovery solution?
• [Isn’t that Google? ;-)] May or may not be useful, but does
not require a single, national repository.
https://www.flickr.com/photos/62954923@N03/15625479052/ CC BY 2.0
20. www.bl.uk 20
What could a national repository fix?
1. Concerns about sustainability of the
underlying repository software package
2. Difficulties with CRIS system integration
3. Difficulties with integration with university
systems (other than CRIS)
4. Difficulties with maintaining custom
functionality
5. Issues with changing publisher and/or
funder policies changing compliance
status of articles
6. Lack of integration with identifiers
(such as ORCID or DOIs)
7. Limitation of reuse through deposit
licence (‘all rights reserved’), e.g.
for text and data mining
8. Limited/no facilities (such as API) to
support text and data mining
9. Linking publications to related datasets
(and vice versa)
10. Linking publications to relevant funders
11. Management effort for journal
embargoes
12. No or limited preservation functionality
13. Not enough resource to update from
older/out-of date version of repository
software
14. Not enough staff resource for
operational management
15. Reporting facilities not sufficient for
funder reporting
16. Reporting facilities not sufficient for
internal reporting
17. Technical support: lack of skills /
capability
18. Technical support: not enough capacity
19. Tracking/integrating AAMs deposited in
subject/other institutional repositories
(REF OA policy)
20. Usability and user interface issues
https://doi.org/10.5281/zenodo.1136075
21. www.bl.uk 21
One system to rule them all?
• A singe repository platform sounds tempting.
• However:
– Can one monolith meet all needs?
– Competition has benefits!
– It is getting easier to exchange data
between systems.
– Multi-tenancy is hard.
– Resilience vs. single point of failure,
– Local resistance to giving up in-house system
• With a government mandate unlikely and no obvious technology
solution a gradual move to more shared services seems a more
likely solution. If they are good enough, we might still get there.
22. www.bl.uk 22
The British Library in this space
• A new service strategy for
the Library’s role as the
national research library.
• Key element: enhance
the national collection by
services that open up and
help to sustain a global
knowledge environment.
http://doi.org/10.1629/uksg.409
23. www.bl.uk 23
Repository related plans
• Re-develop national
preservation system (>5m
items, petabyte-scale) into
multi-tenancy service
• Develop access layer with
multiple (logical) repositories
• Pilot for a multi-tenancy
(12 partner organisations)
access repository
• Consider national OA
preservation approach
• Discussions with Jisc and
others on partnerships
Preservation Layer
Services Layer
Access Layer
EThOS Data.bl.uk
BL
Institutional
Repository
Partner
Repositories
24. www.bl.uk 24
Concluding thoughts
• Stop customising and forking.
• Think in service, not systems
terms.
• Only develop your own systems if
you can do it better than others.
• Can we at least gradually move to
more shared services please?
• We need an internationally
coordinated approach to
preservation.