Slides accompanying a presentation given by Dan Gillean of Artefactual Systems at the PERICLES/DPC joint conference and meeting, "Acting on Change: New Approaches and Future Practices in LTDP," held in London at the Wellcome Collection Conference Center, Nov 30 - Dec 2, 2016.
The talk examines the question of the Capacity Gap - why is it that we have so many tools, services, standards, models, and metrics to support digital preservation, but so many organizations feel they do not have the capacity or capability to begin tackling digital preservation within their institution?
The presentation offers a different take based on Dan's experience working as an analyst and consultant for a software development company engaging with many different types of organizations and individuals in the cultural heritage sector. While acknowledging that the under-resourced nature of cultural heritage work plays a key role, this presentation examines some oft-encountered perceptual or cognitive barriers to getting started with digital preservation. It then provides some suggestions on how to overcome these barriers, acknowledging that anything is better than nothing when it comes to DP, and that sometimes perfect can be the enemy of good.
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preservation
1. Do Something Now:
Why Perfect is the Enemy of Good
(Enough) in Digital Preservation
Starting Blocks at Vacant Starting Line Before Event, by tableatny: https://www.flickr.com/photos/53370644@N06/4976494944
Dan Gillean, MAS, MLIS
Acting on Change Conference
London – November 30, 2016
2. Restating the Problem:
We have many tools,
services, standards, models,
and metrics designed to
support digital preservation
and access!
Happy cat, by panli54 - https://www.flickr.com/photos/53911972@N03/4988877591
3. …And yet, many institutions
and organizations feel they
do not have the capacity or
capability to begin seriously
addressing digital
preservation.
Restating the Problem:
Cat-Sad-Annoyed, by Robert Tortorelli - https://www.flickr.com/photos/39969232@N08/16760847493
24. • Governance
• Organizational structure
• Staffing
• Procedural accountability
• Preservation policy framework
• Documentation
• Financial sustainability
• Security
ISO 16363
Reminds us that much of digital
preservation readiness is not technical
– it’s organizational
25. Level 1 (Protect) Level 2 (Know) Level 3 (Monitor) Level 4 (Repair)
Storage and
Geographic
Location
• 2complete copies not
collocated
• Get media off diverse storage
media and into a system
• At least 3 complete copies
• At least 1 in different
geographic location
• Document storage system,
media, and what’s needed to
use them
• At least 1 copy in location
w different disaster threat
• Obsolescence monitoring
process for storage system
and media
• At least 3 copies in locations w
different disaster threats
• Comprehensive plan to keep
files and metadata on
currently accessible media or
systems
File Fixity and
Data Integrity
• Fixity check on ingest if
checksum provided w content
• Create fixity info if not
provided on transfer
• Check fixity on all ingests
• Use write-blockers w original
media
• Virus check high-risk content
• Fixity checks at regular
intervals
• Maintain fixity logs and
supply audit on demand
• Virus check all content
• Ability to detect corrupt
data
• Check fixity in response to
specific events/activities
• Ability to replace/repair
corrupted data
• Ensure no one has write access
to all copies
Information
Security
• Identify who has read, write,
move, and delete
authorizations
• Restrict who has those
authorizations to individual
files
• Document access restrictions
for content
• Maintain logs of who
performed what actions on
files, incl. deletions and
preservation actions
• Perform audit of logs
Metadata
• Inventory of content and its
storage locations
• Ensure backup and non-
collocation of inventory
• Store admin metadata
• Store transformative
metadata and log events
• Store standard technical
and descriptive metadata
• Store standard preservation
metadata
File Formats
• Encourage creators to use
open formats and codecs when
possible
• Inventory of file formats in
use
• Monitor file format
obsolescence issues
• Perform format migrations,
emulation, etc. as needed
NDSA Levels of Preservation
Adapted from: http://ndsa.org/activities/levels-of-digital-preservation/
26. NDSA Levels of Preservation – Categories
Quantity of NDSA Levels of
Preservation Criteria
Quantity of related
ISO 16363 Criteria
Storage and Geographic Location 9 34
File Fixity and Data Integrity 12 29
Information Security 5 22
Metadata 6 50
File Formats 4 32
(Unmappable from ISO 16363) - 23
Blog post: https://www.avpreserve.com/papers-and-presentations/mapping-
standards-for-richer-assessments-ndsa-levels-of-digital-preservation-and-
iso-163632012/
Mappings: https://www.avpreserve.com/wp-content/uploads/2016/05/ISO-
Requirements-by-NDSA-LoDP-Categories.xlsx
Slides: http://www.avpreserve.com/wp-
content/uploads/2014/07/NDSA_ISO_Presentation_2014.pdf
AVPreserve – 16363/NDSA mappings
36. Seek out
stakeholders
and build
your case
Unique Hotels, “Board Room - Vihula Manor Country Club & Spa.”
https://www.flickr.com/photos/62485988@N05/5692789910
40. Do Something Now
Starting Blocks at Vacant Starting Line Before Event, by tableatny: https://www.flickr.com/photos/53370644@N06/4976494944
info@artefactual.com
Editor's Notes
List is highly selective – does not include all tools, services, and standards
Standards: does not include content standards, only a couple metadata exchange standards like EAD
Line between service and a tool is blurred – e.g. Dataverse, Preservica, LOCKSS
Does not cover major version changes of tools, or formalization of standards (e.g. TRAC ISO 16363)
Many tools listed are open source, but a few aren’t (Preservica, Rosetta, etc). Means barrier isn’t financial.
420 tools as of November 2016
Let me pause here a moment to acknowledge that one of the prime barriers remains resources – human and financial. The cultural heritage sector is vastly underfunded, and yet we’re being expected to do so much more now.
I don’t have a magic solution to the funding problem. Today I’m going to focus instead on cognitive barriers to undertaking digital preservation, drawn from my experience at Artefactual, where we have had the opportunity to work with dozens of clients, large and small.
Information overload. Too many options, not enough guidance. When the answer to so many questions in our field is, “it depends,” it can be difficult to know where to begin. In some ways, our successes in advancing the state of the resources available has also contributed to cognitive barrier some people feel around knowing where to start.
I want to acknowledge that this feeling is related in ways to Imposter Syndrome, the sense that we’re frauds in a field of experts and the belief that our folly will be exposed at any moment. This is a wide-ranging issue written about eloquently by many others, and which has a disproportionate effect in our profession when looked at through the lens of gender and gender identity, race, and other factors affecting privilege. It’s a topic well worth examining further, but in another talk.
List is highly selective – does not include all tools, services, and standards
Standards: does not include content standards, only a couple metadata exchange standards like EAD
Line between service and a tool is blurred – e.g. Dataverse, Preservica, LOCKSS
Does not cover major version changes of tools, or formalization of standards (e.g. TRAC ISO 16363)
Many tools listed are open source, but a few aren’t (Preservica, Rosetta, etc). Means barrier isn’t financial.
420 tools as of November 2016
21 tools just for file format identification. If we take a closer look, we can see that some of these tools actually incorporate each other:
FITS: incorporates 2 other tools on this page: JHOVE, DROID
Other tools not listed here, like Archivematica, incorporate FITS
So how do we figure out which tool we should use? How to figure out when to use FITS and when to use Siegfried? (the answer is: it depends. Meaning you can only really learn by doing, and by asking questions.)
When it comes to relating to technology itself, there are 2 opposing but related mentalities we have encountered doing work around digital preservation. The first is the Black Box:
A suspicion of, and failure to understand what the tools do and don’t do, how they work, etc. This can lead to overly complex workflows, or a failure to get started altogether. There is the hope that the problem will be handed off to someone else, or that some new proof will emerge to either confirm or deny these suspicions that the technology is untrustworthy – and until then, no action should be taken.
The converse mindset to this is the Magic Wand – that is, magical thinking about the powers of technology to “solve” digital preservation, for the process to be fully automated at the push of a button. Set and forget. If a tool is not fully automated, it must be inferior and unworthy of considering. And yet, so much of digital preservation is about more than just tools.
Big Picture Paralysis is, contrary to its name, a way of obsessing over details. It’s the thinking that, “We can't do anything until we've thought through every edge case, every possible future problem, every variable and use case.” It’s a desire for control over the unknown that provides relief by promising that inaction is better than failed or short term action. This is definitely a case where perfect becomes the enemy of the good.
If you have 10GB of video now, but you think you might have 10PB in 5 years, does this mean you should do nothing to properly store, monitor, and preserve the video you do have?
Of course there are always those to whom our standards and best practices just don’t apply. The Special Snowflake Effect is the sentiment that, “our records, our workflows, our needs are so unique and specialized, that the existing standards or tools cannot possibly meet our needs.”
This can lead to custom metadata profiles and bespoke systems. More knowledge required for preservation becomes siloed to specific individuals or systems; the burden of documentation is higher, and the efforts required to migrate environments or share access across institutional boundaries become increasingly challenging.
I will borrow another way of looking at this from a presentation that Cassie Findlay of Recordkeeping Innovation recently made at PASIG NYC – The Toothbrush Principle. “Standards are like toothbrushes. Everyone thinks they are a good idea, but no one wants to use someone else’s.” We have definitely encountered this thinking around tools as well.
Within our community of practice however, we can sometimes collectively make perfect the enemy of the good – or good enough. This is the 927 problem: seeking the magic bullet format, technology, or standard that will supersede all previous efforts and bring about a golden age of universal adoption. The 927 problem is the reinvention of the wheel, over and over again, sometimes at the expense of previous efforts. It can often mean just adding one more option to a crowded field, and further bisecting our efforts along parallel but separate paths.
Finally, we arrive at the Tools Fetishist. The techno solutionist, who becomes overly focused on the technical aspects before working out the goals, strategies, and and workflows involved in getting there. This thinking assumes that policy, procedure, documentation, staffing, training, and so forth are things that can be tacked on at the end of creating a digital preservation environment, instead established first and used in the selection of the tools. If you’ve chosen a hammer with this method, everything can start to look like a nail, and the workflows will soon be built around the tools, instead of the other way around.
There are surely more behaviors we can enumerate, but the point of doing so is not to shame anyone who has ever thought about digital preservation in these ways, but rather to say, okay, where do we go from here? How can we approach this differently? I’d like to offer a few brief suggestions.
Understanding what materials your working with is critical to preparing a digital preservation strategy. If we’re aiming to be pragmatic, to do something now, then we need to forgo worrying about the edge case formats, and focus on what we are currently tasked with preserving.
This means building an inventory. It means understanding the types of file formats in your care – are they open formats, or proprietary? Common or rare? Do you have a large diversity of formats to contend with, or a smaller set of recurring ones? Once we know what we are working with, we can begin to build a strategy around what needs to be done – both minimally, and optimally.
If we acknowledge that digital preservation efforts remain vastly under-resourced, then it makes sense for each of us to be contributing to solutions that will benefit all. DP best practice already embraces many of these principles – open source and open documentation will allow us to collaborate and share, while open formats and open standards ensure that our efforts will remain accessible and interpretable in the long-term.
It can be scary, but the best thing you can do, no matter where you’re at in your preservation efforts, is to use some kind of metric to get a clear sense of how you’re doing before you determine what’s next. Self-assessment will help you figure out what you have the capacity to improve immediately, versus what you will need to plan to address over time.
Using a metric or model also helps make the requirements behind a trustworthy digital preservation environment concrete – they provide clear benchmarks which can be used to help concrete actions that will improve your preservation readiness.
Equally important to note is that digital preservation is not all tools and systems – much of it is organizational, covering internal policies and procedures, workflow documentation and accountability chains, mission statements, budgeting, staffing and succession planning. Regardless of your resources or the technical expertise you have in-house, considering and prioritizing these important aspects means that you can start working on digital preservation today.
Created in 2013 by the National Digital Stewardship Alliance
Provides you with 4 levels across 5 categories, with a total of 36 criteria
Can be less intimidating than ISO 16363 as a starting place
In 2014, Bertram Lyons of AVPreserve presented his analysis of the NDSA levels against ISO 16363. He has shared his work via the AVPreserve website, so you can use the levels as an entry point into the more granular requirements outlined in 16363.
There are many other maturity models and self-assessment tools out there – one more of note the Digital Preservation Capability Maturity Model, developed by Charles Dollar and Lori Ashley in 2013. It provides 5 stages or levels that can be used for assessment in 15 categories – 8 related to Digital Preservation Infrastructure, and 7 related to Digital Preservation Services.
Dollar and Ashley even created a free online assessment tool that can be used with the model, which can be found at www.digitalok.org
All this to say – pick a tool or metric that makes sense to you as a starting place. Even if you haven’t started formally thinking about digital preservation in your institution, run through the model and save your results. Now do it again in a year, so you can see the progress you’ve made.
Ask questions, and help answer them
Submit documentation (or request it if you can’t write it)
Attend or organize meetups, user groups, and skillshares
Watch or deliver webinars
Help translate resources into another language
Make conference presentations
Write a blog post
Fill out a wiki page or add a review
If we acknowledge that much of our challenges stem from a lack of resources, then we need to make clear to our stakeholders the value in investing in digital preservation, and the consequences of ignoring it.
There are resources out there that can help you do this. For example, the Digital Preservation Coalition has assembled a DP Business Case Toolkit, full of tips and resources on how to make your case.
They also have a page linking to dozens of other related resources. It’s important to consider how doing an internal self-audit with a recognized metric or model can help you build your case. It can clarify what the expectations are and where your organization is falling short, and later, it can also help you demonstrate the progress you’ve made and justify further support.
Finally, be public. We can learn from each other’s failures as much as successes.
Remember, what matters is that you do SOMETHING. Digital Preservation Readiness is not all or nothing – it should be anything or nothing. All our tools, standards, maturity models, guides, and services are only useful if they help us start doing the work we need to do. Once we start, it becomes easier with each step to move forward. The move from nothing to something, to good enough, to better, to optimal is a long iterative process, and the analysis that results from a self-assessment can take longer than most institutions expect - if you start with your gaze set too firmly on the optimal, you might never get to good enough (or even something).
And while I never thought I would find myself quoting a former President during a digital preservation conference presentation, I will surprise myself by acknowledging that Theodore Roosevelt’s words can hardly apply more perfectly to this subject:
“Do something Now. If not you, who? If not here, where? If now now, when?”
Thank you.