This is a guest post by Shira Peltzman.
Last month Alice Prael and I gave a presentation at the annual Code4Lib conference in which I mentioned a project I’ve been working on to update the NDSA Levels of Digital Preservation so that it includes a metric for access. (You can see the full presentation on YouTube at the 1:24:00 minute mark.)
For anyone who is unfamiliar with NDSA Levels, it’s a tool that was developed in 2012 by the National Digital Stewardship Alliance as a concise and user-friendly rubric to help organizations manage and mitigate digital preservation risks. The original version of the Levels of Digital Preservation includes four columns (Levels 1-4) and five rows. The columns/levels range in complexity, from the least you can do (Level 1) to the most you can do (Level 4). Each row represents a different conceptual area: Storage and Geographic Location, File Fixity and Data Integrity, Information Security, Metadata and File Formats. The resulting matrix contains a tiered list of concrete technical steps that correspond to each of these preservation activities.
It has been on my mind for a long time to expand the NDSA Levels so that the table includes a means of measuring an organization’s progress with regard to access. I’m a firm believer in the idea that access is one of the foundational tenets of digital preservation. It follows that if we are unable to provide access the materials we’re preserving, then we aren’t really doing such a great job of preserving those materials in the first place.
When it comes to digital preservation, I think there’s been an unfortunate tendency to give short shrift to access, to treat it as something that can always be addressed in the future. In my view, the lack of any access-related fields within the current NDSA Levels reflects this.
Of course I understand that providing access can be tricky and resource-intensive in general, but particularly so when it comes to born-digital. From my perspective, this is all the more reason why it would be useful for the NDSA Levels to include a row that helps institutions measure, build, and enhance their access initiatives.
While some organizations use NDSA Levels as a blueprint for preservation planning, other organizations — including the UCLA Library where I work — employ NDSA Levels as a means to assess compliance with preservation best practices and identify areas that need to be improved.
In fact, it was in this vein that the need originally arose for a row in NDSA Levels explicitly addressing access. After suggesting that we use NDSA Levels as a framework for our digital preservation gap analysis, it quickly became apparent to me that its failure to address Access would be a blind spot too great to ignore.
Providing access to the material in our care is so central to UCLA Library’s mission and values that failing to assess our progress/shortcomings in this area was not an option for us. To address this, I added an Access row to the NDSA Levels designed to help us measure and enhance our progress in this area.
My aims in crafting the Access row were twofold: First, I wanted to acknowledge the OAIS reference model by explicitly addressing the creation of Dissemination Information Packages (which in turn necessitated mentioning other access-related terms like Designated Community, Representation Information and Preservation Description Information). This resulted in the column feeling rather jargon-heavy, so eventually I’d like to adjust this so that it better matches the tone and language of the other columns.
Second, I tried to remain consistent with the model already in place. That meant designing the steps for each column/level so that they are both content agnostic and system agnostic and can be applied to various collections or systems. For the sake of consistency I also tried to maintain the sub-headings for each column/level, (i.e., “protect your data,” “know your data,” “monitor your data,” and “repair your data”) even though some have questioned their usefulness in the past; for more on this, see the comments at the bottom of Trevor Owens blog post.
While I’m happy with the end result overall, these categories map better in some instances than in others. I welcome feedback from you and the digital preservation community at large about how they could be improved. I have deliberately set the permissions to allow anyone to view/edit the document, since I’d like for this to be something to which the preservation community at large can contribute.
Fortunately, NDSA Levels was designed to be iterative. In fact, in a paper titled “The NDSA Levels of Digital Preservation: An Explanation and Uses,” published shortly after NDSA Levels’ debut, its authors solicited feedback from the community and acknowledged future plans to revise the chart. Tools like this ultimately succeed because practitioners push for them to be modified and refined so that they can better serve the community’s needs. I hope that enough consensus builds around some of the updates I proposed for them to eventually become officially incorporated into the next iteration of the NDSA Levels if and when it is released.
My suggested updates are in the last row of the Levels of Preservation table below, labeled Access. If you have any questions please contact me: Shira Peltzman, Digital Archivist, UCLA Library,speltzman@library.ucla.edu | (310) 825-4784.
LEVELS OF PRESERVATION
Level One (Protect Your Data) |
Level Two (Know Your data) |
Level Three (Monitor Your Data) |
Level Four (Repair Your Data) |
|
Storage and Geographic Location | Two complete copies that are not collocated
For data on heterogeneous media (optical disks, hard drives, etc.) get the content off the medium and into your storage system |
At least three complete copies
At least one copy in a different geographic location/ Document your storage system(s) and storage media and what you need to use them |
At least one copy in a geographic location with a different disaster threat
Obsolescence monitoring process for your storage system(s) and media |
At least 3 copies in geographic locations with different disaster threats
Have a comprehensive plan in place that will keep files and metadata on currently accessible media or systems |
File Fixity and Data Integrity | Check file fixity on ingest if it has been provided with the content
Create fixity info if it wasn’t provided with the content |
Check fixity on all ingestsUse write-blockers when working with original media
Virus-check high risk content |
Check fixity of content at fixed intervals
Maintain logs of fixity info; supply audit on demand Ability to detect corrupt data Virus-check all content |
Check fixity of all content in response to specific events or activities
Ability to replace/repair corrupted data Ensure no one person has write access to all copies |
Information Security | Identify who has read, write, move, and delete authorization to individual files
Restrict who has those authorizations to individual files |
Document access restrictions for content | Maintain logs of who performed what actions on files, including deletions and preservation actions | Perform audit of logs |
Metadata | Inventory of content and its storage location
Ensure backup and non-collocation of inventory |
Store administrative metadata
Store transformative metadata and log events |
Store standard technical and descriptive metadata | Store standard preservation metadata |
File Formats | When you can give input into the creation of digital files encourage use of a limited set of known open file formats and codecs | Inventory of file formats in use | Monitor file format obsolescence issues | Perform format migrations, emulation and similar activities as needed |
Access | Determine designated community1
Ability to ensure the security of the material while it is being accessed. This may include physical security measures (e.g. someone staffing a reading room) and/or electronic measures (e.g. a locked-down viewing station, restrictions on downloading material, restricting access by IP address, etc.) Ability to identify and redact personally identifiable information (PII) and other sensitive material |
Have publicly available catalogs, finding aids, inventories, or collection descriptions available to so that researchers can discover material
Create Submission Information Packages (SIPs) and Archival Information Packages (AIPs) upon ingest2 |
Ability to generate Dissemination Information Packages (DIPs) on ingest3
Store Representation Information and Preservation Description Information4 Have a publicly available access policy |
Ability to provide access to obsolete media via its native environment and/or emulation |
1 Designated Community essentially means “users”; the term that comes from the Reference Model for an Open Archival Information System (OAIS).
2 The Submission Information Package (SIP) is the content and metadata received from an information producer by a preservation repository. An Archival Information Package (AIP) is the set of content and metadata managed by a preservation repository, and organized in a way that allows the repository to perform preservation services.
3 Dissemination Information Package (DIP) is distributed to a consumer by the repository in response to a request, and may contain content spanning multiple AIPs.
4 Representation Information refers to any software, algorithms, standards, or other information that is necessary to properly access an archived digital file. Or, as the Preservation Metadata and the OAIS Information Model put it, “A digital object consists of a stream of bits; Representation Information imparts meaning to these bits.” Preservation Description Information refers to the information necessary for adequate preservation of a digital object. For example, Provenance, Reference, Fixity, Context, and Access Rights Information.