The following is a guest post by Jen LaBarbera, National Digital Stewardship Resident at Northeastern University Library.
It’s hard to believe that I only have two and a half months left in this residency. Despite Boston’s interminable winter (officially the snowiest on record), my time as a National Digital Stewardship Resident at Northeastern University has flown by!
As with many digital preservation projects, while my residency is technically under the auspices of Northeastern’s Archives and Special Collections, my work is closely tied to a few other areas of the library. My desk is in the Library Technology Services suite, but about half of the folks I work most closely with are in the Digital Scholarship Group, and I’m in fairly regular contact with our metadata librarians.
At present, my project involves four distinct but related goals. The first three involve developing workflows for the following basic categories of digital material in the archives:
- Born-digital materials (ingest recently born-digital content from Our Marathon to a new digital repository service).
- Digitized materials (ingest digitized images and documents for two Latino collections – Inquilinos Boricuas en Acción and La Alianza Hispana – to the new digital repository service).
- Legacy digital materials (make accessible the “box of disks” from the Hispanic Office of Planning and Evaluation records).
- Develop a digital preservation plan for Northeastern.
Of course, this last goal is enormous, and could easily be its own separate project for a resident. When I started at Northeastern in September, there was a lot of interest in developing a digital preservation plan for the institution. Most of this work will happen after my residency ends, but I am in the process of creating a light framework for Northeastern’s future in planning for digital preservation.
As you can see, the first two goals involve developing workflows as they relate to Northeastern’s new digital repository service (DRS). When I started here in the fall, Northeastern was in the middle of launching this beautiful new custom-built Fedora-based institutional repository. Along with the release of this upgrade from their old repository service, of course, comes the development of new workflows. As the resident at Northeastern, it is my job to develop these workflows for the three types of digital material that the Archives and Special Collections deal with: digitized, recently born-digital and legacy born-digital. The collections mentioned in the list above are to act as test cases for these workflows.
To date, I’ve spent the most time working on the ingest of recently born-digital material, specifically, ingesting a digital archive on one platform (in this case, Omeka) to the new DRS. The digital archive we’re working with is Our Marathon, which is a crowd-sourced and curated archive of pictures, videos, stories and social media related to the Boston Marathon bombing on April 15, 2013. It was created as a public, community archive, so people were encouraged to submit their own stories, primarily in the form of images and testimonials. It also includes some more curated material from traditional archival sources like WBUR, Boston City Archives and the Countway Medical Library, among others.
Our Marathon was created as a digital humanities project in our Digital Scholarship Group, but unlike a lot of digital humanities projects, it was created in close consultation with the archives and special collections staff from the start. This means that for the most part, we’re luckily working with some very solid and robust metadata. As a crowd-sourced archive and a digital humanities project, though, this collection poses a number of interesting metadata-related and structural challenges. For example, there are some images in the crowd-sourced collection that may not be particularly unique, but include a description that acts as a testimonial; in essence, there are two items that should be captured in one item’s container. We’re of course pulling the individual items and their descriptive and technical metadata over from the Omeka site, but we spent a good deal of time wrestling with the best way to preserve the structure and intellectual choices made by the creators of the digital archive on Omeka.
It’s pretty clear that each digital humanities project like this has its own unique intellectual and technical challenges for digital preservation, which makes developing a blanket workflow for this ingest a little difficult. I’m attempting to create a framework for a workflow that can be both sturdy enough to provide consistency and versatile enough to be applied to other digital humanities projects as they are created and then ingested into the repository for long-term preservation.
I’m also developing a workflow for digitized material in Northeastern’s Latino collections, consisting of images and documents that were digitized from records of two Latina/o community organizations. (The Archives’ collection scope includes not only institutional records but also the records of Boston-area social justice organizations.) This project will be a little simpler, as at least some metadata was assigned to the files as they were digitized. My work on this project has just begun, and ingesting these items into the DRS should be fairly straightforward. I will be providing recommendations to Northeastern for enhancing the metadata on these records to make them more discoverable within the DRS and to ensure that the records adhere to our newly adopted metadata standards for MODS records.
Lastly, the legacy born-digital material I’m working with involves what so many archives and special collections are receiving from donors: boxes of disks. Specifically, in this collection, we have four record boxes that include 48 CDs, 18 iOmega zip drives, and 177 3.5″ floppy disks. Though we dream about acquiring a FRED workstation, we don’t have high-end, high-powered digital forensics equipment at Northeastern. My work on this, then, has involved a lot of research on other, more economical workarounds for accessing the information that’s trapped in these boxes of disks. I’ve found some promising possibilities, and by the end of my residency, I will provide Northeastern with a report of my findings and recommendations for pulling this information off of these disks and making it accessible to researchers.
These next few months will no doubt be a whirlwind of activity as I wrap up the various aspects of this project. I’ll also be finishing up some work with my cohort of residents, including an exciting project we took on to provide some digital preservation recommendations to the History Project, Boston’s LGBT community archive. I’ve learned so much about digital preservation and digital stewardship during this residency, both in my project and through the NDSR Boston cohort, and I’m excited to bring that knowledge and experience with me when I move on to the next step in my archival career.