Last year, LC Labs worked with partners across the Library and outside its walls to advance the Digital Strategy. Here’s a look back at some of our work on the strategy’s goals of opening the treasure chest, connecting, and investing in our future, and a preview of this year’s plans.
In the coming year, we hope you’ll join us as we experiment and explore! Stay tuned in this space and our social media channels for job and contracting opportunities, collaboration opportunities, and more news of the digital transformation at America’s Library.
Opening the Treasure Chest
Collaboration is key for nearly all of our projects, but especially those whose activities intersect with the Digital Strategy’s call to exponentially grow our collections, to maximize the use of content, and support emerging styles of research.
For nearly three decades, the Library has designed collections and produced a massive treasure trove of digital content. How can we enable and empower researchers to use that data to its fullest potential? Thanks to a grant from the Andrew W. Mellon Foundation, we are now experimenting with sharing digital collections at scale. The Computing Cultural Heritage in the Cloud project enables us to test models for serving digital content to users in the cloud computing environment. We spent a bulk of 2020 developing a (model) for documenting our digital collections and other behind-the-scenes work to set the stage for incoming staff success. We entered 2021 poised to identify research interests that will help define possibilities for this project and to assess existing service models; more on this work in the coming months.
An increasing amount of cultural heritage material is born digital, but as formats and obsolescence multiply, so do the challenges to accessing this content. Our first LC Labs Staff Innovators, Kathleen O’Neill and Chad Conrady, explored tools and modes of access for born digital materials held by the Library’s Manuscript Division. After recommending tools and potential modes of access tailored to individual collections, they created a prototype digital workstation for processing, preserving, and providing these materials.
What does it take to connect with all Americans? The Digital Strategy identifies starting points: inspire a relationship with every visit, bring the Library to our users, welcome other voices, and drive momentum in our communities.
Connecting with and engaging users are key to the Library’s mission. At the start of 2020, the Librarian’s signature crowdsourcing initiative, By the People, continued to flourish and transitioned to a permanent home at the Library. In just two years, By the People has offered hundreds of thousands of pages for transcription across twenty campaigns, including Alan Lomax field materials, Civil War diaries and letters, Branch Rickey’s baseball scouting records, and selections from Rosa Parks’ papers. Become a volunteer today and dig into the papers of the papers of spiritualist Frederick Hockley from the Houdini Collection, letters to Theodore Roosevelt, or new campaigns launching in 2021. Sign up for updates from the By the People team and mark your calendar for Douglass Day programming beginning on February 12. The collaboration with the Colored Conventions Project will feature the vision, life, and experience of Mary Church Terrell through her papers and other materials.
LC Labs established the Innovator in Residence program to support innovative uses of our collections to expand public interest and engagement with them. This year’s projects, Newspaper Navigator and Citizen DJ, followed through on that promise!
Innovator in Residence Ben Lee recognized that, while the Library’s millions of digitized newspaper pages are easily text-searchable thanks to Optical Character Recognition software, there is no similar tool to find images. Lee built on a corpus of segmented images from newspapers made possible through previous Innovator Tong Wang’s Beyond Words project. With these and hand-annotated classification, Lee developed a workflow and used machine learning to harvest 100 million photographs, illustrations, cartoons, and maps, and created a user interface for the public to explore using machine-learning-assisted searches. Newspaper Navigator is available for you to explore. Interested in digging deeper? You can read Lee’s own critique of the project’s limitations in his data archaeology, as well as spin up your own investigations with code and derivative datasets.
The Library holds major collections of sound and moving image recordings, and hip hop and other musicians work with audio samples to create music. Seeing that connection, Innovator in Residence Brian Foo created the Citizen DJ application, which provides free-to-use sounds culled from Library collections for creating hip hop and other sample-based music. The interface allows users to explore, remix and combine with beats, and download sounds for use with other software; find Foo’s code and documentation in this repo. Along the way, Foo and our colleagues in AFC connected with classrooms and groups including PATH to teach creative expression, music production, and information literacy through Library of Congress collections. Get your headphones on and listen to Brian’s free-to-use album Tracks from the Stacks, and read his guide on ethics and sampling.
Investing in our Future
The third part of the Digital Strategy is a call to invest in the future, building on past and work and creativity while looking forward to future needs. Connecting a desired future with our capability and understanding current user needs and infrastructure presents opportunities to identify where we can invest near-term resources and attention.
The Ins and Outs of Machine Learning
The Digital Strategy pushes us to look to the future, considering the tools and technologies most likely to play a role in 21st century libraries. Last year’s “Season of Machine Learning” let us explore the opportunities and challenges of applying artificial intelligence to library collections. We collaborated with the University of Nebraska-Lincoln’s Project Aida for insights and recommendations, shared the outcomes of 2019’s Machine Learning + Libraries Summit, and released an expert state-of-the-field report by Professor Ryan Cordell. We also took next steps highlighted in the University of Nebraska-Lincoln’s Aida Intelligent Data Analytics report and Cordell’s Machine Learning + Libraries recommendations, beginning an experiment that integrates crowdsourcing and machine learning in September; we expect to share more about the Humans in the Loop project in spring 2021.
Sounding Out Audiovisual Preservation and Access
One of the next frontiers in digital transformation lies in the vast collections of audiovisual materials held by libraries and cultural heritage institutions, including our own. Last year, we worked with Library partners to test the possibilities for implementing speech-to-text transcription tools, using digital spoken-word collections from the American Folklife Center. We are collaborating on a generous Andrew W. Mellon Foundation grant to The University of Texas at Austin supporting preservation and promotion of audiovisual materials, as well as a partnership between Zooniverse and the Library’s American Folklife Center to improve audiovisual transcription workflows. The Library also hosted the I\V/A\V/ Informal Virtual Audiovisual Summit to share and learn about improving access to A/V content. The I\V/A\V Summit brought together hundreds of people for a full day of discussion of the state of the field, accessibility, and future directions, as well as the impact of operational changes during the COVID-19 crisis.
Each year, the Library welcomes a host of Junior Fellows to its Summer Intern Program. In the Summer of 2020, LC Labs helped re-imagine the first-ever remote version of the program while hosting five fellows. Fellows Selena Qian, Emily Sienkiewicz, Hibba Khan, Tyler Youngman, and Nina Kostic explored Sanborn maps, Veteran History Project collections of WWI audio interviews, Political Islam Web Archive and the Puerto Rico at the Dawn of the Modern Age collections, LC for Robots, and Serbian-American history materials contained in Library collections. You can check out their projects, and what their creators had to say about them, in the 2020 Junior Fellows Display Day.
These efforts and more build on decades of work and collaboration at the Library to boost momentum and capacity for the agency’s Digital Strategy.
And there’s so much more ahead! Our fourth Innovator in Residence, Courtney McClellan, is at work designing a collaborative annotation tool for students of all ages. We’re continuing our work with cultural heritage collections at scale, and experimenting with machine learning, crowdsourcing, and alternative access models. Stay tuned for news and opportunities to be involved!
Learn more about LC Labs at http://labs.loc.gov, subscribe to the monthly LC Labs Letter, and follow us on Twitter @LC_Labs. You can reach us at LC-Labs@loc.gov.