Today’s guest post is from Abby Shelton, a Digital Collections Specialist and By the People Community Manager in the Digital Content Management Section at the Library of Congress.
How do people use crowdsourced transcriptions? Do they drive increased traffic and engagement to our digital collections? What kinds of activity do transcriptions of handwritten documents facilitate?
These are some of the big questions that the By the People team is asking this year. We know our volunteers are motivated by making collections accessible and useful for all. Our volunteers have completed an incredible number of transcriptions, over 580,000 of them. We have integrated over 146,000 of those back into their source collections and are continuing to add as our volunteers complete campaigns. To better understand our program’s reach and communicate to our volunteers the real-world impact of their work, we have started looking into the impact of transcriptions from a few different angles. This post will focus on search and discovery, which is one of the ways we know that transcriptions make the collections of the Library of Congress more accessible to all.
To evaluate search terms and usage, we identified terms used in loc.gov that resulted in a user landing on an item from three collections: the papers of Branch Rickey, Carrie Chapman Catt, and Rosa Parks. And we created custom date ranges for each collection depending on when the transcriptions were added to loc.gov so that we could compare pre-and post-transcription data.
We found that as expected, adding transcription data to these collections increased the number of times a user found items from the collections in their search results. In the year after the Branch Rickey transcriptions were integrated into loc.gov the digital collection saw an 86% increase in the number of search terms and 93% increase in user visits where a search led patrons to a Rickey item. Similarly, in the six months after the Catt transcriptions were published, there was a 47% increase in the number of terms and a 43% increase in user visits leading patrons to discover items from the collection. The Rosa Parks collections showed only a modest increase of 3% of search terms leading to Parks and a 23% increase in user visits after transcriptions were added to the collection. One theory about why this might be the case is that the Rosa Parks collection is one of the most popular collections at the Library of Congress and it receives more traffic per year than either of the other test collections. As a result of such high traffic, the transcriptions made a slight but unremarkable difference in users finding their way to the digital collection via keyword search.
Next, we were curious to know what kinds of terms led patrons to these collections due to the transcription data. This required checking search terms against transcription content in loc.gov. And we found all kinds of interesting terms that people all over the world have used to access our transcribed collections.
Place names, thematic terms, and historical events dominate the list but a number of terms illuminate the networks that surrounded the collections. For instance, loc.gov users frequently searched for figures in Carrie Chapman Catt’s network of suffragists, activists, and reformers. Jessie Haver Butler (one of the first women professional lobbyists in Washington), Ella A. Boole (president of the Women’s Christian Temperance Union), and María Abella de Ramírez (founder of the National Women’s League in Argentina) all appear as part of the transcriptions. Without the effort of By the People volunteers, a search of loc.gov for the names of these and many other Catt correspondents would skip over important materials in the collection. These are terms found only in the transcriptions, not in the titles or other metadata associated with the item.
Similarly, a majority of the search terms used to find the Rosa Parks papers revolved around Civil Rights figures and events, including a large group of Black women’s names. Many of these terms came out of the many programs or newspaper clippings reporting on events where Rosa Parks was honored or gave a lecture. For example, the author of this article from the March 1991 edition of Jet Magazine listed the luminaries who gathered at the National Gallery of Art for the unveiling of a statute of Parks, including Coretta Scott King, C. Delores Tucker, John Lewis, and Cicely Tyson. The transcriptions of these programs and news clippings allow us to get a better sense for the networks that Rosa Parks inhabited-who she spoke with and attended events alongside. And if not for our volunteers’ transcriptions, a patron searching for one of these names in the Rosa Parks papers wouldn’t find any of the textual materials related to these figures in the collection.
This is the just the beginning of the impact transcriptions could have on search and discoverability. Have you used By the People transcriptions? Let us know-we would love to expand our understanding of how these resources are being used!