Today’s guest post is from Abbie Grotke (Assistant Head of the Digital Content Management section), Grace Bicho (Senior Digital Collections Specialist), Lauren Baker (Senior Digital Collections Specialist), and Abbey Potter (Senior Innovation Specialist), all from the Library of Congress.
In May, the Library of Congress sent four representatives to the 2023 Web Archiving Conference and General Assembly in the Netherlands to share our web archiving and digital innovation experience with our colleagues and to learn from other experts from all over the world. The Library is a founding member of the International Internet Preservation Consortium (IIPC), and our staff have stayed actively engaged with the community over the past three years through virtual events (including hosting the 2022 Web Archiving Conference). Our four representatives—Abbie Grotke, Grace Bicho, Lauren Baker, and Abbey Potter– were especially excited to attend events in-person for the first time since 2019. When they returned, they were eager to share what they learned with us. Here, members of the Web Archiving Team and LC Labs recount highlights of the conference and what it means to be a part of the IIPC community. If their accounts inspire you to learn more, many of the panels and presentations are also now available online.
It was so great to be back in person at IIPC in 2023! I’ve been attending IIPC meetings since its founding in 2003 so upon its 20th anniversary, it was wonderful to see colleagues again after so many video calls over the past few years. As with other IIPC events (both virtual and in person), I took advantage of opportunities to participate actively and engage with the community. I was pleased to give a brief update on the Library’s web archiving program for members; chair a post-session; get interviewed for an “IIPC at 20” video project; and participate in a Steering Committee meeting in The Hague before the conference started, enjoying train rides with fellow SC members back and forth from Hilversum, where the conference took place at The Netherlands Institute for Sound & Vision (co-hosted by KB, National Library of the Netherlands).
There were many great sessions at the conference, ranging from research and access to collaborative approaches to the challenges of web archiving, the latest news about tools used to archive challenging content, and more. There was a panel about “program infrastructure” that really resonated with me, as a manager of the Library of Congress web archiving program. Laura Wrubel from Stanford University spoke about some development work they are doing to modernize their web archiving infrastructure at Stanford (and the challenges of having developers just pop in to work on web archiving systems, the need to get up to speed in major ways); Paul Koerbin talked about National Library of Australia changes in infrastructure and reorganizations that have led his team to be now located in IT rather on the curatorial side (we web archivists compare notes about organizational structures more than you would think!). They are able to make incremental changes and improvements to tools and systems just sitting in a part of the organization where they have more autonomy and dedicated developers. And Daniel Gomes from Portuguese Web Archive discussed all of their many accomplishments and impressive development work. A key take-away for me from his talk: “no part-time web archivists! Web archiving is complex, requires training and full dedication!” I completely agree.
This was my third in-person IIPC conference and, this year particularly, I was reminded of its tremendous utility in getting this relatively small, geographically dispersed community speaking face-to-face. I spent the days in Hilversum inundated with the generosity and overall good nature of my colleagues, discussing highly practical aspects of web archiving! We all seized organic opportunities to learn about each other’s programs and challenges, whether in the conference rooms or at the nearest train station, waiting for the ride back to our hotels in the evening. Upon leaving the Netherlands, my brain was positively buzzing with new ideas and perspectives.
I had the great privilege to , along with brilliant folks from the UK National Archive. My talk focused on the Library of Congress Web Archiving Team’s QA workflows, which have now become routinized and stable after undergoing a revamp during the last few years. For being the community’s “necessary evil,” the QA session was well-attended and sparks of innovation arose in post-session discussions with attendees. Hearing the similarities among our program’s and the UK National Archives’ QA efforts, I was struck by the similar conclusions we’ve reached in tackling this difficult problem of QA’ing at scale, even as we are separated by an ocean. I had this overwhelming sense that we are on the right track.
A theme that seemed to emerge from this conference was a focus on discussing how researchers use web archives. There was an innovative presentation from Karin de Wild from Leiden University in the Netherlands who researches digital cultural heritage. The thing that made her talk and experience unique was being embedded as a resident among the KB (National Library of the Netherlands) Web Archiving Team and discussing how invaluable that direct contact between herself and the web archivists were for her research. It gave me ideas about how to generate these kinds of relationships with researchers for our archive, and the types of vehicles available to us which may allow working more closely with researchers.
A final highlight for me was acting as a mentor to a new community member who has the awesome task of setting up a web archiving program at her organization. It was wonderful to learn about her planned steps from the ground up and introduce her to colleagues and friends. It was an absolute pleasure providing a warm welcome to such a creative and resourceful community.
This was my first time attending the IIPC conference in-person, and I was grateful to have the chance to connect with colleagues who I’ve only met virtually over the past three years! As a relative newcomer to web archiving, I was overwhelmed by how friendly and welcoming the IIPC community is. Perhaps some of the loveliest interactions came in the interstitial moments between sessions and at dinner after long days of talks. I was also thankful to attend the conference with my colleagues who helped guide me, made introductions, and split up attendance so we didn’t miss any of the fantastic sessions!
One of my favorite experiences was connecting with members of the Training Working Group (TWG) on the General Assembly day. We brainstormed ways to build on the group’s existing beginners’ training materials, including creating case studies and offering interactive, intermediate level training. As co-chair of the group with Claire Newing from The National Archives UK, the opportunity to work together for a focused period during the conference was invaluable. Attending the GA helped me better understand how IIPC is organized, and meeting with TWG members over the course of the three days sparked a lot of ideas about the group’s plans for the coming year.
Attending the IIPC Conference made me feel excited to connect with peers, contribute to the web archiving community, and learn about all the ways that the field continues to grow!
It was so wonderful to be back with the IIPC! I served as the Communications and Program Officer when IIPC marked their ten-year anniversary so I felt grateful to join the group again as they celebrated 20 years as an organization. The diversity of membership and the scale of the conference has increased, as has the scope of the conversations. I saw a greater integration of web archives into research services, more advanced web archiving tools and platforms, and calls for improved connections between web archiving programs, their users and the communities documented in their collections, especially under-represented groups. Still, after a decade, challenges of making web archives accessible and useful and keeping up with rapidly changing technologies with modest resources remain central. The mission of the IIPC has never been more relevant which was apparent from the keynote presentation from Eliot Higgins who shared how Bellingcat, the online journalism group, is using the web to document and investigate the Russian invasion of Ukraine, capturing stories of everyday people caught up in the violence. Walking home from the conference that day I went by a stumbling stone memorializing Salomon Citroen who died in 1944 in Auschwitz. Evidence of people, places and events come in all formats and transforming that evidence into history and research that connect to future generations is the work of all web archivists. It was great to see the field thriving and improving.
I attended IIPC as a member of LC Labs, the Digital Innovation Division of the Library of Congress located in the Digital Strategy Directorate of the Office of the Chief Information Officer. We collaborate, research and experiment to lower barriers to innovation across the Library. Since 2019 we’ve been exploring how machine learning may help the Library meet its goals of expanding access to resources, enhancing services and connecting with new communities. Based on the several years of hands-on experimentation with machine learning, we are developing towards a framework to help library, archives and museum (LAM) professionals consider risks and benefits of machine learning, map out use cases, and design experiments to develop guidelines and quality standards that support the public and long-term missions of cultural heritage and memory organizations. During a workshop at the conference I shared how the framework developed and, with assistance from Lauren Baker, facilitated a workshop to develop use cases for using machine learning or other forms of artificial intelligence to support web archiving activities. The group then filled out a planning worksheet that asked participants to consider the potential risks and benefits to staff, users, and people depicted in the archives. The worksheet also asked for lots of details about the models being used and the data involved. Any machine learning or AI process involves a lot of technical detail that impacts how well a model performs. The worksheet is meant to start and document conversations and decision-making around early implementations of AI, and the web archivists in attendance gave great feedback on the exercise. Many of them found it valuable and grounding. I found the workshop to be energizing, full of ideas for improving the worksheet and the AI planning framework. Stay tuned to this blog for more about LC Labs and frameworks and tools for planning AI projects.