This week saw the first RDA UK workshop hosted by Jisc in Birmingham. The Research Data Alliance is a community-driven organisation aiming to build the social and technical infrastructure to enable open sharing of data. Members come together through self-formed Interest and Working Groups to discuss shared issues and develop recommendations and outputs to address these. In addition to individual members, the RDA has over 44 organisational members, like the DCC, STFC, LIBER and Wiley, who are seen as critical to supporting the adoption of outputs. The DCC ran a workshop at IDCC13 to spread the word about the RDA and to encourage comment on the initial proposals for working groups. Our blog post from that event explains more about the function of the groups and how they are established. Experience has led to some aspects of the process being adjusted.
It was thus interesting to hear Mark Parsons, Secretary General of the RDA, describe both the original vision and how it has been put into practice. He spoke to the organisational philosophy of letting many flowers bloom. RDA will not define the architecture or operate a system, rather it supports the community development of a set of social and technical tools for others to adopt and implement. Mark drew on Anna Tsing’s Friction – an ethnography of global connection to highlight that friction is inevitable in collaboration and that we should embrace it. Often the best results come from working through contrasting ideas and approaches, and these will flourish in such a ground-up, global organisation as RDA with over 4500 members from various backgrounds and cultures.
Juan Bicarregui of STFC then took us through a personal reflection of how open data policy has evolved and strengthened over the years. This has been driven by high-level declarations from bodies like the G8 and OECD, and been reflected in national and funder policy. Recent EC communications have called for an international forum and open science environment that have led to the RDA and the EOSC as a way to help implement the policy agenda. Juan mapped out inherent tensions in the policies, for example the principle of data as a public good and the underlying responsibility that someone needs to manage the data if public access is to be achieved. The overarching principles are hard to disagree with, but implementation of open data policy isn’t easy!
The main section of the event focused on breakout groups in four areas to discuss RDA Working Group outputs and how these are being adopted by the community:
- Trust and certification
- Data citation
- Metadata standards
- Publishing data
The Trust and Certification group had talks from Ingrid Dillo of DANS and Lesley Rickards of the BODC. Lesley reported on the RDA working group efforts to standardise entry-level certification, based on the Data Seal of Approval and the World Data System membership criteria. Differences in language were overcome and there has been successful recent work to test the resulting Catalogue of Common Requirements. Ingrid talked about certification efforts by DANS, firstly to renew its own Data Seal, and then to take the next step up the levels of the European Trusted Data Repository Framework, the Nestor Seal. The efforts involved in certification can be quite substantial. Ingrid quoted 250 hours for DSA renewal and 1500 hours for the Nestor Seal. The payoff has several dimensions. Certification offers some assurance to researchers, funders and journals that data is properly managed. The process of self-assessment improves communication among staff, and the published results offer transparency. It can also highlight gaps where improvements are needed, driving efficiency savings. Nevertheless the time investment brought gasps from some of the institutions represented. There is interest among institutions in certification and, as Ingrid pointed out, just reading the Data Seal or Catalogue of Common Requirements is a useful starting point. Others have gone further and acquired the Data Seal for their institution’s data repository. Institutions can also benefit from the work we have been doing in DCC on capability modelling. These take the form of a Checklist for RDM Services, and a How-to Guide on Repository Evaluation, both informed by the RDA-WDS work on the Catalogue of Common Requirements. We will be releasing them on the website soon and discussing further at IDCC in February. One notable achievement of this group is that, rather than adding to the world’s collection of standards, it has effectively reduced the total by one.
The citation group discussed guidelines to support dynamic data citation. These suggest that a query of a dynamic dataset is given a PID, time stamped and stored in a database. This allows a precise subset to be cited and retrieved. The guidelines are currently being tested by various repositories. A question was raised about creating an exemplar in the UK and whether Jisc could support an institution in implementing this.
Alex Ball, Dom Fripp and I spoke in the Metadata standards group. Alex was co-chair of the Metadata Standards Directory WG, which built on the DCC’s Disciplinary Metadata Standards catalogue so it could be sustained as an international initiative. The new Metadata Standards Catalogue WG will take the Directory, which allows standards to be browsed by discipline, and develop a searchable catalogue. They also intend to allow standards to be compared more easily so people can develop cross-walks. A series of use cases have been developed and a requirements specification is out now for consultation. The DCC will be importing data from the metadata standards directory and providing it directly within DMPonline as a way for users to answer questions about metadata. The ability to search for relevant standards will raise awareness of available options, and details about the use of standards can be extracted and analysed more easily than from the typical freetext responses.
The Data Publishing session included a talk by Fiona Murphy on the Data Publishing Workflows Working Group, which DCC has also been involved in. Its main output has been a reference model titled Key Components of Data Publishing. This aims to pick up from the OAIS reference model, by describing current practices that connect research data with related scholarly communication outputs, including code, articles and data papers. More recently the Working Group has been looking ‘upstream’ to review examples of support for data publishing early in the research lifecycle - another topic for discussion at IDCC.
The final session was a discussion on the potential role of an RDA-UK. Many of those who attended this first workshop were already members of RDA and had attended plenaries. Ian Bruno expressed the view that we should avoid this becoming a UK version of the plenaries. RDA-UK should reach a different audience and broker relationships with researchers and intermediaries whose inputs are critical to RDA but may be unlikely to participate. Mary McDerby suggested that information be shared about RDA activities to enable the UK RDM community to feed into WGs. There’s a lot of expertise within UK universities so RDA-UK could act as a broker to facilitate engagement and the adoption of outputs. Brian Matthews proposed the RDA could play an advocacy role, offering an external perspective to help shape the UK open data agenda. Mark Parsons meanwhile gave examples of other countries with national bodies like Finland and Germany, suggesting that it can sometimes be easier to have national coordination in smaller context like Finland (the UK has one of the biggest memberships in RDA) but there are lots of practical things that can be done like the workshops run in Germany. Ian Bruno asked whether Jisc could provide the UK voice or if it has too many of its own interests? Rachel Bruce confirmed that Jisc can’t speak alone and needs to work in collaboration with others, suggesting that existing vehicles like the UK Open Research Data Forum be used to establish a subgroup that represents multiple stakeholder interests.
There’s a lot of potential for an RDA-UK to help encourage wider UK engagement in this global community and support adoption of RDA outputs here. One message which we have pressed since that first workshop at IDCC13 is that engagement can take many different forms and can range from a few minutes work commenting on the potential usefulness of a proposed group to months of work if you play a core role developing a group’s outputs. Existing data management networks like the DCC’s RDMF and the Jisc Research Data Network will play a critical role in engaging with the community and feeding UK developments into the RDA – and vice versa. The DCC has played an active role in the RDA since its inception and hopes that RDA-UK will provide one way we can help to further connect you with the global RDM community. You can join the RDA-UK group and mailing list on the RDA site.