Oh! Vienna…notes from the CESSDA experts Seminar on Research Data Management

What follows are summaries of presentations and discussions. They are my summaries, so any misrepresentations, mistakes, slanderous accusations, lies, written lies, twisted lies etc. that follow are mine.

All the presentations linked to in this blog post are available under a Creative Commons Attribution-NonCommercial 3.0 Unported License.

What's the collective collective noun for CESSDA RDM experts, a package? Photo: Rein Murakas

What’s the collective collective noun for CESSDA RDM experts, a package?
Photo: Rein Murakas

It did not feel like the end of an era, but it was. For the last time, experts gathered to hold a CESSDA expert seminar before CESSDA’s rebirth as an established legal data infrastructure called CESSDA AS. However, the reason it didn’t feel like the end of an era was the topic and direction of the meeting was looking forward to Research Data Management (RDM) training and work on and RDM costing. In this respect, as the seminar would show, significant work has taken place but there’s still much to be done.

Hosted by WISDOM in Vienna, and organized with GESIS, the day-long meeting was attended by some 20 people from a range of existing CESSDA member archives and interested observers.

The reason for choosing RDM as the topic is the movement towards data sharing policies from research funders, including Horizon 2020 – the next European Commission round of research and infrastructure projects. Funding developments like Horizon 2020 show RDM planning and implementation is increasingly important when securing or complying with funding agreements, so what can we do to best support researchers in promising and realising good intentions?

While we as a community have developed international standards on metadata, other topics remain difficult to address from a cross-national perspective. For example, data protection laws and intellectual property rights vary across, and sometimes within, nations. Therefore, the objective of the meeting was to promote further cooperation between archives, identify experts who could contribute to cooperation, and discuss the possibility for common European RDM support. To address these goals the meeting was structured into two parts. First, incentives and teaching; second, costing.

Incentives

The incentives session started with presentations from Elisabeth Strandhagen and Sara Svensson (SND) on “Working with data management” [PDF] and Sonja Bezjak (ADP) on “Data Management Planning in Slovenia” [PDF]. These presentations outlined situations in their respective countries where funders seem to be moving towards data management planning requirements and it has fallen to SND and ADP to define and provide the infrastructure support to underpin these requirements.

After the presentations, discussion commenced and questions were asked. Elaborating on their presentation, ADP discovered a difference in attitudes to sharing between researchers who used international data and those who did not. Others pointed to discipline differences, while ADP found natural science researchers like to keep data for themselves the Finnish Social Science Data Archive (FSD) have the natural science researchers approaching them to learn about running a data infrastructure because social scientists have been running them for a long time. GESIS picked up on this point, stating that the RDM challenge in the natural sciences comes from storage; the natural sciences tend to be weak on metadata and data description because they traditionally haven’t needed to be strong in those areas. Social sciences, in contrast, have over 50 years of good work in establishing data description standards.

ADP returned to the policy and recommendations in their presentation by reiterating their strategic approach. Setting a national policy first allows institutions to develop their own policies within the national framework. Their recommendations suggest disciplinary data centres should develop in fields where they are needed; for example, “islands” of researchers are already informally sharing research data where there is a need for that data to be shared. This is because, in their view, scope for archives or repositories that exist for things that don’t fit elsewhere are limited. Discipline specific centres have specialisation, embracing a knowledge of both user and depositor communities.

Discussion then moved into a round table, with representatives from other countries talking briefly about RDM requirements, planning, and data sharing culture in their countries. A pattern emerged whereby most countries have funding bodies with some requirement or encouragement for researchers to produce a data management plan, but less emphasis is placed on requirements to share data. Even less effort is made by funders to implement these requirements.

Alexia Katsanidou (GESIS) [PDF] then talked about the importance of framing RDM incentives in ways that have emotional appeal to researchers. Much of the talk in RDM is about compliance in delivered in a cerebral tone, and much of the resistance is emotional in nature. If we could tap into emotional motivations to practice good RDM techniques, the result could be a more positive and active reaction from the research community.

To conclude the day, Alexandra Stam (FORS) [PDF] and Laurence Horton (GESIS) [PDF] gave overviews of recent training events in their institutions. FORS adopted a brave and daring scenario approach for a five day RDM course, weaving RDM themes and lessons into a data kidnapping role play requiring  participants to recover a data set. Laurence Horton outlined the work at GESIS on RDM training courses with a cross-national perspective, mentioning the problems of addressing national level issues in an international course and the need for good, stimulating, approaches to delivering and reinforcing RDM training.

The discussion afterwards raised the topic of needs driven training. Researchers are most receptive and open to training when they have a need for it, either from a policy view or working in a collaborative environment. This moved onto consideration of how we can build training activities in a way that integrates with existing research practice and reduces friction between archives and researchers.

Costing

Part two of the seminar began with presentations on costing from Laurence Horton (GESIS) [PDF], Veerle Van den Eyden (UK Data Archive) [PDF], and Heiko Tjalsma (DANS) [PDF]. Laurence argued that focusing our training on researchers generating good metadata and documentation can help us reduce the financial cost of archiving, a cost which is often tied to the need of the archive to add metadata and documentation during ingest. Veerle presented a tool developed at the UK Data Archive that helps researchers identify RDM costs thereby allowing them to factor realistic, specific RDM costs into funding applications. Heiko spoke about the work at DANS who applied business costing models to their activities in order to develop costing models fit for an archive. DANS are now implementing their cost model within the organisation, particularly in regard to classifying activities into principal and auxiliary costs. They are also contributing to European projects on cost modelling, APARSEN and 4C.

The post presentation discussion raised some worthy points. It was noted that although specific task cost data can be captured, the problem is there is no standard research collection. UK Data Archive and FSD both mentioned their experience with qualitative data and how it was expensive to ingest. DANS noted that their data collection, which contains social science, archaeology, humanities (history) data had “huge” differences between domains when it came to ingest cost. This is a result of incorporating three different disciplines with their own archiving processes, into one archive, something DANS is busy attempting to standardise. Jared Lyle from ICPSR asked a provoking question as to whether well curated studies are actually getting usage regardless of the quality of metadata and documentation.

The costing session ended with an open presentation [PDF] and discussion led by Mari Kremola from FSD. Mari argued that most of the work on costing is storage based and does not help us when it comes to costing the production of metadata, particularly as a proportion of total archiving costs. Furthermore, promoting self-archiving platforms is very well, but it is done so on the expectation that researchers will provide sufficient metadata and display other RDM considerations, which they probably will not. Mari also touched on an issue that seemed to resonate with others in the group, namely the reluctance of funders, institutions, or researchers to claim intellectual property ownership of research data due to a fear of responsibility for the resources required for long-term preservation.

Wrap Up Session

Regarding future directions, the question was asked as to how we cooperate on coordination for training RDM across the CESSDA archives. As a community we are now in a better position as CESSDA to work with universities and negotiate with archives on training. It was suggested we focus on finding commonalities rather than highlighting differences and look at holding training where we can focus on researchers who can’t travel. Options here include federated systems of training where we hold training in different places with one person delivering it from one country with others listening in and providing support. Furthermore, we could possibly offer some advance qualification or certification in RDM.

Further reading

Bezjak, S. “Ravnanje z raziskovalnimi podatki: spodbude in stroški (CESSDA Expert Seminar 2013)”, Prispevki: blog, 18 November, 2013.
http://www.adp.fdv.uni-lj.si/blog/2013/neuvrsceni/ravnanje-z-raziskovalnimi-podatki-spodbude-in-stroski-cessda-expert-seminar-2013/#ixzz2l0J3NlnQ