RDMF14: report from Breakout Group 2 (Systems Integration)

This breakout group was a discussion on the challenges of integrating systems for research data management. It was chaired by Rory McNicholl.

The group was asked to give examples of systems that could be integrated with research data infrastructure.

Some common targets for integration were:

  • Current Research Information Systems (CRIS)
  • Publications (institutional repositories)
  • Data repositories (including those external to the institution)
  • Data catalogues
  • Active data management (Box, Dropbox or institutionally provided storage)
  • Archive storage and preservation (Archivematica)
  • Data management plans (DMPonline)
  • HR
  • Ethics
  • Finance
  • Analysis platforms (DMA Online)
  • Other external systems (Researchfish)

The consensus of the group was that while a CRIS or data catalogue might act as a hub to connect systems together there is unlikely to be a single system that can act as the central point of all systems integration.

There may not be a “single point of truth” for research data systems and a clearer picture may come from combining multiple sources of information. Use of reporting tools such as DMA Online may help to provide some visibility here.

Members of the group also gave some examples of the problems that need to be tackled to successfully integrate research data systems:

  • Handling large datasets.
  • Security and regulatory compliance for sensitive data such as medical information.
  • Lack of “hooks” into the research process because research may not be visible until publication.

Next the group discussed how research data systems could interoperate with the many and diverse types of software used by researchers. It was felt that:

  • Researchers make use a patchwork of different tools - there is unlikely to be one single solution that works for all researchers.
  • Common metadata and file formats are beneficial. However, there is a danger of adopting “lowest common denominator” formats that may lose important metadata or context when systems are integrated.
  • Institutions which are using similar systems could work collaboratively. For example, integrations which have been developed could be re-used and practices could be shared.
  • Researchers could be classified by the types of data that they produce and the software applications that they use (such as MATLAB or SPSS). This may produce more meaningful groupings than trying to address researchers by discipline or faculty.

Summing up, the group agreed that dialogue with researchers is essential and any systems integration needs to provide real benefit or value to researchers.

An understanding of the shared concerns and challenges in handling research data is the first step towards any successful integration.