Implementing FAIR Workflows – Enabling Researchers to Connect Outputs to maDMPs

Implementing FAIR Workflows: A Proof of Concept Study in the Field of Consciousness is a 3-year project funded by the Templeton World Charity Foundation. In this project, DataCite works with a number of partners on providing an exemplar workflow that researchers can use to implement FAIR practices throughout their research lifecycle. In this blog series, the different project participants share perspectives on FAIR practices and recommendations.

Logo of the California Digital Library

California Digital Library (DMPTool)

In this latest post, Maria Praetzellis (DMPTool Product Manager) and Brian Riley (DMPTool Technical Lead) at California Digital Library (CDL), a project partner organization committed to implementing enhanced output reporting workflows on DMPTool, share their implementation experience. DMPTool is a free, open-source, online application that helps researchers create data management plans (DMPs).

Could you briefly introduce your platform/service, and how the FAIR principles fit into the vision?

The DMPTool was developed from a grassroots effort of eight institutions beginning in January 2011. These institutions came together in direct response to demands from funding agencies, such as the National Science Foundation (NSF), that researchers plan to manage their research data. The DMPTool was developed in an effort to consolidate expertise and reduce costs in addressing data management needs at their respective institutions. This original mission still serves as the guiding principle of the DMPTool as the application continues to be free and community-supported. 

Recent feature developments have focused on transforming the DMP from a static text file into an interoperable, networked hub of information wherein details about a research project can be updated and queried over the project’s lifetime. This new machine-actionable DMP (maDMP) allows information within a DMP to be fed across stakeholders, linking metadata, repositories, and institutions, and allowing for notifications and verification, reporting in real-time. A key goal of this new system is to reduce the burden on researchers by generating automated updates to a plan and facilitating seamless integration with systems and groups that support research.

What are the integrations your system is implementing in the project?

To support the work of the FAIR Workflows project, the DMPTool team implemented DMP-related output linking: the newly implemented feature allows researchers to follow up on their plans for output sharing, by appending publicly shared outputs to the DMP by simply providing the resource type and the DOI of the output. The integration will then link the output to the DMP, by creating a relatedIdentifier element in the DMP ID (DataCite DOI) metadata, defining the relation type, related identifier string, and the identifier type with the metadata update API call

This integration allows researchers to keep track of their data-sharing activities based on the DMP throughout their project lifecycle and connect outputs to the DMP even when the output-sharing platform does not support the functionality of linking related works.

This is a screenshot of the DMPTool showing the page of the 'Funding Outcome' including an online form filled with metadata from the FAIR Workflows Project.
Figure 1. UI of DMPTool “Follow-up” tab where related outputs can be added to the DMP.

Beyond the output linking feature, DMPTool is also working on a feature to allow the import of externally generated DMPs to the DMPTool to be structured and registered with a DMP ID, transforming static PDF files into machine-actionable living documents. This is needed because most DMPs are not created with the DMPTool; however, there is still a need to structure these DMPs to generate DMP IDs. For example, the research team in the FAIR workflows project had a previously written DMP for their research, that we can build on with this new workflow. With this new feature, users will upload a PDF of an existing DMP and enter basic metadata about the project. Uploaded DMPs will get a DMP ID. With these DMP IDs, researchers and administrators can connect outputs related to their project, ultimately allowing for notifications and compliance reporting. In addition to this project, our next step in developing the system will be integrating external APIs, including those from funding agencies, repositories, and open catalogs like OpenAlex. This integration will enable searching for linkages between DMPs and associated research outputs.

Can you describe the process of developing your integration and any challenges you encountered?

Thus far, most of our development work has centered around building the backend infrastructure required to build these integrations, with the goal of easing the burden on the end user. The DMPTool is a very small team of one developer and one product manager, so our principal challenge is a lack of resources to meet the development goals of our work. The TWCF-funded project has come forward to help with this gap, and additional funding will allow for additional development resources to help us progress to the next phase of work faster. 

How do you envision the integration will make an impact? 

By integrating the DMP into the persistent identifier (PID and associated metadata) ecosystem and using APIs to connect to repositories, libraries will be better equipped to track and manage their institutional research data products, and institutional data services will be better positioned to communicate and collaborate.

Any advice for other platforms or tools that are considering PID-related integrations?

From the DMPTool’s perspective, there are two ways a system can approach the implementation of PIDs and related workflows – use them as a connection with another system (e.g. DMPTool records ORCIDs) or as the maintainer of a PID (e.g. DMPTool assigns DOIs to a DMP).

When connecting to PIDs maintained by an external system:

  • Verify the system’s retention and access policies (will a PID always be resolvable), if not how will your application handle that?
  • Make sure that the PIDs are either URLs or at least have a namespace so that you can ensure uniqueness (e.g. an external system may use the id HEWG94HG49G4G which is unique for that system but not globally unique)
  • Make sure you understand how they manage versioning. Some systems have different PIDs for each version while others always return the latest version of the object

When assigning and maintaining PIDs through your application:

  • Make sure you clearly document your retention and access policies (e.g. are they publicly available 24×7, can the objects they resolve to be embargoed or restricted, etc.)
  • Make sure your PIDs are resolvable in the browser
  • Make sure you support versioning and then clearly document how that works
  • Consider using a system like Crossref or DataCite to register your PIDs so that they become part of the larger DOI ecosystem

Note that in the second scenario, it is not enough for the integrator to just acquire a DOI and consider the work done. There are broader responsibilities that should be taken to ensure that the PID is resolvable, that your metadata is well documented, you convey how you handle versioning, etc.

TWCF logo light horizontal

This project was made possible through the support of a grant from Templeton World Charity Foundation, Inc. The opinions expressed in this publication are those of the author(s) and do not necessarily reflect the views of Templeton World Charity Foundation, Inc.

The post Implementing FAIR Workflows – Enabling Researchers to Connect Outputs to maDMPs appeared first on DataCite.