As a Research Data Librarian, one of my responsibilities is to raise awareness about the requirements for academic researchers to write data management plans for their projects. These awareness activities, often in the form of a presentation at staff meetings or departmental seminars, are frequently followed by a sceptical response from the academic staff along the lines that “This is just something else for us to try to fit into our already over-busy lives”. My answer is usually: “Yes, but it might prevent data management problems slowing up your project”. So, what are data management plans and how can they be of value to your project?

What is a data management plan?

A data management plan (DMP) outlines the way in which data being used or created in a research project will be generated, organized, documented, stored, backed up, preserved and shared, if possible, with other researchers after the publication of the findings of the project. It should also contain information on institutional and funder policies, relevant legislation and contractual obligations with which the project must comply. For example, it should include the terms of data licences for third-party data, contractual confidentiality clauses and funder obligations around data archiving and sharing. Finally, the DMP should identify, and designate, key responsibilities for data management within the project team. The data steward, usually the principal investigator, is ultimately responsible for data management for the project. Named team members should be responsible for data generation, quality assurance, recording the data and preparing the dataset for archiving at the time of publication of the findings. The DMP is a living document that should be shared with the project team, reviewed regularly and updated as necessary.

Requirements for DMPs

An increasing number of major funders require submission of a DMP at the same time as the case for support for funding. DMP templates vary slightly from funder to funder, but many are aligned with the UK Research and Innovation (UKRI) Common Principles on Data Policy, which are summarized in Table 1. Many UK universities have now followed suit and have adopted research data policies that are largely aligned with those of the funding bodies. The institutional research data policies generally require that researchers (a) write DMPs for all projects; (b) preserve data underpinning publications for a specified minimum period of time (generally 10 years); (c) share their research data, allowing for any legal, ethical and commercial restrictions and (d) include data access (or data availability) statements in publications. Therefore, as a result of funder and institutional data policies, DMPs form the cornerstone of data governance for researchers.

Table 1
Summary of UKRI Common Principles on Data Policy
Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner 
Institutional and project-specific data management policies and plans should be in accordance with relevant standards and community best practice. Data with acknowledged long-term value should be preserved and remain accessible and usable for future research 
To enable research data to be discoverable and effectively reused by others, sufficient metadata should be recorded and made openly available to enable other researchers to understand the research and reuse potential of the data. Published results should always include information on how to access the supporting data 
The UKRI recognizes that there are legal, ethical and commercial constraints on release of research data. To ensure that the research process is not damaged by inappropriate release of data, research organization policies and practices should ensure that these are considered at all stages in the research process 
To ensure that the research teams get appropriate recognition for the effort involved in collecting and analysing data, those to undertake Research Council funded work may be entitled to a limited period of privileged use of the data they have collected to enable them to publish the results of their research. The length of this period varies by discipline and, where appropriate, is discussed further in the published policies of individual research councils 
In order to recognize the intellectual contributions of researchers who generate, preserve and share key research datasets, all users of research data should acknowledge the sources of their data and abide by the terms and conditions under which they were accessed 
It is appropriate to use public funds to support the management and sharing of publicly funded research data. To maximize the research benefit which can be gained from limited budgets, the mechanisms for these activities should be both efficient and cost-effective in the use of public funds 
Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner 
Institutional and project-specific data management policies and plans should be in accordance with relevant standards and community best practice. Data with acknowledged long-term value should be preserved and remain accessible and usable for future research 
To enable research data to be discoverable and effectively reused by others, sufficient metadata should be recorded and made openly available to enable other researchers to understand the research and reuse potential of the data. Published results should always include information on how to access the supporting data 
The UKRI recognizes that there are legal, ethical and commercial constraints on release of research data. To ensure that the research process is not damaged by inappropriate release of data, research organization policies and practices should ensure that these are considered at all stages in the research process 
To ensure that the research teams get appropriate recognition for the effort involved in collecting and analysing data, those to undertake Research Council funded work may be entitled to a limited period of privileged use of the data they have collected to enable them to publish the results of their research. The length of this period varies by discipline and, where appropriate, is discussed further in the published policies of individual research councils 
In order to recognize the intellectual contributions of researchers who generate, preserve and share key research datasets, all users of research data should acknowledge the sources of their data and abide by the terms and conditions under which they were accessed 
It is appropriate to use public funds to support the management and sharing of publicly funded research data. To maximize the research benefit which can be gained from limited budgets, the mechanisms for these activities should be both efficient and cost-effective in the use of public funds 

Identifying, and mitigating for, data management issues

The primary purpose of a project DMP is to make sure that data management issues do not hold up the progress of research projects. Issues that can be predicted ahead of time can be managed and mitigated at the start of the project, rather than at the point in which they arise. Examples of project-stalling data management issues include: not having sufficient data storage available, difficulties in accessing or storing third-party data, logistical issues surrounding the secure transfer of files within and between institutions, losing data because of back-up failure or issues with file naming and organization, complex issues with data interoperability between internal systems or between systems at collaborating organizations and a lack of agreement on data ownership, licensing and data sharing within the project team. This last point is particularly important for collaborative projects. It is my experience that, within and between disciplines, within project teams, within institutions and between institutions, views on data sharing can range from researchers being exceptionally strong advocates for open science to there being a complete refusal to even entertain the idea of data sharing – particularly the sharing of code-based data. These differences may stem from prior personal experiences, concerns about commercialization opportunities and intellectual property ownership and, perhaps (particularly with early-career researchers), a lack of confidence in the workflows and processes, or in the analysis methods. There is also currently a lack of academic recognition and reward associated with data publication and sharing, which means that researchers must be motivated by compliance, integrity and reproducibility. By having conversations about data ownership, commercialization and data sharing at the start of the project, these differences can be identified, discussed and (hopefully) resolved. The result of these discussions should then be included in the DMP, which is circulated to the entire project team so that everyone is working towards a common goal from the start of the project.

Ultimately, writing a DMP is likely to save significantly more time than it takes to write.

DMP templates

The easiest way to write a DMP is to use a pre-existing template. Most of the funders that require a DMP to be submitted at the time of an application for funding provide a template, or a list of headings that should be included in the DMP. The Digital Curation Centre provides a free online DMP writing tool, DMPOnline, which includes templates for all major funders, and a software management plan developed by the Software Sustainability Institute. DMPOnline can be found at https://dmponline.dcc.ac.uk. Many institutions have also developed their own DMP templates and provide support for writing DMPs through their Research Data Service which is usually located within the University Library, Computing Services or Research Innovation and Support Service.

Data management plans – just another thing to do? Available for reuse under CC0 license from Pixabay.

Data management plans – just another thing to do? Available for reuse under CC0 license from Pixabay.
Data management plans – just another thing to do? Available for reuse under CC0 license from Pixabay.

Further Reading

Author information

graphic

Alison Nightingale spent 20 years as an epidemiologist with an interest in drug safety and rheumatic disease epidemiology before joining the University of Bath Research Data Service as a Research Data Librarian in 2018. Email: an313@bath.ac.uk

Published by Portland Press Limited under the Creative Commons Attribution License 4.0 (CC BY-NC-ND)