Because good research needs good data

Making the Case for Research Data Management

By Angus Whyte (DCC) and Jonathan Tedds (University of Leicester)

Published: 1 September 2011

This briefing paper aims to help managers in research institutions build support for developing new services for research data management. It also gives a brief snapshot of the JISC-led programmes on Managing Research Data and Shared Services and the Cloud.

Please cite as: Whyte, A., Tedds, J. (2011). ‘Making the Case for Research Data Management’. DCC Briefing Papers. Edinburgh: Digital Curation Centre. Available online: /resources/briefing-papers

Browse the paper below or download the pdf

** This publication is available in print and can be ordered from our online store **

Introduction - Doing More with Less

Higher Education research managers need to coordinate an ever-broader range of research outputs and outcomes. In this briefing we show how institutions have taken a lead in establishing research data policies and services that will support them. We show how these are giving measurable improvements in research capability, and in the institutions’ ability to respond to policy-makers and regulators. Institutions require coherent frameworks to establish the organisation, resources and technology capable of generating these benefits. This in itself presents challenges in achieving coherent change across the many disparate components within an institution. The pressure to do so with fewer resources means that JISC-led initiatives like the Managing Research Data programme and the Shared Services and the Cloud Programme come at an opportune time.

The prospects for sharing resources to gain efficiencies and more effective collaboration are extending beyond established areas such as IT Services, Library and Research Support. Just as academics are producing digital research assets in greater volume and variety, data management services are joining computation as resources that can be pooled more effectively. Benefits may also be found by considering other parts of the research cycle that can be served through repository services already established to manage research articles.

Tools, services and standards are emerging to help researchers manage their research assets, and to make more widely available the evidence including raw and processed data that underpins their research articles. Effective management is providing institutions with new ways to find synergies across research groups, producing new knowledge by engaging a broader range of stakeholders, and enabling wider reuse of data in teaching and learning, commercial exploitation and policy development.

Measuring the Benefits

37% Projected saving in staff time from moving Oxford University Classics Dept database to centralised virtual service [38]

69% Increase in citations for clinical trial publications associated with making their microarray datasets publicly available [14]

500% Growth in datasets downloaded from Economic and Social Data Service 2003-2008 [36]

One-day delay cut to 5 minutes Estimated time saving for crystallography researchers to access results from Diamond synchrotron, by deploying digital processing pipeline & metadata capture system [38]

(See sources of further information)

Researchers’ needs are likely to span the related areas of research data management, curation, and preservation. Research data management concerns the organisation of data, from its entry to the research cycle through to the dissemination and archiving of valuable results. It aims to ensure reliable verification of results, and permits new and innovative research built on existing information. Preservation is about ensuring that what is handed over to a repository or publisher remains fit for secondary use in the longer term (e.g. 10 years post-project). Curation connects first use to secondary use. It is about ensuring that project results are fit to archive, and that valued research assets remain fit for reuse. This briefing focuses on research data management, its drivers and benefits found. We locate these in the JISC Managing Research Data programme, and take a snapshot of the experiences of one institution, the University of Leicester.

Back to top

The Drivers

There has been a decisive shift towards greater oversight of the research process motivated by the driving principle of data as a public good. This shift is seen in the concerns of policy-makers, and in changes in legislation and its implementation. The needs are being addressed through coordinated action by funders including the UK Research Councils, charities and JISC, with significant responsibilities falling to HEIs and individual researchers.

Back to top

Research Integrity

Research integrity is a key issue for policy-makers. The House of Commons Select Committee on Science and Technology concluded in 2011 “…employers must take responsibility for the integrity of their employees’ research”. They also call for regulatory oversight to ensure funders and institutions fulfil their responsibilities [1]. Data management is a means to assure research integrity, and the UK Research Integrity Office (UKRIO) states in its Code of Practice:

Organisations should have in place procedures, resources (including physical space) and administrative support to assist researchers in the accurate and efficient collection of data and its storage in a secure and accessible form. Researchers should consider how data will be gathered, analysed and managed, and how and in what form relevant data will eventually be made available to others, at an early stage of the design of the project [2].

Back to top

Legislative Change and Regulatory Compliance

A related point is that effective data management can mitigate risks to institutional reputation. These may surface as researchers balance requirements for disclosure and confidentiality. Measures to comply with Data Protection and Freedom of Information legislation need constant monitoring, given rulings by the Information Commissioners Office on the withholding of research data requested through FOI, for example. Partly in response to the Independent Climate Change Emails Review in 2010 JISC developed new guidance for researchers in responding to FOI requests for research data [3]. Dr Malcolm Read, executive secretary of JISC, said at the time: “…We need to move away from a culture of secrecy and towards a world where researchers can benefit from sharing expertise throughout the research lifecycle” [4].

Back to top

Funders' Data Policies

To foster good practice, Research Councils UK has coordinated a statement of Common Principles on Data Policy (see box below) asserting that “..making research data available to users is a core part of the Research Councils’ remit”. 

The DCC tracks and summarises funder policies, including Research Councils and some major charities [5]. The EPSRC, for example now requires research organisations to preserve data securely for at least 10 years, and “… ensure that effective data curation is provided throughout the full data lifecycle, with ‘data curation’ and ‘data lifecycle’ being as defined by the Digital Curation Centre” [6]

The increasing UK activity in this area parallels significant international effort, especially across Europe, the US, and Australasia [7]. In the US, the National Science Foundation has mandated Data Management Plans as a condition for funding, and the European Commission is to require these plans for projects funded in its 8th Framework programme from 2014.

Summary of Research Councils UK
- Common Principles on Data Policy

Public good: Publicly funded research data are produced in
the public interest should be made openly available with few
restrictions

Planning for preservation: Institutional and project specific data
management policies and plans needed to ensure valued data
remains usable

Discovery: Metadata should be available and discoverable;
Published results should indicate how to access supporting data

Confidentiality: Research organisation policies and practices
to ensure legal, ethical and commercial constraints assessed;
research process not damaged by inappropriate release

First use: Provision for a period of exclusive use, to enable
research teams to publish results

Recognition: Data users should acknowledge data sources and
terms & conditions of access

Public funding: Investment is appropriate and must be efficient
and cost-effective.

(Full text at: http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx)

Back to top

Research is Global and more 'Data Intensive'

Funders expect UK research to be international in scope. The Royal Society has reported that over a third of all articles published in international journals are internationally collaborative, up from a quarter 15 years ago [8]. Researchers need data management tools and services to work this way. Research data is itself often seen as a form of infrastructure, as it is the basis for ‘data intensive’ research; a trend spreading from fields such as genomics and astronomy across many domains. As the European Commission Riding the Wave report points out, this trend calls for ‘collaborative research data frameworks’ [9]. These should help develop the emerging pan-European collaborative research data infrastructure, and avoid isolating the islands of good practice.

Back to top

Institutional Policy Responses

In response to these drivers, some UK Universities have started to develop policies on research data management [10]. Oxford University published its Commitment to Research Data Management in 2010 [11]. The University of Edinburgh’s adoption of the UKRIO Code of Practice for Research was an important stepping-stone to its Research Data Management Policy, announced in 2011 [12]. The policies do the following:

  • Identify areas of responsibility for the institution and for researchers
  • Commit the university to develop appropriate guidelines, training and support, including mechanisms and services for storage and backup
  • Support deployment of data repositories and/ or mechanisms for registering metadata about research data
  • Recognise that management and curation of research data requires cooperation and coordination with research funders, and with existing national and international providers of data services and subject-based repositories

It is worth noting that these policies build on earlier work supported by the JISC Digital Repositories and Preservation programme in the projects EIDCSR and DataShare respectively. Other institutions are likely to similarly develop policies to fit their specific needs and contexts. There remain open questions about exactly who is responsible and when at each point within the complex research ecosystem [13]

Further incentives for change are the Research Excellence Framework, and the Research Councils’ coordinated monitoring of research outputs and outcomes. Datasets have yet to make a mark in research assessment terms compared with the traditional article. This is likely to change with evidence that making data related to an article publicly available correlates with higher citation rates, at least in fields that have built the necessary repositories, standards and collaborative culture [14]. These include astronomy, where the number of research papers based on second use of data from the Hubble Space Telescope has now overtaken those based on the initially proposed use [15].

The development of standards and mechanisms for citing data, e.g. Datacite [16], and for identifying contributors e.g. ORCID [17], will also help datasets gain more recognition as outputs in their own right. Standards-compliant research information systems will provide mechanisms to track dataset usage and enable this to be rewarded [18]. The requirements to do that will be grounded in evidence of the benefits. As the remainder of the briefing shows, the benefits of having data in a reusable form are the opportunities it creates – for services to lay the foundation for new research, create material for teaching and learning, improve engagement with the community and business, and inform policy or product development.

Back to top