Because good research needs good data

Common Directions in Research Data Policy

This briefing takes stock of developments in research funders’ data policies and expectations. The focus is on developments most directly affecting UK Higher Education institutions and research institutes, including the Concordat on Open Research Data, and changes in Cancer Research UK policy. The briefing will be of particular interest to those within Higher Education responsible for the management or governance of research data. It assumes the reader already has some familiarity with funder policies in this context.

By Angus Whyte and Martin Donnelly

Published: 4 August 2016

Please cite as: Whyte, A., Donnelly. (2016). ‘Common Directions in Research Data Policy: a Briefing for Institutions'. DCC Briefing Papers. Edinburgh: Digital Curation Centre. Available online: /resources/briefing-papers

Browse the paper below or download the pdf

Introduction

This briefing takes stock of recent developments in research funders’ data policies and expectations. The focus is on developments most directly affecting UK Higher Education institutions and research institutes, including the Concordat on Open Research Data, and recent changes in Cancer Research UK policy. The briefing will be of particular interest to those within Higher Education responsible for the management or governance of research data.  It assumes the reader already has some familiarity with funder policies in this context.

A Global Policy Environment

Research data policies seeking to open up research data have been prominent for more than a decade now, driven by a consistent set of key concerns:
  • increasing the efficiency of research spending [1]
  • supporting the ‘self-correction’ mechanisms in science and scholarship that provide research transparency and integrity [2
  • providing a catalyst for innovation and economic growth [3]
The promise of data science has renewed the vigour of policy efforts to catalyse innovation, as seen for example in the UK Government’s open data strategy.[4] In European policy it is reflected in the European Cloud Initiative, and the European Commission’s declaration that it would be “opening up by default all scientific data produced by future projects under the Horizon 2020 research and innovation programme” from 2017.[5
 
The European Commission’s roadmap towards a supporting environment for Open Science; the European Open Cloud for Science (EOSC) has a strong emphasis on “immediate, affirmative policy action…in close concert with Member States…to realise the first phase of a federated, globally accessible environment where researchers, innovators, companies and citizens can publish, find and re-use each other's data and tools for research, innovation and educational purposes”. [6]
 
Funding body policy is a response to Government policy, whether directly or through collective bodies such as the OECD, EU, or the G7/G8. Research data policy has also had a significant place in recent G8 Science Ministers’ statements, for example in the following principles at the 2013 G8 summit:
  •    “To the greatest extent and with the fewest constraints possible, publicly funded scientific research data should be open, while at the same time respecting concerns in relation to privacy, safety, security and commercial interests, whilst acknowledging the legitimate concerns of private partners;
  •  Open scientific research data should be easily discoverable, accessible, assessable, intelligible, useable, and wherever possible interoperable to specific quality standards; and
  •  To ensure successful adoption by scientific communities, open scientific research data principles will need to be underpinned by an appropriate policy environment, including recognition of researchers fulfilling these principles, and appropriate digital infrastructure.” [7]
This statement signalled a strengthening of research funders’ commitment to an ‘open by default’ position on data, and to resourcing the infrastructure, both technological and organisational, needed to provide that. The emphasis on mechanisms to provide reward and recognition for data sharing is also likely to influence the UK Government’s successor to the Research Excellence Framework.
 
The policy emphasis on opening data up to encourage innovation is coupled with firmer regulation on data protection, which all organisations doing business with the EU will need to apply from May 2018.[8]  This will place a greater onus on research organisations to ensure the trade-offs between openness and security, as expressed in the G8 Science Ministers’ first principle, are properly managed.  
 
Recognising the potential for confusion around how research data producers should apply the concept of ‘openness’, there have been some moves by funders and others to simplify common principles. One example is the FAIR principles, [9] which emerged from the FORCE11 initiative and are strongly influenced by academic publishers. FAIR simply states that data should be Findable, Accessible, Interoperable and Reusable.
 
FAIR principles illustration
Illustration by Scientific Data, Nature Publising Group, [9]
These principles are helping shape development of the European Open Science Cloud, and are also being applied by UK funders. For example in its recently revised data policy the Economic and Social Research Council recommended FAIR as guidance on best practice in data stewardship, whether for data deposited in the UK Data Service or elsewhere. [10]

Back to top

RCUK Guidelines and Concordat on Open Research Data

Research Councils UK (RCUK) has provided since 2011 Common Principles on Data Policy that the seven individual Councils orient their discipline-specific policies around. [11]  In July 2015, RCUK released further Guidance on best practice in the management of research data, [12] offering funded research institutions and investigators explanatory text on each of the seven ‘common principles’.  This guidance was intended to inform the RCUK consultation on a draft Concordat on Open Research Data.[13]
 
The Concordat on Open Research Data is the work of a broad coalition of UK funders under the aegis of the UK Open Research Data Forum. This includes RCUK, HEFCE, Universities UK, Jisc, and Wellcome Trust; plus a broad range of interested bodies including DCC.[14] The Concordat sets out the working group’s shared understanding of different responsibilities of researchers, their employers, and funders of research.  The document complements existing frameworks, grant conditions and guidelines.  Rather than providing further mandates, it aims to frame expectations of good practice, and establish open research data as the desired position for publicly-funded research over the long-term.
 
The Concordat re-iterates the seven RCUK principles, albeit in a different form and with a number of additional points, highlighting research organisations’ shared responsibility to:
  • Provide support with data management for publicly funded research, and review progress in making research data openly accessible
  • Ensure data is curated to make it discoverable and useable 
  • Meet service and infrastructure costs in proportion to realisable benefits
  • Clarify when restrictions are acceptable and the necessity to justify them
  • Observe legal, ethical and regulatory frameworks
  • Make the data underlying publications accessible and citeable
  • Support data skills development
The public consultation on the draft concordat led to 80 responses from universities, research groups, individuals and organisations across the sector, many informed by internal consultations with researchers, or experiences of providing support in research-intensive environments. These included, for example, a public joint response from the universities of Bristol, Cambridge, Manchester, Nottingham, and Oxford. [15]
 
The responses were broadly supportive, and highlighted some common points that are recurring issues for those implementing data policy. These include a need for policy to help institutions address a number of issues, including the needs to:
  • Avoid bias in policy terminology towards STEM subject areas, and recognise the different methodologies and nomenclature used in the Arts and Humanities.  
  • Recognise the challenges in delineating boundaries between research articles, code, and data; and between digital and non-digital data.
  • Recognise that researchers need to make informed decisions, and justify them, on the trade-offs between the risk and benefits of sharing data, particularly in health and social science research.
  • Recognise that while public-private research collaborations need to make the data they produce openly available without undue restrictions, other mechanisms like IP sharing may apply to derivative data products that add value to that data.
  • Recognise and reward the effort to manage data during the research cycle, as well as end-of-cycle data sharing activity.
  • Help institutions deal with costs of providing services and infrastructure, by providing better defined frameworks for recovering costs from funders, ‘responsible metrics’ that incentivise sharing by measuring the returns from open data.[16]
 The consultation responses suggest policy will be ahead of research organisations’ practice for some time. Further support for shared repositories to avoid duplication was called for. The question of ‘who pays’ remains a major barrier to transitioning between pilot projects and ‘business as usual’ services.

Back to top

Who pays?

The costs of research data support and infrastructure are likely to remain a keenly felt ‘pain point’ for some time. In principle all UK Research Councils endorse the ‘efficient and cost-effective’ use of public funds to support the management and sharing of publicly-funded research data. RCUK provided substantial guidance on this in a 2013 blog post, which responded to questions raised at a Research Data Management Forum ‘meet the funders’ event.[17]  Nevertheless a clear consensus has yet to emerge about exactly which public funds an institution should draw from, in order to resource an RDM service i.e. whether to charge out costs to research grants as direct costs, recover them through indirect costs, and/or rely on other sources such as QR income from funding councils.  
 
Among the more challenging aspects of policy implementation is the pressure by Government and Research Councils, following the Wakeham Review, to reduce indirect costs.[18] In short, institutions are expected to meet many of the costs of establishing institutional RDM services while at the same economising on central services. Furthermore, Research Councils do not permit costs to be charged to grants for post-project spending, or for services that are provided free of charge to non-RCUK funded researchers, such as those financed by charities.
 
Adding to the financial challenge, charities have historically not seen it as their role to fund the overheads of research institutions. This may be changing however. For example in 2015 Cancer Research UK changed its approach so that “appropriately justified” costs can be included in grant proposals. Specifically, applicants can now include “proportionate, relevant data management sharing activities and resources as a running cost in applications.”[19] This may cover, for example, data managers’ salary costs, and/or archiving and repository costs, but not OA article processing charges which are dealt with separately.

Back to top

Where next for UK Research Data Policy?

UK funding body data policies have undergone some consolidation over the last five years, and there may be further moves towards a harmonised layer of RCUK data policy, and perhaps more joined-up investment in data infrastructure at a national level.

The ‘direction of travel’ towards greater openness is unlikely to change. And given the importance of European research networks to UK researchers, EU data policy is likely to continue to influence UK funder policies. A key development here is the European Open Science Cloud, which is described as the “EU contribution to an Internet of FAIR Data and Services underpinned with open protocols” and which “operates under well-defined and trusted conditions”.[20]
 
Funders recognise that change in data sharing culture happens neither painlessly nor overnight.  Steps to monitor policy implementation are perhaps more likely in the short term than major changes in emphasis. Those steps will likely include stronger compliance checks. It took some years from the introduction of Open Access policies to the routine monitoring of compliance, although it took some funders fewer years than others.[21]  
 
We can discern a number of further trends that are likely over the next 2-3 years.
 

Monitoring of data management plans

Research funders have begun, albeit sporadically, to review the quality of data management plans they receive.[22] We can expect to see early steps towards peer review of DMPs to be followed through with monitoring of the extent to which plans have led to deposition of data in a public repository.

Post-award data management plans

All funded H2020 projects are now required to submit a Data Management Plan six months post-award. [23]  The DMP is then updated later in the project, a policy process with some similarities to NERC‘s post-award DMP, which is produced by principal investigators in collaboration with the relevant NERC data centre.[24]  It is possible that other UK funders will take further steps in the direction of requiring grant holders to update DMPs for successful project bids, to firm up their intentions to make data resulting data openly accessible and reusable.

Institutional audits

Research Councils’ routine audits of the institutions they fund are likely to include RDM support provision among their criteria, as first mooted by the EPSRC in relation to its 2015 policy mandate.[25]
 

Output monitoring

The EPSRC began steps in this direction with ‘dipstick testing’ of articles to check the availability of underlying data. The infrastructure for automatically tracking and correlating identifiers for researchers, articles and datasets is now becoming well established, and will almost certainly be used to monitor data policy impacts.
 

Rewards for data management support

The recent Research Excellence Framework (REF) review by Stern highlights research data management as an example of the kinds of research facility that should count towards assessment of the ‘Institutional Environment’, and future funding settlements.[26]  While plans for the next REF have yet to be established, there are strong indications that it will “reward research environments that deliver open access to a wider set of outputs than just journal articles and conference papers”.[27] Further incentives to attract submissions of datasets for review by subject panels seem likely, given their almost complete absence in past REF exercises. Considering the likely enhanced role of metrics in assessment, data sharing metrics may also have a role, and results of the European Commission’s consultation on “responsible metrics and evaluation for open science” will be influential on UK policy.[28]

Back to top

Taking data policy further in your institution

Your institution may already have formulated a policy on research data and its management. If not, guidance is available from the DCC,[29] and the EU project RECODE.[30] Higher Education institutions in the UK have been world-leading in their development of policy, but will need to ensure their data policies reflect the changing environment. Below we recommend points to consider. 
 

Keep data open for mining

Institutions will likely be keener to count their research data as a true financial asset when they find that innovative data products and research intelligence can be obtained from mining it.  Publishers and other third-parties will want to mine research data produced from your institution, finding patterns in the connections between data producers, their data, domains, and institutional affiliations.   Institutions should resist any lure to hand over text and data mining rights to any third-party who would limit the ability of others to create derivative products, for example by limiting access to specific APIs or by using licenses that close off further commercial reuse.

 
Keeping text and data mining open will mean your institutions research data will have a better chance of stimulating innovation and benefiting society.  It may also help the institution obtain competitive advantage. When your researchers make their data FAIR to others beyond the institution they also make it FAIR within your institution. Researchers who are geographically close will be well placed to work together, drawing on local knowledge and expertise to exploit synergies from novel connections others find between their data, domains, instruments and methods.  The more people who can find and openly publish those connections, the greater your institutions opportunity to exploit them locally. It is therefore in the institution’s interests to have data policies that safeguard the right of anyone and everyone to reuse, and ensure that TDM rights are not handed over to third-party services that researchers use.

Develop the culture and skill-sets to value and exploit research data

Whatever charging models institutions and funders put in place, senior academics’ willingness to identify their data management effort in grant proposals will depend in part on their recognition of data as a valued output. Examples of funded proposals that cost the time researchers intend spending on preparing data will help get the message across that it is worthwhile accounting for this time.  Policy work at the department or research centre level can also encourage this by translating generic research data policies into their contexts, especially in terms of the kinds of data that are normally worth retaining, and what support or sources of good practice are available in their own research community.  Open Access policies at the sub-institution level already make up around 10% of Open Access policies worldwide.[31] RDM Services can help ensure that academic groups are informed of policies relating to open data in domains relevant to them, including the data policies of Journals in these domains.

 
Funder data policies play some part in identifying disciplinary factors that give data value, but it is necessarily a small one. For example the BBSRC identifies a number of areas, such as systems biology, where there is a particularly strong scientific case for data sharing.[32] However Research Councils are unlikely to develop more detailed guidance on what data types are of value than they already provide. Policy-makers recognise that it is undesirable for them to direct researchers on what research data they should be collecting and retaining, as these are bound up with research questions that researchers need to identify for themselves.
 
Institutions also have a clear role in data science skills development. They have roles in helping their researchers to exploit data effectively, in enabling their professional services to offer the data stewardship that data science requires, and in offering the computing and storage environment to make it happen.  Institutional data policy has an important role in identifying roles and responsibilities, and policy review can help signal to professional services that priorities and roles do not stand still. For example, coordination between services may be required to ensure data science competencies are included in research career pathways, that these are followed through in skills development, and that the competencies are subsequently recognised.[33] Similarly, further development of data support staff roles and responsibilities may be needed to help researchers find the datasets, and apply data analytics and visualisation tools to fulfil their research needs.[34]
 

Offer policy guidance that recognises shades of grey

Research Councils understand that their data policy principles are not absolutes. Different research communities will need to articulate for themselves the use cases for open data, reflecting the cost-benefits of making data FAIR.  Decisions on the cost-benefit balance involve trade-offs between the policy principles. Your researchers should be capable of working out how the generic principles apply to their own situations. One measure of your institutions effectiveness in implementing its own RDM policy is whether your researchers can offer an informed account of their decisions regarding publicly funded research data.   
 
Figure 1 is a visual aid that academic groups may find helpful towards formulating their own guidelines, or applying the RCUK principles to data produced from funded projects.  Given the overall ‘open by default’ policy context, the presumption should be towards the top right corner; towards a position on copyright/ IPR that favours recognition rather than first use, towards making data discoverable rather than limiting access, and towards making it interoperable and reusable as a public good, rather than managed as the shared property of a closed club.  
 
On the other hand, funders also acknowledge that exceptions can be expected, and justified. Figure 1 can also be used to highlight the circumstances in which funders would expect the grant holder to make a case for the position taken towards data produced from a grant, on the three dimensions shown: -
  • The data itself: what reasons are accepted in your research community for not making publicly funded research data of the types produced by this project interoperable and reusable?
  • Access: in what contexts are constraints acceptable in your community that would over-ride the public interest in gaining access to this data?
  • Intellectual property: what is the commercial case for giving higher priority to exploiting any property rights in this data than to seeking recognition for making it findable, accessible, interoperable and reusable?
 

 Dimensions of data policy
Figure 1. ‘Balancing principles for open data’, Source: Juan Bicarregui, Science & Technlogy Facilities Council (STFC) [35]
 
 
Further advice on criteria for retaining data is available in the DCC guide ‘Five Steps to Decide What Research Data to Keep’ [36]
 

Share information to help the sector move forward

Policy implementation inevitably requires institutions to trade-off the desirable against the feasible. This should become more straightforward as the sector establishes norms for RDM service provision, reflecting demand from both researchers and funders, and costing models that both can support.  Institutions can help to establish normal practice by sharing information on the services they have in place and want to develop, in response to DCC and Jisc surveys and community events.

Back to top

Conclusion: open is still the way forward

Research data policies are a product of a wider policy environment that faces major long-term uncertainties, for example as a result of the June 2016 ‘Brexit’ referendum. However in the short term the UK’s legal eligibility to participate in the European Union’s Horizon 2020 programme will not be affected.[37]  In any case, EU Data Policy and legislation will continue to exert strong influence on the UK research environment.  And the broader international picture, articulated by bodies such as OECD and the G8 Science Ministers, is a continued strengthening of research data policy commitments towards open data and support for open innovation.

Back to top

References

[1] Organisation for Economic Co-operation and Development, 2007. ‘OECD principles and guidelines for access to research data from public funding’, Paris, France_: OECD.
[2] Boulton, G. et al., 2012. ‘Science as an open enterprise’. The Royal Society. Available at: https://royalsociety.org/topics-policy/projects/science-public-enterpris...
[3] See for example OECD, 2004. ‘Science, Technology and Innovation for the 21st Century. Meeting of the OECD Committee for Scientific and Technological Policy at Ministerial Level, 29-30 January 2004 - Final Communique’. Available at: http://www.oecd.org/science/sci-tech/sciencetechnologyandinnovationforth...
[4] Department for Business, Innovation & Skills, 2014. ‘Open data strategy’. Available at: https://www.gov.uk/government/uploads/system/uploads/attachment_data/fil...
[5] European Commission. 19 April 2016. ‘European Cloud Initiative to give Europe a global lead in the data-driven economy’. Press release. Available at: http://europa.eu/rapid/press-release_IP-16-1408_en.htm
[6] European Commission High Level Expert Group on the European Open Science Cloud, 2016. ‘A Cloud on the 2020 Horizon. Realising the European Open Science Cloud: first report and recommendations’. Available at: https://ec.europa.eu/research/openscience/pdf/hleg/hleg-eosc-first-report_(draft).pdf
[7] G8 science ministers statement: London, 12 June 2013. Available at: https://www.gov.uk/government/publications/g8-science-ministers-statemen...
[8] Europa. 29 June 2016. ‘Protection of Personal Data’. Available at: http://ec.europa.eu/justice/data-protection/
[9] FAIR Principles: see for example Wilkinson, M. D. et al. 2016. ‘The FAIR Guiding Principles for scientific data management and stewardship’. Sci. Data 3:160018 doi: 10.1038/sdata.2016.18
[10] Economic and Social Research Council, 2015. ‘Research Data Policy’. Available at: www.esrc.ac.uk/funding/guidance-for-grant-holders/research-data-policy/
[11] RCUK, 2011. Common Principles for Data Policy. Available at: www.rcuk.ac.uk/research/datapolicy/
[12] RCUK, 2015. Guidance on best practice in the management of research data. Available at: www.rcuk.ac.uk/documents/documents/rcukcommonprinciplesondatapolicy-pdf/
[13] Concordat on Open Research Data. 2015. Available at:/www.rcuk.ac.uk/media/news/160728/
[14] Universities UK, 2012. “Concordat to Support Research Integrity”  also addressed data stewardship alongside other issues.
[15] University of Cambridge Office of Scholarly Communication ‘Unlocking Research’ Blog post, 1st Oct. 2015. Joint response on the draft UK Concordat on Open Research Data. Available at: https://unlockingresearch.blog.lib.cam.ac.uk/?p=285
[16] ‘Responsible metrics’ means well-designed evaluation criteria, e.g. as described in Wilsdon, J. 2015. ‘The Metric Tide: Report of the Independent Review of the Role of Metrics in Research Assessment and Management’ DOI: 10.13140/RG.2.1.4929.1363 Available at: https://responsiblemetrics.org/the-metric-tide/
[17] Ben Ryan ‘Supporting research data management costs through grant funding’ Research Councils UK Blog post. July 9, 2013 Available at: