Making research data more discoverable: gathering future requirements for the Jisc RDRDS

19 June, 2014

What can we do to increase the discoverability of research datasets, in order to maximise their re-use?  The DCC is piloting approaches to a UK research data registry and discovery service to answer this question.  As part of this work, we held a workshop in London on Monday 16 June 2014 to report on progress so far and to gather feedback and requirements for future activity. 

UK research funders are increasingly establishing expectations of sustainable management of research data.  These datasets currently find homes in a mixed landscape of unmanaged storage, institutional data repositories, and subject-specific databases and data centres.  Initiatives such as the DCC and the recent Jisc Managing Research Data programme have aimed to shift research data destinations as far as possible from unmanaged to managed storage.  The benefits of such a shift are promulgated by research funders, the Royal Society and a variety of international initiatives and publications.  But even when research data is deposited in managed, well-curated, appropriate storage, there can be difficulties discovering whether it exists and exactly where it has gone.  Clarification in the area will be of benefit to researchers looking for data for their work, and institutions and funders keen to track the impact of the research they support.

The RCUK-funded Gateway to Research provides valuable information such as publications, people, organisations and outcomes relating to RCUK-funded research projects.  Links to some research datasets are available via GtR via the ‘research materials’ outcome type on the portal but this is not routinely collected and exposed, as GtR does not specifically aim to aggregate and promote discovery of research data. 

Other initiatives such as Re3data are doing important work identifying and describing available repositories to support researchers finding the most appropriate home for their data, but support is particularly needed for researchers who would like to find research data from their own subject area, regardless of the related funder or institution.  A discovery service would need to be able to reflect the holdings of both subject-specific databases and datacentres, as well as the subject-agnostic holdings of university-based data repositories.

My own experience on the Jisc Incremental project found that researchers – like the general public – are likely to use generic search engines to locate and access data sources, in addition to their use of subject specific databases, and so a discovery service must also interoperate with and enhance existing research search strategies. 

These points were confirmed by our lively and engaged participants at yesterday’s workshop.  After presentations from the current project team updating on technical development, metadata issues and the liaison work with HEIs and datacentres (and plenty of coffee!) we moved into structured discussion groups which provided requirements and desiderata for future work on the pilot discovery service.  Groups focused on one of the following aspects of the pilot work: technical issues, metadata issues, workflows and use cases. 

As a team, we are keen to deliver a service that is focused in scope and performs well to meet the needs of our user communities.  As such, we appreciate input and feedback to our activities - these help us to achieve those aims.  We’d like to thank all active participants from our first phase of activity and also all those who attended Monday’s event and provided such enthusiastic input to our discussions. 

We’ll be continuing to report and reflect on emerging requirements and other developments here on the DCC project webpage; in the meantime, do get in touch with any questions: laura.molloy AT glasgow.ac.uk.