Because good research needs good data

From RDMF5 to RDMF12: How the world has changed…

A guest blog post from Catherine Jones of STFC, reflecting on changes undergone between the 2010 and 2014 RDMF events...

Catherine Jones, STFC | 02 December 2014

I last went to an RDMF meeting in Manchester in the autumn of 2010 with a topic of “The economics of Applying and Sustaining Digital Curation”, I have just attended and spoken at RDMF12 “Linking Data and Repositories (and other systems)” and this blog post muses over the changes in this area over the last four years as demonstrated by the content of these two RDMF meetings. They can be generalised in my view as a move from planning to implementation. There has also been a shift in emphasis from data curation to data management. This widening scope enables us to ensure that the data we want/need to keep for the long term is in a fit state to be kept, through effective management through the lifecycle, and does not turn into a “digital landfill” as described by Mark Thorley, NERC, at RDMF12.

One thing that struck me in particular was the difference in the organisations the speakers represented. In 2010 we gathered to hear a Funder, an established Data Centre, a publisher and experts in the field of curation & specifically the costs of curation tell us about the policies, plans for and costs of data curation, together with Jonathan Tedds representing researchers.  In 2014 we gathered to hear institutional service providers and developers, Research Administrators and librarians tell us about data management policies, implementations and the importance of measuring impact of research data, together with Jonathan Tedds, with a beer in hand this time, representing researchers. 

Researchers and the data that they create are the reason that RDMF members develop services and it is nice to find a researcher who is prepared to spend time away from proper research to give us some insight into their world. J

The 2014 breakout groups discussed: selection of systems to provide the infrastructure; shared services; fostering cooperation with organisational stakeholders and tracking usage of research data. All these groups addressed real-life and hopefully solvable issues. In 2010 the groups discussed: sustaining digital curation; institutional awareness of sustainable digital curation and national versus institutional solutions, which were much more theoretical in nature as real-life institutional solutions were thin on the ground.

It is exciting to see that in a relatively short amount of time, the RDMF membership have moved from discussing theoretic issues to demonstrating prototypes and real-life services. These services sit within a wider infrastructure and the event acknowledged that there needs to be links to both institutional systems, such as CRISs, finances and HR, but also to systems which capture other parts of the research lifecycle such as publication repositories and external data repositories.  It will be interesting to see if these emerging data repositories have the same issues with getting content that publication repositories have wrestled with over the last decade.

There are still common themes of the tension between national subject specific data centres and institutional responsibilities, with an acknowledgement that researchers tend to follow domain practices and in fact research data formats and descriptions are very domain specific.  The thorny issue of what research data should be kept for the long-term, who decides and who is going to add enough meaningful context to ensure it can still be used is an open question and generated a lively debate in 2014. I’m sure I am not alone in opening spreadsheets I have created six months later and wishing that I had used better column names – research data has the same issues but at a much larger scale.

Several of the talks in 2014 expressed plans & aims for long-term data preservation following on from providing active data management services, thus bringing the community back towards the 2010 topic of the costs of curation. RDMF12 was as enjoyable as RDMF5, the four years between the two has created a bigger community focussed on solving research data management needs, which applies to institutions large and small.

I would like to thank the DCC for organising and running these events, they greatly contribute to the establishment and maintenance of the data management community. I know I find them very useful and interesting.

I am nervous of predicting what RDMF20 will be saying in 2019, but I’ll do it anyway J  I think we will be talking about migrating both data management catalogues/systems and data formats and the costs of sustaining data management and curation, whilst debating whether institutional, regional or national services are the most effective.  I look forward to seeing you all in 2019! J

Catherine Jones

Catherine Jones works for the Science and Technology Facilities Council in the Scientific Computing Department as an Information Systems Project Manager. She is responsible for a team of developers who support the STFC’s Institution Repository (www.epubs.stfc.ac.uk ) giving access to publications and the MRC’s Research Data Gateway (www.datagateway.mrc.ac.uk ) which enable location of research data through metadata searches and has just finished leading the STFC contribution to SCAPE (http://www.scape-project.eu/), an EU funded project examining large scale digital preservation with a focus on curation research data, where she led work in exploring capturing context to preserve the meaning as well as the bits.  The views in this blog post are the author’s own and do not represent the view of STFC. Email: catherine.jones@stfc.ac.uk