Because good research needs good data

Elephant invades room at JISC programme launch

A short report on a session about disciplinary challenges to data management planning at the JISC MRD 02 launch meeting in Nottingham, 1-2 December 2011.

A Whyte | 05 December 2011

The room in question was a glass box in Nottingham’s tribute to Space 1999, the National College of School Leadership, launch location for the second JISC Managing Research Data programme.

This had more than 70 people from 27 new projects huddled in intensive bouts of round table discussion over two days, interspersed with parallel workshop sessions.

One of these sessions on the second day included some spirited debate on ‘data management planning and meeting funders requirements’. Three people gave short talks relating to new six month projects in strand B of the new programme. Julie Mcleod, Professor of Information Management at University of Northumberland spoke on the Datum in Action project; Mansur Darlington from University of Bath on the REDM-MED project and Brian Hole on the REWARD project involving Ubiquity Press and University College London.

The elephant in question? This was ‘context’, or rather the representation of enough information about the context to make research data reusable and re-useful to a specific community.

Some would say this is well known in preservation circles as ‘representation information’. Nobody in the room suggested ‘context’ is a new problem, but there were certainly some who felt it goes relatively unacknowledged, compared with more familiar constraints such as confidentiality and copyright.

Julie McLeod has helped train health researchers in data management, training now embedded in post-graduate courses leading to, among other things, praise from research ethics committees for the quality of postgrad students’ ethics applications. 

However some health researchers using qualitative methods saw context as a barrier to sharing; an issue of epistemology since (it is alleged) such methods call for the researcher to acquire an understanding of the phenomena studied in their context; an understanding that can’t be passed on by sharing data.

Some in the room detected a whiff of solipsism in such objections; if the researchers’ notes can’t be used to reproduce their interpretation that surely that does not rule out them being used for other research purposes. Others recognised similar issues in other domains; experimental physics for example, where capturing sufficient information about the context is very far from trivial.

All of which points to the need to be selective; to apply to data the methods archivists use for appraisal to work out what level of effort to expend to give data some level of reusability.

This discussion segued neatly into Mansur Darlington’s talk about REDM-MED since this is trying to support the timely recording of context in metadata as the research unfolds. This will expand on workflow modelling conventions developed in the ERIN project. The project also asks; What characteristics increase or decrease the ‘re-usefulness’ of data? Mansur proposed these ought to include findability, readability, comprehensibility, interpretability, admissibility, and desirability. I especially liked Mansur’s conclusions that “we have to respond to the detail of data’s diversity” and “some data will be reuseless for ever!”

Thirdly Brian Hole talked about REWARD, which is integrating archaeological researcher workflows with systems for publishing research articles and deposition of these in the UCL institutional repository. This has a different take on context, developing the data paper concept for this domain.

By publishing a data journal for archaeology, the project offers a motivation for researchers to document contextual information in a standard way. A data paper describes and may point to the data collected, identifies the methods used and the potential reusability. Peer reviewed and with a Crossref DOI to make the paper count as an assessable output, data papers offer one form of pay-back for effort in documenting context and making data re-useful.