Review Excerpt: Managing Research Data

10 August, 2012 | in DCC News
By: Magdalena Getler

Sally Rumsey, Digital Collections Development Manager at the Bodleian Libraries, University of Oxford, reviews the book Managing Research Data, edited by Graham Pryor.

The following is an excerpt from the review published in Ariadne Issue 69. Full text is available here: http://www.ariadne.ac.uk/issue69/rumsey-rvw

A Practical Guide

Managing Research Data opens with information about the significant financial investment made in the UK to produce research data. It goes on to describe the data landscape that faces those involved in managing research data, particularly information professionals. The book serves as an excellent practical primer and guide to the world of RDM, especially for those coming fresh to this topic. In addition, existing information professionals who find themselves working in the area of RDM would benefit from consulting the book.

Chapter 1 would make a worthy item on library and information courses’ ‘required reading’ lists. It sets out the context clearly, complete with the dilemmas, difficulties and complexity that anyone involved with research data has to tackle. It is written with a sensitive understanding of the academic community and its viewpoints, for example a general distaste for high-level decrees, and how data curation can often work best as a collaborative effort between discipline expert and data curators. It is accepted (in chapter 7) that a combination of provision of RDM services within HEIs with data management mandates imposed by other bodies is not enough to change research practice overnight.

The structure of the book is that each chapter covers a broad topic relevant to research data management, such as policies, roles and responsibilities, and data management planning; the editor and authors should be commended for creating a volume that does this in a way that is clear, concise and of real practical value. The complexity of the landscape is acknowledged, including the impact of dealing with sensitive data.

It is inevitable in a book such as this that it provides a general, largely theoretical view. What those involved with RDM face daily is the more messy, real-life situations. For example, the expectation is that the large amounts of data are influenced by policies and requirements for data management plans imposed by funders. However the reality is that some institutions are faced with data produced by research that is unfunded and therefore not governed by external policies, or that is not, for whatever reason, able to be deposited in a specialist national data centre. Institutions need to work out what to do in these instances: what is to be retained, how to manage it, and how to pay for its curation.

Sarah Higgins describes the data management lifecycle, as conceptualised by the DCC (Digital Curation Centre). The discussion of the separate stages of the lifecycle could each stand perfectly well as independent briefings for those wanting a short overview.

The chapter describing developments in Australia and the US provides a contrast to the overall UK perspective of the book. It gives a useful overview of different models for broaching the problem of data management. I always admire the Australians and their propensity to think big, roll up their sleeves and act. There is a lot we in the UK can learn from what has been achieved and is planned on the other side of the world. The section on the US was interesting too, although it wasn’t clear how, if at all, DataOne relates to Data Conservancy, nor was it clear (as it was with the Australian model) how the US developments are being funded and how they therefore might continue in the long term.

Who Should Read This Book?

In the preface, the editor, Graham Pryor, explains that ‘initially, the aim of this book was to introduce and familiarise the library and information professional with the principal elements of research data management.’ He goes on to say that he believes it will serve a wider audience. I definitely agree, although I doubt that one group, active researchers who produce data, will read the book themselves (although I’d be delighted to be proved wrong).

Each chapter of the book provides a succinct and clear overview of key areas involved in RDM. Most notably, the chapters on policies, sustainability, emerging infrastructure and on data management planning.

One topic I would like to have seen expanded is that of legal matters, rights, licensing and data ownership. These are areas that, as Angus Whyte says ‘represent the most significant barrier to sharing data.’ Therefore this topic would merit longer discourse within the book. Legal matters are key to the management and reuse of data and where there is a certain lack of knowledge. Practitioners implementing RDM infrastructures need to ensure adequate provision of information and guidance.

As Brian Lavoie points out in his chapter, researchers have generally lacked incentives to store, manage, curate and share their data. This is a key point. The situation does appear to be changing, and reliable data citation is becoming more important. The DataCite service and the use of DOIs (Digital Object Identifiers) are becoming de facto standards for identifying and referencing data. I believe that this is going to become more accepted, and researchers will soon expect datasets to be published and cited using persistent identifiers and links.

I have a bit of a gripe about the tables and diagrams in the book. Table 3.1 giving details of research funders’ data policies is not laid out in a way that makes it easy to compare policy details. The shading in figure 9.3 showing the OAIS and Data Conservancy mapping is not clear, neither was the explanation of the diagram in the text. The reader is referred to nodes depicted as triangles and dots in the diagram demonstrating the conceptual overview of DataOne (Fig. 9.4), but they are too small to be immediately noticeable or useful.

Conclusion

This is an excellent book for anyone, not just information professionals, looking to ‘introduce and familiarize’ (Pryor, in the preface) themselves with a complex and challenging, yet increasingly important topic. The book benefits from a prestigious line-up of knowledgeable authors, including those who are actually ‘doing’ research and research data management. As an edited volume it fits well together as a single entity even though written by a number of individuals: chapters reference other chapters and the reader is not left with a sense of a ‘cobbled-together’ mix of disparate topics from different people. The content can equally well be dipped into, as read from cover to cover.

There is always a danger with this type of book that the environment will have moved on since writing and publication, and indeed it has, with a fresh batch of JISC RDM projects underway and significant reports being published. There are also emerging social network services such as collaboration tools like Colwiz and Mendeley, sharing tools like Figshare, as well as blogs and wikis that are being increasingly used by researchers and which will have an impact on the research data management environment. However, I expect this book will remain a valuable resource for those working or intending to work in the field for some while yet.

List of Chapters

  1. Why manage research data? - Graham Pryor
  2. The lifecycle of data management - Sarah Higgins
  3. Research data policies: principles, requirements and trends - Sarah Jones
  4. Sustainable research data - Brian F. Lavoie
  5. Data management plans and planning - Martin Donnelly
  6. Roles and responsibilities – libraries, librarians and data - Sheila Corrall
  7. Research data management: opportunities and challenges for HEIs - Rob Procter, Peter Halfpenny and Alex Voxx
  8. The national data centres - Ellen Collins
  9. Contrasting national research data strategies: Australia and the USA - Andrew Treloar, William Michener and G Sayeed Choudhury
  10. Emerging infrastructure and services for research data management and curation in the UK and Europe - Angus Whyte

Managing Research Data is published by Facet Publishing, ISBN 978-1-85604-756-2, price £49.95.