Review Excerpt: Managing Research Data
Sally Rumsey, Digital Collections Development Manager at the Bodleian Libraries, University of Oxford, reviews the book Managing Research Data, edited by Graham Pryor.
The following is an excerpt from the review published in Ariadne Issue 69. Full text is available here: http://www.ariadne.ac.uk/issue69/rumsey-rvw
A Practical Guide
Managing Research Data opens with information about the significant financial investment made in the UK to produce research data. It goes on to describe the data landscape that faces those involved in managing research data, particularly information professionals. The book serves as an excellent practical primer and guide to the world of RDM, especially for those coming fresh to this topic. In addition, existing information professionals who find themselves working in the area of RDM would benefit from consulting the book.
Chapter 1 would make a worthy item on library and information courses’ ‘required reading’ lists. It sets out the context clearly, complete with the dilemmas, difficulties and complexity that anyone involved with research data has to tackle. It is written with a sensitive understanding of the academic community and its viewpoints, for example a general distaste for high-level decrees, and how data curation can often work best as a collaborative effort between discipline expert and data curators. It is accepted (in chapter 7) that a combination of provision of RDM services within HEIs with data management mandates imposed by other bodies is not enough to change research practice overnight.
The structure of the book is that each chapter covers a broad topic relevant to research data management, such as policies, roles and responsibilities, and data management planning; the editor and authors should be commended for creating a volume that does this in a way that is clear, concise and of real practical value. The complexity of the landscape is acknowledged, including the impact of dealing with sensitive data.
It is inevitable in a book such as this that it provides a general, largely theoretical view. What those involved with RDM face daily is the more messy, real-life situations. For example, the expectation is that the large amounts of data are influenced by policies and requirements for data management plans imposed by funders. However the reality is that some institutions are faced with data produced by research that is unfunded and therefore not governed by external policies, or that is not, for whatever reason, able to be deposited in a specialist national data centre. Institutions need to work out what to do in these instances: what is to be retained, how to manage it, and how to pay for its curation.
Sarah Higgins describes the data management lifecycle, as conceptualised by the DCC (Digital Curation Centre). The discussion of the separate stages of the lifecycle could each stand perfectly well as independent briefings for those wanting a short overview.
The chapter describing developments in Australia and the US provides a contrast to the overall UK perspective of the book. It gives a useful overview of different models for broaching the problem of data management. I always admire the Australians and their propensity to think big, roll up their sleeves and act. There is a lot we in the UK can learn from what has been achieved and is planned on the other side of the world. The section on the US was interesting too, although it wasn’t clear how, if at all, DataOne relates to Data Conservancy, nor was it clear (as it was with the Australian model) how the US developments are being funded and how they therefore might continue in the long term.
Who Should Read This Book?
In the preface, the editor, Graham Pryor, explains that ‘initially, the aim of this book was to introduce and familiarise the library and information professional with the principal elements of research data management.’ He goes on to say that he believes it will serve a wider audience. I definitely agree, although I doubt that one group, active researchers who produce data, will read the book themselves (although I’d be delighted to be proved wrong).
Each chapter of the book provides a succinct and clear overview of key areas involved in RDM. Most notably, the chapters on policies, sustainability, emerging infrastructure and on data management planning.
One topic I would like to have seen expanded is that of legal matters, rights, licensing and data ownership. These are areas that, as Angus Whyte says ‘represent the most significant barrier to sharing data.’ Therefore this topic would merit longer discourse within the book. Legal matters are key to the management and reuse of data and where there is a certain lack of knowledge. Practitioners implementing RDM infrastructures need to ensure adequate provision of information and guidance.
As Brian Lavoie points out in his chapter, researchers have generally lacked incentives to store, manage, curate and share their data. This is a key point. The situation does appear to be changing, and reliable data citation is becoming more important. The DataCite service and the use of DOIs (Digital Object Identifiers) are becoming de facto standards for identifying and referencing data. I believe that this is going to become more accepted, and researchers will soon expect datasets to be published and cited using persistent identifiers and links.
I have a bit of a gripe about the tables and diagrams in the book. Table 3.1 giving details of research funders’ data policies is not laid out in a way that makes it easy to compare policy details. The shading in figure 9.3 showing the OAIS and Data Conservancy mapping is not clear, neither was the explanation of the diagram in the text. The reader is referred to nodes depicted as triangles and dots in the diagram demonstrating the conceptual overview of DataOne (Fig. 9.4), but they are too small to be immediately noticeable or useful.
This is an excellent book for anyone, not just information professionals, looking to ‘introduce and familiarize’ (Pryor, in the preface) themselves with a complex and challenging, yet increasingly important topic. The book benefits from a prestigious line-up of knowledgeable authors, including those who are actually ‘doing’ research and research data management. As an edited volume it fits well together as a single entity even though written by a number of individuals: chapters reference other chapters and the reader is not left with a sense of a ‘cobbled-together’ mix of disparate topics from different people. The content can equally well be dipped into, as read from cover to cover.
There is always a danger with this type of book that the environment will have moved on since writing and publication, and indeed it has, with a fresh batch of JISC RDM projects underway and significant reports being published. There are also emerging social network services such as collaboration tools like Colwiz and Mendeley, sharing tools like Figshare, as well as blogs and wikis that are being increasingly used by researchers and which will have an impact on the research data management environment. However, I expect this book will remain a valuable resource for those working or intending to work in the field for some while yet.
List of Chapters
- Why manage research data? - Graham Pryor
- The lifecycle of data management - Sarah Higgins
- Research data policies: principles, requirements and trends - Sarah Jones
- Sustainable research data - Brian F. Lavoie
- Data management plans and planning - Martin Donnelly
- Roles and responsibilities – libraries, librarians and data - Sheila Corrall
- Research data management: opportunities and challenges for HEIs - Rob Procter, Peter Halfpenny and Alex Voxx
- The national data centres - Ellen Collins
- Contrasting national research data strategies: Australia and the USA - Andrew Treloar, William Michener and G Sayeed Choudhury
- Emerging infrastructure and services for research data management and curation in the UK and Europe - Angus Whyte
Managing Research Data is published by Facet Publishing, ISBN 978-1-85604-756-2, price £49.95.
- Digital curation
- About us
- Briefing Papers
- Introduction to Curation
- Appraisal and Selection
- Curating Emails
- Curating e-Science Data
- Curating Geospatial Data
- Data Accreditation
- Data Citation and Linking
- Data Protection
- Database Archiving
- Digital Repositories
- Freedom of Information
- Genre Classification
- Persistent Identifiers
- Trust Through Self Assessment
- Using OAIS for Curation
- Web 2.0
- What is Digital Curation?
- Making the Case for RDM
- 5 Steps to Research Data Readiness
- Citizen Science
- Legal Watch Papers
- Standards Watch Papers
- Technology Watch Papers
- Introduction to Curation
- How-to Guides
- Developing RDM Services
- Curation Lifecycle Model
- Curation Reference Manual
- Peer review
- Editorial Board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Automated Metadata Generation
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Scenarios for Projects Producing Digital Resources
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- Scientific Metadata
- The Role of Microfilm in Digital Preservation
- Chapters in production
- Policy and legal
- Data Management Plans
- Case studies
- Repository audit and assessment
- Publications and presentations
- Curation journals
- Informatics research
- External resources
- Tools & Services
- Guidance, Reports and Directories
- Projects and Initiatives
- Organisations and Networks
- Standards and Specifications
- Resources of Historical Interest
- Briefing Papers
- Curation webinars
- Digital Curation 101
- Materials for Trainers
- Data management courses and training
- Tools of the Trade training
- RDM for librarians
- Research Data Management Forum (RDMF)
- Interviews: Setting the Scene
- Social media directory
- DCC Associates Network
- DCC blogs
- Survey: Budgeting for RDM
- Tailored support