A journey shared – on the road to research data management.
Angus Whyte through this blog reported some of the outcomes from the Workshop on Developing Institutional Research Data Policies held in Leeds in mid-March, writing with a focus on how institutions are responding to EPSRC expectations and mandates for research data management. Referring to other posts in this series, he summarised the characteristics of a roadmap as combining “aspects of strategy and schedule”.
Laura Molloy developed some of the emerging themes from the same workshop, for example the discussion on the trade-offs of a policy-first approach, the need for buy-in from researchers and senior management, the different elements of infrastructure, how to describe benefits to researchers, and the various institutional policies which are relevant to research data management.
What became very clear quite quickly during the workshop was that sharing experiences between institutions is a very valuable exercise. The aim of this post, which is complementary to the two above, is to draw on some of the lessons imparted by the participants of the workshop so they can be shared more widely through this blog.
The first lesson is that each institution is unique. The differences in each institutional profile include its current location on the road to policy and strategy development, the resources it can draw on, and the history it is building on, amongst others. This uniqueness is reinforced by the EPSRC message that there is no one-size-fits-all for roadmaps and they can only be reviewed on a case-by-case basis.
At the workshop, Ben Ryan from the EPSRC re-iterated that the roadmap is all about showing that what needs to change [at a particular institution] is understood, and that there is high-level support within the institution to make it happen. The detail of the roadmap will then be particular to the institution, depending on the institution’s starting point and its strategy, and acts as an internal planning tool. The roadmap should show that the institution is putting infrastructure into place – technical, as well as policy and training, with responsibilities being allocated.
The uniqueness and variety of institutions was clearly on display at the workshop, where the universities represented came in all shapes and sizes. This was reflected in the approaches that have been taken. One large research-intensive institution has a well-funded programme to create a portal and integrate various services, with members of different support branches (library, IT) delivering information and fielding queries through various channels.
This suits a university with a huge complement of staff and the resources to invest in technical infrastructure. Some smaller institutions, on the other hand, have taken advantage of a culture of easier and smoother decision-making processes, making significant inroads on the policy-making front, with an agility that is perhaps not typical everywhere.
One piece of advice that was often repeated was the importance of describing the benefits of research data management as an advocacy measure. The benefits should be articulated to reinforce the value of data, to the various stakeholders, including the institution and researchers.
One perceived difficulty is that data is difficult to define, and the definition of what is valuable data needs to be taken in context, and may require discussion at discipline level. Defining data and the institution also helps to scope the policy and the strategy. One challenge in formulating policy is to couch it positively, but also for it to feel consequential, whilst making sure that it sounds relevant and resonates with those it will affect.
Storage issues and use of repositories are a subject of continuing work. Participants explained that whilst existing institutional storage provision may be fragmented, the available and currently-used infrastructure and services need to be factored into the strategy.
Some institutions are investigating partnerships with a commercial firm for file virtualisation. External repositories were also considered an important component of the storage solution, with the idea surfacing of developing a method of marking external repositories as having 'approved' status (with associated guidance to researchers). However the definition of an ‘approved’ repository was still somewhat vague. The description of an approved repository, the compilation of repository lists, and sharing of this information was an area that participants felt would be worthy of further joint effort. In the here and now, the problem of DUDs (Data under desktops) was commonly acknowledged.
In terms of communication within institutions, several methods are employed. Lunchtime seminar series, existing mechanisms (e.g. heads of department meetings) and engaging with stakeholder groups, annual research staff conferences and use of champions were all mentioned– with champions frequently being in quite senior positions such as PVC research.
Nevertheless, effective communication is sometimes found to be challenging. However when rolling out policy and getting feedback, there was a general view that little negative reaction was encountered, with most questions relating to the specific details of implementation. Looking forward to ways of informing researchers on working policies once they had been developed, there was interest in using methods that scale to communicate information e.g. webinars, podcasts, interactive methods.
Selection and appraisal of data for preservation is an area where practice is still developing. Problems with selection include making judgements on what is ‘interesting’ enough to be kept or shared – this often needs to be defined at discipline or researcher level. Some disciplines (creative arts, archaeology) have a culture of keeping everything. A process needs to be developed for keeping contextual data that enables re-use (metadata), and for regularly reviewing what is kept.
The discussion recognised that some of the metadata (e.g. grant numbers) are currently held in disparate administrative systems which are not joint up. Ben Ryan emphasised the importance of data that is critical to understanding research and particularly data that underpins research, with a focus on research that has been published.
Tackling unfunded research, on the other hand, proved to be a recurrent theme during discussions. Participants recognised that although management of this research was not driven by funder requirements, it still needed to feature in the institutional strategy and considered in policies.
Considerations need to include the simply practical e.g. the lack of a grant number means that metadata systems driven by grant numbers may miss out this research, and its existence will not be picked up by systems that rely on grant support offices to trigger processes related to data management and planning. Furthermore, costing models will need to take into account that this research is not being funded externally, and the cost of its longer-term management needs to be accommodated internally.
Participants also described some of the challenges they were anticipating. Once again these were as different as the institutions present, and encompassed: tackling the long-tail of data, offering research data management tools that are usable and well-used, evolving from a project to a sustainable embedded service that can tackle data management at scale, overcoming a previous reputation for not being a helpful service, gaining understanding within the university that RDM includes many activities (not just storage), ownership of processes, recognition of costs and institutional investment, winning hearts and minds.
It is planned that these and other lessons will be developed in a more substantative report from the meeting to impart the messages more widely. In the meantime, this quote, from the Research Data Toolkit Project, sums up what to me seem to have been the main outcome of the meeting so far - that those present left feeling enriched with ideas, and with lots to think about and actions to follow up within their own institution:
“The great value in this event was the perspective gained from a large of group people, acting on the same imperative, via different paths. It was also important to me that the head of our Research Grants Team attended the workshop and returned a fully clued up, engaged and co-opted member of the RDTK effort, buzzing with ideas to better embed data management planning in pre-award workflows.”
From Research Data Toolkit, Progress Report, Month 6, March 2011
Of those who have blogged about the workshop and their follow-up actions, Bill Worthington from the University of Hertfordshire and the Research Data Toolkit Project, was inspired to think more about data re-use, workflows and metadata for data publication. Stephen Gray from data.bris set to work on the statement on the underlying principles being proposed at the Univeristy of Bristol, embedding a statement about the value of research data. The team from the Orbital project at Lincoln also got busy drafting their policy. The blogs from the JISCMRD programme are aggregated in this google bundle which is well worth following to stay up to date with the projects’ experiences and lessons learnt. MISS, for example, shared a document in a post on their blog previously, that contained a summary of their process alongside the draft policy. I hope all the above examples reinforce the idea of how useful it is to share experiences and learn from others.
Finally, to this end, the DCC would be very pleased to host guest blog posts about any aspects of institutional experiences of developing policies, implementing strategies and capturing roadmaps. The topics could address the processes underway at your institution, developing roles and responsibilities, discussion of guides and working policy documents, training needs, methods for gathering feedback, advocates and supporters, challenges and compromises. But please don’t be limited by these ideas and do get in touch – we look forward to hearing from you, so your messages can be passed on to others. Links to relevant blog posts about roadmaps can also be posted in the comments.
- Digital curation
- About us
- Briefing Papers
- Introduction to Curation
- Appraisal and Selection
- Curating Emails
- Curating e-Science Data
- Curating Geospatial Data
- Data Accreditation
- Data Citation and Linking
- Data Protection
- Database Archiving
- Digital Repositories
- Freedom of Information
- Genre Classification
- Persistent Identifiers
- Trust Through Self Assessment
- Using OAIS for Curation
- Web 2.0
- What is Digital Curation?
- Making the Case for RDM
- 5 Steps to Research Data Readiness
- Citizen Science
- Legal Watch Papers
- Standards Watch Papers
- Technology Watch Papers
- Introduction to Curation
- How-to Guides
- Developing RDM Services
- Curation Lifecycle Model
- Curation Reference Manual
- Peer review
- Editorial Board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Automated Metadata Generation
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Scenarios for Projects Producing Digital Resources
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- Scientific Metadata
- The Role of Microfilm in Digital Preservation
- Chapters in production
- Policy and legal
- Data Management Plans
- Case studies
- Repository audit and assessment
- Publications and presentations
- Curation journals
- Informatics research
- External resources
- Tools & Services
- Guidance, Reports and Directories
- Projects and Initiatives
- Organisations and Networks
- Standards and Specifications
- Resources of Historical Interest
- Briefing Papers
- Curation webinars
- Digital Curation 101
- Materials for Trainers
- Data management courses and training
- Tools of the Trade training
- RDM for librarians
- Research Data Management Forum (RDMF)
- Interviews: Setting the Scene
- Social media directory
- DCC Associates Network
- Survey: Budgetting for RDM
- Tailored support