Because good research needs good data

From RDM strategy to action -a glass half full!

An overview of results from the DCC 2014 survey of institutions on their progress in developing support for Research Data Management.

A Whyte | 08 May 2014

Recently DCC carried out an online survey of senior managers involved in decision-making on Research Data management in UK Higher Education Institutions. If you took part in it you may have already seen some of the results. As a small payback for the time that 87 people contributed to our snapshot portrait of UK universities’ development of RDM services we gave them early access to results (something we also plan to do next time round). This article is a preview of some of the key points. The rest will appear asap after our workshop today Thursday 8th May, including the data.

Before commenting on the results, a quick note on the survey targeting and our response. We targeted Pro-Vice Chancellor for Research and heads of service in Libraries, IT/ Computing, and Research Support & Commercialisation. Some of this was through direct e-mails and phone call reminders. Partly to make the most of our limited resource, and partly to help maximise the response rate we targeted our effort here at the institutions we thought likely to be ‘RDM intensive’; managers in the 69 UK institutions whose research income is at least 10% of total income.  We also had help to post invites via selected email lists; and a note of thanks for that support goes to colleagues in professional associations SCONUL, UCISA and ARMA. 

If this all sounds a little elitist I should point out we do aim to support any UK institution that asks for it, and have often done so with universities that are not at all ‘research intensive’. In the event, our 87 respondents came from 61 institutions; 37% of the 163 listed by the Higher Education Statistics Agency. They included 100% of the Russell Group, and 55% of the 45 universities that cross our research income threshold but are not in the Russell Group (i.e. 71% of our target group in all). That’s enough to draw from the responses some comparisons about how these two subsets of our target group compare. It’s not a high enough response rate to generalise from the 13 responses we had from ‘other’ universities; a category that actually includes the majority of UK institutions. We’ll take that into account next time around.

By adding in some data from HESA stats before doing the analysis, and using our knowledge of which institutions had received support from us or from Jisc Managing Research Data programme, we were able to tell that our respondents were more likely to come from institutions that had had one or other forms of support, and from wealthier institutions in research income terms. It’s worth bearing in mind here that average research income for a Russell Group member in 20011-12 was 5 times that of one in the ‘other 10%plus’ category, and about 48 times that of the majority whose research income is less than 10% of total income.

So what of the results? As someone once said, ‘Progress has little to do with speed, but much to do with direction’. On that principle, much progress has been made. Institutions have been developing RDM policies to communicate the direction they wish to take. Overall, nearly three-quarters of our respondents said their institution has RDM policy at least at a draft stage. In our target group more than 60% said this is either being considered for final approval or already approved. That figure rose to 70% among institutions that had support from DCC, or funding through Jisc MRD. We like to think there is a causal link there rather than just a correlation, and were grateful to see comments giving unsolicited thanks for our support.

It was not surprising to find the most frequently identified factor driving the development of RDM support to be “UK Research Council data policies”. What was a little more surprising was how the two sub-sets of our target group compared on the other factors we listed. For example two-thirds of Russell Group respondents said a “strategy to expand support for research” was a factor, while only 40% of the ‘other10%plus’ institutions did. And there was a similar 60:40 (approx) split on the numbers citing as influences the EU Horizon2020 policy on data management, and the UK Government policy on open data.

Even if speed is not the essence of RDM development, there is a clear timeline set out by the EPSRC (Engineering and Physical Sciences Research Council) for institutions it funds to meet its policy expectations. One year from now they are expected to have 9 capabilities established; from policy awareness among their research community through to the preservation of publicly funded data outputs.  

Almost all of our respondents’ institutions receive at least some EPSRC funding. But when asked how far their institution had progressed towards implementing RDM policy, even among Russell Group respondents there was a clear majority reporting that this has not yet gone beyond investigating the requirements. And asked; “when do you expect your institution to be able to ensure that selected research data collections remain accessible for the long-term (e.g. 10 years+)” there were 31% in the Russell Group who said “after 12 months”. This rose to 43% in the ‘other 10%plus’ category.

Institutions have been expected to fund the development of new support capabilities, which may require recruitment or retraining and sizable investment in storage infrastructure (we asked about these factors, and will make responses available soon). So if the capability to ensure long-term access to datasets is ‘the bottom line’,  the more capital-intensive elements of that are proving harder to get in place than some have envisaged. That was also a conclusion of Loughborough University’s survey last year.

The picture is more positive if we look at capabilities that don’t necessarily require much up-front investment by the institution; those that may draw on existing roles and involve finding out about current data assets, assessing current capabilities, developing skills and awareness of the external environment. Of nine capabilities that we listed, the three where most institutions seem to be making progress are shown below; policy development, as already mentioned, and two areas of implementation; ‘RDM skills training & consultancy’ and ‘Data Management & Sharing Plans’, i.e. advice to researchers or faculty groups on preparing DMPs or the Data Access Statements required by RCUK Open Access policy.

These support and consultancy activities address some of the key elements of (for example) EPSRC’s expectations, and RCUK’s Data Principles. All are arguably more essential than expensive storage if an institution is to be confident that its researchers will be able to produce, on demand, the evidence that warrants their published outputs. 

One question that arises here is about the policy message that ‘data’ can include anything comprising the evidence-base for research. If this is getting through, why is it that “low priority for researchers” was among the top 3 obstacles our respondents identified “…for the institution as a whole to comply with research funders expectations”?  It may be that researchers remain unconvinced about the potential for generic support from their institution to help with their research. We do hear that argument occasionally, but it’s an odd one considering how much researchers rely on, say, Google or Dropbox. The expectation that researchers should 'show their working' is as old as Galilelo, as pointed out in the excellent PLoS editorial 'Ten Simple Rules for the Care and Feeding of Scientific Data'.

Its true though that the links between generic and discipline-specific services are still being built. This may be what is in our respondents minds when they indicate "lack of appropriate staff resources and infrastructure” as one of the top obstacles to compliance. The other was ‘availability of funding’. That in itself was no surprise, but it will be useful to further explore which RDM support activities are most difficult for institutions to make the case for, either internally or through research bids.

On a related note, our survey also received many comments about collaboration. We asked people to identify any areas where they expected part of the solution to lie in using collaborative regional or national approaches. More than 70% of respondents made some comment on this, many pointing out initiatives being explored by regional consortia such as the N8 Group. The focus of these efforts it seems is not only on sharing data centres; some mentioned training, and others repositories or data catalogues. One remark identified “…a collective fear of collaboration preventing us from meeting our mandates”. I’m sure our workshop today will highlight plenty of examples to allay that fear. We look forward to sharing more from the survey after that, along with the data.