The sweet smell of sustainability - JISC MRD projects make the business case

As projects funded in the Jisc Managing Research Data programme to set up and pilot institutional services reach their end, many have been working up business cases for sustained funding from their institution.

A Whyte | 11 April 2013

The MRD Workshop at Aston University 25-6 March heard about the business case advanced by four projects; at Universities of Bristol, Nottingham and Oxford and, with a slightly different take on things, from the Archaeology Data Service.

The common threads were, firstly, some reflection on the range of services that have evolved over the duration of the projects (mostly 17 months). Secondly the presenters outlined which providers would be the envisaged owners of which services. Thirdly there was discussion of the level of resourcing that projects believe they need to sustain a desirable level of service. Mostly expectations were in the order of 2 to 4 additional FTEs, on top of changes to existing staff roles to help deal with data management issues. Lastly there was discussion of the factors on the horizon affecting senior management decisions.

The data.bris project at Bristol has had notable success in gaining internal support for continuation. Stephen Gray traced a substantial shift over the duration of the project towards greater senior management support. At the beginning of the project various elements were in place; substantial investment in petabyte storage, a small institutional repository (ROSE) and research information system (PURE). The project has progressively joined these up, provided well-received training and online guidance, and delivered support for data management planning in the form of tailored guidance on meeting the various research funder requirements. All this is now covered by a set of principles the institution is committed to apply to itself and expects its researchers to follow.

A key element of the data.bris business case according to Stephen was that it pitched Bristol’s ‘research data service’ (interestingly dropping the ‘management’) as a facility for researchers to work collaboratively across institutions. That emphasis was partly down to the 2-stage business case data.bris followed to comply with standard PRINCE procedures. Their initial focus was on managing risk and compliance, but the opportunities for collaborative research carried more weight than expected, translating directly into 50% more resource than they at first had hoped to get. The new service will be Library led, although the pilot was led by IT Services.

Three potential levels of service,i.e. ‘do little, preferred, and gold-plated’ offerings were postulated in Bristol’s business case, with ‘do nothing’ as a fourth option. This tiered approach is being taken forward to help phase further service development. 

A tiered approach has also been taken at Nottingham in the ADMIRe project. The project has successfully nurtured RDM Policy through to approval and will be making online guidance available, including online training via an adaptation of MANTRA. ADMIRe are also planning awareness training, with DCC support, and a CPD course aimed at researchers. Laurian Williamson outlined the three service models that were proposed; ‘minimal’, ‘mediated’ and ‘consultancy’. Each model defines a level of service for six core activities: -

  • Data Management Plans
  • Active data management & storage
  • Data archive & preservation
  • Data sharing & publishing
  • Copyright & IPR
  •  Compliance & reporting

Laurian outlined the degrees of institutional benefit and risk associated with each model. These also identify a minimum additional resource required; 2 FTE to continue the ‘mediated’ level that ADMIREe has been piloting, and a further 3.5 for the ‘consultancy’ level. The latter assumes faculty-level embedding of services, focused on Research and Knowledge Transfer Priority Areas, and limited to research that’s funded by RCUK, the EU, or the NHS. 

To date only a ‘minimal’ service has been agreed at Nottingham. This would deploy the project’s online outputs in the expectation that these will enable the university to meet some of the compliance requirements. IT Support would continue to offer support in some RDM service areas, though without dedicated RDM staff. Given that the project’s investigations indicated low awareness of RDM among researchers, the institution will no doubt be giving further attention in due course to the demand for help in meeting compliance needs and in realising opportunities from data.

James AJ Wilson from Oxford University was next, outlining measures to take forward work from DaMaRo. As the project title- Data Management Roll-out at Oxford – suggests, it started with an enviable 4 years worth of experience nurturing policy and infrastructure, including prototypes of the Dataflow project's twin platforms Datastage and Databank. Among the main successes, James singled out the University’s endorsement last year of a policy committing it to resourcing access to services and facilities for storage, backup, deposit and retention, plus training, support and advice.

In practice the resourcing is contingent on Oxford’s working group developing a business case and the subsequent committee work, the results of which will be known some time later this year. That business case has set out:

  • Division of responsibilities for supporting RDM throughout the research lifecycle
  • Explanation of RDM components designed to assist at each stage
  • An intended business model for each component
  • Initial estimates of cost of provision

A four-way division of responsibilities is currently being worked on, tentatively comprising: -

  • Guidance, training, & planning services (Research Services); employing the RDM web pages, DMPonline, and bespoke cost management platform X5
  • Data gathering, manipulation, & storage services for active data (IT Services); these will integrate ORDS (Oxford Research Database Service) with the university's supercomputing, storage-as-a-service, backup and VM hosting services.
  • Deposit & discovery services (Library Services); joining up Datastage, Datafinder and Databank.
  • Preservation, curation, & publication services (Library Services); managing output records across Databank, Symplectic and the Oxford Research Archive

Oxford are also considering the role of divisions and departments; should they have any long-term data ownership responsibilities? Should they be required to produce a ‘policy implementation plan’? These questions have implications for the business models applied to each of the services being planned. Oxford are adopting a modular approach to their infrastructure, with different business models for the different components. Options being considered include:

  • Full service costing, treating the service as a ‘Small Research Facility’ e.g. ORDS, where staff, hosting, and capital renewal costs are all recouped (in theory) from up-front charge to use service
  • Lightweight open source software model, where a platform such as DataStage is made available projects that wish to use it, with no charge and minimal support
  • Embedding in existing funded channels – e.g. RDM Training may be delivered by modifying and adding content through established arrangements
  • Central University funding – e.g. DataFinder will be free at the point of use (so as to not act as a disincentive)
  • ‘Mixed model’ – e.g. DataBank may be offered at no charge up to a certain data volume

Currently demand is being estimated based on uncertain assumptions. These include that the 63% of researchers who receive grant funding will want to deposit potentially up to 3.4Tb in the first year. This tentative figure is based on responses from the university’s excellent RDM survey, carried out at the end of last year. There looks to be substantial demand, even if these figures turn out be unrepresentative of the real take-up in the first year. DaMaRo are optimistically predicting the need for controls on deposit; based on charging, and/or a principle that only data associated with articles in the Oxford Research Archive should be included. A range of business models are also being considered for offering ORDS beyond Oxford, and DaMaRo are encouraging institutions interested in this to contact

The Archaeology Data Service archives about 5% of archaeological data (but c.65% or archaeological event reports) produced each year in the UK and has long experience of costing storage and curation. Catherine Hardman told us how the SWORD-ARM project has been encapsulating some of that knowledge in the ADS Easy data deposition tool, which includes a costing calculator.

Institutions may find ADS Easy a useful model for their own approaches to cost modelling, though, primarily due to the metadata requirements set within the tool,  archaeological researchers are it’s primary target. ADS Easy allows users to get an estimate for digital archiving costs at the outset of their projects. This should be really useful for bid writing (and for pre-award support), enabling accurate costs to be charged to projects.

The broader aim of making repository deposit easier also makes this a useful exemplar, both for encouraging take-up and handling ingest more efficiently. As well as aiming to boost deposit rates ADS are looking to reduce the marginal costs by up to 50% compared with manual methods. Catherine outlined the aim of giving researchers easy steps to deposit:

  1. Estimate costs

  2. Login and create information about their project, including resource discovery metadata

  3. Upload their files and check the costs

  4. Create file level metadata and documentation

  5. Submit to the archive with the necessary permissions

The tool is not finished yet; substantial ‘time and motion’ studies are being done aiming to ensure the tool is fit for purpose and context. Catherine identified challenges encountered so far in this, mainly in managing expectations of those involved. For example some users have expected an ‘add to cart’ function to do their thinking for them, and it has been difficult to strike a balance between “visionaries, nay-say-ers, and the users”.

 A key issue is the potential for hidden costs to arise when the ‘human-face-of-digital-archiving’ is less present. For example there may be impacts on data quality that are more expensive to correct post-ingest. Take-up is also uncertain, especially among commercial users, and ADS are hoping to manage this by staging roll-out across the country.

The vagaries of planning offered much for the discussion at the end of this session.  Cross-institutional services were a cloudy but inviting prospect. How far can RDM support services realistically be shared among institutions? Bristol’s data.bris has given some consideration to sharing storage, including the possibility of trading it for access to skills in other areas.

Making the case for continued funding has been very challenging in some institutions. The battle for senior management buy-in has in some cases had to be fought again because key individuals have left. In other cases that battle has been overshadowed by the need to service the REF (Research Excellence Framework) or address issues around open access for publications.

Faced with other short-term concerns it appears that some institutions are viewing non-compliance with data policies as a low risk, or at least one that may be put off for now. Their senior management perceived a lack of unequivocal signals from Research Councils that they will hold institutions to account. The dangers in prevarication were highlighted by some reflections on European Commissioner for the Digital Agenda Neelie Kroes’s recent comments on linking funding to open data sharing. So it may well be that, as well as missing opportunities to join the ‘data mining gold rush’, University Chancellors are taking very serious risks.