How to run a data service

26 November, 2010

Members of the UK Data Archive team ran an very useful event over two days earlier this month to share their experiences of how to run a data service. They explained the processes they've developed and shared practical tips based on lessons learned.  My event fee was generously funded through the DPC leadership programme - thanks!

Matthew Woollard explained the UKDA is a family of services including ESDS, the History Data Service, Survey Question Bank and Secure Data Service.  They have an annual turnover of around £3 million and employ sixty-two staff, demonstrating how much can be achieved on comparatively little.  Running services together clearly brings them great economies of scale and seems a useful model for HEIs to reflect on as they start to address demands for data management infrastructure - collaboration will be key.

Matthew suggested there are six main functions data centres need to cover:

  1. Pre-ingest / acquisition
  2. Ingest
  3. Data management / archival storage
  4. Preservation
  5. Access
  6. Administration

Staff working in each of these areas gave presentations and ran surgeries.  Some key lessons are shared below and full event notes can be downloaded:  

  • Building trust is key.  Sue Cadogan explained some researchers find it emotional to hand over their data so staff will invite them into the archive so they can see their data will be properly looked after.
  • In the past, data tended to be refused for poor documentation but now it’s more likely IPR or ethics will prevent deposit.  If licence and consent agreements don’t cover preservation and re-use, the archive can’t accept the data.  They’ve done lots of outreach and training on this issue and have reduced refusals of qualitative data on grounds of insufficient consent from around 40% to 25%.
  • Depositors often select the closed options on license forms - even when data can be shared.  So now, there are only two options on the licence form as standard: for educational and commercial access; or for educational use only.
  • UKDA produces various resources to supporting teaching and learning.  There are thirty-five tailor-made sample survey datasets (subsets of large surveys that are easier for students to handle) and fifteen teaching datasets in the ESDS Nesstar Catalogue.  They also have SPSS workbooks that focus on the skills students need in order to work with a real-world survey dataset.
  • The Secure Data Service feels people are the weak link in security and that researcher ignorance is a far greater risk than technical threats.  They provide compulsory training to try and combat this and have been careful to replicate researchers’ preferred working environment, as breaches occur most when systems are inconvenient. 

The Incremental project ran a training course shortly after this event, pulling UKDA’s processes out in a case study to explain the principles researchers could adopt to try to keep their data accessible in the long-term.  See slides 14-16 from the ‘How to Manage Your Data’ presentation.