Engaging with UK data centres for the pilot research data registry

24 April, 2014
   

For the pilot UK research data registry project, I have been liaising with existing UK data centres, for the metadata harvest from their existing catalogues into the central registry.

The UK has a long tradition of discipline-specific data centres, publicly funded by the research councils, to preserve and disseminate research data. Existing data centres in the UK include the UK Data Archive, the Archeology Data Service, the UK Solar System Data Centre, the Cambridge Crystallographic Data Centre, the Visual Arts Data Service, the ISIS Data Catalogue (ICAT) and the network of NERC Data Centres: British Atmospheric Data Centre, British Oceanographic Data Centre, Environmental Information Data Centre, NERC Earth Observation Data Centre, National Geoscience Data Centre, Polar Data Centre and the Environmental Bioinformatics Centre. Each data centre uses a metadata standard and profile suited to its purpose and discipline, and has its own discovery catalogue.

In this pilot phase for the UK research data registry we are testing metadata mapping to RIF-CS and metadata harvest from the NERC Data Catalogue Service, the UK Data Archive and the Archaeology Data Service. These were selected as case studies for the natural and social sciences that represent a diverse range of data collections.

NERC recently developed its NERC Data Catalogue Service as a central data discovery portal that brings together all metadata records from data holdings of eight different data centres. This catalogue uses a NERC discovery metadata standard that is UK GEMINI 2.2 and INSPIRE compliant and based on an ISO19115 profile. Alongside the DCS, NERC developed a GeoNetwork Catalogue Services for the Web (CSW) node to support the DCS portal, and from where metadata can be harvested for the registry.

The UK Data Archive holds data collections from all disciplines of the social sciences and humanities. Its metadata profile is based on the Data Documentation Initiative, a metadata standard commonly used in the social sciences. Talks about the pilot registry harvesting the UK Data Archive’s metadata via OAI-PMH prompted activities to upgrade its OAI stream, in line with recent developments for the new and improved Discover portal that uses more controlled vocabularies, is DDI2.5 compliant, and contains DOIs for each collection. The existing live OAI stream was still DDI2.1 compliant and did not yet contain DOIs. The new OAI stream is live for testing by the registry, and will replace the old stream as soon as other metadata harvesting services have also been upgraded.

The Archaeology Data Service metadata profile follows a bespoke ADS schema called ads_archive and exposes metadata via OAI-PMH. This service also publishes NERC-funded data collections into the NERC Data Catalogue Service.

Besides those three pilot cases, initial contacts have also been made with the Visual Arts Data Centre, the Cambridge Crystallographic Data Centre and the ISIS Data catalogue to explore the technicalities of metadata harvest and their metadata profiles to inform future registry work.

Continue to keep up with events on this blog by using the tag 'research data registry', or visit http://www.dcc.ac.uk/projects/research-data-registry-pilot. We welcome your comments and feedback.