UK Repositories claiming to hold data
31 March, 2008
The OpenDOAR and ROAR services both present self-reported claims by repositories across the world about their contents, backed up by some harvested facts. I’m interested in those UK repositories that claim to hold data.
My first problem is that neither repository allows me simply to choose data. OpenDOAR allows me to search on “Datasets” (63 world-wide, 8 in the UK), while ROAR allows me to search for “Database/A&I Index” (24 world-wide, 6 in the UK). I thought the latter was a surprisingly “library science” classification, given the origins of ROAR. Not surprisingly, most repositories are in only one of the lists. Also not surprisingly given the origins of these services in the Open Access and OAI-PMH movements, there are many first class data repositories NOT listed here (UKDA and BADC, for example).
The UK repositories listed are:
OpenDOAR “Datasets”
The 3 that do have serious amounts of data are DSpace @ Cambridge, eCrystals and NDAD. DSpace @ Cambridge is dominated by the 100,000 ++ collection of chemical structures encoded in CML, but there are plenty of other datasets there, including some from Archaeology. Sadly, there are plenty of empty collections, and many collections where the last deposit was 2006 (I guess around when the funded project died). eCrystals is completely crystal structures, and has some very nice features; find a compound, and as you look perhaps rather bemused at the page, a Java object loads and there you have a rotatable image of the molecular structure before your eyes on the data page! NDAD also has many ex-Government datasets, some of them very large.
ROAR “Database/A&I Index”
It’s a rather sad study! I do hope that the Open Repositories 2008 conference in Southampton over the next couple of days leads to an improvement. I can't get there, unfortunately, but I hope someone will report from it here. I particularly liked the idea of the developers challenges. Can we have some oriented to data, please?
My first problem is that neither repository allows me simply to choose data. OpenDOAR allows me to search on “Datasets” (63 world-wide, 8 in the UK), while ROAR allows me to search for “Database/A&I Index” (24 world-wide, 6 in the UK). I thought the latter was a surprisingly “library science” classification, given the origins of ROAR. Not surprisingly, most repositories are in only one of the lists. Also not surprisingly given the origins of these services in the Open Access and OAI-PMH movements, there are many first class data repositories NOT listed here (UKDA and BADC, for example).
The UK repositories listed are:
OpenDOAR “Datasets”
- Bristol Repository of Scholarly Eprints (ROSE)
- DSpace @ Cambridge
- eCrystals - Southampton
- Edinburgh DataShare
- Edinburgh Research Archive (ERA)
- Leicester Research Archive (LRA)
- NDAD (National Digital Archive of Datasets)
- Nature Precedings
The 3 that do have serious amounts of data are DSpace @ Cambridge, eCrystals and NDAD. DSpace @ Cambridge is dominated by the 100,000 ++ collection of chemical structures encoded in CML, but there are plenty of other datasets there, including some from Archaeology. Sadly, there are plenty of empty collections, and many collections where the last deposit was 2006 (I guess around when the funded project died). eCrystals is completely crystal structures, and has some very nice features; find a compound, and as you look perhaps rather bemused at the page, a Java object loads and there you have a rotatable image of the molecular structure before your eyes on the data page! NDAD also has many ex-Government datasets, some of them very large.
ROAR “Database/A&I Index”
- Higher Education Empirical Research Database (1642 records)
- NDAD - UK National Digital Archive of Datasets (66 records)
- ReOrient Knowledge Base
- Research Findings Register (1496 records)
- Southampton Crystal Structure Report Archive (165 records)
- The Linnean Collections (14275 records)
It’s a rather sad study! I do hope that the Open Repositories 2008 conference in Southampton over the next couple of days leads to an improvement. I can't get there, unfortunately, but I hope someone will report from it here. I particularly liked the idea of the developers challenges. Can we have some oriented to data, please?
- Home
- Digital curation
- About us
- News
- Events
- Resources
- Briefing Papers
- Introduction to Curation
- Annotation
- Appraisal and Selection
- Curating Emails
- Curating e-Science Data
- Curating Geospatial Data
- Data Accreditation
- Data Citation and Linking
- Data Protection
- Database Archiving
- Digital Repositories
- Freedom of Information
- Genre Classification
- Interoperability
- Persistent Identifiers
- Trust Through Self Audit
- Using OAIS for Curation
- Web 2.0
- What is Digital Curation?
- Making the Case for RDM
- Research Data Readiness
- Legal Watch Papers
- Standards Watch Papers
- Technology Watch Papers
- Introduction to Curation
- How-to Guides
- Curation Reference Manual
- Peer review
- Editorial Board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Metadata
- Ontologies
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- The Role of Microfilm in Digital Preservation
- Chapters in production
- Curation Lifecycle Model
- Policy and legal
- Data Management Plans
- Tools
- Case studies
- Repository audit and assessment
- Standards
- Publications and presentations
- Roles
- Curation journals
- Informatics research
- External resources
- Briefing Papers
- Training
- Projects
- Community
