Scientific Metadata
Scientific data are generated by experiments or observations. In order to be interpreted, or even accessed, they must be accompanied by auxiliary information, ranging perhaps from the experimenter and the time and place that the experiment was conducted to arcane calibration details. This auxiliary information constitutes the metadata for the dataset. It is similar to the metadata needed for non-scientific data, but with some distinctive features; notably it is likely to be more extensive and less standardised.
In order to be properly interpreted by either humans or software metadata items need to be precisely defined. Similar quantities are often subtly different. Numeric values are meaningless unless their units are known. Scientific data often have a small and specialised initial user-community. If the data are to be re-used outside this community additional adumbration or exegesis may be required.
Even for its initial user-community scientific metadata is often notoriously incomplete. Additional quantities and assumptions necessary to interpret the data may initially only be recorded on scraps of paper, hard-coded into analysis software or only exist in the experimenter's head. Considerable effort must be made to capture all this information if the data are to be retained for posterity or made available to a wider community of users.
Finally standards are important to promote interoperability. Because of the small and specialised user-base standards can be informal, specialised and change rapidly. Thus it is necessary to monitor and track them.
Key Points
- Scientific data are generated by experiments and observations as part of the scientific process. That is, they are ultimately experimental, rather than routine. This point is less important for large, long-lived, collaborative experiments, but is ultimately inherent to the scientific process.
- Scientific metadata are likely to be more extensive and less standardised than non-scientific metadata.
- Scientific datasets are often generated with incomplete metadata. Considerable effort may be required to ensure that all the metadata necessary to make the data re-usable are gathered and ingested.
- Scientific user-communities are often small and specialised. If data are to be used outside their original communities, or preserved for an extended period of time, additional exegesis may be required.
- Standards, both syntactic and semantic, are needed to facilitate interoperability and re-usability. Physical quantities need precise, and documented, definitions and numeric values must have known units.
- Standards may be specific to the specialised community that generated the data. Because the communities are small standards (and other practices) can evolve rapidly and so must be tracked.
- Home
- Digital Curation
- About Us
- News
- Events
- Resources
- Briefing Papers
- Introduction to Curation
- Annotation
- Appraisal and Selection
- Curating emails
- Curating e-science data
- Curating geospatial data
- Data accreditation
- Data Citation and Linking
- Data protection
- Database archiving
- Digital repositories
- Freedom of Information
- Genre classification
- Interoperability
- Persistent Identifiers
- Trust through self audit
- Using OAIS for curation
- Web 2.0
- What is digital curation?
- Legal Watch Papers
- Standards Watch Papers
- Technology Watch Papers
- Making the Case for RDM
- Introduction to Curation
- How-to Guides
- Curation Reference Manual
- Peer review
- Editorial board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Metadata
- Ontologies
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- Chapters in production
- Curation Lifecycle Model
- Policy and legal
- Data Management Plans
- Case studies
- Tools and applications
- Standards
- Publications
- External resources
- Roles
- Curation journals
- Informatics research
- Briefing Papers
- Training
- Projects
- Community
- Contact Us
ERIS project
ERIS project
The development of a set of user-led and user-centric solutions to motivate researchers to deposit work in repositories was the goal of the Enhancing Repository Infrastructure in Scotland (ERIS) project. ERIS also aimed to connect repositories across the country to enable easy access to Scotland’s research output. The project ran from April 2009 to March 2011.
