Because good research needs good data

List of Metadata Standards

  • The Protein Data Bank archive (PDB) is a worldwide archival repository of information about the 3D structures of proteins, nucleic acids, and complex assemblies, managed by the Worldwide PDB (wwPDB). The PDB Exchange Dictionary (PDBx) is used by the wwPDB to define data content for deposition, annotation and archiving of PDB entries. PDBx incorporates the community standard metadata representation, the Macromolecular Crystallographic Information Framework (mmCIF), orginally developed under the auspices of the International Union of Crystallography (IUCr). PDBx has been extended by the wwPDB  to include descriptions of other experimental methods that produce 3D macromolecular structure models such as Nuclear Magnetic Resonance Spectroscopy, 3D Electron Microscopy and Tomography.

  • The PREMIS (Preservation Metadata: Implementation Strategies) Data Dictionary defines a set of metadata that most repositories of digital objects would need to record and use in order to preserve those objects over the long term. It has its roots in the Open Archival Information System Reference Model but has been strongly influenced by the practical experience of such repositories. While the Data Dictionary can be used with other standards to influence the creation of local application profiles, an XML Schema is provided to allow the metadata to be serialized independently.

    PREMIS was initially developed by the Preservation Metadata: Implementation Strategies Working Group, convened by OCLC and RLG, and is currently maintained by the PREMIS Maintenance Activity, lead by the Library of Congress.

  • The standard will be used to describe trials that conform to: 1) Any applicable human subject or ethics review regulations (or equivalent) and 2) Any applicable regulations of the national or regional health authority (or equivalent). Most of the records in ClinicalTrials.gov describe clinical trials (also called interventional studies). A clinical trial is a research study in which human volunteers are assigned to interventions (for example, a medical product, behavior, or procedure) based on a protocol and are then evaluated for effects on biomedical or health outcomes. ClinicalTrials.gov also includes records describing observational studies and programs providing access to investigational drugs outside of clinical trials (expanded access).
  • Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. The PROV Family of Documents defines a model, corresponding serializations and other supporting definitions to enable the inter-operable interchange of provenance information in heterogeneous environments such as the Web.

  • The QuDEx standard/schema is a software-neutral format for qualitative data that preserves annotations of, and relationships between, data and other related objects. It can be viewed as the optimal baseline data exchange model for the archiving and interchange of data and metadata.

  • The standard provides a means to publish multi-dimensional data, such as statistics, on the web in such a way that it can be linked to related data sets and concepts using the W3C RDF (Resource Description Framework) standard. The model underpinning the Data Cube vocabulary is compatible with the cube model that underlies SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations.

  • Some repositories have decided that current standards do not fit their metadata needs, and so have created their own requirements.

  • The Standard for Documentation of Astronomical Catalogues is a set of conventions for archiving astronomical data. As well as path, filename and data format conventions, it also specifies how to construct a plain text description file for documenting the data files. It was developed as an alternative to FITS that would be more suited to archives, permit human inspection, and allow manipulation via standard Unix command-line tools.

    SDAC was developed by CDS (Centre de Données astronomiques de Strasbourg). Version 2.0 is the most recent; it was released in February 2000.

  • A set of common technical and statistical standards and guidelines to be used for the efficient exchange and sharing of statistical data and metadata.

    Sponsoring institutions include BIS, ECB, EUROSTAT, IMF, OECD, UN, and the World Bank. Technical Specification 2.1 was amended in May 2012.

  • An information model for describing the elements of the heliophysics data environment, and a set of resource types which can be used to describe data along with its scientific context, source, provenance, content and location. It is designed to support a federated data system where data may reside at different locations and may be seperated from the metadata which describes it. The preferred expression form is XML.

    The Space Physics Archive Search and Extract (SPASE) effort is implemented by the SPASE Consortium which is composed of representatives of the international Heliophysics data community. The Current Release of the data model (2.2.2) was updated in October 2012.

Pages