Because good research needs good data

List of Metadata Standards

  • A set of mandatory metadata that must be registered with the DataCite Metadata Store when minting a DOI persistent identifier for a dataset. The domain-agnostic properties were chosen for their ability to aid in accurate and consistent identification of data for citation and retrieval purposes.

    Sponsored by the DataCite consortium, version 3.0 was recently released in 2013.

  • By using DCAT to describe datasets in data catalogs, publishers increase discoverability and enable applications easily to consume metadata from multiple catalogs. It further enables decentralized publishing of catalogs and facilitates federated dataset search across sites. Aggregated DCAT metadata can serve as a manifest file to facilitate digital preservation.

  • A widely used, international standard for describing data from the social, behavioral, and economic sciences. Two versions of the standard are currently maintained in parallel:

    • DDI Codebook (or DDI version 2) is the simpler of the two, and intended for documenting simple survey data for exchange or archiving. Version 2.5 was released in January 2014.
    • DDI Lifecycle (or DDI version 3) is richer and may be used to document datasets at each stage of their lifecycle from conceptualisation through to publication and reuse. It is modular and extensible. Version 3.2 was published in March 2014.

    Both versions are XML-based and defined using XML Schemas. They were developed and are maintained by the DDI Alliance.

  • An early metadata initiative from the Earth sciences community, intended for the description of scientific data sets. It inlcudes elements focusing on instruments that capture data, temporal and spatial characteristics of the data, and projects with which the dataset is associated. It is defined as a W3C XML Schema.

    Sponsored by the Global Change Master Directory, the DIF Writer's Guide Version 6 is from November 2010.

  • A basic, domain-agnostic standard which can be easily understood and implemented, and as such is one of the best known and most widely used metadata standards.

    Sponsored by the Dublin Core Metadata Initiative, Dublin Core was published as ISO Standard 15836 in February 2009.

  • Ecological Metadata Language (EML) is a metadata specification particularly developed for the ecology discipline. It is based on prior work done by the Ecological Society of America and associated efforts (Michener et al., 1997, Ecological Applications).

    Sponsored by ecoinformatics.org, EML Version 2.1.1 was released in 2011.

  • A standard for encoding archival finding aids using XML in archival and manuscript repositories, implementing the recommendations of the International Council on Archives ISAD(G): General International Standard Archival Description.

    The scheme is maintained by the Technical Subcommittee for Encoded Archival Standards of the Society of American Archivists, in partnership with the Library of Congress.

  • A widely-used, but no longer current standard defining the information content for a set of digital geospatial data required by the US Federal Government.

    CSDGM was sponsored by the US Federal Geographic Data Committee.  However, in September 2010 the FGDC endorsed ISO 19115 and began encouraging federal agencies to transition to ISO metadata.

  • FITS is an image data file format for encoding astronomical data. The WCS (World Coordinate System) conventions map elements in data arrays to standard physical coordinates in the sky. FITS has provisions for image metadata encoded in an ASCII header at the beginning of files.

  • Genome metadata on PATRIC consists of 61 different metadata fields, called attributes, which are organized into the following seven broad categories: Organism Info, Isolate Info, Host Info, Sequence Info, Phenotype Info, Project Info, and Others.

Pages