Because good research needs good data

File Formats

Stephen Abrams, California Digital Library

Published: October 2007

The goal of digital curation is to ensure the appropriate usability of managed digital assets over time. Format is a fundamental characteristic of a digital asset that governs its ability to be used effectively.

Without strong format typing a digital asset is merely an undifferentiated string of bits. The information content encoded into an asset's bits can only be interpreted properly and rendered in human-sensible form if that asset's format is known.

While it is possible for bits to be preserved indefinitely without consideration of format, it is only through the careful management of format that the meaning of those bits remains accessible over time.

This instalment investigates aspects of format description, validation, and characterisation that may assist with long-term curation and usability of data.

Download the File Formats chapter (pdf)

Key Points

  • The concerns about file format structure
  • Need for open, long-term file formats
  • Guidance on selecting suitable file formats (what's best for your needs)
  • Documentation of file formats
  • Directing the development of file formats
  • Transformation of file formats
  • Functionalities of file formats (e.g. compression and Digital Rights Management)