I create data - why is the model relevant to me?

It is a common misconception that data is created or captured and then passed on to someone else to curate. In fact, much of the most crucial information required for effective long-term curation and reuse must be captured at the conceptualisation and collection stages.

Information captured at the conceptualisation stage may include references to funding requirements and specific research aims, planning for the creation and/or collection of data (including capture methods and storage options), as well as any legal constraints that will affect the use of the data created/collected. This is especially important with regards to medical research projects. 

Information about the data created/collected may include allusions to data capture tools and calibrations and the use of particular schemas for recording administrative, descriptive, structural and technical metadata.

Therefore primary research, which creates data of any kind is an intrinsic part of the curation lifecycle because the way in which the data (and its associated representation information and metadata) are designed and captured/created affects:

  • How meaningful the data is to other users.
  • Whether it can be accessed, shared, and re-used in the short or long-term.
  • Whether the data may or may not be selected for ingest into an archive (i.e., does the data conform to archival standards, can the data be stored and preserved?)
  • Which transformations can be performed on the data (e.g. migration to new file formats)
  • How easily other researchers can find and understand the data for reuse.
  • Whether the data can be proven to be authentic and have integrity (i.e. is what it purports to be and has not been changed or tampered with since creation, a crucial characteristic of scientific data)

Be they simple or complex objects or structured data such as databases, data — once conceived and created — become the centre of the Curation Lifecycle.