Home > Frequently Asked Questions > DCC Curation Lifecycle Model
The DCC Curation Lifecycle Model provides a graphical, high level overview of the stages required for successful curation and preservation of data from initial conceptualisation through the iterative curation cycle. The model can be used to plan activities within a specific research project, organisation, or consortium to ensure all necessary stages are undertaken, each in the correct sequence. It is important to note that the description, preservation planning, community watch, and curate and preserve elements of the model should be considered at all stages of activity.
It is a common misconception that data is created or captured and then passed on to someone else to curate. In fact, much of the most crucial information required for effective long-term curation and reuse must be captured at the conceptualisation and collection stages.
Information captured at the conceptualisation stage may include references to funding requirements and specific research aims, planning for the creation and/or collection of data (including capture methods and storage options), as well as any legal constraints that will affect the use of the data created/collected. This is especially important with regards to medical research projects. Information about the data created/collected may include allusions to data capture tools and calibrations and the use of particular schemas for recording administrative, descriptive, structural and technical metadata.
Therefore primary research which creates data of any kind is an intrinsic part of the curation lifecycle because the way in which the data (and its associated representation information and metadata) are designed and captured/created affects:
Be they simple or complex objects or structured data such as databases, data — once conceived and created — become the centre of the Curation Lifecycle.
The DCC Curation Lifecycle Model supports activities undertaken by data archivists and preservation experts, and was designed in consultation with practitioners and experts at all stages of the curation cycle.
The models shows the logical sequence of receiving, appraising, selecting (or disposing of) data, followed by ingest and subsequent actions such as preservation, storage, access, and possibly transformations or reappraisals of the data. The model allows curators to identify potential weaknesses in policies, or gaps in the archival chain. It also identifies ongoing concerns such as community watch which could be incorporated into working practice, and identifies other stakeholders as sources or users of data, or as people who could pick up the process where your institution's responsibilities end.
Data archiving both preserves and adds value to data. For example:
It is in the interest of data users to be able to access high-quality data which can be proven to be authentic and have integrity. The ways in which data can be accessed is dependent on good practice in curation — for example, retrieval and querying of data is affected by the how they have been described in the creation and preservation stages of the cycle. Furthermore data use can produce new results, which themselves need to be curated, so data users become data creators and feed their research back into the cycle.
The DCC Curation Lifecycle Model supports data use and reuse by:
Full lifecycle actions are shown in concentric rings around the data objects at the centre of the model. These are activities which take place at any time during the digital curation lifecycle and are relevant to many different sequential actions. For example, preservation planning should be taken into account as the data is conceptualised, when it is being preserved, and when it is used and re-used. As different people may be responsible for different steps in the lifecycle, there is a risk of repeated effort and an opportunity for different roles to collaborate when undertaking full lifecycle actions, for instance, sharing data on community needs between data creators, archivists, and users.
Sequential actions are the steps which are repeatedly taken to ensure that data is curated according to best practice. This sequence is not simply performed once from start to finish but forms the basis of the curation chain and continues as long as data is being curated. Re-use and transformation of data can lead to the creation of a subset, by selection or query, or create newly derived results which themselves need to be curated.
Occasional actions are those which interrupt or reorder the sequential actions as a result of a decision. For example, upon appraisal it may be decided that the data in question does not fit the remit of a digital repository in which case data may be transferred to another archive, repository, data centre or other custodian. In some instances data is destroyed, perhaps for legal reasons. Other occasional actions are the reappraisal of data which fails validation procedures or the migration of data to a different format to protect it against hardware or software obsolescence.
Representation information is any information required to understand and render both the digital material and the associated metadata. Digital objects are stored as bitstreams which are not understandable to a human being without further data to interpret them. Representation information is the extra structural or semantic information which converts raw data into something more meaningful. For example, structure information could tell a computer to interpret a string of bits as ASCII characters, and semantic information could explain what a particular mathematical symbol means. Representation information should contain as much structure and semantic information as is required for a defined community (or Designated Community in OAIS terminology) to access the information stored within a digital object. The term can be applied to all levels of abstraction and refers to both the structural and semantic composition. It can therefore be recursive, and is dependent on the knowledge base of the designated community.
Representation information is not the same thing as metadata which describes data in administrative, descriptive, technical, structural and preservation terms. In the Lifecycle model, metadata is covered under the Description term.
For more information on representation information, see our DCC Digital Curation Manual Instalment on Representation Information, Reference Model for an Open Archival Information System (OAIS), Section 2.2.1 [external], and Alan's notes and thoughts about digital preservation, OAIS 7: Representation Information [external].The Lifecycle model enables the mapping of granular functionality onto a series of practical activities which allows creators, curators, and re-users of data to identify where they themselves fit into the bigger picture. By applying the lifecycle model to your own working practices, you will be able to identify whether additional steps are required, if there are 'missing links' in your data curation processes, whether some steps can be eliminated for your particular working practices, and to define specific roles and responsibilities within your project/institution across the different stages. By separating discrete stages the model also supports the identification of collaborators in the process of curation (for example, allowing a data creator to work with a repository in order to design the most appropriate metadata for the data objects instead of 'going in blind'). The sequential stages encourage the documentation of processes and policies by and between different stakeholders, and also support the building of frameworks of standards and technologies, and identifies needs for particular tools and services to support data curators at every level.
A poster of the DCC Curation Lifecycle Model is available. In addition, why not attend one of our DCC Information Days (which can be hosted at your own institution) to help contextualise the range of activities, roles and responsibilities, and tools available for your particular domain? To find out more, write to us at info@dcc.ac.uk.
The model has been developed through internal and public consultation with experts and practitioners in all stages of digital curation. However, we realise that the model will evolve over time to reflect actual working practices and we value your contributions and suggestions for improvements. If you have any suggestions for improvements then let us know by writing to info@dcc.ac.uk.