Lifecycle Model FAQ
- What is the DCC Curation Lifecycle Model?
- I create data, why is the model relevant to me?
- I'm a data archivist, why is the model relevant to me?
- I want to reuse other people's data, why is the model relevant to me?
- What is the difference between the types of action?
- What is representation information?
- How can I use the model in practice?
- What are the benefits of this model?
- Where can I learn more about the DCC Curation Lifecycle Model?
- Can I suggest additions/changes to the model?
Q1. What is the DCC Curation Lifecycle Model?
The DCC Curation Lifecycle Model provides a graphical, high level overview of the stages required for successful curation and preservation of data from initial conceptualisation through the iterative curation cycle. The model can be used to plan activities within a specific research project, organisation, or consortium to ensure all necessary stages are undertaken, each in the correct sequence. It is important to note that the description, preservation planning, community watch, and curate and preserve elements of the model should be considered at all stages of activity.
Q2. I create data, why is the model relevant to me?
It is a common misconception that data is created or captured and then passed on to someone else to curate. In fact, much of the most crucial information required for effective long-term curation and reuse must be captured at the conceptualisation and collection stages.
Information captured at the conceptualisation stage may include references to funding requirements and specific research aims, planning for the creation and/or collection of data (including capture methods and storage options), as well as any legal constraints that will affect the use of the data created/collected. This is especially important with regards to medical research projects. Information about the data created/collected may include allusions to data capture tools and calibrations and the use of particular schemas for recording administrative, descriptive, structural and technical metadata.
Therefore primary research which creates data of any kind is an intrinsic part of the curation lifecycle because the way in which the data (and its associated representation information and metadata) are designed and captured/created affects:
- How meaningful the data is to other users
- Whether it can be accessed, shared, and re-used in the short or long-term
- Whether the data may or may not be selected for ingest into an archive (i.e., does the data conform to archival standards, can the data be stored and preserved?)
- Which transformations can be performed on the data (e.g. migration to new file formats)
- How easily other researchers can find and understand the data for reuse
- Whether the data can be proven to be authentic and have integrity (i.e. is what it purports to be and has not been changed or tampered with since creation, a crucial characteristic of scientific data)
Be they simple or complex objects or structured data such as databases, data — once conceived and created — become the centre of the Curation Lifecycle
Q3. I'm a data archivist, why is the model relevant to me?
The DCC Curation Lifecycle Model supports activities undertaken by data archivists and preservation experts, and was designed in consultation with practitioners and experts at all stages of the curation cycle.
The models shows the logical sequence of receiving, appraising, selecting (or disposing of) data, followed by ingest and subsequent actions such as preservation, storage, access, and possibly transformations or reappraisals of the data. The model allows curators to identify potential weaknesses in policies, or gaps in the archival chain. It also identifies ongoing concerns such as community watch which could be incorporated into working practice, and identifies other stakeholders as sources or users of data, or as people who could pick up the process where your institution's responsibilities end.
Data archiving both preserves and adds value to data. For example:
- Selection decisions affect which data are kept in the long term, and therefore which data are accessible to users
- Ingest and preservation action can lead to the addition of administrative metadata which describes the curation chain
- Data can be transformed into new formats
- Data are placed in a wider context in terms of their long-term management through, for example, the addition of annotations or developing relationships with other datasets
Q4. I want to reuse other people's data, why is the model relevant to me?
It is in the interest of data users to be able to access high-quality data which can be proven to be authentic and have integrity. The ways in which data can be accessed is dependent on good practice in curation — for example, retrieval and querying of data is affected by the how they have been described in the creation and preservation stages of the cycle. Furthermore data use can produce new results, which themselves need to be curated, so data users become data creators and feed their research back into the cycle.
The DCC Curation Lifecycle Model supports data use and reuse by:
- Ensuring that steps are taken to make the data available in the first place
- Ensuring that data and their descriptions are in forms that are both accessible to and understandable by users
- Protecting data against unauthorised use by maintaining legal constraints or use rights from creation, through curation, to delivery
- Providing the means to assure users of data integrity and authenticity
Q5. What is the difference between the types of action?
Full lifecycle actions are shown in concentric rings around the data objects at the centre of the model. These are activities which take place at any time during the digital curation lifecycle and are relevant to many different sequential actions. For example, preservation planning should be taken into account as the data is conceptualised, when it is being preserved, and when it is used and re-used. As different people may be responsible for different steps in the lifecycle, there is a risk of repeated effort and an opportunity for different roles to collaborate when undertaking full lifecycle actions, for instance, sharing data on community needs between data creators, archivists, and users.
Sequential actions are the steps which are repeatedly taken to ensure that data is curated according to best practice. This sequence is not simply performed once from start to finish but forms the basis of the curation chain and continues as long as data is being curated. Re-use and transformation of data can lead to the creation of a subset, by selection or query, or create newly derived results which themselves need to be curated.
Occasional actions are those which interrupt or reorder the sequential actions as a result of a decision. For example, upon appraisal it may be decided that the data in question does not fit the remit of a digital repository in which case data may be transferred to another archive, repository, data centre or other custodian. In some instances data is destroyed, perhaps for legal reasons. Other occasional actions are the reappraisal of data which fails validation procedures or the migration of data to a different format to protect it against hardware or software obsolescence.
Q6. What is representation information?
Representation information is any information required to understand and render both the digital material and the associated metadata. Digital objects are stored as bitstreams which are not understandable to a human being without further data to interpret them. Representation information is the extra structural or semantic information which converts raw data into something more meaningful. For example, structure information could tell a computer to interpret a string of bits as ASCII characters, and semantic information could explain what a particular mathematical symbol means. Representation information should contain as much structure and semantic information as is required for a defined community (or Designated Community in OAIS terminology) to access the information stored within a digital object. The term can be applied to all levels of abstraction and refers to both the structural and semantic composition. It can therefore be recursive, and is dependent on the knowledge base of the designated community.
Representation information is not the same thing as metadata which describes data in administrative, descriptive, technical, structural and preservation terms. In the Lifecycle model, metadata is covered under the Description term.
For more information on representation information, see our DCC Digital Curation Manual Instalment on Representation Information, Reference Model for an Open Archival Information System (OAIS), Section 2.2.1 [external], and Alan's notes and thoughts about digital preservation, OAIS 7: Representation Information [external].
Q7. How can I use the model in practice?
- Firstly, identify which ongoing, sequential, or occasional activities you are currently undertaking. Examine how these stages interact with other stages and stakeholders. Which other steps are you engaged with indirectly (for example, perhaps you use data provided by a digital repository)?
- Identify your needs in terms of data curation. Which steps are problematic or inefficient for you (for example, do you have any problems describing or retrieving data)? Does the curation cycle break at any point, risking your data, perhaps due to gaps in provision at your institution? Are there any problems in your interactions with other stakeholders who form part of the chain?
- Work with the DCC to establish whether there are any existing tools or services which you could make use of to improve the quality and efficiency of your curation activities. Express your requirements for other tools (which perhaps don't exist yet). Look at your institutional policies for data curation — can they be improved? Even if you don't think of yourself as a curator, your requirements can help to shape policy.
- Work locally and with the DCC to identify collaborators who could potentially provide the services that are outside your remit/expertise.
Q8. What are the benefits of this model?
The Lifecycle model enables the mapping of granular functionality onto a series of practical activities which allows creators, curators, and re-users of data to identify where they themselves fit into the bigger picture. By applying the lifecycle model to your own working practices, you will be able to identify whether additional steps are required, if there are 'missing links' in your data curation processes, whether some steps can be eliminated for your particular working practices, and to define specific roles and responsibilities within your project/institution across the different stages. By separating discrete stages the model also supports the identification of collaborators in the process of curation (for example, allowing a data creator to work with a repository in order to design the most appropriate metadata for the data objects instead of 'going in blind'). The sequential stages encourage the documentation of processes and policies by and between different stakeholders, and also support the building of frameworks of standards and technologies, and identifies needs for particular tools and services to support data curators at every level.
Q9. Where can I learn more about the DCC Curation Lifecycle Model?
A poster of the DCC Curation Lifecycle Model is available. In addition, why not attend one of our DCC Information Days (which can be hosted at your own institution) to help contextualise the range of activities, roles and responsibilities, and tools available for your particular domain? To find out more, write to us.
Q10. Can I suggest additions/changes to the model?
The model has been developed through internal and public consultation with experts and practitioners in all stages of digital curation. However, we realise that the model will evolve over time to reflect actual working practices and we value your contributions and suggestions for improvements. If you have any suggestions for improvements let us know.
- Home
- Digital curation
- About us
- News
- Events
- Resources
- Briefing Papers
- Introduction to Curation
- Annotation
- Appraisal and Selection
- Curating Emails
- Curating e-Science Data
- Curating Geospatial Data
- Data Accreditation
- Data Citation and Linking
- Data Protection
- Database Archiving
- Digital Repositories
- Freedom of Information
- Genre Classification
- Interoperability
- Persistent Identifiers
- Trust Through Self Audit
- Using OAIS for Curation
- Web 2.0
- What is Digital Curation?
- Making the Case for RDM
- Research Data Readiness
- Legal Watch Papers
- Standards Watch Papers
- Technology Watch Papers
- Introduction to Curation
- How-to Guides
- Curation Reference Manual
- Peer review
- Editorial Board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Metadata
- Ontologies
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- Chapters in production
- Curation Lifecycle Model
- Policy and legal
- Data Management Plans
- Tools
- Case studies
- Repository audit and assessment
- Standards
- Publications and presentations
- Roles
- Curation journals
- Informatics research
- External resources
- Briefing Papers
- Training
- Projects
- Community
In this section
Open Science case studies
Open Science case studies
Can openness among researchers benefit science? Read more about the three-month study funded by RIN and NESTA, which examined the motivation for – and advantages of – sharing data, and records of the research process and results.
