Digital Curation Centre logo

DCC Digital Curation Manual Instalments

Home > Resource Centre > Curation Manual Introduction > Instalments > Archiving Web Resources

Return to list of Instalments

Archiving Web Resources

Author:
Dave Thompson
Wellcome Trust

View this Instalment:

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland [external] licence.

Note: Opinions expressed are those of the individual author and do not necessarily represent the views of the DCC or the Partner Institutions.

Abstract:

The World Wide Web is among the most important information resources, and is certainly the most voluminous. In a relatively short time, it has become a vital medium for a range of academic and commercial publishers. However, until recently, little effort has been directed towards ensuring the long term preservation of the digital assets that reside on-line. The web's dynamic nature makes it prone to frequent changes, and without a means for capture and preservation it's likely that vast quantities of content will be lost forever. Since the web is home to a vast range of materials with widely varying characteristics in terms of formats, scale and behaviour there are inevitable issues that must be overcome to facilitate their collection, management and preservation.

Key Points:

  • Automation of harvesting
  • Deposit approach
  • Selection, negotiation and capture
  • Issues associated with the "deep" web
  • Existing initiatives/products (e.g. Internet Archive, NWA, PANDAS)
  • Legal implications
  • Collaboration and responsibility
  • Non-standard media types

Back to top