Web Curator Tool
The Web Curator Tool (WCT) is a tool for managing the selective web harvesting process. It is designed for use in libraries and other collecting organisations, and supports collection by non-technical users while still allowing complete control of the web harvesting process.
The WCT Project is a collaborative effort by the National Library of New Zealand and the British Library, initiated by the International Internet Preservation Consortium. The WCT software was developed by Sytec Resources Ltd and is now available under the terms of the Apache Public License.
Functionality:
The WCT is a tool for managing the selective web harvesting process. The tool's workflow encompasses the following tasks:
- Harvest Authorisation: seeking and recording permission to harvest web material, and to make it accessible to the general public.
- Selection and scoping: determining what material should be harvested, be it a web site, a web page, a partial web site, a group (or collection) of web sites, or any combination of these.
- Scheduling: determining when a harvest should occur, and when it should be repeated.
- Description: describing harvests with basic Dublin Core metadata, and other specialized fields (or a by a providing a reference to an external catalogue).
- Harvesting: the Web Curator Tool will download the selected web material at the appointed time using the Internet Archive's Heritrix web crawler -- each installation can have multiple harvesters on different machines, each which can perform several harvests simultaneously.
- Quality Review: tools are provided for making sure the harvest worked as expected, and correcting simple harvest errors.
- Endorsing and submitting: if the harvest was a success, it is endorsed then submitted to an external digital archive.
Level of Expertise:
The WCT is designed for non-technical users in libraries and other collecting institutions who need to capture web material for archival purposes. It is designed to run in an enterprise setting, and would normally be installed by a system administrator (it is not a desktop application).
Web Curator Tool fits in the following categories
- Home
- Digital Curation
- About Us
- News
- Events
- Resources
- Briefing Papers
- Introduction to Curation
- Annotation
- Appraisal and Selection
- Curating emails
- Curating e-science data
- Curating geospatial data
- Data accreditation
- Data Citation and Linking
- Data protection
- Database archiving
- Digital repositories
- Freedom of Information
- Genre classification
- Interoperability
- Persistent Identifiers
- Trust through self audit
- Using OAIS for curation
- Web 2.0
- What is digital curation?
- Legal Watch Papers
- Standards Watch Papers
- Technology Watch Papers
- Making the Case for RDM
- Introduction to Curation
- How-to Guides
- Curation Reference Manual
- Peer review
- Editorial board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Metadata
- Ontologies
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- Chapters in production
- Curation Lifecycle Model
- Policy and legal
- Data Management Plans
- Case studies
- Tools and applications
- Standards
- Publications
- External resources
- Roles
- Curation journals
- Informatics research
- Briefing Papers
- Training
- Projects
- Community
- Contact Us
Role based resources
Closing the Digital Curation Gap
Closing the Digital Curation Gap
Data curation is often carried out by information practitioners with little training or experience. The Closing the Digital Curation Gap (CDCG) collaboration united those at the cutting edge of digital curation research, development, teaching and training with the aim of creating good practice guides covering all aspects of data curation.
