SafeArchive

SafeArchive is a policy-driven auditing tool developed to make LOCKSS (Lots Of Copies Keeps Stuff Safe) networks easier to monitor and manage, though there are plans to extend it to other replication networks. It uses machine-readable policies (rules) to determine, at a particular node, which content to offer to the network for replication and which content from the network to replicate. It provides documentation of all such rules in place across the network, and checks compliance in practice. It can be used to answer questions such as how many times a particular resource has been replicated, how recently it has been replicated or validated, and whether all offered resources have been replicated. It also provides tools for detecting and repairing inconsistencies at each node.

Provider

SafeArchive was developed by the partners involved in Data-PASS (Data Preservation Alliance for the Social Sciences) with funding from the Institute of Museum and Library Services and NDIIPP (National Digital Information Infrastructure and Preservation Program of the Library of Congress). It is managed by IQSS (Institute of Quantitative Social Science), Harvard University.

Licensing and cost

The system is free to download and use. The source code has been released under the Apache Licence version 2.0.

Development activity

Version 2.0 was released on 17 July 2013. The source code repository shows the system has been developed further.

Platform and interoperability

SafeArchive runs on a RHEL 6 or CentOS 6 server with at least 4 GB memory and 60 GB storage. It depends on MySQL version 5, Java version 1.6, Glassfish version 3, Apache Commons Logging and Subversion. An Amazon Machine Image is available for rapid deployment on the Amazon Web Services cloud.

It should be used in conjunction with a private LOCKSS network with at least one actively running member. SafeArchive should be able to access the LOCKSS daemon status, and read the debug account credentials from the LOCKSS XML file over the Web.

Documentation and user support

A set of task-based user manuals and installation guides are available from the SafeArchive website. Technical support is available from the Archival Replication Technology Google Group.

Usability

Interaction with the system is via a web-based interface.

Expertise required

Installation by hand requires some confidence working with Linux servers, though detailed guidance is provided. A script is provided that automates most of the process in the context of a fresh installation of the RHEL 6 Server operating system. Setting up the Amazon Machine Instance requires no special expertise.

Using the system also requires no special expertise, though prior familiarity with LOCKSS would be an advantage.

Standards compliance

The audit reports generated by SafeArchive are suitable as evidence in the context of the Data Seal of Approval, TRAC (Trustworthy Repositories Audit and Certification) and ISO 16363.

The system supports syndication of content via OAI-PMH and the protocols of the Dataverse Network.

Influence and take-up

The current level of use is unknown; SafeArchive has however been used by Data-PASS (which includes ICPSR, the Electronic and Special Media Records Service Division of NARA, and social science institutes at Harvard University, the University of North Carolina, Chapel Hill, the University of Connecticut, the University of California, Los Angeles, and Syracuse University), the USDocs Digital Federal Depository Library Program, and the Council of Prarie and Pacific University Libraries.

Last reviewed: 
21 November, 2014