PERICLES Extraction Tool

The PERICLES Extraction Tool (PET) captures information about the environment in which digital objects are created and modified. The nature of this information depends on the extraction modules used and how they have been configured. The following are some examples of information that can be collected:

  • List of open files at any given time
  • Changes made to certain tracked files, and when they occurred
  • System and platform information
  • External font dependencies for PDFs

Modules may be used on demand ('snapshot mode') or set to work continuously in the background (as 'daemons') to monitor changes, logging the information for later use. The information gathered is intended to support the validation, evaluation, preservation and re-use of the digital objects in question. The tool is aimed at researchers or workstation administrators.

Provider

The tool was developed by the PERICLES, a project funded under the 7th Framework Programme of the European Commission.

Licensing and cost

The tool is free of charge to download and use. The source code is released under the Apache Licence version 2.0.

Development activity

Version 1.0 of the tool was released on 31 October 2014. It will continue to be developed by the PERICLES project, which runs until February 2017.

Platform and interoperability

The tool requires Java 7 or higher to be installed. It is otherwise platform independent, though by their nature some extraction modules can only be used on certain operating systems. For example, the 'lsof' module may be used on UNIX-like operating systems but not Windows, for which a roughly equivalent 'handle' module is provided. Results may be exported as XML or JSON.

Documentation

A Quick start guide is provided for users, alongside an overview of what the tool does, how it works, and how it might be used. There is also more detailed documentation aimed at developers.

Usability

The tool has a graphical user interface that will suit most users, though it can also be used from the command line. The developers warn that it has not been extensively tested and may require careful configuration to achieve useful results in any given context.

Expertise required

Running the software is straightforward. The configuration of modules and daemons, however, involves editing JSON files and may thus seem unintuitive.

Standards compliance

The tool relies on commonly supported data structures like JSON and XML, and supports several storage backends.

Last reviewed: 
14 November, 2014