NLNZ Metadata Extraction Tool
The Metadata Extraction Tool automatically extracts a limited set of metadata from the headers of digital files; it has the capability to process both individual files and larger batches. The Tool outputs this information as XML, with the goal of facilitating transfer into a preservation metadata repository.
Provider
The National Library of New Zealand (NLNZ)
Licensing and cost
Apache Public License version 2.0 – free.
Development activity
Version 3.6GA was released in June 2014.
The initial version of the tool was released in 2003; redevelopment for version 3 began in 2007. Contact information on the tool's site implies ongoing support; no firm information is available about ongoing development.
Platform and interoperability
The software uses Java (1.4 or later if building from source, 1.6 if using precompiled binaries) and XML, and has been tested in Windows and Linux/Unix environments.
Functional notes
The Metadata Extraction Tool uses a library of ‘adapters’ to extract metadata for specific file types. Adapters have been created for the following formats: BMP, GIF, JPEG and TIFF; MS Word, Word Perfect, Open Office, MS Works, MS Excel, MS PowerPoint, and PDF; WAV, MP3, BWF, and FLAC; HTML and XML; and ARC. If the file type is unknown the Tool applies a generic adapter, which extracts a limited amount baseline metadata.
The application opens all files as read-only, ensuring the integrity of original files.
Documentation and user support
The documentation page on tool’s site includes user and installation guides, as well as a developer guide.
Users can report bugs through the Sourceforge site, which also lists a contact email.
Usability
The tool has both a GUI and command line interface.
Expertise required
Installation and configuration require solid knowledge of application design and technologies. Users should have comprehensive knowledge of metadata standards and formats, particularly regarding preservation metadata.
Standards compliance
The Metadata Extraction Tool currently outputs its XML files using the NLNZ preservation metadata schema; however, the software can be configured to support other schemas.
Influence and take-up
Sourceforge statistics show over 98,000 downloads since 2007. Version 3.6GA was downloaded over 7500 times between June and November 2014.
- Home
- Digital curation
- About us
- News
- Events
- Resources
- Briefing Papers
- Introduction to Curation
- Annotation
- Appraisal and Selection
- Curating Emails
- Curating e-Science Data
- Curating Geospatial Data
- Data Accreditation
- Data Citation and Linking
- Data Protection
- Database Archiving
- Digital Repositories
- Freedom of Information
- Genre Classification
- Interoperability
- Persistent Identifiers
- Trust Through Self Assessment
- Using OAIS for Curation
- Web 2.0
- What is Digital Curation?
- Common Directions in Research Data Policy
- 5 Steps to Research Data Readiness
- Citizen Science
- Making the Case for RDM
- Legal Watch Papers
- Standards Watch Papers
- Technology Watch Papers
- Introduction to Curation
- How-to Guides & Checklists
- Appraise & Select Research Data for Curation
- Cite Datasets and Link to Publications
- Develop RDM Services
- Develop a DMP
- Discover Requirements
- Five Steps to Decide What Data to Keep
- Five Things You Need to Know About RDM and the Law
- License Research Data
- Track Data Impact with Metrics
- Using RISE
- Where to keep research data
- Write a Lay Summary
- Developing RDM Services
- Reviewing research data platform capabilities at CISER
- Using EPrints to Build a Repository for UEL
- Assigning DOIs at Bristol
- DMPs in the Arts and Humanities
- Improving RDM at Monash
- Improving Research Visibility
- Increasing Participation in Training
- RDM Training for Librarians
- RDM strategy: moving from plans to action
- Storing and Sharing Data in Hull
- Curation Lifecycle Model
- Curation Reference Manual
- Peer review
- Editorial Board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Automated Metadata Generation
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Metadata
- Ontologies
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Scenarios for Projects Producing Digital Resources
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- Scientific Metadata
- The Role of Microfilm in Digital Preservation
- Chapters in production
- Policy and legal
- Data Management Plans
- Tools
- Case studies
- Repository audit and assessment
- Standards
- Publications and presentations
- Roles
- Curation journals
- Informatics research
- External resources
- Online Store
- Briefing Papers
- Training
- Projects
- Community
- Tailored support
