Automated Metadata Generation
Automated metadata extraction is still not very widely used in digital preservation workflows. However, automated extraction can not only help improve efficiency in time and resource management within preservation systems, but also alleviate the problems associated to the “metadata bottleneck”. The successful application of automated metadata extraction requires informed solutions that are based on a broad understanding and integration of existing methods and tools. In particular, solutions should include the identification of weak links in the metadata collection workflow to highlight the components requiring further development, and be firmly grounded in strict quality control at each stage of extraction.
This chapter aims to provide an overview of existing methods and tools, paying special attention to the quality-related issues (in particular, on the precision and recall of extracted metadata and the need for human intervention). The chapter presents examples of ingest processes and illustrates the essential role of automated metadata extraction as a part of the ingest process.
In addition, this instalment will also justify the use of automated metadata extraction as part of a metadata enrichment scenario.
The chapter is relevant to pre-ingest and ingest within digital preservation workflow.
Key Points
- Overview of methods for automated metadata extraction
- file types
- elements being extracted
- quality metrics
- Case studies in use of automated metadata extraction in digital preservation lifecycle
- Automated metadata extraction as part of the ingest
- Automated metadata extraction for metadata enrichment
- Home
- Digital curation
- About us
- News
- Events
- Resources
- Briefing Papers
- Introduction to Curation
- Annotation
- Appraisal and Selection
- Curating Emails
- Curating e-Science Data
- Curating Geospatial Data
- Data Accreditation
- Data Citation and Linking
- Data Protection
- Database Archiving
- Digital Repositories
- Freedom of Information
- Genre Classification
- Interoperability
- Persistent Identifiers
- Trust Through Self Assessment
- Using OAIS for Curation
- Web 2.0
- What is Digital Curation?
- Common Directions in Research Data Policy
- 5 Steps to Research Data Readiness
- Citizen Science
- Making the Case for RDM
- Legal Watch Papers
- Standards Watch Papers
- Technology Watch Papers
- Introduction to Curation
- How-to Guides & Checklists
- Appraise & Select Research Data for Curation
- Cite Datasets and Link to Publications
- Develop RDM Services
- Develop a DMP
- Discover Requirements
- Five Steps to Decide What Data to Keep
- Five Things You Need to Know About RDM and the Law
- License Research Data
- Track Data Impact with Metrics
- Using RISE
- Where to keep research data
- Write a Lay Summary
- Developing RDM Services
- Reviewing research data platform capabilities at CISER
- Using EPrints to Build a Repository for UEL
- Assigning DOIs at Bristol
- DMPs in the Arts and Humanities
- Improving RDM at Monash
- Improving Research Visibility
- Increasing Participation in Training
- RDM Training for Librarians
- RDM strategy: moving from plans to action
- Storing and Sharing Data in Hull
- Curation Lifecycle Model
- Curation Reference Manual
- Peer review
- Editorial Board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Automated Metadata Generation
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Metadata
- Ontologies
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Scenarios for Projects Producing Digital Resources
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- Scientific Metadata
- The Role of Microfilm in Digital Preservation
- Chapters in production
- Policy and legal
- Data Management Plans
- Tools
- Case studies
- Repository audit and assessment
- Standards
- Publications and presentations
- Roles
- Curation journals
- Informatics research
- External resources
- Online Store
- Briefing Papers
- Training
- Projects
- Community
- Tailored support
In this section
- Briefing Papers
- How-to Guides & Checklists
- Developing RDM Services
- Curation Lifecycle Model
- Curation Reference Manual
- Peer review
- Editorial Board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Automated Metadata Generation
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Metadata
- Ontologies
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Scenarios for Projects Producing Digital Resources
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- Scientific Metadata
- The Role of Microfilm in Digital Preservation
- Chapters in production
- Policy and legal
- Data Management Plans
- Tools
- Case studies
- Repository audit and assessment
- Standards
- Publications and presentations
- Roles
- Curation journals
- Informatics research
- External resources
- Online Store
