iPres 2009: Kejser on Danish cost model on migration
[CR: missed the start of this posting the last one!]
Using a cost model for digital curation, based on the functional breakdown from OAIS. Multiply break down activities until get to costable components; loos rather frightening. Have use case for digital migration. Cost factors include format interpretation, software provision (development of reader, writer & translator). Interesting data in person weeks for development of migration, eg TIFF to PDF/A as 34.7 person weeks (!!)
Reporting results of some earlier stuff; A-archives dating 1968-1998; very heterogeneous; B & C archives more recent and more homogeneous. Shows results from model predictions and actual costs, differences mostly because the A archives were so hard. Also, for the better archives, the mode did well overall but under-estimated some parts and over-estimated other parts.
Second test case was migration of 6 TB of data in 2000 files (very big ones: 300 MByte each). They bought software; the model over-estimated the “development” time on this basis, but under-estimated the processing, perhaps because of the very big files; throughput was very low.
Overall, they found that detailed cost factors make the model not an accurate predictor (but still useful). Precision an issue; models are inaccurate per se, but sometimes give impression of accuracy.
Searching for studies on format life expectancy and migration frequency [longer and less in my view].
Question: how about software re-use? They cost on a first mover basis. Also migration tools do also become obsolete.
Question: why did you think migrating from PDF was necessary? Hardly a format at risk. Turns out to be a move from proprietary to non-proprietary.
Question on scaling: thousands to hundreds of millions of objects; will these apply. Answer was that they will. [CR: doubt this; biggest flaw in LIFE so far has been devastating scaling problems.]
- Home
- Digital curation
- About us
- News
- Events
- Resources
- Briefing Papers
- Introduction to Curation
- Annotation
- Appraisal and Selection
- Curating Emails
- Curating e-Science Data
- Curating Geospatial Data
- Data Accreditation
- Data Citation and Linking
- Data Protection
- Database Archiving
- Digital Repositories
- Freedom of Information
- Genre Classification
- Interoperability
- Persistent Identifiers
- Trust Through Self Audit
- Using OAIS for Curation
- Web 2.0
- What is Digital Curation?
- Making the Case for RDM
- Research Data Readiness
- Legal Watch Papers
- Standards Watch Papers
- Technology Watch Papers
- Introduction to Curation
- How-to Guides
- Curation Reference Manual
- Peer review
- Editorial Board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Metadata
- Ontologies
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- Chapters in production
- Curation Lifecycle Model
- Policy and legal
- Data Management Plans
- Tools
- Case studies
- Repository audit and assessment
- Standards
- Publications and presentations
- Roles
- Curation journals
- Informatics research
- External resources
- Briefing Papers
- Training
- Projects
- Community
