Navigating the uncertain waters of data archiving and curation
The DCC recognises that within institutions research data management initiatives have emanated from different departments (the library, the research office, IT services etc.) and from those with different roles. Many of the individuals now working in this area have had to learn new skills and deal with new challenges. One such individual is Gaz J Johnson (@llordllama), Repository Manager at the University of Leicester. Gaz has kindly written a guest blog post about his experiences of ‘navigating the uncertain waters of data archiving and curation’.
As those who might have read some of my posts on the University of Leicester’s Library blog will be aware, I’m one of a number of repository managers and librarians who are taking their first tentative steps into the terrifyingly uncertain waters of research data archiving.
Did I say terrifying? I meant of course exciting! Although when I start speculating about the scale and complexity of research data that my institution creates in the course of its average working day, a little of the mind-killing fear does return. And these waters seem awfully deep. And wide. And just how am I supposed to navigate them?
That said, archiving research data isn’t for me a total unknown quantity. Over the years on my local institutional repository the Leicester Research Archive we’ve had a handful of code and the occasional data set to store from academics. Nothing massive and to be honest nothing especially complex, and mostly simple Access-based databases or CSV delineated outputs.
Certainly they’ve not been at the forefront of my mind when looking to expand our content. I almost typed full text content there which is a real give away as to what we have been focussing on collecting.
Interestingly though in the last couple of years I’ve been having more and more conversations with academics looking for somewhere to host their project data outputs. I’m hearted they’ve looked towards the repository and our expertise with digital curation, although I’ve been acutely aware that practically there must be a lot of bridges to cross.
These calls have not been a flood, more of a trickle if I were honest. However, the frequency with which my phone buzzes and I’m suddenly talking to a concerned potential PI finishing off a funding proposal, which has a requirement to archive the data outputs, has slowly risen.
It’s something that’s not gone unnoticed in our institution, and with moves from funders like EPSRC to bring about a firm open data policy in the coming years suddenly we’re up against the clock.
Of course thinking about doing something or even having a response to a policy in place is one thing, having an operational response to cope with the practical side of things is another entirely, as any repository manager will tell you.
Personally I consider myself reasonably tech savvy, and while I’m pretty rusty at coding say, I am fairly up to date on a lot of the issues around running an effective open access repository.
But when I started to think at the start of the year about working with data archiving I had to take a long hard look at my skills and experiences and question if I was equipped to deal with the issue. Fundamentally yes, practically…well it came down to that often asked day one interview question – that is “If you were managing the institution’s data repository – what’s the first thing you’d need to do?”
I think I’ve gone a little way towards answering this through attendance at one of the DCC’s residential data management forums and following this up with a JISC/BL DataCite workshop. Both of these have been useful for a number of reasons.
Firstly, I’ve met a good cross section of people who seemed to be at various stages of trying to answer the same question; and with the exception of a crystallographer I met, most of them weren’t much further along than I.
One of the highlights of attending the workshops was in conversation with various people about the type, scale and complexity of just what they considered research data. Personally I’d not even begun to give any thought towards access or curation of non-born digital materials!
Secondly it’s raised my awareness of some of the projects and resources I might be able to tap into.
Certainly I suspect that the solution to managing research data is not going to come entirely from within any one institution or projects resources.
Finally, it’s given me an awareness of where I need some upskilling. Some of my repository and library born skills like metadata handling, collection management and information organisation are all going to come into play; as it seems are my finely honed advocacy skills (I remain unconvinced that the mass of academics will take to data archiving like ducks to water any more than they have to open access).
But beyond this there are clearly new areas I need to understand, or fresher concepts like minting DOIs that will be totally new.
From where I stand now I feel encouraged that rather than being on the shores of a vast unknown data archiving lake, I now have at least a canoe and a couple of trusty oars. There are going to be uncertain currents ahead, but at least now I can see how I might get to the other side.
But I suspect there’s still a whole lot of paddling to go!
- Digital curation
- About us
- Briefing Papers
- Introduction to Curation
- Appraisal and Selection
- Curating Emails
- Curating e-Science Data
- Curating Geospatial Data
- Data Accreditation
- Data Citation and Linking
- Data Protection
- Database Archiving
- Digital Repositories
- Freedom of Information
- Genre Classification
- Persistent Identifiers
- Trust Through Self Assessment
- Using OAIS for Curation
- Web 2.0
- What is Digital Curation?
- Common Directions in Research Data Policy
- 5 Steps to Research Data Readiness
- Citizen Science
- Making the Case for RDM
- Legal Watch Papers
- Standards Watch Papers
- Technology Watch Papers
- Introduction to Curation
- How-to Guides & Checklists
- Five Steps to Decide What Data to Keep
- Five Things You Need to Know About RDM and the Law
- How to Appraise & Select Research Data for Curation
- How to Cite Datasets and Link to Publications
- How to Develop RDM Services
- How to Develop a DMP
- How to Discover Requirements
- How to License Research Data
- How to Track Data Impact with Metrics
- Where to keep research data
- How to Write a Lay Summary
- Developing RDM Services
- Reviewing research data platform capabilities at CISER
- Using EPrints to Build a Repository for UEL
- Assigning DOIs at Bristol
- DMPs in the Arts and Humanities
- Improving RDM at Monash
- Improving Research Visibility
- Increasing Participation in Training
- RDM Training for Librarians
- RDM strategy: moving from plans to action
- Storing and Sharing Data in Hull
- Curation Lifecycle Model
- Curation Reference Manual
- Peer review
- Editorial Board
- Completed chapters
- Appraisal and Selection
- Archival Metadata
- Archiving Web Resources
- Automated Metadata Generation
- Curating Emails
- File Formats
- Investment in an Intangible Asset
- Learning Object Metadata
- Open Source for Digital Curation
- Preservation Metadata
- Preservation Scenarios for Projects Producing Digital Resources
- Preservation Strategies
- Principles for Enabling Access to Engineering Design Information Through Life
- Scientific Metadata
- The Role of Microfilm in Digital Preservation
- Chapters in production
- Policy and legal
- Five Steps to Developing a Research Data Policy
- Overview of funders' data policies
- Funders' data policies
- Institutional data policies
- Policy tools and guidance
- RDM guidance webpages
- Roadmaps to EPSRC Expectations
- Freedom of information FAQ
- MRC data plan FAQ
- Open source FAQ
- Data Management Plans
- Case studies
- Repository audit and assessment
- Publications and presentations
- Curation journals
- Informatics research
- External resources
- Tools & Services
- Guidance, Reports and Directories
- Projects and Initiatives
- Organisations and Networks
- Standards and Specifications
- Resources of Historical Interest
- Online Store
- Briefing Papers
- Forthcoming training events
- Request a training session
- Previous training events
- Training and reference materials
- Career profiles and related data management skills
- DC 101 training materials
- Disciplinary RDM training
- RDM for librarians
- Skills frameworks
- Data management courses and training
- Research Data Management Forum (RDMF)
- Interviews: Setting the Scene
- Social media directory
- DCC Associates Network
- DCC blogs
- Survey: Budgeting for RDM
- Tailored support