Because good research needs good data

Review: DCC Roadshow in Oxford

Steve Walsh | 22 September 2011

As a rather out-of-practice ecologist I was looking forward to the DCC Roadshow with a little trepidation. Would the material be aimed at digital curators writing code to fine tune their repository functionality? Or would it be of help to someone like me… drafted in from a different area to work for a few months trying to get to grips with the management and sharing of spatial data for the JISC-funded IGIBS project?

Well, all concerns were rapidly dispelled as the first day got underway and the programme was delivered with a broad church in mind. We were accommodated in the rather glorious Wadham college and I began to feel like a well-cared for data object. Securely stored in safe surroundings and my metadata had obviously been carefully read as I was now being professionally transformed into a more up-to-date format, just before my existing file type became redundant.

It  was clear that Oxford had taken data management very seriously and was a centre for developing new ideas and services to aid their researchers and the wider community. Particularly inspiring was a presentation from David Shotton who had taken his position as Director of the Bioinformatics Research Group and added considerable professional weight to the importance of developing a professional data management infrastructure. He was leading a project to manage, publish and cite datasets. Using the theme of infectious diseases was the icing on the cake for me. The freeing of data on this subject is providing both an academic good and a very obvious public good, especially to countries with less ability to access the journals and data than more developed regions with perversely less public need for the data.

Themes that came out of the day were the importance of standards in metadata to bridge across disciplines and the need for institutional repositories to hold the “long tail” data that most researchers produce en route to paper publication. It wasn’t until later that evening, when I found myself not in the pub but spending an hour watching Bryan Heidorn on YouTube presenting a version of his Curating the Dark Data in the Long Tail of Science paper, that I realised how the day had really got my remaining grey cells up and firing.

Day two comprised mainly of group work where I began to understand the roles and problems facing many of the different data managers attending the Roadshow. The need for researchers to fully engage with the data management service providers became apparent as the group sessions developed. There are whole armies (well maybe platoons) of professional data experts trying to herd the cat like academic data producers into the pen of good practice. From meeting the Roadshow participants and coming to understand their roles and expertise,  it is hard to understand how there could be a problem with data management. It was not until I came back to Aberystwyth’s IGES department and saw individuals trying to finish off theses and papers as well as prepare for the new academic term that I really understood the problem. As was suggested in day one of the event, it is only the moment when data management arrives at the top of an individual’s priority list that it actually gets done. Since then been thinking about trying to start a project using “Nudge” theory to move data management up the priority tree in an academic department.

Day three was focused on the tools available to aid the researcher in their data management. The DCC's data management planning tool shone out as a valuable asset for meeting the increasing need of funding bids to meet the demands/requests of the funding councils. For me it will provide the framework on which to hang my thoughts and conclusions about managing spatial data and provides some of the structure for my work on the IGIBS project.

Now back at Aberystwyth and slipping back into my routine the measure of success, at least from my perspective, for the Roadshow will be both the new sections in my final report that I feel able to write and the changes in working practices that come about as a result of my attendance. Well it’s too early to say but, as I have just moved metadata creation for some soon to be used images and shape files to the top of my priority tree, maybe the digital curation my brain received from the DCC is beginning to show.

Steve Walsh

Interoperable Geospatial Data for Biosphere Study( IGIBS) Project
Institute of Geography and Earth Science
Aberystwyth University