Because good research needs good data

IDCC13 Preview: Hans Pfeiffenberger

The 8th International Digital Curation Conference is just around the corner and we are anticipating great discussions about data science when our international audience gather in Amsterdam in January 2013. In the ninth and last of our series of preview posts, Hans Pfeiffenberger from the Alfred ...

Magdalena Getler | 08 January 2013

The 8th International Digital Curation Conference is just around the corner and we are anticipating great discussions about data science when our international audience gather in Amsterdam in January 2013.

In the eighth of our series of preview posts, Hans Pfeiffenberger from the Alfred Wegener Institute for Polar and Marine Research, gives us his insights into some of the current issues... 

Your presentation will focus on preservation of data in the marine sciences. Are there any specific messages would you like people to take away from your talk?

There is much more to the sea than meets the eye - literally! We have a wild mixture of scientific topics and data-types to deal with. And we must find ways not to break the chain of digital evidence - from the sensor onboard a ship or undersea laboratory to the repository publishing quality assured primary data or data products. This will give this data so much more impact ... with less effort.

We address three areas in our call this year - Infrastructure, Intelligence and Innovation. What do you see as the most pressing challenges across these?

The challenge to scale up - before all and every bit of development is finished! And to make sure research is made more effective and efficient in the process. It shows nicely in the programme: Some usage of clouds on the horizon? Discussing cost. Heavy emphasis how to roll out training. And talking about reality - which always interferes with good ideas.

And in terms of opportunities, do you see potential in data science as a new discipline?

I don't know if it can be a discipline - I am in favour of defining the data scientist as a disciplinary scientist (say: biologist) who has specialized in data management and analysis techniques. About one in ten to twenty scientists in a group/institute. So the question is: Who trains these people? Or: Who organizes their training? When in their career?

The conference theme recognises that the term ‘data’ can be applied to all manner of content. Do you also apply such a broad definition or are you less convinced that all data are equal?

At a low (bit) level they are certainly equal. But obviously, text, (moving) images and numeric data (say: water temperatures in the Antarctic) are very different beasts - and to be described by very different metadata (What do they share beyond the mandatory author and title fields?). And there is the area of rights (management) - also very different. I wouldn't want to be encumbered by a system for elaborate movies management when I simply want to store and publish facts (which are not copyrightable ...)

You’ll undoubtedly have looked at the programme in preparation for IDCC. Which speakers / sessions are you most looking forward to?

The RDA intrigues me much. I am also looking forward to discussing " Processes and Procedures for Data Publication" with Sarah Callaghan. And I simple can't decide between "confidentiality" and "education". Which bomb will Herbert Van de Sompel set off?
         


Hans Pfeiffenberger's presentation is on Day 1 of the conference, 15 January. Programme is available.

If you have not already done so, you can still book your place

Please share your attendance at IDCC13 via Lanyrd