Because good research needs good data

IDCC13 Preview: Herbert Van de Sompel

The 8th International Digital Curation Conference is just around the corner and we are anticipating great discussions about data science when our international audience gather in Amsterdam in January 2013. In the sixth of our series of preview posts, Herbert Van de Sompel from Los Alamos Nationa...

Magdalena Getler | 17 December 2012

The 8th International Digital Curation Conference is just around the corner and we are anticipating great discussions about data science when our international audience gather in Amsterdam in January 2013.

In the sixth of our series of preview posts, Herbert Van de Sompel from Los Alamos National Laboratory, gives us his insights into some of the current issues... 

Your presentation will focus on the Web as infrastructure for scholarly research and communication. Are there any specific messages would you like people to take away from your talk?

I am not sure yet what exactly I will discuss in my presentation. After all, I still have a few weeks to think about that. But I anticipate that, generally, I will observe some significant changes that are occurring in scholarly research and communication, specifically the increasingly dynamic nature of scholarly output, the rapid adoption of social media as a communication channel, and the addition of a machine-actionable component to the scholarly record. There’s some interesting observations and challenges related to these changes that I will likely touch upon. In doing so, I will come from the perspective that, among others for reasons of sustainability, the Web with all its strengths and constraints provides the basic infrastructure that needs to be leveraged to meet the challenges at hand. But, please do not hold it against me if my presentation ends up taking another direction.

We address three areas in our call this year - Infrastructure, Intelligence and Innovation. What do you see as the most pressing challenges across these?

I am intrigued by how information across these three areas and across disciplines can be represented in an interoperable manner so as to allow seamless discovery, access, use, interpretation, reuse, republication, etc. Interoperable in the sense that the information can just flow around and tools can consume it without the need for any significant intervention.  Information flows into a researcher’s tool to achieve a certain task just like power flows into her desk lamp to lighten up her day. That’s what we meant with “data as infrastructure” in the Riding the Wave report (http://www.grdi2020.eu/Pages/Unlock.aspx). 

And in terms of opportunities, do you see potential in data science as a new discipline?

I see people wrangling data all over the place and increasingly striving towards common methodologies and tools to do so, even across the boundaries of disciplines. I guess that makes it start to look like a discipline?

The conference theme recognises that the term ‘data’ can be applied to all manner of content. Do you also apply such a broad definition or are you less convinced that all data are equal?

I like the description used in the Blue Ribbon report on Sustainable Digital Preservation and Access (http://brtf.sdsc.edu/): Research data are the primary inputs into research, as well as the first-order results of that research.  I think that makes anything that can be analyzed and processed data. I find it interesting how things that originally were not necessarily considered data eventually can become it. Journals and Web Archives are examples that readily come to mind.

You’ll undoubtedly have looked at the programme in preparation for IDCC. Which speakers / sessions are you most looking forward to?

I’m afraid I need to admit I only just briefly looked now. It does look like a really interesting program, overall. I didn’t get to hear Chris Greer at the recent CNI meeting, so I’m looking forward to hearing about the status of the Research Data Alliance effort aiming for global interoperability of data infrastructures. With that regard, the session on Cross Disciplinary Data should also be interesting. Generally, I am very interested in all the talks related to Data Management, specifically those that focus on how to get efforts off the ground within institutions. We’ve been talking quite a bit about it in the Research Library where I work but somehow nothing substantial has emerged yet. We’re behind the curve in this area, and I’m looking for inspiration and guidance. I noticed there is a talk on cool persistent identifiers and there is no doubt I will be in that one, giving my long-standing obsession with identifiers. I also noticed there’s some great colleagues that I’ve collaborated with in the past on the program. I’m looking forward to catching up with them personally, and to hear from their presentations how they have kept themselves busy, recently. 


Herbert's keynote talk is on Day 2 of the conference, 16 January. Programme is available.

If you have not already done so, you can still book your place

Please share your attendance at IDCC13 via Lanyrd