Because good research needs good data


Pre-Conference Workshops 1- 5

Monday 14 January 2013

Monday 14 January

Workshop 1- Matterhorn 2

09.30am  - 1.00pm  (Free of charge)

Community Capability Model Framework for Data-Intensive Research – Applying the Model                          

Joint Workshop : UKOLN University of Bath, Microsoft Research Connections, European Commission

Intended audience: researchers, digital repository managers, staff from library, information and research organisations, data curators, data centre managers, data scientists, research funding organisations and research networks.

Overview: Microsoft Research Connections and UKOLN are working in partnership on a project to develop a Community Capability Model Framework for Data-Intensive Research, building upon the principles described in The Fourth Paradigm. 

This joint workshop has three main aims: 1) to present the Community Capability Model Framework (CCMF), which encompasses a series of capability factors including human (data skills and collaborative approaches), environmental (legal and socio-ethical issues) and technical aspects (adoption of common data infrastructures, standards, formats and ontologies), 2) to review the prototype CCMF Checklist Tool, and 3) to demonstrate its value and applicability by funding bodies, policy makers and practitioners.

The ultimate aim is to provide a framework that is useful to various stakeholders, such as researchers, institutions and funders, in modelling a range of disciplinary and community behaviours associated with the adoption, usage, development and exploitation of e-infrastructure for data-intensive research.

This workshop will focus on presenting the CCMF White Paper plus demo of the Checklist tool, and perspectives from a leading European funder and research practitioner case study.

Objectives: The workshop will include a mix of presentations, group work and discussion, and will allow time for networking and collaboration. This will involve:

1.    Presentation of the White Paper including an introduction to the CCMF concept, a description of the scope and functions of the model informed by a series of mini case studies, a review of existing models, information about the detailed model with supporting visualisations plus a taxonomy of terms.
2.    Reviewing the CCMF Checklist Tool from the perspective of different stakeholders;
3.    Understanding how the CCMF can be applied by the community.


Participants will have an understanding of the developing CCMF and the methodology by which it has been derived;

Participants will have reviewed the CCMF Checklist and shared ways to enhance the Tool and will consider practical applications of the Model.

The Project team will receive community input to inform further CCMF development

Draft Programme

09.30   Coffee  

10.00   Welcome - Dr Liz Lyon, UKOLN, University of Bath & Kenji Takeda, Microsoft Research

10.05   Introducing the Community Capability Model Framework & White Paper - Alex Ball, UKOLN University of Bath

10.30   CCMF Checklist Tool: Demo and Review - Dr Manjula Patel, UKOLN University of Bath

11.00   Group work  

11.30   Coffee break   

11.45   Research funder perspective
Dr Carlos Morais-Pires, Head of Sector Research Data Infrastuctures European Commission     

12.15   Domain practitioner perspective
Dr Ilya Zaslavsky, Director of the Spatial Information Systems Lab,University of California San Diego

12.45   Feedback Session, Discussion & Next Steps   Liz Lyon & Kenji Takeda      

Close & Lunch with Co-workshop delegates



Monday 14 January

Workshop 2 - Matterhorn 2

1pm - 5pm (Free of charge)

Europe and the Research Data Alliance: a meeting of minds

Joint Workshop : UKOLN University of Bath, Digital Curation Centre, European Commission

Intended audience: researchers, digital repository managers, staff from library, information and research organisations, data curators, data centre managers, data scientists, research funding organisations and research networks

Draft Programme

13.00    Lunch with CCMF Co-workshop delegates    

14.00    Welcome    Dr Liz Lyon, UKOLN-DCC, University of Bath

14.05   The Research Data Alliance and European Data Infrastructure  Dr Carlos Morais-Pires, Head of Sector Research Data Infrastructures Unit, European Commission

14.30    EUDAT perspective    Dr Peter Wittenberg, Max Planck Institute

15.00    UK eScience perspective    Juan Bicarregui, Rutherford Appleton Laboratory, STFC

15.30    Coffee break    

16.00    DCC perspective    Kevin Ashley. DCC

16.30    Discussion & Next Steps    Liz Lyon, UKOLN-DCC

17:00    Close



Monday 14 January

Workshop 3 - Matterhorn 3

10.30 - 16.00 hrs (Free of charge)

Data Management Planning: what's happened, what's happening and what's coming next?

Organisers: Martin Donnelly, Digital Curation Centre and Carly Strasser, California Digital Library

This workshop will provide overview, analysis and opportunity for discussion of recent developments in data-related policies in the UK, US, Europe and Australia. Speakers and participants will include representatives of national data support services, DMP tool providers, and major research funders. The workshop will conclude with a discussion of expected shifts in data planning policy and practice over the coming year.

Who should attend?

Anyone involved in the research data management process, notably data professionals (researchers, data managers, support officers), research funders, government agencies, publishers, and librarians and repository managers. The workshop will be relevant to delegates from all parts of the world.

Speakers and slides:

  • Martin Donnelly, Digital Curation Centre, University of Edinburgh - overview
  • Veerle Van den Eynden, Economic and Social Data Service - UK context
  • Rachel Frick, Digital Library Federation - US context
  • David Groenewegen, Australian National Data Service - Australian context
  • Carlos Morais Pires, European Commission - European context
  • Kerry Miller, Digital Curation Centre, University of Edinburgh - DMPonline
  • Sarah Shreeves, University of Illinois at Urbana-Champaign - DMPTool
  • Sarah Jones, Digital Curation Centre, University of Glasgow - institutional policies and DMP work



Monday 14 January

Workshop 4 - Matterhorn 1

9am - 5pm  (Invitation only)

Aligning Digital Preservation across nations

Organiser: Cal Lee, University of North Carolina

This workshop is designed to advance discussion and collective action for digital preservation across national boundaries, including plans and priorities for an Aligning National Approaches to Digital Preservation (ADADP) 2 conference to be held in autumn 2013 and strategies for establishing a sustainable social infrastructure for continuing events and activities.

NB. This workshop is by invitation only



Monday 14 January

Workshop 5 - Monte Rosa

Programme PDF

9.30 - 12.30 (Free of charge)

Designing Data Management Training Resources: Tools for the provision of interactive research data management workshops [view slides] [Open Exeter]

Organisers: Catherine Pink & Jez Cope - University of Bath. Gareth Cole, Hannah Lloyd-Jones & Jill Evans - University of Exeter

In order to effectively manage research data, both researchers and professional support staff need to develop and enhance new skills. Provision of training to facilitate this is therefore an essential component of any institutional data management infrastructure.

The JISC-funded ‘Open Exeter’ and ‘Research360’ projects have developed and delivered a range of interactive training workshops aimed at enabling both postgraduate and academic researchers to understand relevant challenges, risks and requirements and equipping them with the necessary skills to facilitate best practice in research data management.

In this half day workshop, a range of exercises that have been successfully used in data management training courses will be presented. Workshop delegates will then have an opportunity to complete each exercise. Participants will benefit from experiencing an interactive managing research data training event from the perspective of researchers and support staff and will learn tools and exemplar exercises that they can subsequently incorporate into their own training provision


Post Conference Workshops 6 - 8

Thursday 17 January 2013


Thursday 17 January

Workshop 6 - Monte Rosa

9.45am - 4.30pm (Free of charge *)

Data publishing, peer review and repository accreditation: everyone a winner?

Organisers: Angus Whyte (DCC) Jonathan Tedds (University of Leicester), Sarah Callaghan (BADC) and Fiona Murphy (Wiley-Blackwell)

Researchers, publishers and academic institutions are innovating in the search for faster and more visible pathways through the research record. Data journals and other data publishing models are opening up the means of knowledge production, linking data and other research products to better enable scrutiny and realise their potential to be reused for new studies and novel applications.

The PREPARDE project is funded by JISC to address key issues arising from data publication. It aims to produce guidelines applicable to a wide range of scientific disciplines and data publication models. The project initially focuses on earth science disciplines, and the Geoscience Data Journal, a partnership between the Royal Meteorological Society and Wiley-Blackwell.

Intended audience: the workshop will interest anyone involved in publishing and reviewing data


This workshop will highlight the challenges and opportunities data publishing raises for the various stakeholders in data management and curation. It aims to identify the mutual benefits to be gained from collaboration between data centres/repositories, publishers, learned societies and institutional repositories, in a fast developing arena that includes other actors such as data repository directories (e.g. and Critical issues the workshop aims to cover include:

  • How can publishers and repositories collaborate to exploit shared interests in how Trusted Data Repository standards may support the efficient exchange and effective peer review of data, in pursuit of their common need to engage with the research community?
  • What are the respective responsibilities of stakeholders to ensure published data are reviewed on minimum criteria for data quality?

Participating in the workshop will help you to: 

  1. Understand benefits and risks to stakeholders in scholarly publishing likely to arise from different data publishing models.
  2. Shape PREPARDE project guidelines on:- a) Dataset review criteria and the associated workflows b) Roles of trusted repository accreditation and related data standards in supporting the peer review of data.

Draft programme

9.30 – 9.45 Arrival

9.45 – 9.55 Aims and Introduction - Jonathan Tedds, University of Leicester/ PREPARDE

9.55 – 10.20  Sarah Callaghan, BADC/ PREPARDE Data publication models: benefits, risks and peer review

10.20 – 10.50 Peter Doorn, DANS, Trusted Repository certification and its potential to improve data quality

10.50 – 11.20 Michael Diepenbroek PANGAEA / MARUM, University Bremen Research data enters scholarly Communication - towards an infrastructure for data publication in the empirical sciences

11.20 – 11.40 tea/coffee

11.40 – 12.30 Publishers & Learned Society perspectives

Eefke Smit, International Association of STM Publishers Integration of data and publications

Richard Kidd, Royal Society of Chemistry Chemists...and data

12.20 – 13.30 Lunch

13.30 – 14.20 Data Centre and Institution perspectives

Kerstin Lehnert, IEDA Data Publication at IEDA- Making Data Fit for Reuse

Veerle Van den Eynden, UKDA and Research Data @ Essex project Data review at the UK Data Archive.

14.20 – 15.00 Discussion groups - Who should review on what criteria?

15.00 – 15.20 tea/coffee

15.20 – 16.00 Discussion groups - Collaboration: benefits, risks and workflows?

16.00 – 16.30 Report back, Discussion & Next Steps

Sarah Callaghan's storify of workshop tweets is here


* Supported by Wiley-Blackwell



Thursday 17 January

Workshop 7

9am - 5pm  - (Free of charge)

Open Digital Humanities: use and reuse of digital data in the Humanities

Organiser: Jean-Philippe Magué - ENS de Lyon, France

The use of digital data in the Humanities has become a common practice. The multiplication of these collections of data and the investments in their constitution raise questions about accessibility and, more generally, their life cycle beyond the context in which they were created. How these data can support the statement of the result of research? How these data can be reused for other research? How to ensure the preservation of these data? The answers to these questions simultaneously mobilize scientific considerations (how the data are constructed), technical considerations (how data formats and tools promote interoperability) and legal considerations (how property and licences can facilitate or not the reuse). These are the questions this workshop proposes to explore.

This workshop is organized by the "Building and Developing Collections of Digital Data for Research" working group of the NeDiMAH (Network for Digital Methods in the Arts and Humanities) network. NeDiMAH is founded by the European Science Foundation (ESF) to examine the practice of, and evidence for, advanced ICT methods in the arts and humanities across Europe by providing a locus of networking and interdisciplinary exchange of expertise among the trans-European community of digital arts and humanities researchers, as well as those engaged with creating and curating scholarly and cultural heritage digital collections.


9:15 – 9:30 : Registration

9:30 – 9:45 : Introduction

9:45 – 10:30 : William Kilbride, Digital Preservation Coalition - "What we’ve learned about Digital Preservation and Digital Humanities: emerging practice (good and bad)"

10:30 – 11:15 : Pekel Joris, Open Knowledge Foundation - "The Digital Commons and the Republic of Letters"

11:15 – 12:00 : Freire Nuno, Charles Valentine & Isaac Antoine , Europeana / The European Library - "Europeana and Research: Enabling the Use of Cultural Heritage Objects for Digital Humanities"

12:00 – 13:30 : Lunch

13:30 – 14:15 : Laurents Sesinks, Data Archiving and Networked Services

14:15 – 15:00 : Johan Oomen, Nederlands Instituut voor Beeld en Geluid - "Towards more open, smart and connected audiovisual archives"

15:00 – 15:45 : Dominic Forest, University of Montreal - "Text mining, topic modelling and information discovery in cyberinfrastructures"

15:45 – 16:30 : Andreea Popa, University of Architecture and Urban Planning Ion Mincu - "Use of digital data in Landscape Planning- Trans-disciplinary approach"




Thursday 17 January

Workshop 8 - Winterthur

1.30 - 5pm (Free of charge)

Sustainability and the APARSEN Network of Excellence

Organisers: René van Horik, DANS, The Netherlands  and John Lindstrom, Lulea University, Sweden


1.30 - 1.45 Introduction

The introduction contains background information on the APARSEN network and introduces the "sustainability" topics that are covered by the project. These topics are: preservation services, storage solutions, cost issues related to digital preservation and business cases.

1.45 - 2.25 Preservation services (Prepared by: Science and Technology Facilities Council – STFC, Simon Lambert)

Sustainability may be built on services that transcend an individual organisation. Both the supply and demand will be considered. What services are needed for a sustainable infrastructure? Conversely, what service offerings already exist or are under development? How may they be delivered? Is there a common view on what services can be shared?

2.25 - 3.05 Storage solutions (Prepared by: National Library of the Netherlands - KB, Jeffrey van der Hoeven)

Storage is of course a central component in any preservation solution and requires special functionalities in order to adequately address the need of long-term preservation. APARSEN has surveyed the storage solutions currently in use among its partners, with emphasis on any exploitation of cloud based storage solutions. Business models and costs are the two main aspects considered.

3.05 - 3.25 - 3.05  Tea/Coffee

3.25 - 4.05 Cost issues related to digital preservation (Prepared by: British Library – BL, Kirnn Kaur)

The first part of this presentation deals with why and how cost models contribute to sustainability. This is followed by an overview of a number of existing cost models for the preservation of digital information. Finally an analysis will be given of the different cost models with respect to their contribution to the sustainability of digital archives.

4.05 - 4.45 Business cases (Prepared by: University of Patras, Giannis Tsakoras)

Based on recommendations of the Blue Ribbon Task Force (BRTF) on economically sustainable digital preservation, work is ongoing to produce:- a view of the current landscape in digital preservation activities and preparedness across Europe, and a roadmap that gives pointers and methodological guidelines on how to design, implement and operate digital preservation activities and services in the long run and under sustainable conditions.

The intention is that the roadmap will provide an efficient method for defining and reporting the interaction of memory institutions or repositories with other key stakeholders (policy makers, funders, private industry). The issues that are considered include, the preparedness for preservation management, digital curation of commercial content, Open Access content and Open Research Data, what infrastructures are available, what skills are needed, what funding is needed etc.

4.45 - 5.00 Conclusion



Thursday 17 January

Workshop 9 - Winterthur

8:30-12:15 (Free of charge)

The Price of Keeping Knowledge: Financial Streams for Digital Preservation

Organiser: Knowledge Exchange, Hans Pfeiffenberger (AWI)


Over the next few years, a reasonable mapping of the structures of costs, prices and funding components for digital data management will have to be developed before full scale infrastructure for long term preservation and access can be built and operated sustainably. A number of projects and organizations have developed cost models in order to assess costs for individual services in the data management cycle. These models vary in with regard to their coverage of the whole cycle, their methodological approach, and the level of detail. Furthermore, it has been difficult to compare actual costs not only because of the reluctance to share financial data, but also because of the difficulty to calculate costs for individual components and services. In contrast to cost modeling activities, the pricing of services must be simple and transparent. Calculating and thus knowing price structures, would not only help identify the level of detail required for cost modeling of individual institutions, but also help develop a ”public” market for services as well as clarify the division of task and the modeling of funding and revenue streams for data preservation of public institutions.  This workshop will build on the results from the workshop  ”The Costs and Benefits of Keeping Knowledge” which took place 11 June 2012 in Copenhagen.

Aims of the workshop:

  • Identifying ways for data repositories to abstract from their complicated cost structures and arrive at one transparent pricing structure which can be aligned with available and plausible funding schemes. Those repositories will probably need a stable institutional funding stream for data management and preservation. Are there any estimates for this, absolute or as percentage of overall cost? Part of the revenue will probably have to come through data management fees upon ingest. How could that be priced? Per dataset, per GB or as a percentage of research cost? Will it be necessary to charge access prices, as they contradict the open science paradigm?
  • What are the price components for pricing individual services, which prices are currently being paid e.g. to commercial providers? What are the description and conditions of the service(s) delivered and guaranteed?
  • What types of risks are inherent in these pricing schemes?
  • How can services and prices be defined in an all-inclusive and simple manner, so as to enable researchers to apply for specific amount when asking for funding of data-intensive projects?


8:30    Registration & Coffee

9:00    Welcome and introduction to the workshop: Hans Pfeiffenberger, AWI/KE

9:15    Experiences, expectations, service offerings - Perspectives shared by infrastructure/service providers, describing their services and price- or revenue stream structure.

Moderator: Jens Klump, GFZ Potsdam 9:15-9:20 Isaac Sanya, ENSURE

9:20-9:25 Jared Lyle, ICPSR

9:25-9:30 Martin Ostasz, EUDAT

9:30-9:35 Jens Ludwig, Göttingen University Library

9:35-9:40 Catherine Hardman, Archaeology Data Service

9:40-9:45 Sebastian Drude, Max Planck Institute

9:45-10:00 Discussion

10:00-10:30 Coffee break

10:30    Experiences, expectations, service offerings (continued)

Moderator: Jens Klump, GFZ Potsdam

10:30-10:35 Jens Nieschulze, Göttingen University

10:35-10:40 Neil Beagrie

10:40-11:00 Discussion

11:00    What to pay and who pays what? Are the suggestions for pricing schemes from the first session plausible, fair and sustainable?  Which elements are expected by funders? To what extent can/must price models depend on discipline, national peculiarities of funding, institutional structures?

Moderator: Jens Klump, GFZ Potsdam

Ron Dekker, NWO

Stefan Winkler-Nees, DFG

Open discussion

12:20    Wrapping up - Joy Davidson, DCC

12:30    Lunch