You are here
From Data Curation to Software Curation: Enhancing Reproducibility and Sustainability of Data and Software
04 February 2019 |
This workshop will enable understanding of approaches to software curation, including its importance in enabling reproducible research. While a strong focus on data curation already exists, software curation is an emerging practice of equal importance. The workshop will present case studies to stimulate group discussion on how those engaged in facilitating reproducible research can support or actively engage with this.
The workshop will enable those concerned with reproducible research to understand issues in software curation. Software is fundamental to research, and plays a key role in creating and facilitating access to trusted, research outputs. “Software curation encompasses the active practices related to the creation, acquisition, appraisal and selection, description, transformation, preservation, storage, and dissemination/access/reuse of software over short- and long- periods of time.” (Chassanoff, Building a Model for Software Curation).
This workshop seeks to assist participants to understand the challenges in software curation, the support and resources needed by researchers to facilitate software reproducibility and re-use, and to consider how those involved in digital curation can engage with researchers and research software engineers in support of software curation, including on aspects of software curation related to provenance.
Whilst discussions around reproducibility and open science have often focussed on research data, research software is critical to research. Nangia and Katz note that “a survey of academic faculty and staff at British universities found that 92% use research software, with 69% saying that their research would not be practical without it.” Similarly, their Nature paper survey reveals that 32 of the 40 papers examined mention software, totalling 211 mentions of distinct pieces of software” (Understanding Software in Research). Other studies clearly show the need for increased understanding of how to curate software. In a 2017 PresQT research study, over 88% of respondents polled from among US National Science Foundation funded researchers and others likely to participate in data intensive research, reported that they create and/or use software, code, or scripts in their research. 76% of the 1700+ participants acknowledged that their software, code or scripts are needed for reproducing their results by third parties. Yet, less than 20% of respondents reported that they felt “more than moderately familiar with tools used to share, publish, cite and preserve data or software”. (Gesing, Johnson, Meyers & Wang. PresQT Needs Assessment).
The workshop will use a range of speakers and interactive small and large group activities to enable participants to explore and understand:
- the importance of both software curation (alongside data curation) to achieving gains in reproducibility
- software curation best practice through case studies and policies
- software citation and aspects of software curation related to provenance
- challenges in software curation
- how those engaged with data curation can support or actively engage with these issues to cultivate and build capacity for collaborative efforts to collect, care for, and preserve software
- demos and discussions on how tools, platforms and projects like ReproZip, CodeOcean, Gigantum, and WholeTale can be used in the context of software curation and reproducible research.
Organisers: Natalie Meyers, University of Notre Dame; Sophie Hou, National Center for Atmospheric Research; Jens Klump, CSIRO; Natasha Simons, Australian Research Data Commons, Matthias Liffers, Australian Research Data Commons, Gerry Ryder, Australian Research Data Commons
Agenda
Time | Session |
12:30 - 13:30 | Lunch & networking |
13:30 - 13:35 | Welcome, introductions, housekeeping |
13:35 - 13:50 | Icebreaker activity: who’s who in the room |
13:50 - 14:10 | From data curation to software curation: challenges and opportunities |
14:10 - 14:30 |
Software curation best practice: tools and platforms for source code
|
14:30 - 15:00 |
Software curation best practice: tools and platforms for curating software in the context of publication and re-use. Possible case studies:
|
15:00 - 15:30 | Afternoon tea break |
15:30 - 16:00 |
Software curation best practice - case studies & policy
|
16:00 - 16:30 |
Software curation best practice - support networks and research collaborations. Possible case studies:
|
16:30 - 16:50 |
Table discussions
|
16:50 - 17:00 | Wrap up and evaluation |
17:00 | Thank you and close |
Costs and Registration
£52
This is part of the excellent programme of workshops at the 14th International Digital Curation Conference.