Because good research needs good data

Call for Papers

Data quality and data limitations: working towards equality through data curation 

Data is increasingly transforming our personal and professional lives and it is important that we think critically about how we manage and share this data. Benefits from data driven research and innovation have been unevenly realized and the growth in data science has often exposed and amplified inequalities on individual, community and regional levels.  

For example, bias in research design, in data collection, or in data quality assessment may result in inequalities. There are countless examples where data bias leads to products and medical procedures being tailored for male physiology, putting women as well as other underrepresented groups at risk. Data quality needs to be translated into formal criteria that can be applied to measure coverage, representativeness as well as inclusiveness of data. 

Similarly, minority and indigenous voices have long been silenced in the archives, for example in historical recordings of events, and been given limited access to their communities’ data. The CARE Principles for Indigenous Data Governance were written in response to this specific inequality, and in recognition of how increasing emphasis on greater data sharing creates a tension as Indigenous People are asserting, to a greater extent, control over the use of their data to ensure collective benefit.   

Data sharing can help to improve data quality. Making data collection, curation and analysis more transparent also helps to highlight the otherwise often hidden labour that is required to ensure data quality and re-use over time. An example is the European Covid-19 Data Platform, launched by the European Commission in April 2020. It was a response to the urgent need to share, re-use, process and access research data and metadata on the SARS-CoV-2 virus, and the related COVID-19 disease. 

The key question we wish to ask at IDCC21 is:  

How can we ensure data collection and curation works for society at large?

Papers are invited, but not limited to, address one or multiple themes in the broad scope of data quality and data limitations:

  • Documenting and avoiding biases in datasets (e.g. clinical trials, facial recognition training datasets, machine learning).  
  • Anticipating use, avoiding misuse: communicating the applicability or otherwise of data to research questions. 
  • Non-custodial archiving, "documenting the now", and strengthening the archives of marginalized communities.  
  • Curating propaganda, misinformation, disinformation and falsified data. How do we protect the integrity of research data? 
  • Research data and curation in the contexts of geopolitics and "data nationalism".  
  • Data documentation for FAIRness and data quality.  
  • Applying data ethics in curation and assessment of data quality. 
  • Developing digital skills programmes to promote diversity and inclusiveness in the data science and curation professions. 
  • Indigenous data sovereignty, community archives and application of the CARE principles. 
  • Addressing inequality: broadening the benefits of data-driven science to more and diverse stakeholders.