Because good research needs good data

"Towards Automating the Data Analytics Process" - Keynote lecture by Professor Chris Williams

Diana Sisu | 18 February 2017

Along with his research and teaching roles, Chris Williams is University of Edinburgh’s liaison director for the Alan Turing Institute (ATI), headquartered at the British Library, London.  The Alan Turing Institute is a relatively new collaborative initiative between the Universities of Cambridge, Edinburgh, Oxford, UCL and Warwick, supporting interdisciplinary and multidisciplinary research in various aspects of data science.

In his keynote address, Professor Williams will achieve two main goals: firstly, to provide an introduction to the work and scope of the ATI, particularly in those areas of interest to the IDCC community.  Secondly, he will argue that the process of going from raw data to a publishable analysis is often largely dominated by the tasks of data understanding and data preparation.  Data preparation involves handling all sorts of problems.  Chris will focus on five of these; specifically:

  • Data parsing
  • Data understanding
  • Data cleaning
  • Data integration
  • Data restructuring.

The talk will set out these five areas as specific foci of the types of challenges that must be met before a reliable analysis can be started, and for each Professor Williams will give examples of the issues and also of some of the tools that he and his team have found useful in data handling in each case.  He will also argue for developing advanced tools that can help us to build capacity in performing these tasks effectively and at scale.

Finally, be prepared to volunteer your open data sets!  Chris and his team want to locate and work with open datasets where there is a specific analysis task that needs to be done.   This is a chance for the IDCC community to work together in bringing clarity and attention to the work that must happen before analysis – and to turn messy data to clean, integrated and useful open datasets.

"Towards Automating the Data Analytics Process", keynote talk by Professor Chris Williams is at 9.30am on 22 Feb 2017 at the 12th International Digital Curation Conference, Quincentenary Centre, Edinburgh.

Image: CC-BY by Walter