Because good research needs good data

New Consultancy Project - The Role of Data in AI

Thordis Sveinsdottir | 29 October 2020

The Role of Data in AI is a new project sponsored by the Data Governance Working Group of the newly founded Global Partnership for AI (GPAI). The Partnership will bring together experts from industry, government, civil society and academia, to advance cutting-edge research and pilot projects on AI priorities for societal good. Openness, transparency, diversity and collaboration is at the heart of this project and all of the group's work into the future.

A consultancy team led by the Digital Curation Centre, with partners Trilateral Research and the School of Informatics at the University of Edinburgh will undertake focused research aimed at situating the importance of data to AI development and identify both areas where more data would be useful – such as specific, open, datasets that could be worthy of national support or international collaboration – and where harms arise due to the collection of or access to data.

Here are 5 reasons why The Role of Data was chosen as one of the Data Governance Working Group’s first project:

How important is data to AI?

Data is the foundational element of AI systems and the source of its power.

AI systems need large quantities of high-quality data to perform well. The more data they can crunch through, the better they are at making decisions.

Data needs to be FAIR (Findable, Accessible, Interoperable, and Reusable) and high-quality data is clean, free of error, relevant, and unbiased.

Datasets used in AI development also need to meet legal and ethical considerations. This is especially important when using extremely personal information like healthcare data.

Why the focus on the ‘role of data’ in AI?

You can’t develop an AI system in isolation, feed it some real-world data and expect it to make the right decision. All AI algorithms require ‘training data’ on which to learn.

The project aims to illustrate the importance of data for AI development. This will help us identify and highlight areas where more data would be useful or where access could be made more open.

Why does data governance matter in the context of AI?

As citizens, we want to know that governments, industry and academia are using data responsibly.

Good data governance will guide how data is collected,created, used and shared in responsible and trustworthy ways that are consistent with human rights, inclusion, diversity, innovation, economic growth, and societal benefit.

Data governance also takes into consideration laws on protection of personal data and intellectual property rights.

Is there a need to share data internationally?

Sharing data can help us respond effectively to our most pressing social, economic and environmental challenges.

So an aim of the project is to identify datasets that could be considered for international collaboration.

How can we make sure data collected about individuals isn’t used unfairly or in a discriminatory way?

We don’t want to see existing biases or prejudices in data reproduced or accelerated through the implementation of AI systems.

The project will look at the benefits and harms that may arise from creating and having access to different types of data.

It will also identify challenges to responsible AI development that may arise from having or not having access to particular data.

When will I hear more about the project?

The project will be delivered by GPAI’s first Plenary in December 2020.