The next generation of data scientists

14 December, 2017 | in DCC News
By: Sarah Jones

The last two weeks have seen the first CODATA/RDA Research Data Science school in South America. We started the initiative in 2015, and after developing a curriculum to offer broad-based, introductory data skills to Early Career Researchers with a specific focus on those from Lower and Middle Income Countries, we ran our inaugural School at the International Centre for Theoretical Physics (ICTP) in Trieste, Italy in August 2016. 

From the start the Schools were a huge success, receiving hundreds of applications from researchers in a diverse range of countries and disciplines. We’ve continually iterated on the curriculum based on student feedback and developments in the field. The event in São Paulo was an important first step to branch out to regional schools and develop local hubs of expertise. We hope the School in South America will become an annual event and will shortly be inviting applications to host one in Africa in Autumn 2018 as we’ve had many requests from there as well. 

For my own part, the School has become one of the regular events I look forward to the most. The students are so enthused and keen to take the learning back to their institutions and colleagues that you really feel you are making an impact. Kevin and I have amended the Research Data Management curriculum over time, adding elements on FAIR data and new RDM services. We’re also in discussion with Gail Clements who runs Author Carpentry and Louise Bezuidenhout who teaches on open science and ethics, about how we can combine these three topics into one joint module for Trieste 2018.

In São Paulo we were joined by Steve Diggs from Scripps who put together an excellent data reuse lab. Students had to form mixed-skill teams and then review research papers for links to the underlying data. Donning their investigative deerstalkers, they then obtained the data and reproduced results. It was fantastic to see the determination and ingenuity displayed across the teams. They brought such creativity and inventiveness to the various pitfalls encountered, and the exercise drove home the message of why it’s so important to make your data FAIR.

It may surprise you all to learn that these Schools are an entirely volunteer effort. Hugh, Rob, Ciira and I give up our time to plan, coordinate and teach on the Schools, and this would not happen without the backing of our institutions. The host organisations (ICTP in Trieste and UNESP in São Paulo) invest a great deal of time and finances to make the Schools run. They provide the venue, accommodation and catering, cover student travel and administer all the visas, and provide the most excellent local support when we’re in town running the Schools. On top of that we receive a lot of small donations from too many organisations to mention. This covers the speaker travel and supports the helpers. 

This year we particularly want to thank Springer Nature and Wellcome Trust, whose support enabled the helpers participation and allowed us to run a weekend session to let this new cohort of students know how they can get involved. Oscar, Sara, Marcella and Silvia (pictured below) have all participated in previous Schools and are now bringing back their expertise to help others. At the weekend session, Sara explained to a packed room how different it is being a helper and how much it enriches your learning. Students approach the tasks differently so you’re troubleshooting a really wide range of problems and learning so much more about the technology by doing so.

The next two priorities are to increase the regions in which the Schools take place, and to move them on to a more sustainable footing which is not so reliant on volunteer effort and sponsorship. In 2018 we hope to run 3 Schools. One will take place in Trieste on 6-17 August 2018, and we anticipate others in Africa in Sept/Oct and Brazil in December. As part of the CODATA Task Force we’ll be reaching out to funders to seek support for a central office, and exploring business models to sustain the Schools. One idea is to run Schools in the USA and Europe with a delegate fee that is reinvested in supporting the Schools for LMIC. We hope to trial this in 2019.

With this being the Season of Goodwill and people looking for opportunities to give back to community, I would encourage you to think about what you could do for the Schools. Are you in a position to help us to coordinate them, to teach, to host events, to sponsor or help us develop a robust business model? There’s a huge demand for the training and we need lots of different inputs to make it scale.

CODATA Working Group Co-Chairs

  • Sarah Jones, Digital Curation Centre, Scotland
  • Ciira Maina, Dedan Kimathi University of Technology, Kenya
  • Rob Quick, Indiana university, USA
  • Hugh Shanahan, Royal Holloway University of London, England