You are here
Many citations flow from data...
I've been at the UK e-Science All Hands Meeting in Edinburgh over the past few days (easy, since it's being held in the bulding in which I work!). Lots of interesting presentations; far too many to go to, let alone blog about. But I can't resist mentioning one short presentation (PPT), from Prof Michael Wilson of STFC. His pitch was simple: publishing data is good for your career, especially now. And he has evidence to back up his claims!Michael and colleagues put together a Psycholinguistic database many years ago, with funding from the MRC. At first (1981) it was available via postal request, then (1988) via ftp from the Oxford Text Archive, now via web access from STFC and from UWA(since 1994). Not much change over the years, little effort, free data, no promotion.The database is now publicly available, eg at the link above. You may see that users are requested to cite the relevant paper:
Wilson, M.D. (1988) The MRC Psycholinguistic Database: Machine Readable Dictionary, Version 2. Behavioural Research Methods, Instruments and Computers, 20(1), 6-11.
The vital piece of evidence was a plot of citations over the years (data extracted from Thomson ISI, I hope they don't mind my re-using it):You'll notice that citations flowed very slowly in the early days, picked up a little after ftp access was available (it takes a few years to get the research done and published), and then really started to climb after web access was provided. Now Michael and his colleagues are getting around 80 citations per year!To ram home his point, Michael did some quick investigations, and found "At least 7 of the top 20 UK-based Authors of High-Impact Papers, 2003-07 ranked by citations ... published data sets".There was some questioning on whether citing data through papers (rather than the dataset itself) was the right approach. Michael is clear where he is on this question: paper citations get counted in the critical organs by which he and others will be measured, so citations should be of papers.Summary of the ptich: data publishing is easy, it's cheap, and it will boost your career when the new Research Evaluation Framework comes into effect.