Because good research needs good data

Negative Click, Positive Value Research Repository Systems

Chris Rusbridge | 02 July 2008

I promised to be more specific about what I would like to see in repositories that presented more value for less work overall, by offering facilities that allow it to become part of the researcher’s workflow. I’m going to refer to this as “the Research Repository System (RRS)” for convenience.At the top of this post is a mind map illustrating the RRS. A more complete mind map (in PDF form) is accessible here.The main elements that I think the RRS should support are (not in any particular order):

Here’s a quick scenario to illustrate some of this. Sam works in a highly cross-disciplinary laboratory, supported by a Research Repository System. Some data comes from instruments in the lab, some from surveys that can be answered in both paper and web form, some from reading current and older publications. All project files are kept in the Persistent Storage system, after the disaster last year when both the PIs lost their laptops from a car overseas, and much precious but un-backed-up data were lost. The data are managed through the RRS Data Management element, and Sam has requested a checkpoint of data in the system because the group is near finalising an article, and they want to make sure that the data that support the article remain available, and are not over-written by later data.Sam is the principal author, and has contributed a significant chunk of the article, along with a colleague from their partner group in Australia; colleagues from this partner group have the same access as members of Sam’s group. Everyone on the joint author list has access to the article and contributes small sections or changes; the author management and version control system does a pretty good job of ensuring that changes don’t conflict. The article is just about to be submitted to the publisher, after the RRS staff have negotiated the rights appropriately, and Sam is checking out a version to do final edits on the plane to a conference in Chile.None of the data are public yet, but they are expecting the publisher to request anonymous access to the data for the reviewers they assign. Disclosure control will make selected check-pointed data public once the article is published. Some of the data are primed to flow through to their designated Subject Repository at the same time.One last synchronisation of her laptop with the Persistent Storage system, and Sam is off to get her taxi downstairs…This blog post is really too big if I include everything, so I [have released] separate blog posts for each Research Repository System element, linking them all back to this post... and then come back here and link each element above to the corresponding detailed bits.OK I’m sure there’s more, although I’m not sure a Research Repository System of this kind can be built for general use. Want one? Nothing up my sleeve!