Negative click repositories

Chris Rusbridge | 10 June 2008

I wanted to write a bit more about this idea of a negative click repository (negative cost was a bad name, as there is a real positive $ and £ cost to the repository, rather than the depositor). First some ancient history...When I joined the University of Glasgow in 2000, the Archives and Business Records Centre with other collaborators within the University were near the end of a short project on Effective Records Management (ERM, http://www.gla.ac.uk/infostrat/ERM/). During the course of that project, they surveyed committee clerks (who create many authoritative institutional records) on how much effort they were willing to put in, how many clicks they were willing to invest, to create records that would be easily maintainable in the digital era. The answer was: zero, none, nada! Rather than give up at this point, the team went on to create CDocS (http://www.gla.ac.uk/infostrat/ERM/Docs/ERM-Appendix2.pdf), an instrumented addition to MS Word that allowed the committee clerks to create their documents in university standard forms, with agreed metadata, with the documents and metadata automatically converted into XML for preservation and to HTML for display and sharing. ICE (see below) might be a contemporary system of a related kind, in a slightly different area. Thanks to James Currall for updating me on ERM and CDocS.In April 2007, Peter Murray Rust had an epiphany thinking about repositories on the road to Colorado, realising that SourceForge was a shared repository that he had been using for years, and speculating that it might be used for writing an article. The tool for control of managing versions and sharing in SourceForge is SVN… Peter wrote about the complex workflow in writing a collaborative article, but then wrote:

“BUT using SVN it’s trivial - assuming there is a repository. So we do not speak of an Institutional Repository, but an authoring support environment (ASE or any other meaningless acronym. ) A starts a project in institutional SVN. B joins, so do C, D, E, etc. They all edit the m/s. Everyone sees the latest version. The version sent to the publisher is annotated as such (this is trivial). All subsequent stuff is tracked automatically. When the paper is published, the institution simply liberates the authorised version - the authors don’t even need to be involved. The attractive point of this - over simple deposition - is that the repository supports the whole authoring process.”

Many of those who left comments disagreed that the technology would work directly as suggested, for various reasons. Google Docs was mentioned as an alternative (still flawed). Peter Sefton mentioned ICE (and Murray Rust subsequently visited USQ to work briefly with the ICE team.You may also remember Caveat Lector’s series of personae representing stakeholders in the repository game at fictional Achaea University, that I reported on before. Ulysses Acqua was her repository manager, and here’s a quote from his attempts to explain the advantages of his repository to faculty; they ask:

“Can it produce CVs or departmental-activity reports automatically? No. Can it be tweaked so that the Basketology collection looks like the Basketology website? No. (The software can do that, in fact, but Ulysses can’t.) Can it talk to campus IT’s file-storage-cum-website servers? No. Can it harvest faculty articles from disciplinary repositories? No. Can it deliver records straight to Achaea Library’s catalogue? No. Can it have access controls per item, such that items are shared with specific people only, with the list controlled by the depositor? No. Can it embargo items, for a certain length of time or indefinitely? No. Can it read a citation, check rights on the journal, and go fetch the paper if rights are cleared? Dream on. Can it restrict items for campus-only access by IP address? No. Does it talk to RefWorks and Zotero and similar bibliographic managers? No. Does it do version control? No.”

The problem here is that the repository adds work, it doesn’t take it away (there are other examples of this in some of the other personae). And overloaded people don’t accept extra work. They may promise to, but they (mostly) don’t do it.Finally, I posted earlier on Nico Adam’s comments on repositories for scientists. He got stuck, he said: "I had to explain what a repository is from the point of view of functionality - when talking to non-specialist managers, it is the only way one can sell and explain these things…they do not care about the technological perspective…the stuff that’s under the hood. I found it impossible to explain what the value proposal of a repository is and how it differentiates itself in terms of its functionality from, say, the document management systems in their respective companies." It’s worth repeating a couple of extracts from his conclusions:

"Now today it struck me: a repository should be a place where information is (a) collected, preserved and disseminated (b) semantically enriched, (c) on the basis of the semantic enrichment put into a relationship with other repository content to enable knowledge discovery, collaboration and, ultimately, the creation of knowledge spaces. "

And further:

"Now all of the technologies for this are, in principle, in place: we have DSpace, for example, for storage and dissemination, natural language processing systems such as OSCAR3 or parts of speech taggers for entity recognition, RDF to hold the data, OWL and SWRL for reasoning. And, although the example here, was chemistry specific, the same thing should be doable for any other discipline. … Yes it needs to be done in a subject specific manner. Yes, every department should have an embedded informatician to take care of the data structures that are specific for a particular discipline. But the important thing is to just make the damn data do work!"

So we need to develop repositories that make the data work to take human work away: negative click repositories. Or maybe not… (sorry about this). I’m just a bit concerned that we might get a set of monumental edifices built on DSpace or ePrints.org foundations, resulting in “take it or leave” it decisions. Institutional information environments are highly tailored (not always carefully) and at the department level even more eclectic, and things have to fit together. Maybe as Nico was suggesting, what we need is an array of tools, connected together by technologies like Atom/RSS and/or OAI-ORE that can be configured so as to link the components into an information management system that works to reduce the publishing effort on campus, and captures the intellectual product on the way.People appear to have lots of ideas on what negative click repositories might do. We could have tools for supporting the scientist in their information sharing (web sites, bibliographies and CVs). Tools for shared data management. Tools for shared writing. Tools even for the library to support faculty in dealing with publishers. And of course tools to help management count their beans. I’d like to begin to collect and order these and other suggestions if possible, so please leave comments or tag your blogs with “negative click”…

You are here

Negative click repositories