Because good research needs good data

SPRUCE Mashup Glasgow

The SPRUCE Project held a Mashup event in Glasgow from 16 to 18 April. The event was designed to bring together developers and digital preservation practitioners so that the former could find solutions to the some of the issues being encountered by the latter. Patrick McCann attended as a...

Patrick McCann | 25 April 2012

The SPRUCE Project held a Mashup event in Glasgow from 16th to 18th April. The event was designed to bring together developers and digital preservation practitioners so that the former could find solutions to the some of the issues being encountered by the latter.

The practitioners brought a wide range of issues with them. I was in attendance as a developer, and found the issue described by Toni Sant and Tony Grimaud of the Malta Music Memory Project (M3P) an interesting one. They run a wiki (using MediaWiki) intended to capture memories of the Maltese Music Scene, but there are problems with engagement - there are not as many users as they would like, and most of those are only reading the wiki rather than contributing to it. Meanwhile, there are concert photographs, posters and other items being posted and discussed on Facebook, which is not a platform well suited to digital curation or preservation. How can users be encouraged to use the wiki, and how can Facebook content and discussions be copied across to it?

The first step was to look at easing the process of registering for and logging into the wiki. I set up an instance of MediaWiki on my laptop and created a Facebook application via their developers’ portal. After installing the Facebook Open Graph extension for MediaWiki, very little configuration was necessary to enable authentication using Facebook Connect.

The next stage was to look at the possibility of getting data out of Facebook and into MediaWiki. Facebook provides a Graph API to enable access to data. It’s worth taking a look at the Graph API explorer, and in particular the difference that including an authentication token makes to the data returned following a request. The ease with which data can be extracted from Facebook is simultaneously alarming and reassuring - it’s not trapped there, but you should pay attention when an application seeks access to your information. There is also a MediaWiki API, as used by Wikipedia.

I looked at extracting user information from Facebook on registration and using it to create their user page within the wiki. The idea was that as well as proving that the data transfer is possible, it may help get new users engaged with editing the wiki. After a couple of false starts, the proof of concept was created by placing JavaScript in the MediaWiki:Common.js page of the wiki. Due to time constraints, it simply dumped the unparsed string of Facebook data into the user page, but it showed that (private) data could indeed be retrieved from Facebook and placed within the wiki.

There are clearly privacy issues with this technical proof of concept. It extracts private data from a Facebook account and places it on what would typically be a public page. Moreover, it does this silently in the background, without telling the user what’s happening. Most worryingly, the basic Facebook application I created could read all of a user’s data without explicitly asking permission to access the various classes of information. However, I feel that it there is potential for this to become a usable tool. The population of a user page on registration could work as long as users are kept informed and placed in control of the data that is posted. It could also be adapted to enable a user to easily transfer any of their own Facebook content to the wiki, though there are unresolved issues around associated comments by other users.

Remarkably, this was enough to win me the best developer prize at the event. I was genuinely surprised - there were other developers there who were effortlessly tackling problems (plural - some finished ahead of time and moved on to work on other issues) which I wouldn’t have known how to get started on. Not only that, but many solutions were fully developed tools that the practitioners could take away and start using in their work. I’ll resist picking out favourites, but you can see the whole set of solutions on the SPRUCE wiki. Additionally, some other participants have blogged about the event.

I really enjoyed attending the Mashup and working with Toni and Tony, and I got the impression that the participants generally found it useful, not least because of the tools that were produced. Personally, I got to learn quite a bit about working with Facebook and with MediaWiki, and some of the other issues and solutions were fascinating. There should be more SPRUCE Mashup events in future - on this basis they should be well worth attending.