Open Data Licensing: is your data safe?

20 July, 2007
Over on the Nodalities blog, Rob Styles wrote about some of the aspects of open data licensing, and the tricky questions of copyright versus database right. OK, yawn. Let me put that another way… over on the Nodalities blog, Rob Styles writes about whether you can make your data openly accessible on the web without getting totally ripped off in the process. A bit less of a yawn?

One key quote:
“Without appropriate protection of intellectual property we have only two extreme positions available: locked down with passwords and other technical means; or wide open and in the public-domain. Polarising the possibilities for data into these two extremes makes opening up an all or nothing decision for the creator of a database. 



With only technical and contractual mechanisms for protecting data, creators of databases can only publish them in situations where the technical barriers can be maintained and contractual obligations can be enforced.”
It’s true: to put any conditions over the use of our data, we have to have an exclusive right to control it. Copyright gives its owner that right for a text. If I own the Copyright for my works, I can (and try to) put a Creative Commons licence on it, to allow others to use it but to ask them to give me attribution if they do so.

The problem is that there is doubt… OK, more than doubt… whether and/or how Copyright applies to databases. And if Copyright does not apply, you don’t get the exclusive control which allows you to apply a conditional licence like Creative Commons. Just to explore a bit further...

Science Commons was set up to look at helping make science data more openly available. But if you look at their FAQ, you can see some real concerns. They pick out several aspects of a database that might be subject to Copyright, including the structure, but also say:
"In the United States, data will be protected by copyright only if they express creativity. Some databases will satisfy this condition, such as a database containing poetry or a wiki containing prose. Many databases, however, contain factual information that may have taken a great deal of effort to gather, such as the results of a series of complicated and creative experiments. Nonetheless, that information is not protected by copyright and cannot be licensed under the terms of a Creative Commons license."
In a note to me Mags McGinley, our legal officer, re-inforces this, and adds:
"Copyright definitely applies to certain elements of a database. Copyright exists in the structure of a database if, by reason of the selection and arrangement, it constitutes the authors own intellectual creation. In addition the contents of database, depending on what they are, may attract their own copyright protection (a simple example might be a database of poems)."
But is there a glimmer of hope? The Science Commons FAQ goes on to say:
"Note - for databases subject to the laws of members of the European Union and certain other countries, the law supplies a special right for databases. Except in the Netherlands and Belgium Creative Commons Licenses, Creative Commons licenses do not apply to this right..."
Rob Styles also reminds us that in Europe we have this other right: “the EU adopted a robust database right in 1996 while the US ruled against such protection in 1991”.
“Database right in the EU is like Copyright. It is a monopoly, but only on that particular aggregation of the data. The underlying facts are still not protected and there is nothing to stop a second entrant from collecting them independently.”
Charlotte Waelde has written a report for the JISC-funded GRADE project on rights that apply to data in geospatial databases. She concluded that Database Copyright does not apply, but the Database Right does apply. She also concluded (my emphasis):
"• Unauthorised taking and making available of substantial parts of the contents of the database will infringe the right of extraction and re-utilisation"
and...
"• A lawful user of the database (e.g. the researcher or teacher in an educational institution) may not be prevented from extracting and re-utilising an insubstantial part of the contents of a database for any purposes whatsoever.
• A researcher or teacher may not be prevented from extracting a substantial part of the contents of the database for the purposes of non-commercial research or illustration for teaching so long as the source is indicated. Re-utilisation may only be enjoined if the output contains a substantial part of the contents of the protected database"
I am not a lawyer and (try as I might) I couldn't get all the nuances of what she is trying to say, particularly in the last sentence above; however Mags tells me
"The thing there is that there is a difference between extraction and reutilisation which are the two activities that can be prevented by the database right. The fair dealing exceptions for the database right are not as wide as those of copyright and are for some reason limited to the act of extraction."
"So Charlotte is highlighting the maximum you could do in such case where your activities fall within the research/teaching area. This is: extract a substantial part. And then reutilise an insubstantial part (because the database right only limits what you do with substantial parts of the database)."
Rob goes on to end his blog entry, saying of rights:
“They allow inventors to disclose their inventions when they might otherwise have had to keep them secret... That's why we've invested in a license to do this, properly, clearly and in a way that stays Open.”
He is referring to the Talis Community Licence, which attempts to base a conditional open licence on the Database Right. Trust me, I REALLY want this sort of thing to work, but I worry that the Database Right may not be sufficient as underlying protection to make this licence firm. And what would be the law applying to access FROM a jurisdiction like the US that did not have a Database Right?

As I’ve said before, I’m not a lawyer. Can a data-oriented lawyer comment?