Friday, 20 July 2007

Open Data Licensing: is your data safe?

Over on the Nodalities blog, Rob Styles wrote about some of the aspects of open data licensing, and the tricky questions of copyright versus database right. OK, yawn. Let me put that another way… over on the Nodalities blog, Rob Styles writes about whether you can make your data openly accessible on the web without getting totally ripped off in the process. A bit less of a yawn?

One key quote:
“Without appropriate protection of intellectual property we have only two extreme positions available: locked down with passwords and other technical means; or wide open and in the public-domain. Polarising the possibilities for data into these two extremes makes opening up an all or nothing decision for the creator of a database. 

With only technical and contractual mechanisms for protecting data, creators of databases can only publish them in situations where the technical barriers can be maintained and contractual obligations can be enforced.”
It’s true: to put any conditions over the use of our data, we have to have an exclusive right to control it. Copyright gives its owner that right for a text. If I own the Copyright for my works, I can (and try to) put a Creative Commons licence on it, to allow others to use it but to ask them to give me attribution if they do so.

The problem is that there is doubt… OK, more than doubt… whether and/or how Copyright applies to databases. And if Copyright does not apply, you don’t get the exclusive control which allows you to apply a conditional licence like Creative Commons. Just to explore a bit further...

Science Commons was set up to look at helping make science data more openly available. But if you look at their FAQ, you can see some real concerns. They pick out several aspects of a database that might be subject to Copyright, including the structure, but also say:
"In the United States, data will be protected by copyright only if they express creativity. Some databases will satisfy this condition, such as a database containing poetry or a wiki containing prose. Many databases, however, contain factual information that may have taken a great deal of effort to gather, such as the results of a series of complicated and creative experiments. Nonetheless, that information is not protected by copyright and cannot be licensed under the terms of a Creative Commons license."
In a note to me Mags McGinley, our legal officer, re-inforces this, and adds:
"Copyright definitely applies to certain elements of a database. Copyright exists in the structure of a database if, by reason of the selection and arrangement, it constitutes the authors own intellectual creation. In addition the contents of database, depending on what they are, may attract their own copyright protection (a simple example might be a database of poems)."
But is there a glimmer of hope? The Science Commons FAQ goes on to say:
"Note - for databases subject to the laws of members of the European Union and certain other countries, the law supplies a special right for databases. Except in the Netherlands and Belgium Creative Commons Licenses, Creative Commons licenses do not apply to this right..."
Rob Styles also reminds us that in Europe we have this other right: “the EU adopted a robust database right in 1996 while the US ruled against such protection in 1991”.
“Database right in the EU is like Copyright. It is a monopoly, but only on that particular aggregation of the data. The underlying facts are still not protected and there is nothing to stop a second entrant from collecting them independently.”
Charlotte Waelde has written a report for the JISC-funded GRADE project on rights that apply to data in geospatial databases. She concluded that Database Copyright does not apply, but the Database Right does apply. She also concluded (my emphasis):
"• Unauthorised taking and making available of substantial parts of the contents of the database will infringe the right of extraction and re-utilisation"
"• A lawful user of the database (e.g. the researcher or teacher in an educational institution) may not be prevented from extracting and re-utilising an insubstantial part of the contents of a database for any purposes whatsoever.
• A researcher or teacher may not be prevented from extracting a substantial part of the contents of the database for the purposes of non-commercial research or illustration for teaching so long as the source is indicated. Re-utilisation may only be enjoined if the output contains a substantial part of the contents of the protected database"
I am not a lawyer and (try as I might) I couldn't get all the nuances of what she is trying to say, particularly in the last sentence above; however Mags tells me
"The thing there is that there is a difference between extraction and reutilisation which are the two activities that can be prevented by the database right. The fair dealing exceptions for the database right are not as wide as those of copyright and are for some reason limited to the act of extraction."
"So Charlotte is highlighting the maximum you could do in such case where your activities fall within the research/teaching area. This is: extract a substantial part. And then reutilise an insubstantial part (because the database right only limits what you do with substantial parts of the database)."
Rob goes on to end his blog entry, saying of rights:
“They allow inventors to disclose their inventions when they might otherwise have had to keep them secret... That's why we've invested in a license to do this, properly, clearly and in a way that stays Open.”
He is referring to the Talis Community Licence, which attempts to base a conditional open licence on the Database Right. Trust me, I REALLY want this sort of thing to work, but I worry that the Database Right may not be sufficient as underlying protection to make this licence firm. And what would be the law applying to access FROM a jurisdiction like the US that did not have a Database Right?

As I’ve said before, I’m not a lawyer. Can a data-oriented lawyer comment?


  1. Chris... :-)

    Excellent points, as usual...

    In order to ensure the applicability of a database right-dependent license such as the TCL in jurisdictions that do not respect such a right (ie the USA), I am in the final stages of funding some further legal work. IANAL, but I presume that this work will see the existing license re-expressed in the form of a set of Terms & Conditions, supported by the weight of Contract Law rather than Copyright (a la Creative Commons) or Database (a la TCL) law. As such, it would be enforceable in a far wider set of jurisdictions.

    I'm also talking to some pretty heavy-weight legal minds in California later today, once that part of the planet wakes up.

    As I've mentioned before, we're aligning this legal activity to some broader branding/positioning work that will see the (current) TCL renamed in more neutral terms, moved off our web sites, and offered to a neutral home capable of ensuring - and protecting - its long term relevance.

    We'd be happy to talk to the DCC about ways in which such a license might best meet your needs...

  2. Paul,

    I’m glad you mentioned the contract element as, having reviewed the TCL, I was thinking about this yesterday. Where there are no IPR rights available to base the Ts&Cs on, contract law could be utilised as a basis for dictating usage. I see a couple of difficulties here though. One is the uncertainty of whether contract can override IPR (not a problem if there is no IPR to override but in some cases there may be). My understanding is that the current weight of academic opinion seems to suggest that, in the public interest, such contractual provisions should not be tolerated (although you will often see them in licences). However, it is logical that the public interest argument is relevant only for the situation where contractual terms look to limit/restrict a use that would otherwise be available via copyright/database right legislation. If the intention here is to enable as opposed to restrict the public interest objection is weakened.

    A second difficulty is how to ‘contract’ with unspecified people/organisations (as opposed to agreeing something with a specific individual). If someone wants to use the data and doesn’t agree with your terms they could just choose to ignore them – what can you do about it if you don’t have any rights in the data? You could choose to lock up the data, making it available only once someone has agreed to the contract terms. Would this be permissible? Does it go against the spirit of what we trying to do here?

    Having said all this, it’s very possible that contract is the way ahead. But there would be a few things to iron out first… I think?

    I must also add that I’m really pleased that there are people out there, like yourself, giving this matter so much consideration.


Please note that this blog has a Creative Commons Attribution licence, and that by posting a comment you agree to your comment being published under this licence. You must be registered to comment, but I'm turning off moderation as an experiment.