Open access license

Revision as of 21:04, 7 July 2019 by P.petrelli (talk | contribs)
Template:Stub This is a stub page and needs expansion

If you are sharing your data or code freely, you might be wondering why you should be using a license, but making your data freely available to others doesn't mean that you shouldn't establish in which way it can be used. This is were open access licenses come in handy and this is why you can have different flavours of the same open access license.

Here we summarise the ones we commonly use, for a much more detailed view of these licenses ANDS (the Australian National Data Services) has a comprehensive guide which covers also copyright. The important thing to understand about copyright is that it is usually hold by your own institution, even if you are publishing with CLEx. This is because CLEx is not a legal entity and your institution has the Intellectual Property of your work (see below). When you apply a license to your data, you are doing so on behalf of your institution. Unless there are particular circumstances around your data, for example if it is already covered by an agreement or is of a sensitive nature, most institutions have an open access policy and would be fine with a open access license. If you are in doubt, check with your institution if they are fine with the license you want to use.

Before looking at the different flavours of licenses, we need to understand what are they covering.  First of all these are all open access licenses so by applying one of them you are usually giving free access to your data, which means that you are not charging for using your data. However, you can regulate how a potential user can access and use your data. Usually a license will cover some or all of the following points:

  • Attribution:  a user is required to cite and/or acknowledge your data, this is usually included in any license so make sure you are also sharing the information on how to do so.
  • Commercial use and/or research use: you can limit the use of your data to research only or exclude any commercial use and/or redistribution of your data.
  • Distribution vs private use: if a user can distribute your data/software or can only use it for his own work
  • License and copyright notice: a copy of the license and copyright has to be included with the data or software.
  • Modification: a user can modify, augment or transform your data to create a new dataset or software
  • Same license: the dataset or software resulting from modification of yours has to be distributed under the same license, so they cannot restrict the use of their derived product.
  • No-Derivatives: if a user modify, augment or transform the data in any way they cannot redistribute it. This can be changed of course if they get in contact with the creator and they allow an exception. In fact, this is often the reason why this clause is added so to be kept informed of how the data is used.

The last two apply only to software

  • Disclose source: source code must be made available when derived product is distributed 
  • State changes: any changes made must be documented 

Finally licenses usually contain a disclaimer to cover warranty and liability, i.e. that the data or software is provided "as is" and that the copyright owner cannot be considered responsible for any damages derived by using the product.

As you can see there are a lot of different flavours, usually any license will cover "attribution", which is also usually the main reason to apply a license: to get your work recognised!

Licenses are really useful for both the "data" creator and the user, it helps a potential user to work out quickly if the data is suitable for their intended use and helps them citing and using the data in the way the creator wants them too.

Datasets license

While we consider both datasets and code as a form of data, from a licensing point of view they are treated differently. For datasets we suggest the Creative Commons licenses. As well as the international version, there is also an Australian Creative Commons. These licenses are simple to use, they offer 6 different flavours, which cover most use cases, and an online tool to help you choose between them. They have a human readable version as well as the legal text and finally the International version was created to cover your data independently of the potential user country of origin.

Creative Commons use abbraviation for each of the options BY for attribution which is always present

Check this 5 minutes video to learn more of the available combinations:


The CoE advice if you are unsure which one to use is to use the CC-BY-NC-SA

Software licenses

For software you can refer to the Open Source initiative and the Software Sustainability Institute. They both offer a good introduction, SSI also lists several other websites covering available licenses in a human readable way and useful comparison tools. Among these:

from github:

from TLDRLegal:

They both summarise licenses by their "rules".

We usually apply the Apache 2.0 license to the code we produce. Creative Commons are not suitable for licensing software, as stated on the Creative Commons website:

" ... Unlike software-specific licenses, CC licenses do not contain specific terms about the distribution of source code, which is often important to ensuring the free reuse and modifiability of software. Many software licenses also address patent rights, which are important to software but may not be applicable to other copyrightable works. Additionally, our licenses are currently not compatible with the major software licenses, so it would be difficult to integrate CC-licensed work with other free software. Existing software licenses were designed specifically for use with software and offer a similar set of rights to the Creative Commons licenses. "

Apache 2.0 applies the following

Permissions Conditions Limitations
  •  Commercial use
  •  Distribution
  •  Modification
  •  Patent use
  •  Private use
  •  License and copyright notice
  •  State changes
  •  Liability
  •  Trademark use
  •  Warranty

Summarised this means that commercial and private use, distribution and modification are all allowed. A user can also use parts of the code covered by patent, should there be any. A user has to attach the original license and copyright notices to any derived product they want to distribute, as well as stating any changes made. Finally the software creator cannot be hold liable for damages and they share the product "as is" with no warranty and the license does not grant trademark rights.

More on intellectual property and copyrights by institution


Melbourne Uni