Publishing data in Zenodo

Revision as of 00:07, 14 July 2021 by P.petrelli (talk | contribs)

CLEX Data Collection

Zenodo Data collection.png

You can now publish the code you use for your research with the new CLEX Data Collection on Zenodo (CDC). We are also using this collection to list all our published data in one place regardless of where it was originally published. If you have already published your data on another repository, please contact us and we can replicate your record here. We are doing this by default, if you are publishing with NCI.

Zenodo is an initiative funded by CERN that allows anyone to share their research outputs and attach to them a DOI. Zenodo is funded for at least the next 20 years and so it offers a good long-term solution, as well as been widely used internationally.

We will curate the Collection for the duration of CLEX, but the records will be still available and visible well past CLEX termination.

How to publish

Publishing your data is easy and quick as long as your data is reasonably organised and has a detailed Readme file.

  1. Create a Zenodo account, you can use your ORCID to login if you have one.
  2. Create a Zenodo record for your dataset, uploading the relevant files. We prepared a step-by-step guide on how to do this, which also cover the kind of information you should include.
  3. Choose the CLEX Data Collection as community. We will get a notification of your request. Remember your data can be listed in more than one community. 
  4. We will receive your request to join the CDC and check that you data is well described and in line with the Collection policies. If any changes are needed, we will contact you.

Useful tips

  • A dataset can have several authors, they all should agree to the dataset publication and to list the record in the CLEX Data Collection. All authors should have made a significant contribution to the data, you can refer to the CDC authorship policy if in doubt. 
  • Make sure your files are following any relevant standards, if they are netcdf files they should follow both CF and ACDD conventions. We are happy to help you preparing them.
  • Make sure the files have descriptive names and are organised in files and directories in a way that facilitate their access and use.
  • If your data has already been published elsewhere and the Zenodo record is a copy, we will upload here only the Readme file and add links to the original records for data download.
  • Remember you can only publish datasets which are less than 50GB
  • We will soon release a code to help you uploading the files programmatically, so they can be directly transferred from a server via the Zenodo API. In the meantime, if you want to use the API just send us an email and we will facilitate the process. 
  • If you do not want to create your own Zenodo account, we can create a record for you. In that case, let us know that you want to publish via the helpdesk. If your data is well documented and organised, it might take as little as 20 minutes for us to add a record to the Zenodo community.  Please note that having your own Zenodo account will give you more control on the record. It is especially recommended if you are likely to release new versions in the future.

Requirements

We set up policies covering the use and contribution to the Collection. These are shown if full on the Zenodo community page, but can also be downloaded here.

They include:

  • The CDC scope: outlining the scope of the collection and the main requirements for a code to be accepted.
  • The Authorship policy: this is based on the  Australian Code for the Responsible Conduct of Research (2018) and covers who can be considered and author or a collaborator. 
  • Contributors guidelines: more in depth guidelines on how to contribute data to the Collection.
  • Retention and Retraction policy: defining the few cases in which we might retract a record and the procedure we will follow.

Because we are using the Zenodo platform also any policy set by Zenodo has to be honoured.

We are not checking the quality of your data, as long as the Collection policies are satisfied and your data is in line with the Collection scope, it will be accepted. The only requirement currently is for your description of the data to be reasonably comprehensive, and for the files to be accessible. However, if you would like some advice and feedback on your data, we are happy to help you. As usual you can contact us via the CWS helpdesk: cws_help<at>nci.org.au.