Publishing with NCI
NCI provides web services to publish data and metadata:
- a geonetwork catalogue to describe your dataset (i.e. a metadata repository) and associated DOI. This will be the landing page for the DOI and will include a link to access the dataset.
- a Thredds Data Server (TDS) this is a public data repository that provides access to the data, files can be downloaded from here or accessed via the OPeNDAP protocol.
You have two options when publishing with NCI: publishing as part of the CLEX Collection or have your own data project.
If you publish your data as part of the CLEX Collection, your data will be hosted in the ks32 project and will be listed together with other CLEX data.
The second option usually applies only to special cases and it is worth only if you want to create a separate collection, so if you will have more datasets to add in the future. This is a slower way to publish the data as you will have to apply for a new data project code, which includes finding a way to fund the disk storage needed. In either case, we can help you providing the required information and processing the files, the steps listed below applies to both.
This diagram shows the publishing process, described in more details below.
- Create a data management plan (DMP) for your dataset using the CLEX Roadmap tool, if you don't have one already. If you do make sure to have filled in the third phase of the plan which deals with the "Publishing details". You then should share the plan with us when you're ready (share with paola.petrelli<at>utas.edu.au).
We create a directory in /g/data/ua8/Publishing/CLEX_Data/ , using the acronym you chose for your dataset as name . This will include a draft for a readme file and one file for the chosen license. You can request access to the ua8 project via https://my.nci.org.au .
- Move the data to the your dataset directory: /g/data/ua8/Publishing/CLEX_Data/<your-dataset>/tmp .
- Get the dataset to a good quality level, necessary to share it successfully: metadata in the files should satisfy the CF conventions. The readme file should contain a data description and both directory structure and filenames should be understandable and contain information on the data. You can use a CF checker yourself or we can check your files and tell you if there's anything which needs to be done.
- Once NCI has minted a DOI for your dataset we can finalise the files by adding the global attributes that describe the publication details (doi, citation, title ..) following the ACDD conventions. Most of these can be defined based on the publishing section of the DMP. A draft bash code "attributes_cf.sh" is generated with the directory to facilitate this step.
- Once the files are ready, we will ask NCI to run their QA/QC checker on them, when they pass these quality checks the final version is copied to /g/data/ks32/CLEX_Data/ and the dataset is added to the TDS catalogue.
This process looks complex but if you follow the guidelines when filling in the DMP and the variables in your files are reasonably well-described, we can do most of the work for you.
NB the DOI will not be working until the publishing process is complete.
If you are in doubt do not hesitate to ask for help by e-mailing our Helpdesk: cws_help<at>nci.org.au.
- Think carefully about the license and rights terms, you will find some options on the form itself. Contact the helpdesk if there are other collaborators involved or any other special terms to be kept into account, the license will be virtually null if either the CLEX and/or your university don't hold the copyright.
- If you do not have one already create an ORCID identity which we can list in the record as well. The ARC is now encouraging the use of researcher identities to reference all your body of work when applying for a grant.
- Give some thought to what you want to publish, as well as satisfying the journal requirements, you want to make sure to include anything that could be useful to other researchers. This will increase the value of your data and potentially get more people to cite you. Check also our guidelines on which data you should publish.
- The title of your dataset is important check our tips to create a descriptive title.
- Have a good versioning strategy to accomodate corrections or new releases.
- Use keywords and controlled vocabularies in your metadata to increase discoverability.
Using the same information, we also create a record on Research Data Australia (RDA), a metadata catalogue service provided by ARDC. We do this because RDA has more visibility then the NCI catalogue, including being harvested by the google dataset search toolbox.
We create the RDA record before uploading the geonetwork xml file, as this allows us to show you a draft of what your record would like before publication, which is not possible for geonetwork. You can look at one of the ARCCSS/CLEX records as an example.
In a similar way we will add your published record to the Zenodo CLEX Data Collection. We will not upload data here only the readme file for the dataset. Both the RDA and Zenodo record will refer to the NCI geonetwork as the official metadata source and to the TDS as the dataset access point.