Data publishing guidelines
Recently journal editors have updated their data policy and now require that data relating to the submitted paper should be made available by the authors. From the AGU data policy : "..all data necessary to understand, evaluate, replicate, and build upon the reported research must be made available and accessible whenever possible ..."
The aim of this change in the policy is to satisfy the principle that someone reading the paper should be able to reproduce your experiment. Again form the AGU policy : " For the purposes of this policy, data include, but are not limited to, the following:
- Data used to generate, or be displayed in, figures, graphs, plots, videos, animations, or tables in a paper.
- New protocols or methods used to generate the data in a paper.
- New code/computer software used to generate results or analyses reported in the paper.
- Derived data products reported or described in a paper. "
There can be practical and even copyright limitations to do this, but these can be taken into account and it should not be an impediment to publication if properly documented. The JGR-Space Physics editor-in-chief has listed some of these challenges and clarified the scope of the policy on his blog. This include references to model data which apply to lots of the CLEx data as well.
The publishing policies page offer a list of data policies and requirements by publisher.
How to publish
NCI is now providing web services to publish data and metadata. This include a geonetwork catalogue to describe your dates (i.e. a metadata repository) and provide links to other description and to the dataset access point. Once you have a geonetwork record, NCI can mint a DOI for the dataset, as for papers a DOI makes the dataset easy to cite. The files can be made accessible to the public by using their TDS catalogue (THREDDS).
We also create a collection record on Research Data Australia (RDA), a metadata catalogue service provided by ANDS. We do this because RDA has more visibility then the NCI catalogue, including being harvested by the new google dataset search toolbox. All these records are referencing each other so no matter where a user find your record it will always get to the same DOI, data access point and information.
There a few necessary steps to make your data available:
- Create a data management plan for your collection using the CLEX Roadmap tool, if you don't have one already. If you do make sure to have filled in the third phase of the plan which deals with the publishing details. You then should share the plan with me when you're ready (share with email@example.com). I use the dmp to collect the necessary information to create a metadata record on RDA on your behalf. You can look at one of ARCCSS records as an example of the kind of information required.
- We will use the information on the plan to automatically generate the NCI and RDA records and a directory in /g/data/ua8/CLEX_Data/<your-dataset> which will contained also a draft for a readme file and the license attached to the data. You need to request access to the ua8 project via https://my.nci.org.au .
- move the data to the your dataset directory: /g/data/ua8/CLEX_Data*/<your-dataset>/tmp . * this could also be ARCCSS_Data depending on your affiliation.
- get the dataset to a good quality level, necessary to share it successfully: metadata in the files should satisfy the CF conventions, there should be a READ-ME file or some data description sitting with the data and both directory structure and filenames should be understandable and contains information on the data. You use a CF checker yourself or we can check your files and tell you if there's anything which needs to be done.
- Once the files are ready we will ask NCI to run their QA/QC checker on them, if they pass these quality checks the final version is copied to /g/data/ks32/CLEX_Data/ and the dataset is added to the thredds catalogue.
We are using a new version of the DMP tool if you had already an account on the ARCCSS DMPonline you should be able to login with the same e-mail and password. If you have issue with your password you can reset it still using the same e-mail as before. If you do not have yet an account please use your university e-mail, only users from CLEx and the ARCCSS or approved collborators can create an account.
If your e-mail is not in the approved list Paola will receive an e-mail and will add your e-mail. If you are in doubt don't hesitate to ask for help by e-mailing our Helpdesk: firstname.lastname@example.org.
- Think carefully about the license and rights terms, you'll find some options on the form itself, contact the helpdesk if there are other collaborators involved or any other special terms to be kept into account, the license will be virtually null if either the CLEx and/or your university don't hold the copyright.
- If you don't have one already make an ORCID identity which we can list in the record as well. The ARC is now encouraging the use of researcher identities to reference all your body of work when applying for a grant.
- Give some thought to what you want to publish, as well as satisfying the journal requirements, you want to make sure to include anything that could be useful to other researchers. This will increase the value of your data and potentially get more people to cite you. Look here for guidelines.
Managing your data is an essential part of the publishing process for more detailed information go to the Data management induction training.
You can find more information on geonetwork, RDA, thredds and the associated services in our data management tools page.
I am still working on this page and the DMPOnline tool, any feedback on both is welcome! Look also to the other wiki pages under data services dedicated to the tool, data management in general and researcher identities for more information