Difference between revisions of "Publishing options"

 
(28 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Category: Data]]
 
  
Recently journal editors have updated their data policy and now require that data relating to the submitted paper should be made available by the authors. From the [http://publications.agu.org/author-resource-center/publication-policies/data-policy/ AGU data policy] : "..all data necessary to understand, evaluate, replicate, and build upon the reported research must be made available and accessible whenever possible ..."
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">The main reasons to publish data is to share it with others and/or following a requirement from a publisher, funder or your own institution. The specific requirements are covered in the [[Institution_data_requirements|institutional policies]] and&nbsp;</span></span><span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[[Publisher_policies|journal&nbsp;policies]]&nbsp;pages. While these requirements might differ in some details they are all based on the [[FAIR|FAIR principles]], we can help you making a decision on how to publish your data or code in a way that satisfy these principles. This will depend&nbsp;on the kind of data and the reasons you want to publish.</span></span>
  
The aim of this change in the policy is to satisfy the principle that someone reading the paper should be able to reproduce your experiment. Again form the AGU policy&nbsp;: " For the purposes of this policy, data include, but are not limited to, the following:
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">If you are unsure about exactly which data you should be publishing, the [[CLEX_Data_policy|CLEX data policy]] and our [[Which_data_should_I_publish|guidelines]] should help you decide.</span></span>
  
*<span style="font-family: Arial,Helvetica,sans-serif;">Data used to generate, or be displayed in, figures, graphs, plots, videos, animations, or tables in a paper.</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Whichever way you chose remember to report your record to CLEVER (see below).</span></span>
*<span style="font-family: Arial,Helvetica,sans-serif;">New protocols or methods used to generate the data in a paper.</span>
 
*<span style="font-family: Arial,Helvetica,sans-serif;">New code/computer software used to generate results or analyses reported in the paper.</span>  
 
*<span style="font-family: Arial,Helvetica,sans-serif;">Derived data products reported or described in a paper. "</span>  
 
  
There can be practical and even copyright limitations to do this, but these can be taken into account and it should not be an impediment to publication if properly documented. The JGR-Space Physics editor-in-chief has listed some of these challenges and clarified the scope of the policy on his [https://liemohnjgrspace.wordpress.com/category/publication-policy/ blog]. This include references to model data which apply to lots of the CLEx data as well.
+
=== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Where should I publish my data?'''</span></span> ===
  
The [[Publisher_policies|publishing policies]] page offer a&nbsp;list of data policies and&nbsp;requirements by publisher.
+
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">There are at least three&nbsp;options and there is not a straight answer, it depends on what you are publishing and why.</span></span>
  
 +
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;"><u>CLEX&nbsp;Data Collection on NCI</u></span></span>
  
=== '''How to publish''' ===
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">&nbsp; This is the best option if you have data in NetCDF format and this data could be useful for other researchers. We will help you document&nbsp;your data and make it user friendly;&nbsp;it will be part of a climate data collection and so it will be easier to discover. NCI also has more storage capacity than&nbsp;other repositories and services which are designed around the NetCDF&nbsp;format.</span></span>
  
NCI is now providing web services to publish data and metadata. This include a [https://geonetwork.nci.org.au geonetwork catalogue] to describe your dates (i.e. a metadata repository) and provide links to other description and to the dataset access point. Once you have a geonetwork record, NCI can mint a DOI for the dataset, as for papers a DOI makes the dataset easy to cite. The files can be made accessible to the public by using their [http://dap.nci.org.au/thredds/remoteCatalogService?catalog=http: TDS catalogue]&nbsp;(THREDDS)'''''.'''''
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;"><u>Institutional repository</u></span></span>
  
We also create a collection record on Research Data Australia (RDA), a metadata catalogue service provided by ANDS. We do this because RDA has more visibility then the NCI catalogue, including being harvested by the new [https://toolbox.google.com/datasetsearch google dataset search toolbox]. All these records are referencing each other so no matter where a user find your record it will always get to the same DOI, data access point and information.
+
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">&nbsp; This is adequate if you have a small dataset and it is really specific to your study or only a subset/post processing of another dataset. While institutions offer some data curation, they usually will&nbsp;not check that the&nbsp;data&nbsp;is well described, consistent and user friendly, so you might get a DOI for your record&nbsp;but no added value. If your dataset is bigger than 50-100 GB, you might not be able to publish it with one of these repositories.</span></span>
  
There a few necessary steps to make your data available:
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;"><u>Zenodo, Figshare, Mendeley</u></span></span>
  
#Create a data management plan for your collection using the [https://clex.dmponline.cloud.edu.au/ CLEX Roadmap]&nbsp;tool, if you don't have one already. If you do make sure to have filled in the third phase of the plan which deals with the publishing details. You then should share the plan with me when you're ready (share with paola.petrelli@utas.edu.au). I use the dmp to collect the necessary information to create a metadata record on RDA on your behalf. You can look at one of [https://researchdata.ands.org.au/access13b-model-output-experiment-v10/453996 ARCCSS/CLEX records] as an example of the kind of information required.  
+
&nbsp;&nbsp;<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">These services are free, and you can create your own account,&nbsp;a record for your data and get a DOI fairly easily and quickly. You can also publish here different kind of materials. This can be useful if you want to publish some very specific data, for example code and data to produce a specific figure required to publish&nbsp;a paper.&nbsp;</span></span>
#We will use the information on the plan to automatically generate the NCI and RDA records and a directory in /g/data/ua8/CLEX_Data/<your-dataset> which will contained also a draft for a readme file and the license attached to the data. You need to request access to the ua8&nbsp;project via&nbsp;[https://my.nci.org.au/ https://my.nci.org.au]&nbsp;.
 
#move the data to the your dataset directory: /g/data/ua8/CLEX_Data*/<your-dataset>/tmp . * this could also be ARCCSS_Data depending on your affiliation.
 
#get the dataset to a good quality level, necessary to share it successfully: metadata in the files should satisfy the&nbsp;[http://cfconventions.org/ CF conventions], there should be a READ-ME file or some data description sitting with the data and both directory structure and filenames should be understandable and contains information on the data. You use a&nbsp;[[CF_checker|CF checker ]]&nbsp;yourself or&nbsp;we can&nbsp;check your files&nbsp;and tell you if there's anything which needs to be done.
 
#Once the files are ready we will ask NCI to run their QA/QC checker on them, if they pass these quality checks the final version is copied to /g/data/ks32/CLEX_Data/ and the&nbsp;dataset is added to the thredds catalogue.
 
  
We are using a new version of the DMP tool&nbsp;if you had already an account on the ARCCSS DMPonline you should be able to login with the same e-mail and password. If you have issue with your password you can [https://clex.dmponline.cloud.edu.au/users/password/new reset it]&nbsp;still using the same e-mail as before.&nbsp;If you do not have yet an account please use your university e-mail, only users from CLEx&nbsp;and the ARCCSS or approved collborators can create an account.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">However, there are no&nbsp;standards required or anyone checking on your metadata. This means that it is up to you to make sure your data</span></span><span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">&nbsp;is as&nbsp;[[Data_terminology|FAIR]]&nbsp;as possible. This means [[FAIR_-_Findable|'''Findable''']], which is harder when your record is not part of a discipline repository or collection or you haven't used keywords in an effective manner. And [[FAIR_-_Accessible|'''Accessible''']], [[FAIR_-_Interoperable|'''Interoperable''']] and '''[[FAIR_-_Reusable|Reusable]],'''&nbsp;which means the data should have enough metadata, use discipline standards and&nbsp;be&nbsp;properly described.</span></span>
  
If your e-mail is not in the approved list Paola will receive an e-mail and will add your e-mail. If you are in doubt don't hesitate to ask for help by e-mailing our Helpdesk: cws_help@nci.org.au.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Finally, as for institutional repositories, the&nbsp;data size is limited to 50 GB and you will&nbsp;not get any additional data services apart from HTTP&nbsp;download. If you decide to go this way,&nbsp;please make sure you document your data properly,&nbsp;we are happy to provide support and review your record. If you use&nbsp;Zenodo, you can easily add your record to our Data Collection (see below)</span></span>
  
Other advice:
+
<span style="font-size:medium;"><u><span style="font-family:Arial,Helvetica,sans-serif;"><span style="caret-color:#000000"><span style="color:#000000">CLEX Data Collection</span></span></span></u></span>
  
#Think carefully about the [[Open_access_licenses|license and rights terms]], you'll find some options on the form itself, contact the helpdesk&nbsp;if there are other collaborators involved or any other special terms to be kept into account, the license will be virtually null if either the CLEx&nbsp;and/or your university don't hold the copyright.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="caret-color:#000000"><span style="color:#000000">&nbsp;We have started a [https://zenodo.org/communities/arc-coe-clex-data/?page=1&size=20 CLEX Data Collection in Zenodo] to collect in one place all our data records, regardless of how they have been published. Having one place where all our data is listed means that it is easier for anyone to discover CLEX data outputs, both for external potential users and for our own researchers and students.&nbsp;</span></span>If you have already published&nbsp;<span style="caret-color:#000000"><span style="color:#000000">your data with your own institution and/or with one of the freely available services, like Zenodo itself, let us know and we will list your record in our collection. We will use your original metadata record, data&nbsp;access url and existing&nbsp;DOI (if available) as official source.</span></span></span></span>
#If you don't have one already&nbsp;make an&nbsp;[http://orcid.org/ ORCID]&nbsp;identity which we can list in the record as well. The ARC is now encouraging&nbsp;the use&nbsp;of researcher identities to reference all your body of work when applying for a grant.
 
#Give some thought to what you want to publish, as well as satisfying the journal requirements, you want to make sure to include anything that could be useful to other researchers. This will increase the value of your data and potentially get more people to cite you. Look [[Which_data_publish|here]] for guidelines.  
 
  
Managing your data is an essential part of the publishing process for more detailed information go to the '''[[Data_induction|Data management induction training]].'''
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">If you are confused feel free to ask us, we are always happy to provide advice and support.</span></span>
  
You can find&nbsp;more information on geonetwork, RDA, thredds and the associated services in our [[Data_management_tools|data management tools]]&nbsp;page.
+
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">'''Where should I publish my code?'''</span></span>
  
I am'''&nbsp;still working on this page and the DMPOnline tool, any feedback on both is welcome!''' Look also to the other wiki pages under data services dedicated to the tool, data management in general and researcher identities for more information
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Like data, code represents part of your work and funders are starting to&nbsp;look&nbsp;at all research products not only papers when reviewing grant applications, also some journals require you to publish your code alongside the data. Putting your code on GitHub or another version control service helps to keep track of the code, expose it to others and manage potential issues and&nbsp;enhancements. However,&nbsp;GitHub is not ideal if you want to pinpoint the code you used for a paper or to create some data.</span></span>
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">We started a Zenodo community for a&nbsp;[https://zenodo.org/communities/arc-coe-clex/?page=1&size=20 CLEX Code Collection (CCC)]&nbsp;in 2020. Zenodo is a platform that will mint a DOI for your code and integrates well with GitHub. Initially we published some of our own codes and&nbsp;code used to produce papers as&nbsp;required by journal editors. We are now looking into broadening this and actively seek contributions of code, notebooks etc that might be useful to others. Zenodo has given our codes much more visibility than GitHub and some of the codes have&nbsp;lots of views and downloads.</span></span>
 +
 
 +
'''<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Publishing options diagram</span></span>'''
 +
 
 +
[[File:Where to publish.jpeg|800px|Publishing options]]
 +
 
 +
&nbsp;
 +
 
 +
 
 +
=== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''How to publish'''</span></span> ===
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">We are not providing here specific advice if you are publishing with an institutional repository or other non-specific repositories. However, the steps involved in both preparing the files and filling in the metadata form will be very similar to what we cover in "Publishing data in Zenodo". So, you can refer to that wiki page and the step by step guide linked into it for useful advice.</span></span>
 +
 
 +
*<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">[[Publishing_with_NCI|Publishing with NCI]]</span></span>
 +
*[[Institution_data_requirements|<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Publishing with your institution</span></span>]]
 +
*<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">[[Publishing_software|Publishing code in Zenodo CLEX Code Collection]]</span></span>
 +
*[[Publishing_data_in_Zenodo|<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Publishing data in Zenodo CLEX Data Collection</span></span>]]
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Reporting to CLEVER'''</span></span>
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Whichever way you decide to publish your data, as part of the NCI collection or with a repository provided by your institution, remember to add your published record to [https://climateextremes.org.au/clever-dashboard/ CLEVER]&nbsp;the CLEX reporting hub, in the "Publications and Datasets" section.</span></span>
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">You will need to record only the main information: author, title, DOI&nbsp;and citation.&nbsp;It will only take a couple of minutes.&nbsp;Published datasets are part of the Centre&nbsp;KPIs and something we have to report to our funders.</span></span>
 +
 
 +
&nbsp;
 +
 
 +
[[Category:Data induction]]

Latest revision as of 00:39, 14 July 2021

The main reasons to publish data is to share it with others and/or following a requirement from a publisher, funder or your own institution. The specific requirements are covered in the institutional policies and journal policies pages. While these requirements might differ in some details they are all based on the FAIR principles, we can help you making a decision on how to publish your data or code in a way that satisfy these principles. This will depend on the kind of data and the reasons you want to publish.

If you are unsure about exactly which data you should be publishing, the CLEX data policy and our guidelines should help you decide.

Whichever way you chose remember to report your record to CLEVER (see below).

Where should I publish my data?

There are at least three options and there is not a straight answer, it depends on what you are publishing and why.

CLEX Data Collection on NCI

  This is the best option if you have data in NetCDF format and this data could be useful for other researchers. We will help you document your data and make it user friendly; it will be part of a climate data collection and so it will be easier to discover. NCI also has more storage capacity than other repositories and services which are designed around the NetCDF format.

Institutional repository

  This is adequate if you have a small dataset and it is really specific to your study or only a subset/post processing of another dataset. While institutions offer some data curation, they usually will not check that the data is well described, consistent and user friendly, so you might get a DOI for your record but no added value. If your dataset is bigger than 50-100 GB, you might not be able to publish it with one of these repositories.

Zenodo, Figshare, Mendeley

  These services are free, and you can create your own account, a record for your data and get a DOI fairly easily and quickly. You can also publish here different kind of materials. This can be useful if you want to publish some very specific data, for example code and data to produce a specific figure required to publish a paper. 

However, there are no standards required or anyone checking on your metadata. This means that it is up to you to make sure your data is as FAIR as possible. This means Findable, which is harder when your record is not part of a discipline repository or collection or you haven't used keywords in an effective manner. And Accessible, Interoperable and Reusable, which means the data should have enough metadata, use discipline standards and be properly described.

Finally, as for institutional repositories, the data size is limited to 50 GB and you will not get any additional data services apart from HTTP download. If you decide to go this way, please make sure you document your data properly, we are happy to provide support and review your record. If you use Zenodo, you can easily add your record to our Data Collection (see below)

CLEX Data Collection

 We have started a CLEX Data Collection in Zenodo to collect in one place all our data records, regardless of how they have been published. Having one place where all our data is listed means that it is easier for anyone to discover CLEX data outputs, both for external potential users and for our own researchers and students. If you have already published your data with your own institution and/or with one of the freely available services, like Zenodo itself, let us know and we will list your record in our collection. We will use your original metadata record, data access url and existing DOI (if available) as official source.

If you are confused feel free to ask us, we are always happy to provide advice and support.

Where should I publish my code?

Like data, code represents part of your work and funders are starting to look at all research products not only papers when reviewing grant applications, also some journals require you to publish your code alongside the data. Putting your code on GitHub or another version control service helps to keep track of the code, expose it to others and manage potential issues and enhancements. However, GitHub is not ideal if you want to pinpoint the code you used for a paper or to create some data.

We started a Zenodo community for a CLEX Code Collection (CCC) in 2020. Zenodo is a platform that will mint a DOI for your code and integrates well with GitHub. Initially we published some of our own codes and code used to produce papers as required by journal editors. We are now looking into broadening this and actively seek contributions of code, notebooks etc that might be useful to others. Zenodo has given our codes much more visibility than GitHub and some of the codes have lots of views and downloads.

Publishing options diagram

Publishing options

 


How to publish

We are not providing here specific advice if you are publishing with an institutional repository or other non-specific repositories. However, the steps involved in both preparing the files and filling in the metadata form will be very similar to what we cover in "Publishing data in Zenodo". So, you can refer to that wiki page and the step by step guide linked into it for useful advice.

Reporting to CLEVER

Whichever way you decide to publish your data, as part of the NCI collection or with a repository provided by your institution, remember to add your published record to CLEVER the CLEX reporting hub, in the "Publications and Datasets" section.

You will need to record only the main information: author, title, DOI and citation. It will only take a couple of minutes. Published datasets are part of the Centre KPIs and something we have to report to our funders.