Difference between revisions of "Leaving the Centre guidelines"

(Files not useful anymore)
Line 36: Line 36:
 
=== Files not useful anymore ===
 
=== Files not useful anymore ===
 
-----
 
-----
 +
It can be difficult to know what to keep and what to delete. A general rule might be to keep what is needed to reproduce your work and delete everything else. It gets complicated as this should be weighed against the cost, in money and time, of reproducing your work.
 +
This rule still allows us to clearly identify files that always need to be kept and files that never need to be kept.
 +
 +
====To keep, always====
 +
'''Codes'''
 +
 +
There are 2 types of codes: codes distributed by others (e.g. climate models) and codes written by you.
 +
 +
''For codes distributed by others'', you simply need to keep a reference to the version used as long as no modification to the code was made by you. If you modified the code and this modification is part of a standard version of the model, you might be able to simply reference this version as long as it is exactly the version you used. If you modified the code for yourself only, this is now a code written by you and falls into the second category.
 +
 +
''For codes written by you'', the simplest is for you to save the code in a Git repository on Github, then to publish this repository via Zenodo. We can help with the publication. It is absolutely fine to create a repository per project or paper with all your codes in. You can then use the README file from the Github repository to explain how to reproduce your results. Don't forget to clearly reference everything one might need in addition to this repository. Or you can have a repository per code especially if you envision you'll reuse the same code for other work.
 +
 +
'''Configuration files for running the codes and some input files.'''
 +
 +
In addition to the codes themselves, you need to keep everything that enables someone to run the codes in the same way you have done so. Usually, the most complicated configurations are for climate models. Some climate models will save your configurations in version control repositories (e.g. UM, ACCESS, ACCESS-OM2), in which case you simply need to keep the information on how to retrieve these configurations. Some models don't save your configurations and you need to do it yourself.
 +
 +
For the input files, some inputs are published data in which case you need to keep the reference to this data (including the version). If you have written several codes, the output of a piece of code will be the input of the next piece of code, in which case you do not necessarily need to keep that data. But you need to keep the information on your workflow.
 +
 +
'''A description of your workflow'''
 +
 +
 +
 +
 +
To delete, always:
  
  

Revision as of 20:34, 11 March 2020


Sorting your data

You need to know what you will be required to do when you leave before you leave so you can prepare for it. Sorting through your files might take longer than you think.

You might need to discuss the following questions with your supervisor:

  1. Will you require access to the files after you leave (e.g. for a paper review)?
  2. What files will be useful to others? Should they be published?
  3. What files won't be useful to others? Should they be archived or deleted?

If you need specific advice on how to actually transfer or archive your files, you can always ask for assistance to our helpdesk: [1].

You require access to the files after you leave


Files at NCI

When you leave, you will keep your NCI credentials IF you keep your contact details up to date through https://my.nci.org.au

Make sure to quit all NCI projects you don't need access to anymore. You can be ruthless as it is quite easy to gain access to those later on if needed.

Some projects have strict license terms attached (e.g. access). Your membership might be revoked at any time after you leave the Centre if you haven't negotiated extended access to those projects. Please contact the Lead CI or the project manager to discuss your needs.

Files at your institution Most universities will close your university and e-mail account. Often university data services are accessible only via your university account. If that is the case, you need to arrange access for yourself by contacting the IT services or the CI of projects that you used to deposit data before you leave. Specifics on university data services and advice on what happens when you leave can be found following the relevant link in the data services page.

Your files will be used by others


Publish the files

You should publish your files if the files are frequently used by several other persons. Ideally, that data should be published as soon as its usefulness to others is clear, not just when you leave.

If your files are at NCI, that is all you need to do. If your files at held at your institution, you may need to make sure there is a copy that is accessible by others and owned by someone who is likely to stay for years to come. This might depend on your institution data services.

Change ownership of the files

If someone else comes after you and simply uses your files to continue on related work, you simply need to change the ownership of the files. If the files are at NCI, you can not change the files' ownership. You and the Lead CI of the project owning the files will need to contact [[2]] so NCI staff can do it for you.

Files not useful anymore


It can be difficult to know what to keep and what to delete. A general rule might be to keep what is needed to reproduce your work and delete everything else. It gets complicated as this should be weighed against the cost, in money and time, of reproducing your work. This rule still allows us to clearly identify files that always need to be kept and files that never need to be kept.

To keep, always

Codes

There are 2 types of codes: codes distributed by others (e.g. climate models) and codes written by you.

For codes distributed by others, you simply need to keep a reference to the version used as long as no modification to the code was made by you. If you modified the code and this modification is part of a standard version of the model, you might be able to simply reference this version as long as it is exactly the version you used. If you modified the code for yourself only, this is now a code written by you and falls into the second category.

For codes written by you, the simplest is for you to save the code in a Git repository on Github, then to publish this repository via Zenodo. We can help with the publication. It is absolutely fine to create a repository per project or paper with all your codes in. You can then use the README file from the Github repository to explain how to reproduce your results. Don't forget to clearly reference everything one might need in addition to this repository. Or you can have a repository per code especially if you envision you'll reuse the same code for other work.

Configuration files for running the codes and some input files.

In addition to the codes themselves, you need to keep everything that enables someone to run the codes in the same way you have done so. Usually, the most complicated configurations are for climate models. Some climate models will save your configurations in version control repositories (e.g. UM, ACCESS, ACCESS-OM2), in which case you simply need to keep the information on how to retrieve these configurations. Some models don't save your configurations and you need to do it yourself.

For the input files, some inputs are published data in which case you need to keep the reference to this data (including the version). If you have written several codes, the output of a piece of code will be the input of the next piece of code, in which case you do not necessarily need to keep that data. But you need to keep the information on your workflow.

A description of your workflow



To delete, always:


If you are leaving, or even if you are only changing position in the Center, one or more of the following might apply to you.

NCI account

Your NCI user-id will stay the same unless you (or your new institution) specifically ask for it to be changed. NCI will suspend your account once you are no longer a member of any group or your contact details are not up to date. Occasionally projects review their active members by sending e-mails: keep your contact e-mail updated through my.nci.org.au

Leaving a project

If you leave a specific project:

  • tidy up and document the files and directory structure and contact the lead-CI or project representative
  • If you haven’t already, set r-X group access to all your files and directories
  • If the project representative agrees, ask help@nci.org.au to transfer ownership of your files to someone else in the project (specify the project and filesystem)
  • If you want to transfer files externally
    • use sftp, scp or rsync to transfer files securely ( rsync can be resumed )
    • use the dedicated data-mover nodes, g-dm.nci.org.au for large file transfers
    • use copyq if you want to queue a job

Leaving your institution