This page lists FAQs for the different model supported by the CMS team, as well as FAQs around accessing and publishing data.
- 1 FAQ pages
- 2 FAQ miscellaneous
- 2.1 How to debug issues with gadi_jupyter?
- 2.2 How do I run independent jobs in parallel?
- 2.3 How to transform 1D lat/lon arrays into 2D lat/lon arrays with Python?
- 2.4 How do I push my local code changes on gadi back to github using ssh keys?
- 2.5 Why do I get segfaults?
- 2.6 No files found or non-existent path error while running a PBS job
- 2.7 Weird errors for a script that was previously working
- 2.8 Running out of memory when running a relatively small timeseries analysis using xarray/dask
How to debug issues with gadi_jupyter?
The script submitted to PBS and the PBS logs are saved under the directory tmp/runjp directory under the user's /scratch directory for the group used to run the script. These will often have more information if the script was started alright but you can't seem to be able to connect to the Jupyer Notebooks. This usually happens when the job started alright but exited due to an error in the script.
For example, let's say I (ccc561) requests gadi_jupyter to run using the project w35. The information will then be stored under /scratch/w35/ccc561/tmp/runjp
If I then run gadi_jupyter again but this time running on w40, the information will be stored under /scratch/w40/ccc561/tmp/runjp
How do I run independent jobs in parallel?
If you have to run the same script with a lot of independent sets of inputs, or you have lots of similar scripts to run at NCI. And these scripts run on 1 processor. You could submit each of your jobs independently via PBS but then small jobs are not efficient at NCI. It would probably be better for you to have a master script that dispatches your jobs to available processors. You then submit this master script to PBS and this script requires a lot more processors and hence might be more efficient with the scheduler. There is an example of such a master script on this page.
How to transform 1D lat/lon arrays into 2D lat/lon arrays with Python?
If using numpy or masked arrays, one can use the resize() method. Assuming slat and slon are 1D latitude and longitude arrays, do this:
tmp = numpy.resize(slat, (slon.size, slat.size)) slat2D = numpy.transpose(tmp) slon2D = numpy.resize(slon, (slat.size, slon.size))
If you want masked arrays, replace "numpy." by "numpy.ma."
How do I push my local code changes on gadi back to github using ssh keys?
You have an account on | github. You have cloned a repository to raijin and made changes to it. Now you want to push those changes back to the github repository but you get errors like this (where <username> is your github username):
error: The requested URL returned error: 403 Forbidden while accessing https://github.com/<username>/test.git/info/refs fatal: HTTP request failed
Firstly, you need to | make an ssh key and add the public part of the key to your account on github. Check the url to which you are trying to push your changes:
git remote -v origin https://github.com/<username>/test.git (fetch) origin https://github.com/<username>/test.git (push)
If it looks like the output above you need to alter the url and put in a login name in front of the address. For ssh access the login name is always git. The command to set the remote url is:
git remote set-url origin ssh://firstname.lastname@example.org/<username>/test.git
but replace <username> with your github username.
Why do I get segfaults?
A segmentation fault means the program you are using has tried to read or write a memory location it does not have access to. This can mean there is an error in the code, or there is a system limit set. If the program runs for others, but not you, it may well be a system limit issue. This seems to be quite common when trying to write netcdf files. In some cases this can be overcome by setting your stack size to unlimited
For tcsh/csh add this to your .cshrc:
limit stacksize unlimited
For bash add this to your .bashrc:
ulimit -s unlimited
Note: you must log out and log back in again for this change to take effect.
No files found or non-existent path error while running a PBS job
When submitting a job on gadi you need to declare explicitly what gdata and/or scratch projects your script need to access.
This is done by setting the storage flag in the job you are submitting:
- PBS -l storage=gdata/hh5+gdata/ua8+gdata/e14
In the above case I want to use input data from gdata/ua8 save my results to gdata/e14 and use the conda modules which are in gdata/hh5
Weird errors for a script that was previously working
Make sure you have loaded the right modules and only them! Avoid to load modules directly in your .bash_profile, .bashrc files . They will be loaded every time and occasionally interfere with the modules you are loading for a specific task.
It is useful and safe to add instructions as
module use /g/data/hh5/public/modules
as this is making the modules available without loading any.
Sometimes you need to install a python module in your user space ( $HOME/.local/... ) as it is not available in a managed conda environment. Again, occasionally modules stored in your local user space can interfere with other loaded modules, for example if they have different dependencies. If you find this is causing issues you can exclude the local environment by setting this environment variable
PYTHONNOUSERSITE=x ( =True and =1 should also work as values)
Running out of memory when running a relatively small timeseries analysis using xarray/dask