http://climate-cms.wikis.unsw.edu.au/api.php?action=feedcontributions&user=Aidanheerdegen&feedformat=atomclimate-cms wikis.unsw.edu.au - User contributions [en]2024-03-28T18:45:41ZUser contributionsMediaWiki 1.31.0http://climate-cms.wikis.unsw.edu.au/index.php?title=NetCDF_Compression_Tools&diff=253NetCDF Compression Tools2019-01-08T00:50:33Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>==Why compress netCDF files?== <br />
<br />
Space. Your supervisor/CI/system administrator is telling you your files are taking too much space. Compressing your netCDF data files will shrink your data files to <range type="comment" id="545050612_1">one third the size</range id="545050612_1">. This is the equivalent of being given three times as much disk space.<br />
<br />
NetCDF compression is lossless: the data is exactly as it was when read from disk. It can still be read using the same programming interface. As long as the program reading the data has been compiled with the latest netCDF library (version 4) then the task of decompressing the data is handled by the library and as far as the programs are concerned there is no difference in the data. The usual tools, such as ncdump, can be used to examine the variables contained within the netCDF file. However, if you rely on an old piece of software that you think may not have been compiled with netCDF4 you should test that it can read compressed netCDF4 files before converting all your data.<br />
<br />
It is possible to simply compress the entire file on disk with tools such as gzip. This has the disadvantage that the file must be decompressed to be read and then recompressed again when you have finished, which can be time consuming and degrade your productivity, not to mention the data in question will take up much more room while it is being analysed. This is not recommended.<br />
<br />
=General guidelines= <br />
<br />
The netCDF library has several options for compressing data, which all compression programs will use, as they all use the underlying library to perform the compression. There is a [http://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf_compression | more detailed explanation] if you wish to understand more, but briefly:<br />
<br />
===Deflate level=== <br />
<br />
This is an integer value from ranging from 0 to 9. A value of 0 means no compression, and 9 is the highest level of compression possible. The higher this value the smaller your file will be once compressed. However, there is a trade-off, the higher the deflate level, the longer it will take to compress, particularly so with very high deflate levels. At deflate level 9 it can take six times longer to compress the data, with only a few percent improvement in compression. The recommended deflate level is 5. This combines good compression with a small increase in compression time.<br />
<br />
===Shuffle=== <br />
<br />
Turn shuffle on. Simple. It usually results in a smaller compressed file with little performance overhead.<br />
<br />
===Chunking=== <br />
<br />
The netCDF library writes the data to disk in "chunks". There is a [http://www.unidata.ucar.edu/blogs/developer/entry/chunking_data_why_it_matters | very good description of chunking and how it works]. All you really have to know is that in order to use netCDF compression '''your data must be chunked'''.The question then is, do I care how the program I use chooses the size of my data chunks? The answer is almost certainly yes, but maybe not a lot. An optimal chunking strategy is [http://www.unidata.ucar.edu/blogs/developer/en/entry/chunking_data_choosing_shapes | largely determined by the structure of your data and how you will access it].<br />
<br />
Specific details about chunking strategies are largely dependent on the tool used to compress your data, and will be covered in more detail in the next section. However, all tools still utilise the underlying netCDF4 library, and so can implement the default chunking strategy, which has changed over time. For many versions the default strategy has been to create chunks that are simply the same size as the dimensions of the variable, which can be a disastrous choice in terms of performance if the data is also compressed. The entire variable must be read into memory to be uncompressed even if only a single slice is required.<br />
<br />
=Compression tools= <br />
<br />
There are some software packages available on raijin that can be used to compress netCDF data:<br />
<br />
===nco=== <br />
<br />
The [http://nco.sourceforge.net | netCDF Operator (NCO) program suite] can compress netCDF files and has recently [http://nco.sourceforge.net/nco.html#Compression | included some ability to choose different chunking strategies]. It may be that for some cases this is a reasonable solution based on the available options, but a weakness is the i<range type="comment" id="545084662_1">nability to use their optimised chunking strategy for variables with four dimensions or more</range id="545084662_1">.<br />
<br />
===cdo=== <br />
<br />
<span class="cmsl-10">[https://code.zmaw.de/projects/cdo | Climate Data Operators] (cdo) can also compress netCDF and offers limited chunking options: </span>auto, grid or lines.<br />
<br />
===netcdf=== <br />
<br />
One of the standard tools included in a netCDF installation is [https://www.unidata.ucar.edu/software/netcdf/docs/netcdf/nccopy.html | nccopy]. nccopy can compress files and define the chunking using a command line argument (-c). nccopy is a good option if your data file structure changes little, so a chunking scheme can be decided upon and hard coded into scripts. It is not so useful if the dimensions and variables change. Another major limitation is that the chunking is defined by dimensions, not variables. If your data file has variables that share dimensions, but have different combinations or numbers of dimensions it is not possible to determine an optimal chunking strategy for each variable.<br />
<br />
===nccompress=== <br />
<br />
The nccompress package is available on raijin under [https://accessdev.nci.org.au/trac/wiki/Raijin%20Apps | ACCESS apps]. At present it consists of three python programs, ncfind, nc2nc and nccompress, written and supported by [http://www.climatescience.org.au/staff/profile/AHeerdegen | Aidan]. nc2nc can copy netCDF files with compression and an optimised chunking strategy that has reasonable performance for many datasets. This two main limitations: it is slower than the other programs, and it can only compress netCDF3 or netCDF4 classic format. There is more detail in the following sections.<br />
<br />
===ncinfo=== <br />
<br />
The convenience utility ncinfo is also included, and though it has no direct relevance to compression, it is a convenient way to get a summary of the contents of a netCDF file.<br />
<br />
==Identifying files to be compressed== <br />
<br />
ncfind, part of the nccompress package, can be used to find netCDF files and discriminate between compressed and uncompressed:<br />
<br />
<syntaxhighlight><br />
$ ncfind -h<br />
usage: ncfind [-h] [-r] [-u | -c] [inputs [inputs ...]]<br />
<br />
Find netCDF files. Can discriminate by compression<br />
<br />
positional arguments:<br />
inputs netCDF files or directories (-r must be specified to<br />
recursively descend directories). Can accept piped<br />
arguments.<br />
<br />
optional arguments:<br />
-h, --help show this help message and exit<br />
-r, --recursive Recursively descend directories to find netCDF<br />
files (default False)<br />
-u, --uncompressed Find only uncompressed netCDF files (default False)<br />
-c, --compressed Find only compressed netCDF files (default False)<br />
<br />
</syntaxhighlight><br />
There are other methods for finding files, namely the unix utility find utility. For example, to find all files in the directory "directoryname" which end in ".nc":<br />
<syntaxhighlight><br />
find directoryname -iname "*.nc"<br />
<br />
</syntaxhighlight><br />
However, if your netCDF files do not use the convention of ending in ".nc" or cannot be systematically found based on filename, you can use the ncfind to recursively descend into a directory structure looking for netCDF files:<br />
<syntaxhighlight><br />
ncfind -r directoryname<br />
</syntaxhighlight><br />
You can refine the search further by requesting to return only those files that are uncompressed:<br />
<syntaxhighlight><br />
ncfind -r -u directoryname<br />
</syntaxhighlight><br />
If you want to find out how much space these uncompressed files occupy you can combine this command with other unix utilities such as xargs and du:<br />
<syntaxhighlight><br />
ncfind -r -u directoryname | xargs du -h<br />
</syntaxhighlight><br />
du is the disk usage utility. The output looks something like this:<br />
<syntaxhighlight><br />
67M output212/ice__212_223.nc<br />
1003M output212/ocean__212_223.nc<br />
1.1G total<br />
<br />
</syntaxhighlight><br />
It is even possible to combine the system find utility with ncfind, using a unix pipe (|). This command will find all files ending in ".nc", pipe the results to ncfind, and only those that are uncompressed will be printed to the screen:<br />
<syntaxhighlight><br />
find directoryname -iname "*.nc" | ncfind -u<br />
</syntaxhighlight><br />
<br />
==Batch Compressing files== <br />
<span id="nc_compress"></span><br />
Having identified where the netCDF files you wish to compress are located, there is a convenience program, nccompress, which can be used to easily step through and compress each file in turn:<br />
<syntaxhighlight><br />
$ nccompress -h<br />
usage: nccompress [-h] [-d {1-9}] [-n] [-b BUFFERSIZE] [-t TMPDIR] [-v] [-r]<br />
[-o] [-m MAXCOMPRESS] [-p] [-f] [-c] [-pa] [-np NUMPROC]<br />
[--nccopy]<br />
inputs [inputs ...]<br />
<br />
Run nc2nc (or nccopy) on a number of netCDF files<br />
<br />
positional arguments:<br />
inputs netCDF files or directories (-r must be specified to<br />
recursively descend directories)<br />
<br />
optional arguments:<br />
-h, --help show this help message and exit<br />
-d {1-9}, --dlevel {1-9}<br />
Set deflate level. Valid values 0-9 (default=5)<br />
-n, --noshuffle Don't shuffle on deflation (default is to shuffle)<br />
-b BUFFERSIZE, --buffersize BUFFERSIZE<br />
Set size of copy buffer in MB (default=50)<br />
-t TMPDIR, --tmpdir TMPDIR<br />
Specify temporary directory to save compressed files<br />
-v, --verbose Verbose output<br />
-r, --recursive Recursively descend directories compressing all netCDF<br />
files (default False)<br />
-o, --overwrite Overwrite original files with compressed versions<br />
(default is to not overwrite)<br />
-m MAXCOMPRESS, --maxcompress MAXCOMPRESS<br />
Set a maximum compression as a paranoid check on<br />
success of nccopy (default is 10, set to zero for no<br />
check)<br />
-p, --paranoid Paranoid check : run nco ndiff on the resulting file<br />
ensure no data has been altered<br />
-f, --force Force compression, even if input file is already<br />
compressed (default False)<br />
-c, --clean Clean tmpdir by removing existing compressed files<br />
before starting (default False)<br />
-pa, --parallel Compress files in parallel<br />
-np NUMPROC, --numproc NUMPROC<br />
Specify the number of processes to use in parallel<br />
operation<br />
--nccopy Use nccopy instead of nc2nc (default False)<br />
<br />
</syntaxhighlight><br />
The simplest way to invoke the program would be with a single file:<br />
<syntaxhighlight><br />
nccompress ice_daily_0001.nc<br />
</syntaxhighlight><br />
or using a wildcard expression:<br />
<syntaxhighlight><br />
nccompress ice*.nc<br />
</syntaxhighlight><br />
You can also specify one or more directory names in combination with the recursive flag (-r) and the program will recursively descend into those directories and find all netCDF files contained therein. For example, a directory listing might look like so:<br />
<syntaxhighlight><br />
$ ls data/<br />
output001 output003 output005 output007 output009 restart001 restart003 restart005 restart007 restart009<br />
output002 output004 output006 output008 output010 restart002 restart004 restart006 restart008 restart010<br />
</syntaxhighlight><br />
with a number of sub-directories, all containing netCDF files.<br />
<br />
It is a good idea to do a trial run and make sure it functions properly. For example, this will compress the netCDF files in just one of the directories:<br />
<syntaxhighlight><br />
nccompress -p -r data/output001<br />
</syntaxhighlight><br />
Once completed there will be a new subdirectory called tmp.nc_compress inside the directory output001. It will contain compressed copies of all the netCDF files from the directory above. You can check the compressed copies to make sure they are correct. The paranoid option (-p) calls an nco command to check that the variables contained in the two files are the same. You can use the paranoid option routinely, thought it will make the process more time consuming. It is a good idea to use it in the testing phase. You should also check the compressed copies manually to make sure they look ok, and if so, re-run the command with the -o option (overwrite):<br />
<syntaxhighlight><br />
nccompress -r -o data/output001<br />
</syntaxhighlight><br />
<br />
and it will find the already compressed files, copy them over the originals and delete the temporary directory tmp.nc_compress. It won’t try to compress the files again. It also won’t compress already compressed files, so, for example, if you were happy that the compression was working well you could compress the entire data directory, and the already compressed files in output001 will not be re-compressed.<br />
<br />
So, by default, nccompress '''does not overwrite the original files'''. If you invoke it without the '-o' option it will create compressed copies in the tmp.nc_compress subdirectory and leave them there, which will consume more disk space! This is a feature, not a bug, but you need to be aware that this is how it functions.<br />
<br />
With large variables, which usually means large files (> 1GB) it is a good idea to specify a larger buffer size with the '-b' option, as it will run faster. On raijin this may mean you need to run interactively with a higher memory (~10GB) or submit it as a copyq job. A typical buffer size might be 1000 -> 5000 (1->5 GB).<br />
<br />
It is also possible to use wildcards type operations, e.g.<br />
<br />
<syntaxhighlight><br />
nccompress -r -o output*<br />
<br />
nccompress -r -o output00[1-5]<br />
<br />
nccompress -r -o run[1-5]/output*/ocean*.nc random.nc ice*.nc<br />
</syntaxhighlight><br />
The nccompress program just sorts out finding files/directories etc, it calls nc2nc to do the compression. Using the '--nccopy' forces nccompress to use the nccopy program in place of nc2nc, though the netcdf package must already be loaded for this to work.<br />
<br />
You can tell nccompress to work on multple files simultaneously with the '-pa' option. By default this will use all the physical processors on the machine, or you can specify how many simultaneous processes you want to with '-np', e.g.<br />
<syntaxhighlight><br />
nccompress -r -o -np 16 run[1-5]/output*/ocean*.nc random.nc ice*.nc<br />
</syntaxhighlight><br />
will compress 16 netCDF files at a time (the -np option implies parallel option). As each directory is processed before beginning on a new directory there will be little reduction in execution time if there are few netCDF files in each directory.<br />
<br />
==nc2nc<span id="nc2nc"></span>== <br />
<br />
The nc2nc program was written because no existing tool had a generalised per variable chunking algorithm. The total chunk size is defined to be the file system block size (4096KB). The dimensions of the chunk are sized to be as close as possible to the same ratio as the dimensions of the data, with the limits that no dimension can be less than 1. This chunking scheme performs well for a wide range of data, but there will always be cases for certain types of access, or variable shape that this is not optimal. In those cases a different approach may be required.<br />
<br />
Be aware that nc2nc takes at least twice as long to compress an equivalent file as nccopy. In some cases with large files containing many variables it can be up to five times slower.<br />
<br />
You can use nc2nc “stand alone”. It has a couple of extra features that can only be accessed by calling it directly:<br />
<syntaxhighlight><br />
$ nc2nc -h<br />
usage: nc2nc [-h] [-d {1-9}] [-m MINDIM] [-b BUFFERSIZE] [-n] [-v] [-c] [-f]<br />
[-va VARS] [-q QUANTIZE] [-o]<br />
origin destination<br />
<br />
Make a copy of a netCDF file with automatic chunk sizing<br />
<br />
positional arguments:<br />
origin netCDF file to be compressed<br />
destination netCDF output file<br />
<br />
optional arguments:<br />
-h, --help show this help message and exit<br />
-d {1-9}, --dlevel {1-9}<br />
Set deflate level. Valid values 0-9 (default=5)<br />
-m MINDIM, --mindim MINDIM<br />
Minimum dimension of chunk. Valid values 1-dimsize<br />
-b BUFFERSIZE, --buffersize BUFFERSIZE<br />
Set size of copy buffer in MB (default=50)<br />
-n, --noshuffle Don't shuffle on deflation (default is to shuffle)<br />
-v, --verbose Verbose output<br />
-c, --classic use NETCDF4_CLASSIC output instead of NETCDF4 (default<br />
true)<br />
-f, --fletcher32 Activate Fletcher32 checksum<br />
-va VARS, --vars VARS<br />
Specify variables to copy (default is to copy all)<br />
-q QUANTIZE, --quantize QUANTIZE<br />
Truncate data in variable to a given decimal<br />
precision, e.g. -q speed=2 -q temp=0 causes variable<br />
speed to be truncated to a precision of 0.01 and temp<br />
to a precision of 1<br />
-o, --overwrite Write output file even if already it exists (default<br />
is to not overwrite)<br />
</syntaxhighlight><br />
With the vars option (-va) it is possible to select out only a subset of variables to be copied to the destination file. By default the output file is netCDf4 classic, but this can be changed to netCDF4 using the '-c' option. It is also possible to specify a minimum dimension size for the chunks (-m). This may be desirable for a dataset that has one particularly long dimension,. The chunk dimensions would mirror this and be very large in this direction . If fast access is required from slices orthogonal to this direction performance might be improved setting this option to a number greater than 1.<br />
<br />
==ncinfo<span id="ncinfo"></span>== <br />
<br />
ncinfo is a convenient way to get a summary of the contents of a netCDF file.<br />
<syntaxhighlight><br />
./ncinfo -h<br />
usage: ncinfo [-h] [-v] [-t] [-d] [-a] [-va VARS] inputs [inputs ...]<br />
<br />
Output summary information about a netCDF file<br />
<br />
positional arguments:<br />
inputs netCDF files<br />
<br />
optional arguments:<br />
-h, --help show this help message and exit<br />
-v, --verbose Verbose output<br />
-t, --time Show time variables<br />
-d, --dims Show dimensions<br />
-a, --aggregate Aggregate multiple netCDF files into one dataset<br />
-va VARS, --vars VARS<br />
Show info for only specify variables<br />
<br />
</syntaxhighlight><br />
By default it prints out a simple summary of the variables in a netCDF file, but omitting dimensions and time related variables. e.g.<br />
<syntaxhighlight><br />
ncinfo output096/ocean_daily.nc<br />
<br />
output096/ocean_daily.nc<br />
Time steps: 365 x 1.0 days<br />
tau_x :: (365, 1080, 1440) :: i-directed wind stress forcing u-velocity<br />
tau_y :: (365, 1080, 1440) :: j-directed wind stress forcing v-velocity<br />
geolon_t :: (1080, 1440) :: tracer longitude<br />
geolat_t :: (1080, 1440) :: tracer latitude<br />
geolon_c :: (1080, 1440) :: uv longitude<br />
geolat_c :: (1080, 1440) :: uv latitude<br />
<br />
</syntaxhighlight><br />
If you specify more than one file it will print the information for each file in turn<br />
<syntaxhighlight><br />
ncinfo output09?/ocean_daily.nc<br />
<br />
output096/ocean_daily.nc<br />
Time steps: 365 x 1.0 days<br />
tau_x :: (365, 1080, 1440) :: i-directed wind stress forcing u-velocity<br />
tau_y :: (365, 1080, 1440) :: j-directed wind stress forcing v-velocity<br />
geolon_t :: (1080, 1440) :: tracer longitude<br />
geolat_t :: (1080, 1440) :: tracer latitude<br />
geolon_c :: (1080, 1440) :: uv longitude<br />
geolat_c :: (1080, 1440) :: uv latitude<br />
<br />
output097/ocean_daily.nc<br />
Time steps: 365 x 1.0 days<br />
output098/ocean_daily.nc<br />
Time steps: 365 x 1.0 days<br />
tau_x :: (365, 1080, 1440) :: i-directed wind stress forcing u-velocity<br />
tau_y :: (365, 1080, 1440) :: j-directed wind stress forcing v-velocity<br />
geolon_t :: (1080, 1440) :: tracer longitude<br />
geolat_t :: (1080, 1440) :: tracer latitude<br />
geolon_c :: (1080, 1440) :: uv longitude<br />
geolat_c :: (1080, 1440) :: uv latitude<br />
<br />
output098/ocean_daily.nc<br />
Time steps: 365 x 1.0 days<br />
tau_x :: (365, 1080, 1440) :: i-directed wind stress forcing u-velocity<br />
tau_y :: (365, 1080, 1440) :: j-directed wind stress forcing v-velocity<br />
geolon_t :: (1080, 1440) :: tracer longitude<br />
geolat_t :: (1080, 1440) :: tracer latitude<br />
geolon_c :: (1080, 1440) :: uv longitude<br />
geolat_c :: (1080, 1440) :: uv latitude<br />
<br />
output099/ocean_daily.nc<br />
Time steps: 365 x 1.0 days<br />
tau_x :: (365, 1080, 1440) :: i-directed wind stress forcing u-velocity<br />
tau_y :: (365, 1080, 1440) :: j-directed wind stress forcing v-velocity<br />
geolon_t :: (1080, 1440) :: tracer longitude<br />
geolat_t :: (1080, 1440) :: tracer latitude<br />
geolon_c :: (1080, 1440) :: uv longitude<br />
geolat_c :: (1080, 1440) :: uv latitude<br />
</syntaxhighlight><br />
If the files have the same structure it is possible to aggregate the data and display it as if it were contained in a single dataset:<br />
<syntaxhighlight><br />
ncinfo -a output09?/ocean_daily.nc<br />
<br />
Time steps: 1460 x 1.0 days<br />
tau_x :: (1460, 1080, 1440) :: i-directed wind stress forcing u-velocity<br />
tau_y :: (1460, 1080, 1440) :: j-directed wind stress forcing v-velocity<br />
geolon_t :: (1080, 1440) :: tracer longitude<br />
geolat_t :: (1080, 1440) :: tracer latitude<br />
geolon_c :: (1080, 1440) :: uv longitude<br />
geolat_c :: (1080, 1440) :: uv latitude<br />
</syntaxhighlight><br />
You can also just request variables you are interested in to be output:<br />
<syntaxhighlight><br />
ncinfo -va tau_x -va tau_y output09?/ocean_daily.nc<br />
<br />
output096/ocean_daily.nc<br />
Time steps: 365 x 1.0 days<br />
tau_x :: (365, 1080, 1440) :: i-directed wind stress forcing u-velocity<br />
tau_y :: (365, 1080, 1440) :: j-directed wind stress forcing v-velocity<br />
<br />
output097/ocean_daily.nc<br />
Time steps: 365 x 1.0 days<br />
tau_x :: (365, 1080, 1440) :: i-directed wind stress forcing u-velocity<br />
tau_y :: (365, 1080, 1440) :: j-directed wind stress forcing v-velocity<br />
<br />
output098/ocean_daily.nc<br />
Time steps: 365 x 1.0 days<br />
tau_x :: (365, 1080, 1440) :: i-directed wind stress forcing u-velocity<br />
tau_y :: (365, 1080, 1440) :: j-directed wind stress forcing v-velocity<br />
<br />
output099/ocean_daily.nc<br />
Time steps: 365 x 1.0 days<br />
tau_x :: (365, 1080, 1440) :: i-directed wind stress forcing u-velocity<br />
tau_y :: (365, 1080, 1440) :: j-directed wind stress forcing v-velocity<br />
</syntaxhighlight><br />
[[Category:netcdf]][[Category:compress]][[Category:python]]</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=JRA55-do&diff=193JRA55-do2018-07-31T00:54:45Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>JRA55- Driving Ocean is a surface dataset based on JRA-55 for use in driving ocean - sea ice models. The dataset consists of nine atmospheric variables necessary for computing surface fluxes as well as fresh water runoff to the ocean. All of the atmospheric variables are derived from the forecast phase of JRA-55 and are three hourly. At the time of writing the temporal coverage is from 1st of January 1958 to 1st of February 2018, but it is regularly updated to near present day.<br />
For more information on this dataset please refer to the[http://amaterasu.ees.hokudai.ac.jp/~tsujino/JRA55-do-v1.3/00README_v1_3.1st | readme file for v1-3] and from the UCAR website: [[http://www.cesm.ucar.edu/events/wg-meetings/2018/presentations/omwg/kim.pdf]]<br />
<br />
===Licence=== <br />
Terms of Use for this dataset are available [http://amaterasu.ees.hokudai.ac.jp/~tsujino/JRA55-do-v1.2/Terms_of_Use.txt | here]. NB this should be soon updated, since the dataset will be officially published as part of the ana4MIPs project. the citation also will be soon updated as soon as the paper will be published.<br />
This dataset should be used only to drive ocean models, <u>it is not suitable for analysis</u>.<br />
<br />
**Citation:'''<br />
<span style="font-size: 9pt;">Tsujino H, Urakawa S, Nakano H, Small RJ, Kim WM, Yeager SG, Danabasoglu G, Suzuki T, Bamber JL, Bentsen M, Böning C, Bozec A, Chassignet E, Curchitser E, Dias FB, Durack PJ, Grif- es SM, Harada Y, Ilicak M, Josey SA, Kobayashi C, Kobayashi S, Komuro Y, Large WG, Le Sommer J, Marsland SJ, Masina S, Scheinert M, Tomita H, Valdivieso M, Yamazaki D (2017) JRA-55 based surface dataset for driving ocean-sea-ice models (JRA55-do). Ocean Model (accepted 2018) </span>https://doi.org/10.1016/j.ocemod.2018.07.002<br />
<br />
===JRA55-do on raijin=== <br />
<br />
JRA55-do is available on raijin under the ua8 project:<br />
<br />
/g/data/ua8/JRA55-do/<version>/<files><br />
<br />
We created a new directory latest where the latest version is linked with user friendly names.<br />
<br />
/g/data/ua8/JRA55-do/latest/<br />
<br />
contains files named <variable>.YYYY.nc instead of <variable>.YYYY.<created-date>.nc<br />
<br />
Also, superseded files which were in v1-3 have been moved to the directory<br />
<br />
/g/data/ua8/JRA55-do/v1-3/superseded_2017/<br />
<br />
This dataset is for internal use only, if you're not part of our ocean modelling group please contact climate_help@nci.org.au to arrange access.</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=How_to_run_WRF&diff=167How to run WRF2018-07-13T04:03:26Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>NCAR provides a detailed [http://www2.mmm.ucar.edu/wrf/OnLineTutorial/index.htm | tutorial] on how to run WRF, but note:<br />
<br />
* Skip the configure and compile steps of the NCAR tutorial and go straight to the Basics tab. You should have followed the NCI-specific [[WRF | installation instructions]] for building the model at NCI '''before''' attempting to run the model<br />
* Skip the geogrid, ungrib, and metgrid steps as these are included and compiled during the installation step<br />
* To run the [http://www2.mmm.ucar.edu/wrf/OnLineTutorial/CASES/JAN00/ungrib.htm | January 2000 tutorial case] do not download the metgrid data, is it available at NCI, see below<br />
* When the tutorial says to run the model see below for local run scripts<br />
==Data for the tutorial== <br />
The data needed to run the [http://www2.mmm.ucar.edu/wrf/OnLineTutorial/CASES/JAN00/ungrib.htm | January 2000 tutorial case] is stored under '''/projects/WRF/data/JAN00''' at NCI and the geographical data (needed for all runs) are under '''/projects/WRF/data/WPS_GEOG'''<br />
<br />
=='''Run scripts for real.exe and wrf.exe'''== <br />
Example scripts to submit real.exe and wrf.exe to the queues (do not try to run these on the login nodes) are provided under '''WRFV3/run'''. The files are named '''run_real.exe''' and '''run_mpi.exe'''.<br />
Those files are good to use as is for the tutorial. For other configurations, feel free to configure those to your needs.</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=Compiling_MOM6&diff=106Compiling MOM62018-07-03T02:32:28Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>There will be an easier method for compiling MOM6 on raijin in the future, in the mean time here are the instructions for building the ocean-only version:<br />
<br />
=Load appropriate models= <br />
<br />
<syntaxhighlight lang=bash><br />
module load intel-fc/17.0.1.132<br />
module load intel-cc/17.0.1.132<br />
module load openmpi/1.10.7<br />
module load netcdf/4.3.3.1<br />
</syntaxhighlight><br />
<br />
=Download source code= <br />
<br />
<syntaxhighlight lang=bash><br />
git clone --recursive https://github.com/NOAA-GFDL/MOM6-examples.git MOM6-examples<br />
</syntaxhighlight><br />
<br />
==Copy mkmf template== <br />
<br />
<syntaxhighlight lang=bash><br />
cd MOM6-examples<br />
cp /short/public/aph502/mom6/mkmf.template.nci .<br />
</syntaxhighlight><br />
<br />
=Shared Libraries= <br />
<br />
==Make build directory== <br />
<syntaxhighlight lang=bash><br />
mkdir -p build/shared/opt<br />
cd build/shared/opt<br />
</syntaxhighlight><br />
==Create file paths== <br />
<syntaxhighlight lang=bash><br />
cat > make_paths<<EOF<br />
rm -f path_names path_names.html<br />
../../../src/mkmf/bin/list_paths ../../../src/FMS/<br />
EOF<br />
chmod a+x make_paths<br />
./make_paths<br />
[[code format="bash"]]<br />
==Create Makefile== <br />
[[code format="bash"]]<br />
cat > make_make<<EOF<br />
../../../src/mkmf/bin/mkmf -t ../../../mkmf.template.nci -p libfms.a -c "-Duse_libMPI -Duse_netCDF -DSPMD" path_names<br />
EOF<br />
chmod a+x make_make<br />
./make_make<br />
</syntaxhighlight><br />
==Compile== <br />
<syntaxhighlight lang=bash><br />
cat > make_shared<<EOF<br />
make NETCDF=4 OPT=1 libfms.a -j<br />
EOF<br />
chmod a+x make_shared<br />
./make_shared<br />
<br />
cd ../../../<br />
</syntaxhighlight><br />
=MOM ocean_only= <br />
<br />
==Make build directory== <br />
<syntaxhighlight lang=bash><br />
mkdir -p build/ocean_only/opt<br />
cd build/ocean_only/opt<br />
</syntaxhighlight><br />
==Create file paths== <br />
<syntaxhighlight lang=bash><br />
cat > make_paths<<EOF<br />
rm -f path_names path_names.html<br />
../../../src/mkmf/bin/list_paths ../../../src/MOM6/{config_src/dynamic,config_src/solo_driver,src/{*,*/*}}/<br />
EOF<br />
chmod a+x make_paths<br />
./make_paths<br />
</syntaxhighlight><br />
==Create Makefile== <br />
<syntaxhighlight lang=bash><br />
cat > make_make<<EOF<br />
../../../src/mkmf/bin/mkmf -t ../../../mkmf.template.nci -o '-I../../shared/opt' -p 'MOM6 -L../../shared/opt -lfms' -c "-Duse_libMPI -Duse_netCDF -DSPMD" path_names<br />
EOF<br />
chmod a+x make_make<br />
./make_make<br />
</syntaxhighlight><br />
==Compile== <br />
<syntaxhighlight lang=bash><br />
cat > make_mom<<EOF<br />
make NETCDF=4 OPT=1 MOM6 -j<br />
<br />
EOF<br />
chmod a+x make_mom<br />
./make_mom<br />
</syntaxhighlight><br />
[[Category:mom]][[Category:mom6]][[Category:compile]]</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=File_sharing&diff=142File sharing2018-05-20T23:33:03Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>Some of you are collaborating with researchers that are not in Australia. Sometimes you may need to share access to some of your unpublished data with such researchers. When your dataset is quite small, it is easy to setup existing commercial tools (e.g. Dropbox, Google Drive,...) to give access to your data. But this is not possible once your dataset reaches a certain size.<br />
<br />
To help you with your collaborations, it is now possible to share your dataset with anyone directly from /g/data on raijin without publishing it. This page explains what you need to do.<br />
<br />
===Require access to ua8 group on my.nci.org.au if you are not already part of this group.=== <br />
<br />
===Copy your data under /g/data1/ua8/tmp/tmp-arccss/=== <br />
* It is best to create a subdirectory for your data first then put your data in that directory. Even if it's only 1 file.<br />
* Make sure your files are world readable, and any directories are world readable and executable. If you don't know what this means or how to do this, contact [[mailto:climate_help@nci.org.au | climate_help]]<br />
* Anyone from ua8 can write or delete anything in this directory so make sure to make copies, no links, do not move the original files in there.<br />
* We will automatically clean this space every week for data older than two weeks.<br />
<br />
===Write to your collaborator to download the dataset from:=== <br />
**[[http://dap-wms.nci.org.au/thredds/catalog/ua8/tmp-arccss/catalog.html]]'''<br />
<span class="s1">Note that the server is only updated on the evening so it may take up to 24h for your collaborators to see your dataset.</span><br />
<span class="s1">If you want to give the exact link to your dataset to your collaborator, go on the website, navigate to your dataset then copy the URL at the top of your browser into your email.</span><br />
<span class="s1">The download can be done either manually from the website (for a few files) or with [https://www.gnu.org/software/wget/ | wget] or [http://siphon.readthedocs.io/en/stable/ | Siphon for Python] </span><br />
<br />
===Delete the data once your collaborator accessed it!=== <br />
Your data will be discoverable by anyone as long as it is stored in the ua8 directory. So make sure your collaborators copy it quickly and you remove it promptly if it is some work in progress. It is also good behaviour to clean up whether there is an automatic clean up in place or not.<br />
<br />
===If you don't have an NCI account:=== <br />
You can still share your data. Contact us at [[mailto:climate_help@nci.org.au | climate_help]] and we will assist you with getting your data in place.</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=Storage-request&diff=326Storage-request2018-05-09T02:06:52Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>===Disk allocation on /g/data=== <br />
<br />
<span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">You can request a temporary allocation of disk space if your group /short or /g/data allocation are insufficient for you to analyse your data or if you need to share data with users from other groups. These allocations are temporary so they are easily obtained but you do need to clear out the disk when you finish your work. So you cannot use this disk to store permanent data and/or to archive model output.</span><br />
<span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">The tape system massdata is the right tool to archive the data.</span><br />
<span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">If you have data which need to be accessed more easily but in a more permanent way you can still request an allocation but you should specify this clearly in your request.</span><br />
<span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">The extra disk allocation will be in "/g/data1/ua8" for permanent cases and in "/g/data3/hh5/tmp" for temporary cases.</span><br />
<br />
===How to submit a request=== <br />
<br />
<span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">To request the allocation you should use the new [http://dmp.climate-cms.org:3000/ | DMPonline tool]. You will need to create an account using your e-mail address and click on the "New request" button. This will open a form requesting the following information:</span><br />
# <span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">name of the new directory which will be placed under /g/data1/hh5/tmp/... This has to be unique and will also identify the request.</span><br />
# <span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">the amount of disk requested in Tb</span><br />
# <span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">NCI id for users and/or groups which should have read & write access</span><br />
# <span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">NCI id for users and/or groups which should have only read access</span><br />
# <span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">a brief summary of why you need the data: this shouldn't be about your research project as much as the reason why you can't use your own group /g/data1 allocation or /short etc. You should also include how did you estimate the allocation size.</span><br />
# <span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">you can, if applicable, associate the request to one of your data management plans</span><br />
# <span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">indicate for how long you will need the disk allocation, depending on your request this might be reviewed by the CMS, when this time gets nearer we will contact you to discuss if and for how long your allocation will continue.</span><br />
<br />
<span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">When you submit your request the tool will send us an e-mail with the details and we'll let you know the outcome as soon as possible. </span><br />
<span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">If a request has been submitted then only few of the fields can be edited. If you want to make changes to the locked fields then you we'll need to execute the following steps:</span><br />
# <span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">edit the request first</span><br />
# <span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;"> click on "cancel the submission" button, now your fields will all be unlocked</span><br />
# <span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">make your changes and "save" or "submit" again</span><br />
<span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">If your request has been approved then you will need to contact an administrator to unlocked it, or you could submit a new request and called the directory "<my-request-name>-update"</span><br />
<span style="font-family: Arial,Helvetica,sans-serif; font-size: 110%;">This is necessary to avoid that the information for an allocation which is current or under review is changed without notification to the administrator.</span><br />
<br />
You will also need to be a member of the relevant group, ua8 or hh5. If you are not already a member, [https://my.nci.org.au/mancini/login?next=/mancini/ | login to the NCI system] and request membership of the relevant group.</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=Python&diff=283Python2017-07-25T06:02:17Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>=Learning Python= <br />
* [http://learnpythonthehardway.org/book/ | Learn Python the Hard Way] - Python tutorial for beginners to programming<br />
* [http://swcarpentry.github.io/python-novice-inflammation/ | Software Carpentry] - Programming for Scientists<br />
* [http://www.johnny-lin.com/pyintro/ | A Hands-On Introduction to Python in the Atmospheric and Oceanic Sciences]<br />
<br />
=Resources for Climate Scientists= <br />
* [[Using Python with Climate Data]]<br />
* [[Python Libraries on Raijin]]<br />
* [http://christopherbull.com.au/blog/?page_id=180 | Chris Bull's getting started list]<br />
* [http://oceanpython.org/ | OceanPython plotting examples]<br />
* [[Running IPython Notebook from Raijin]]<br />
* [[How to publish your Python code to PyPI | Publishing your code on PyPI]]<br />
<br />
=Useful Libraries= <br />
* [http://docs.scipy.org/doc/numpy/ | NumPy] - Numerical Python Library<br />
* [http://docs.scipy.org/doc/scipy/reference/ | SciPy] - Scientific Python Tools<br />
* [http://scitools.org.uk/iris/docs/latest/index.html | Iris] - Read, Analyse and Plot Climate Datasets (NetCDF, GRIB, UM output)<br />
* [http://scitools.org.uk/cartopy/ | Cartopy] - A library containing cartographic tools for python (alternative to basemap)<br />
<br />
* '''[https://accessdev.nci.org.au/trac/wiki/Raijin%20Apps#PythonLibraries | All libraries installed on Raijin]'''<br />
<br />
=<span style="font-size: 80%;">For using Python on your desktop computer, consider installing ''Enthought: Canopy''</span>= <br />
* <span style="font-size: 14px; line-height: 15.6px;">The full version of the software is free for academic users</span><br />
* <span style="font-size: 14px; line-height: 15.6px;">and provides a large library of ready-made python packages</span></div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=How_to_publish_your_Python_code_to_PyPI&diff=165How to publish your Python code to PyPI2017-07-25T05:49:43Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>To make your code more visible, and easy for others to install and use, you can upload it to [http://pypi.org/ | PyPI (The Python Package Index)]. PyPI is a repository of software for the Python programming language. It helps people to find and install software developed and shared by the Python community.<br />
<br />
If you have some python code that you would like to make available to others the first step is to make sure it is in a publicly viewable repository at a site like [http://github.com | GitHub] or [http://bitbucket.org | BitBucket].<br />
<br />
The [https://packaging.python.org/tutorials/distributing-packages/ | documentation for creating a PyPI package] is the ultimate authority on packaging python code for distribution via PyPI. However, in many simple cases much of the information is superfluous and possibly confusing. If you have a single python source file (or perhaps a handful of source files) you can follow this [http://antrikshy.com/blog/publish-python-single-file-script-project-structure-pypi-pip-noobs-beginners | clear and simple explanation of the minimum requirement for publishing your code on PyPI].<br />
<br />
As an example, some python code to calculate rank histograms by CoE researcher Oliver Angélil was made into a PyPI package. The [https://github.com/oliverangelil/rankhistogram | GitHub repository for rank-histogram] shows the relatively simple directory structure required to create the [https://pypi.python.org/pypi/rank-histogram/0.2 | PyPI package for rank-histogram]. All that was required was to create an empty file called <span style="font-family:monospace"><nowiki><u>init</u></nowiki>.py}} and a {{setup.py</span> file:<br />
<syntaxhighlight><br />
from setuptools import setup<br />
<br />
setup(<br />
name='rank-histogram',<br />
description='Python function that takes model data, obs data, and a boolean mask to generate a rank histogram.',<br />
version='0.2',<br />
url='https://github.com/oliverangelil/rankhistogram',<br />
install_requires=['numpy','scipy'],<br />
author='Oliver Marc Angelil',<br />
author_email='molofishy@gmail.com',<br />
py_modules=['ranky'],<br />
license='MIT',<br />
keywords='rank histogram climate ensemble Hamil'<br />
)<br />
</syntaxhighlight><br />
<br />
The [http://antrikshy.com/blog/publish-python-single-file-script-project-structure-pypi-pip-noobs-beginners | instructions] use the python program <span style="font-family:monospace">twine</span> which is available in the [https://accessdev.nci.org.au/trac/wiki/User%20Guides/conda | CMS conda distribution].<br />
<br />
===Other resources=== <br />
<br />
* http://the-hitchhikers-guide-to-packaging.readthedocs.io/en/latest/quickstart.html<br />
* https://blog.ionelmc.ro/2014/05/25/python-packaging/<br />
* https://blog.jetbrains.com/pycharm/2017/05/how-to-publish-your-package-on-pypi/<br />
* https://tom-christie.github.io/articles/pypi/<br />
[[Category:python]][[Category:pypi]][[Category:publish]]</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=Running_MOM&diff=302Running MOM2017-07-05T04:22:09Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>=Payu= <br />
<br />
Currently the only supported method for running MOM directly, i.e. not in the ACCESS coupled model, is to use the model run tool [https://github.com/marshallward/payu | payu]<br />
<br />
There is [http://payu.readthedocs.io/en/latest/ | comprehensive documentation for payu], but initially it should be sufficient to follow the instructions in the [http://payu.readthedocs.io/en/latest/usage.html | Usage section]</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=Configuring_MOM&diff=109Configuring MOM2017-07-05T03:52:11Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>This page outlines basic configuration of MOM4 experiments.<br />
<br />
=MOM4/5 Configuration Files= <br />
<br />
MOM4 configuration is managed primarily through four text files:<br />
<br />
* <span style="font-family:monospace">input.nml</span>: Principal configuration<br />
* <span style="font-family:monospace">diag_table</span>: Diagnostic output management<br />
* <span style="font-family:monospace">data_table</span>: Input and boundary condition data field management<br />
* <span style="font-family:monospace">field_table</span>: Initial condition and advection scheme configuration<br />
<br />
=Basic Configuration= <br />
<br />
==Setting the simulation time== <br />
<br />
Simulation time (or integration time) is set in the <span style="font-family:monospace">input.nml}} namelist file, in either {{ocean_solo_nml}} (for ocean-only runs) or {{coupler_nml</span> (for coupled runs).<br />
<br />
===Ocean-only (solo) configuration=== <br />
<br />
An example <span style="font-family:monospace">ocean_solo_nml</span> namelist record is shown below:<br />
<syntaxhighlight lang=fortran><br />
&ocean_solo_nml<br />
date_init = 1980,1,1,0,0,0<br />
calendar = 'julian'<br />
<br />
months = 0<br />
days = 2<br />
hours = 0<br />
minutes = 0<br />
seconds = 0<br />
<br />
dt_cpld = 86400<br />
/<br />
</syntaxhighlight><br />
The major namelist fields are described below:<br />
<br />
* <span style="font-family:monospace">date_init}}: Simulation start time (if no {{ocean_solo.res</span> timestamp is present, such as from a previous simulation, then this field is ignored)<br />
* <span style="font-family:monospace">calendar</span>: Simulation calendar type. Four calendars are supported:<br />
** <span style="font-family:monospace">gregorian</span>: Modern calendar will full leap-year support<br />
** <span style="font-family:monospace">julian</span>: 365-day calendar with 4-year leap years<br />
** <span style="font-family:monospace">noleap</span>: 365-day calendar with no leap years<br />
** <span style="font-family:monospace">thirty_day</span>: 30-day months (or 360 days per year)<br />
<br />
* <span style="font-family:monospace">months}}, {{days</span>, etc.: Simulation (or integration) time<br />
* <span style="font-family:monospace">dt_cpld</span>: Timestep (in seconds) to external data fields (usually atmospheric)<br />
<br />
===Coupled configuration=== <br />
<br />
An example <span style="font-family:monospace">coupler_nml</span> record follows a similar structure:<br />
<syntaxhighlight lang=fortran><br />
&coupler_nml<br />
current_date = 1980,1,1,0,0,0<br />
calendar = 'noleap'<br />
months = 12<br />
<br />
do_atmos = .false.<br />
do_land = .false.<br />
do_ice = .true.<br />
do_ocean = .true.<br />
<br />
dt_cpld = 1800<br />
dt_atmos = 1800<br />
/<br />
</syntaxhighlight><br />
Many of the fields are identical to the <span style="font-family:monospace">ocean_solo_nml</span> fields. The new or modified fields are listed below:<br />
<br />
* <span style="font-family:monospace">current_date}}: Simulation start time, similar to {{date_init}} in {{ocean_solo_nml}} (in this case, the timestamp override file is named {{coupler.res</span>)<br />
* <span style="font-family:monospace">do_atmos}}, {{do_land</span>, etc.: Use to enable or disable submodel components<br />
* <span style="font-family:monospace">dt_atmos</span>: Atmospheric model timestep (in seconds), including its coupling to land and ice (or the "fast" coupling timestep)<br />
* <span style="font-family:monospace">dt_cpld}}: Ocean-atmosphere coupling (or the "slow" coupling timestep). This must be a multiple of {{dt_atmos}} and {{dt_ocean</span>.<br />
<br />
==Timestep configuration== <br />
<br />
The predominant numerical parameter in model configuration is timestep size. MOM4 timestepping is configured in the <span style="font-family:monospace">ocean_model_nml</span> namelist record. An example record is provided below:<br />
<syntaxhighlight lang=fortran><br />
&ocean_model_nml<br />
dt_ocean = 10800<br />
vertical_coordinate = 'zstar'<br />
barotropic_split = 60<br />
/<br />
</syntaxhighlight><br />
Some typical timestep settings are as follows:<br />
<br />
* <span style="font-family:monospace">dt_ocean</span>: Ocean model timestep size (in seconds)<br />
* <span style="font-family:monospace">vertical_coordinate</span>: Vertical coordinate type, the most common options are listed below:<br />
** <span style="font-family:monospace">geopotential</span>: Geopotential (equivalent to depth in many cases)<br />
** <span style="font-family:monospace">zstar</span>: Quasi-horizontal depth<br />
** <span style="font-family:monospace">pressure</span>: Pressure-based vertical coordiante<br />
** <span style="font-family:monospace">pstar</span>: Quasi-horizontal pressure<br />
<br />
* <span style="font-family:monospace">barotropic_split}}: Split timestepping between the ocean free surface (or barotropic) and internal (or baroclinic) flow. A barotropic timestep of 60 means that there are sixty free surface timesteps per model timesteps (set by {{dt_ocean</span>).<br />
<br />
=Advanced Configuration= <br />
<br />
A typical <span style="font-family:monospace">input.nml</span> file will usually contain a large number of namelist records, sometimes as many as 100 for fully coupled models. Much of the configuration settings are determined through experiment design, which is a nontrivial task and often a subject of ongoing research.<br />
<br />
==data_table== <br />
<br />
The <span style="font-family:monospace">data_table</span> file is used to supply a MOM experiment with external data forcing fields, such as surface winds or radiative heating.<br />
<br />
''Note that MOM usually expects external fields to be in netCDF format, and that the grid variables must be formatted in a particular manner.''<br />
<br />
Example <span style="font-family:monospace">data_table</span> record:<br />
<syntaxhighlight lang=fortran><br />
"OCN", "u_flux", "taux", "INPUT/stress.nc", .true., 1.0<br />
</syntaxhighlight><br />
* <span style="font-family:monospace">"OCN"}}: Identifies the class of the field. It contains one of the following values: {{ATM}}, {{OCN}}, {{LND}}, {{ICE</span><br />
* <span style="font-family:monospace">"u_flux"</span>: The field variable name as defined in MOM4.<br />
* <span style="font-family:monospace">"taux"</span>: The field variable name as defined in the netCDF provided by the user.<br />
* <span style="font-family:monospace">"INPUT/stress.nc"</span>: The path (including filename) for the netCDF file containing the forcing field. Local paths can be used.<br />
* <span style="font-family:monospace">.true.}}: Indicates whether or not the field is on the the ocean model grid. A value of {{.false.</span> indicates that it does not match the model grid and requires interpolation (as computed by MOM).<br />
* <span style="font-family:monospace">1.0}}: A rescaling factor applied to the field. A value of {{1.0}} indicates no rescaling, while a value of {{0.</span> will set all values to zero.</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=Acquiring_and_compiling_MOM5&diff=39Acquiring and compiling MOM52017-07-05T03:50:11Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>=Acquiring the source code= <br />
<br />
The MOM source code can either be downloaded directly from the main sourcecode repository or through local repositories at NCI.<br />
<br />
To acquire the CMS-supported version of the source code, type the following:<br />
<syntaxhighlight lang=bash><br />
$ mkdir -p /short/${PROJECT}/${USER}/mom/codebase # Or some other directory<br />
$ cd /short/${PROJECT}/${USER}/mom/codebase<br />
$ git@github.com:mom-ocean/MOM5.git<br />
<br />
</syntaxhighlight><br />
where <span style="font-family:monospace">${PROJECT}}} is your default project group and {{${USER}}} is your username. This will download the latest version of the source code into a directory named {{mom</span>.<br />
<br />
=Types of MOM5 build= <br />
<br />
A build script (<span style="font-family: monospace;">mom/exp/MOM_compile.csh) </span>for compiling MOM is included in the github repository.<br />
<br />
The build script has been pre-configured to build the coupled ocean-only (<span style="font-family: monospace;">MOM_solo</span>) version of MOM. The script can also be modified to build other versions of MOM:<br />
* <span style="font-family:monospace">MOM_solo</span>: Ocean-only MOM, without ice, land, or atmosphere<br />
* <span style="font-family:monospace">MOM_SIS</span>: MOM coupled to the GFDL sea ice model (SIS)<br />
* <span style="font-family:monospace">EBM</span>: Ocean / sea ice / land model coupled to an energy-balanced (radiation) atmosphere model (EBM)<br />
* <span style="font-family:monospace">ICCM</span>: Ocean / sea ice / land / atmosphere<br />
* <span style="font-family:monospace">CM2M</span>: CMIP-based ocean / sea ice / land / atmosphere<br />
* <span style="font-family:monospace">ESM2M</span>: Coupled Earth System Model (biogeochemistry)<br />
<br />
=Building MOM5= <br />
<br />
To use the build script copy and paste the following (choosing which type of build you wish):<br />
<syntaxhighlight lang=bash><br />
# From the source code directory<br />
$ cd mom/exp<br />
$ ./MOM_compile.csh --platform nci --use_netcdf4 --type MOM_SIS<br />
</syntaxhighlight><br />
Compilation should require approximately 10-20 minutes, and will place an executable in the following directory path:<br />
<syntaxhighlight lang=bash><br />
mom/exec/nci/MOM_SIS/fms_MOM_SIS.x<br />
</syntaxhighlight><br />
relative to <span style="font-family:monospace">/short/${PROJECT}/${USER}/mom/codebase</span> if you followed the previous instructions.</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=Running_MOM5&diff=304Running MOM52017-02-02T06:25:35Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>Running an ocean simulation using MOM5 requires three steps:<br />
<br />
# Designing an experiment, including its numerical grids and forcing fields<br />
# Configuring MOM's numerical settings and physical parameterisations<br />
# Executing MOM and generating numerical output<br />
<br />
This section focusses on the final step of model execution, as well as a cursory discussion of model configuration. It assumes that you have been provided with a pre-configured experiment, such as one of the MOM5 examples.<br />
<br />
In general, model design and configuration is a complex process which can require a broad range of expertise. For more detailed discussions about experiment design, consult the MOM5 documentation and ocean modelling literature.<br />
<br />
=Running MOM= <br />
<br />
Before beginning, make sure that you have an experiment ready for MOM simulation, as well as a compiled MOM5 executable. Whilst there are other ways of running MOM5, the recommended (and supported) way is to use [https://github.com/marshallward/payu | payu], an experiment running program.<br />
<br />
==Loading payu== <br />
<br />
Payu is available as a system wide loadable software module.<br />
<syntaxhighlight lang=bash><br />
module load payu<br />
</syntaxhighlight><br />
should load the most recent version. You can check which versions are available like so:<br />
<syntaxhighlight lang=bash><br />
module avail payu<br />
</syntaxhighlight><br />
And which is currently loaded:<br />
<syntaxhighlight lang=bash><br />
module list<br />
</syntaxhighlight><br />
<br />
==Configuring your workspace== <br />
<br />
Payu places the experiment configuration in a separate directory tree to the large input and output files. This is a deliberate design decision so that the smaller model configuration files can be in the user's home directory. This means they are backed up in case of catastrophic disk failure, and can be restored.<br />
<br />
Create a directory for your model experiment configurations:<br />
<syntaxhighlight lang=bash><br />
mkdir -p experiments/mom<br />
cd experiments/mom<br />
</syntaxhighlight><br />
<br />
The easiest way to obtain the necessary configuration files is to copy an existing experiment. Preferably one that is as close as possible to the experimental configuration you wish to use.<br />
<br />
If the experiment you wish to copy has the automatic experiment logging feature turned on, then it can be as simple as<br />
<syntaxhighlight lang=bash><br />
git clone /path/to/experiment/config<br />
</syntaxhighlight><br />
or if it is available as a repository on a server:<br />
<syntaxhighlight lang=bash><br />
git clone url<br />
</syntaxhighlight><br />
<br />
For example, this will clone a standard 1/4 degree MOM-SIS configuration into a directory named gfdl_nyf_18080.<br />
<syntaxhighlight lang=bash><br />
git clone git@github.com:coecms/gfdl_nyf_1080.git<br />
</syntaxhighlight><br />
To clone into a different directory name, specify a new name. For example, the following will clone the same configuration into a directory gfdl_nyf_1080_mycopy<br />
<syntaxhighlight lang=bash><br />
git clone git@github.com:coecms/gfdl_nyf_1080.git gfdl_nyf_1080_mycopy<br />
</syntaxhighlight><br />
[[Category:mom5]][[Category:payu]]</div>Aidanheerdegenhttp://climate-cms.wikis.unsw.edu.au/index.php?title=Nmlcompare&diff=256Nmlcompare2015-03-26T04:00:26Z<p>Aidanheerdegen: Imported from Wikispaces</p>
<hr />
<div>=Introduction= <br />
<br />
Is a simple python script which utilises [https://github.com/marshallward/f90nml | f90nml, Marshall Ward's excellent FORTRAN namelist parser,] to do a simple minded comparison of two FORTRAN [http://www.jchmr.org/jules/documentation/user_guide/vn3.3/namelists/intro.html | namelists].<br />
<br />
<span style="background-color: #ffffff;">The nmlcompare package is available on raijin available under </span><span style="color: #000000;">[https://accessdev.nci.org.au/trac/wiki/Raijin%20Apps | ACCESS apps]</span><span style="background-color: #ffffff;">.</span><br />
<syntaxhighlight><br />
$ nmlcompare -h<br />
usage: nmlcompare [-h] [-s] [-d] [-g GROUPS] first second<br />
<br />
Report the difference between two FORTRAN namelist files. (=) Means they are<br />
same, (-) missing in second, (+) missing in first, (?) exists in both, but<br />
with different values<br />
<br />
positional arguments:<br />
first namelist file<br />
second namelist file<br />
<br />
optional arguments:<br />
-h, --help show this help message and exit<br />
-s, --same Show variables that are the same (default False)<br />
-d, --diff Show variables that are the different (default True,<br />
use -d to toggle off)<br />
-g GROUPS, --groups GROUPS<br />
Specify particular namelist groups<br />
<br />
</syntaxhighlight><br />
<br />
=Examples= <br />
<br />
For example, given two namelists, short1.nml:<br />
<syntaxhighlight><br />
$ cat short1.nml<br />
&coupler_nml<br />
months = 6,<br />
days = 0,<br />
current_date = 1,1,1,0,0,0,<br />
calendar = 'noleap',<br />
dt_cpld = 1800,<br />
dt_atmos = 1800,<br />
do_atmos = .false.,<br />
do_land = .false.,<br />
do_ocean = .true.,<br />
atmos_npes = 0,<br />
ocean_npes = 0,<br />
use_lag_fluxes=.true.<br />
check_stocks=0<br />
/<br />
<br />
&data_override_nml<br />
<br />
/<br />
<br />
&diag_integral_nml<br />
file_name = 'diag_integral.out'<br />
time_units = 'days'<br />
output_interval = -1.0<br />
/<br />
<br />
&diag_manager_nml<br />
max_output_fields=700<br />
max_input_fields=700<br />
max_axes=300<br />
max_num_axis_sets=40<br />
max_files = 1000<br />
issue_oor_warnings=.false.<br />
/<br />
<br />
</syntaxhighlight><br />
and short2.nml:<br />
<syntaxhighlight><br />
$ cat short2.nml<br />
&coupler_nml<br />
months = 6,<br />
days = 5,<br />
current_date = 1,1,1,0,0,0,<br />
calendar = 'noleap',<br />
dt_cpld = 1800,<br />
dt_atmos = 1800,<br />
do_atmos = .false.,<br />
do_land = .false.,<br />
do_ice = .true.,<br />
atmos_npes = 0,<br />
ocean_npes = 0,<br />
check_stocks=0<br />
/<br />
<br />
&data_override_nml<br />
<br />
/<br />
<br />
&diag_manager_nml<br />
max_output_fields=700<br />
max_input_fields=700<br />
max_axes=100<br />
max_num_axis_sets=40<br />
max_files = 100<br />
issue_oor_warnings=.false.<br />
/<br />
<br />
&flux_exchange_nml<br />
do_area_weighted_flux=.true.<br />
/<br />
<br />
</syntaxhighlight><br />
A simple comparison, with no options:<br />
<syntaxhighlight><br />
$ nmlcompare short1.nml short2.nml<br />
(?)diag_manager_nml<br />
(?)max_axes : 300 -> 100<br />
(?)max_files : 1000 -> 100<br />
(?)coupler_nml<br />
(?)days : 0 -> 5<br />
(-)do_ocean : True<br />
(-)use_lag_fluxes : True<br />
(+)do_ice : True<br />
(-)diag_integral_nml<br />
(-)file_name : diag_integral.out<br />
(-)time_units : days<br />
(-)output_interval : -1.0<br />
(+)flux_exchange_nml<br />
(+)do_area_weighted_flux : True<br />
</syntaxhighlight><br />
As explained in the usage above, "(?)" indicates the group (or variable) is in both namelists, but are different, "(-)" that it is only in the first namelist and "(+)" only in the second.<br />
<br />
It is possible to only list specified namelist groups using the "-g" option, multiple times if more than one group is required:<br />
<syntaxhighlight><br />
$ nmlcompare short1.nml -g coupler_nml -g flux_exchange_nml short2.nml<br />
(?)coupler_nml<br />
(?)days : 0 -> 5<br />
(-)do_ocean : True<br />
(-)use_lag_fluxes : True<br />
(+)do_ice : True<br />
(+)flux_exchange_nml<br />
(+)do_area_weighted_flux : True<br />
<br />
</syntaxhighlight><br />
<br />
You can also output variables that are the same in both namelists:<br />
<syntaxhighlight><br />
$ nmlcompare short1.nml -s short2.nml<br />
(?)diag_manager_nml<br />
(=)max_num_axis_sets : 40<br />
(?)max_axes : 300 -> 100<br />
(?)max_files : 1000 -> 100<br />
(=)max_output_fields : 700<br />
(=)issue_oor_warnings : False<br />
(=)max_input_fields : 700<br />
(=)data_override_nml<br />
(?)coupler_nml<br />
(=)atmos_npes : 0<br />
(=)check_stocks : 0<br />
(=)months : 6<br />
(?)days : 0 -> 5<br />
(=)do_land : False<br />
(=)current_date : [1, 1, 1, 0, 0, 0]<br />
(=)do_atmos : False<br />
(=)ocean_npes : 0<br />
(=)calendar : noleap<br />
(=)dt_atmos : 1800<br />
(=)dt_cpld : 1800<br />
(-)do_ocean : True<br />
(-)use_lag_fluxes : True<br />
(+)do_ice : True<br />
(-)diag_integral_nml<br />
(-)file_name : diag_integral.out<br />
(-)time_units : days<br />
(-)output_interval : -1.0<br />
(+)flux_exchange_nml<br />
(+)do_area_weighted_flux : True<br />
<br />
</syntaxhighlight><br />
Equally it is possible to only show the variables that are the same in both namelists by "toggling off" the difference flag (-d):<br />
<syntaxhighlight><br />
$ nmlcompare short1.nml -s -d short2.nml<br />
(?)diag_manager_nml<br />
(=)max_num_axis_sets : 40<br />
(=)max_output_fields : 700<br />
(=)issue_oor_warnings : False<br />
(=)max_input_fields : 700<br />
(=)data_override_nml<br />
(?)coupler_nml<br />
(=)atmos_npes : 0<br />
(=)check_stocks : 0<br />
(=)months : 6<br />
(=)do_land : False<br />
(=)current_date : [1, 1, 1, 0, 0, 0]<br />
(=)do_atmos : False<br />
(=)ocean_npes : 0<br />
(=)calendar : noleap<br />
(=)dt_atmos : 1800<br />
(=)dt_cpld : 1800<br />
<br />
</syntaxhighlight><br />
<br />
This program is very simple, and no attempt has been made to test it against complicated namelist constructions. If you find it does not work for your namelists and would like it to, please contact the [http://www.climatescience.org.au/staff/profile/AHeerdegen | author] or [[home | contact the CMS team]]<br />
[[Category:namelist]][[Category:fortran]][[Category:compare]][[Category:difference]]</div>Aidanheerdegen