Tips to reduce WRF outputs size
- netCDF4 compression: you are probably using netCDF v4.0 or newer. In that case, you can enable compression directly in WRF code so the output is compressed at creation time. If you ever want an output file in classic format, you can then use the namelist option use_netcdf_classic=.true. in the &time_control section. The choice of netcdf happens at the configuration step so if you already have compiled the code, clean it up with
> then follow these steps:
- Define the NETCDF4 environment variable:
>> for csh
setenv NETCDF4 1
>> for bash
- Configure and compile as usual.
- output a subset of variables: WRF comes with a built-in mechanism to choose which variables to output or not. So you don't have to output all the default variables (and you can add some non-default ones). The details can be found in the | WRF User's guide
- clean up the output file: After creation, it can be useful to clean up the output file. For example, all variables have a Time dimension even if they are constant in time (latitudes and longitudes are usually constants unless you use a moving nest). So it can save storage space to remove this dimension. For example by using NCO:
ncwa -a Time -v XLONG wrfout_d01_2000-01-24_12\:00\:00 time0.nc ncks -x -v XLONG wrfout_d01_2000-01-24_12\:00\:00 no_longitude.nc ncks -A -v XLONG time0.nc no_longitude.nc mv no_longitude.nc wrfout_d01_2000-01-24_12\:00\:00
Obviously you can list more than 1 variable at once in the commands.
What to do if WRF does not compile on Raijin?
You need to make sure that you are using the WRF version that is stored on /projects/WRF on Raijin. Then you should first try to compile using the dmpar option. For this, clean (./clean -a) and run configure again and choose option #3 (dmpar). Then WRF should compile the code without problems. WRF has not yet been successfully compiled on Raijin using the other options.
Which processor crashed in my WRF simulation?
WRF output the standard output in rsl.out and the error in rsl.error. There is a pair of files for each processor. If you are running on a large number of processors, checking each file is very time-consuming. The first thing to do when your simulation stops is to check the output file for your script. If your script to launch WRF is called script.pbs, then at the end of the job PBS will create a script.pbs.o1234567 where "1234567" is to be replaced by your job ID. Open this file and check the end. If you see a message like:
-------------------------------------------------------------------------- mpirun has exited due to process rank 50 with PID 32659 on node r2059 exiting improperly. There are two reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). --------------------------------------------------------------------------
Then it means the simulation did not finish normally. The first line of the message gives the processor number that finished abnormally: "due to process rank 50". So in this case, the problem happened on the processor 50 and the error message should then be located in rsl.error.0050. Note you might want to check rsl.out.0050 as well just in case a message was written to the output first.
What to do if I get a segmentation fault error with WRF?
If in rsl.error you get a message like this one: forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source wrf.exe 00000000017F59A1 Unknown Unknown Unknown wrf.exe 00000000017F3655 Unknown Unknown Unknown wrf.exe 0000000001ACFA07 Unknown Unknown Unknown wrf.exe 0000000001387C8A Unknown Unknown Unknown wrf.exe 0000000000EC4CDD Unknown Unknown Unknown wrf.exe 0000000000DCCAB7 Unknown Unknown Unknown you have a segmentation fault error. These can have multiple causes, the best way to find out what is causing the error is to recompile WRF using the debugging options and re-run.
- Clean the previous compilation in WRFV3/ with ./clean -a
- Copy configure.wrf.backup to configure.wrf
- Open configure.wrf and search for FCDEBUG. Uncomment the options by removing the #. If there are 2 #s signs, the best is to remove both unless you know what you are doing.
- Compile (DO NOT run configure) and run again. Check the error message at the end of the run and hopefully it should be more precise.