Comparing UM Files
Note The 'cumf' tool has been replaced with 'mule-cumf' in the Conda environment, which works very similarly to what's described here
There is a special tool called 'cumf' for comparing two UM output files. Since the UM files contain a timestamp of when the run was performed using 'diff' to compare two output files won't give any useful information.
To use 'cumf' to compare two UM output files 'FILE1' and 'FILE2:
module load um $UMDIR/vn8.5/normal/utils/cumf FILE1 FILE2
The 'um' module needs to be loaded in order to set up some necessary environment variables.
Once run 'cumf' will give you a list of files that you can look at to see the differences between the files:
CUMF successful Summary in: /short/w35/saw562/tmp/cumf_summ.saw562.d15202.t155457.1264 Full output in: /short/w35/saw562/tmp/cumf_full.saw562.d15202.t155457.1264 Difference maps (if available) in: /short/w35/saw562/tmp/cumf_diff.saw562.d15202.t155457.1264
The 'cumf_summ' file gives a summary of the differences:
COMPARE - SUMMARY MODE ----------------------- Number of fields in file 1 = 1619 Number of fields in file 2 = 1619 Number of fields compared = 1619 FIXED LENGTH HEADER: Number of differences = 3 INTEGER HEADER: Number of differences = 0 REAL HEADER: Number of differences = 0 LEVEL DEPENDENT CONSTANTS: Number of differences = 0 LOOKUP: Number of differences = 1619 DATA FIELDS: Number of fields with differences = 0 files DO NOT compare
You can see at the bottom of this file that the files aren't exactly identical, with 3 differences in the 'FIXED LENGTH HEADER' and 1619 in the 'LOOKUP' section. The most important section here though is the 'DATA FIELDS' section, which contains the model fields. If there are differences in this section then the files are truly different - in this case the field values are the same but there are some differences in the metadata.
You can see more information in the 'cumf_full' file, which lays out the structure of both files and highlights differences.
Fixed Length Header
FIXED LENGTH HEADER ------------------- Dump format version 20 UM Version No 703 Atmospheric data Charney-Phillips on radius levels Over global domain FIELDS file Exp No =-32768 Run Id = 0 Gregorian calendar Arakawa C grid Year Month Day Hour Min Sec DayNo Data time = 2002 1 1 0 0 0 1 Validity time = 2002 5 1 0 30 0 121 Creation time = 2015 7 20 14 20 1 -32768 Start 1st dim 2nd dim 1st parm 2nd parm Integer Consts 257 46 46 Real Consts 303 38 38 Level Dep Consts 341 39 8 39 8 Row Dep Consts 0-1073741824-1073741824 0 0 Column Dep Consts 0-1073741824-1073741824 0 0 Fields of Consts 0-1073741824-1073741824 0 0 Extra Consts 0-1073741824 0 History Block 0-1073741824 0 CFI No 1 0-1073741824 0 CFI No 2 0-1073741824 0 CFI No 3 0-1073741824 0 Lookup Tables 653 64 4096 64 4096 Model Data 264193 107078676 107078676 LEVEL DEPENDENT CONSTANTS 312 64-bit words long LOOKUP TABLE 262144 64-bit words long
This section includes a timestamp of when the file was created - it's common that the 'Creation time' value will differ between two runs. It also includes the calendar and grid type. Any values that differ between the fields will be stated, e.g.
FIXED LENGTH HEADER: ITEM = 38 Values = 14 and 18 ITEM = 39 Values = 20 and 0 ITEM = 40 Values = 1 and 39
The integer and real constant tables contain values such as number of timesteps, land points and grid size and spacing. See  for the full list of constants. 'cumf' will show them like:
INTEGER HEADER: OK REAL HEADER: OK LEVEL DEPENDENT CONSTS: OK
Fields are stored in UM files as a collection of 2 dimensional arrays. 3 dimensional fields are stored using a separate 2d array for each level.
The lookup table contains header information for each of the fields in the file, such as the date, level, axes and processing information. 'cumf' will show differences like
LOOKUP: Header1: 1 Header2: 1 Item: 28 Values: 352331094 352331096 Header1: 2 Header2: 2 Item: 28 Values: 352331094 352331096 Header1: 3 Header2: 3 Item: 28 Values: 352331094 352331096
In this case item 28 of the lookup tables are different, which corresponds to an encoding of the experiment id (e.g. '>vabcd') according to the UM documentation . In this case the files were generated by two different experiments.
At the end of the comparison will be a list of the individual fields, looking like:
Field 1617 : Stash Code 436 : Dust division 6 mass mixing ratio : Level 36 OK