Comparing UM Files

Revision as of 23:26, 4 December 2019 by (talk | contribs)
Template:Needs Update This page needs updating Update to mule-cumf

Note The 'cumf' tool has been replaced with 'mule-cumf' in the Conda environment, which works very similarly to what's described here


There is a special tool called 'cumf' for comparing two UM output files. Since the UM files contain a timestamp of when the run was performed using 'diff' to compare two output files won't give any useful information.

To use 'cumf' to compare two UM output files 'FILE1' and 'FILE2:

module load um
$UMDIR/vn8.5/normal/utils/cumf FILE1 FILE2

The 'um' module needs to be loaded in order to set up some necessary environment variables.

Once run 'cumf' will give you a list of files that you can look at to see the differences between the files:

CUMF successful
Summary in: /short/w35/saw562/tmp/cumf_summ.saw562.d15202.t155457.1264
Full output in: /short/w35/saw562/tmp/cumf_full.saw562.d15202.t155457.1264
Difference maps (if available) in: /short/w35/saw562/tmp/cumf_diff.saw562.d15202.t155457.1264


The 'cumf_summ' file gives a summary of the differences:


Number of fields in file 1 = 1619
Number of fields in file 2 = 1619
Number of fields compared = 1619

FIXED LENGTH HEADER: Number of differences = 3
INTEGER HEADER: Number of differences = 0
REAL HEADER: Number of differences = 0
LEVEL DEPENDENT CONSTANTS: Number of differences = 0
LOOKUP: Number of differences = 1619
DATA FIELDS: Number of fields with differences = 0
 files DO NOT compare

You can see at the bottom of this file that the files aren't exactly identical, with 3 differences in the 'FIXED LENGTH HEADER' and 1619 in the 'LOOKUP' section. The most important section here though is the 'DATA FIELDS' section, which contains the model fields. If there are differences in this section then the files are truly different - in this case the field values are the same but there are some differences in the metadata.

Full Details

You can see more information in the 'cumf_full' file, which lays out the structure of both files and highlights differences.

Fixed Length Header

 Dump format version 20
 UM Version No 703
 Atmospheric data
 Charney-Phillips on radius levels
 Over global domain
 FIELDS file
 Exp No =-32768 Run Id = 0
 Gregorian calendar
 Arakawa C grid
 Year Month Day Hour Min Sec DayNo
 Data time = 2002 1 1 0 0 0 1
 Validity time = 2002 5 1 0 30 0 121
 Creation time = 2015 7 20 14 20 1 -32768
 Start 1st dim 2nd dim 1st parm 2nd parm
 Integer Consts 257 46 46
 Real Consts 303 38 38
 Level Dep Consts 341 39 8 39 8
 Row Dep Consts 0-1073741824-1073741824 0 0
 Column Dep Consts 0-1073741824-1073741824 0 0
 Fields of Consts 0-1073741824-1073741824 0 0
 Extra Consts 0-1073741824 0
 History Block 0-1073741824 0
 CFI No 1 0-1073741824 0
 CFI No 2 0-1073741824 0
 CFI No 3 0-1073741824 0
 Lookup Tables 653 64 4096 64 4096
 Model Data 264193 107078676 107078676

 312 64-bit words long

 262144 64-bit words long

This section includes a timestamp of when the file was created - it's common that the 'Creation time' value will differ between two runs. It also includes the calendar and grid type. Any values that differ between the fields will be stated, e.g.

ITEM = 38 Values = 14 and 18
ITEM = 39 Values = 20 and 0
ITEM = 40 Values = 1 and 39


The integer and real constant tables contain values such as number of timesteps, land points and grid size and spacing. See [1] for the full list of constants. 'cumf' will show them like:





Fields are stored in UM files as a collection of 2 dimensional arrays. 3 dimensional fields are stored using a separate 2d array for each level.

The lookup table contains header information for each of the fields in the file, such as the date, level, axes and processing information. 'cumf' will show differences like

Header1:     1 Header2:     1 Item:  28 Values:  352331094 352331096
Header1:     2 Header2:     2 Item:  28 Values:  352331094 352331096
Header1:     3 Header2:     3 Item:  28 Values:  352331094 352331096

In this case item 28 of the lookup tables are different, which corresponds to an encoding of the experiment id (e.g. '>vabcd') according to the UM documentation [1]. In this case the files were generated by two different experiments.


At the end of the comparison will be a list of the individual fields, looking like:

Field  1617 : Stash Code   436 : Dust division 6  mass mixing ratio   : Level   36



[1] | UMDP F03 File Formats