Comparing UM Files
Note The 'cumf' tool has been replaced with 'mule-cumf' in the Conda environment, which works very similarly to what's described here
---
There is a special tool called 'cumf' for comparing two UM output files. Since the UM files contain a timestamp of when the run was performed using 'diff' to compare two output files won't give any useful information.
To use 'cumf' to compare two UM output files 'FILE1' and 'FILE2:
module load um
$UMDIR/vn8.5/normal/utils/cumf FILE1 FILE2
The 'um' module needs to be loaded in order to set up some necessary environment variables.
Once run 'cumf' will give you a list of files that you can look at to see the differences between the files:
CUMF successful
Summary in: /short/w35/saw562/tmp/cumf_summ.saw562.d15202.t155457.1264
Full output in: /short/w35/saw562/tmp/cumf_full.saw562.d15202.t155457.1264
Difference maps (if available) in: /short/w35/saw562/tmp/cumf_diff.saw562.d15202.t155457.1264
Contents
Summary
The 'cumf_summ' file gives a summary of the differences:
COMPARE - SUMMARY MODE
-----------------------
Number of fields in file 1 = 1619
Number of fields in file 2 = 1619
Number of fields compared = 1619
FIXED LENGTH HEADER: Number of differences = 3
INTEGER HEADER: Number of differences = 0
REAL HEADER: Number of differences = 0
LEVEL DEPENDENT CONSTANTS: Number of differences = 0
LOOKUP: Number of differences = 1619
DATA FIELDS: Number of fields with differences = 0
files DO NOT compare
You can see at the bottom of this file that the files aren't exactly identical, with 3 differences in the 'FIXED LENGTH HEADER' and 1619 in the 'LOOKUP' section. The most important section here though is the 'DATA FIELDS' section, which contains the model fields. If there are differences in this section then the files are truly different - in this case the field values are the same but there are some differences in the metadata.
Full Details
You can see more information in the 'cumf_full' file, which lays out the structure of both files and highlights differences.
Fixed Length Header
FIXED LENGTH HEADER
-------------------
Dump format version 20
UM Version No 703
Atmospheric data
Charney-Phillips on radius levels
Over global domain
FIELDS file
Exp No =-32768 Run Id = 0
Gregorian calendar
Arakawa C grid
Year Month Day Hour Min Sec DayNo
Data time = 2002 1 1 0 0 0 1
Validity time = 2002 5 1 0 30 0 121
Creation time = 2015 7 20 14 20 1 -32768
Start 1st dim 2nd dim 1st parm 2nd parm
Integer Consts 257 46 46
Real Consts 303 38 38
Level Dep Consts 341 39 8 39 8
Row Dep Consts 0-1073741824-1073741824 0 0
Column Dep Consts 0-1073741824-1073741824 0 0
Fields of Consts 0-1073741824-1073741824 0 0
Extra Consts 0-1073741824 0
History Block 0-1073741824 0
CFI No 1 0-1073741824 0
CFI No 2 0-1073741824 0
CFI No 3 0-1073741824 0
Lookup Tables 653 64 4096 64 4096
Model Data 264193 107078676 107078676
LEVEL DEPENDENT CONSTANTS
312 64-bit words long
LOOKUP TABLE
262144 64-bit words long
This section includes a timestamp of when the file was created - it's common that the 'Creation time' value will differ between two runs. It also includes the calendar and grid type. Any values that differ between the fields will be stated, e.g.
FIXED LENGTH HEADER:
ITEM = 38 Values = 14 and 18
ITEM = 39 Values = 20 and 0
ITEM = 40 Values = 1 and 39
Constants
The integer and real constant tables contain values such as number of timesteps, land points and grid size and spacing. See [1] for the full list of constants. 'cumf' will show them like:
INTEGER HEADER:
OK
REAL HEADER:
OK
LEVEL DEPENDENT CONSTS:
OK
Lookup
Fields are stored in UM files as a collection of 2 dimensional arrays. 3 dimensional fields are stored using a separate 2d array for each level.
The lookup table contains header information for each of the fields in the file, such as the date, level, axes and processing information. 'cumf' will show differences like
LOOKUP:
Header1: 1 Header2: 1 Item: 28 Values: 352331094 352331096
Header1: 2 Header2: 2 Item: 28 Values: 352331094 352331096
Header1: 3 Header2: 3 Item: 28 Values: 352331094 352331096
In this case item 28 of the lookup tables are different, which corresponds to an encoding of the experiment id (e.g. '>vabcd') according to the UM documentation [1]. In this case the files were generated by two different experiments.
Fields
At the end of the comparison will be a list of the individual fields, looking like:
Field 1617 : Stash Code 436 : Dust division 6 mass mixing ratio : Level 36
OK