Difference between revisions of "Comparing UM Files"

 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
{{Needs Update|Update to mule-cumf}}
 
{{Needs Update|Update to mule-cumf}}
{{Bad Formatting}}
 
  
There is a special tool called <span style="font-family:monospace">cumf}} for comparing two UM output files. Since the UM files contain a timestamp of when the run was performed using {{diff</span> to compare two output files won't give any useful information.
+
'''Note''' The 'cumf' tool has been replaced with 'mule-cumf' in the [[Conda]] environment, which works very similarly to what's described here
  
To use <span style="font-family:monospace">cumf}} to compare two UM output files {{FILE1}} and {{FILE2</span>:
+
---
<syntaxhighlight>
+
 
 +
There is a special tool called 'cumf' for comparing two UM output files. Since the UM files contain a timestamp of when the run was performed using 'diff' to compare two output files won't give any useful information.
 +
 
 +
To use 'cumf' to compare two UM output files 'FILE1' and 'FILE2:
 +
<syntaxhighlight lang=bash>
 
module load um
 
module load um
 
$UMDIR/vn8.5/normal/utils/cumf FILE1 FILE2
 
$UMDIR/vn8.5/normal/utils/cumf FILE1 FILE2
 
</syntaxhighlight>
 
</syntaxhighlight>
  
The <span style="font-family:monospace">um</span> module needs to be loaded in order to set up some necessary environment variables.
+
The 'um' module needs to be loaded in order to set up some necessary environment variables.
  
Once run <span style="font-family:monospace">cumf</span> will give you a list of files that you can look at to see the differences between the files:
+
Once run 'cumf' will give you a list of files that you can look at to see the differences between the files:
<syntaxhighlight>
+
<syntaxhighlight lang=text>
 
CUMF successful
 
CUMF successful
 
Summary in: /short/w35/saw562/tmp/cumf_summ.saw562.d15202.t155457.1264
 
Summary in: /short/w35/saw562/tmp/cumf_summ.saw562.d15202.t155457.1264
Line 20: Line 23:
 
</syntaxhighlight>
 
</syntaxhighlight>
 
== Summary==
 
== Summary==
The <span style="font-family:monospace">cumf_summ</span> file gives a summary of the differences:
+
The 'cumf_summ' file gives a summary of the differences:
<syntaxhighlight>
+
<syntaxhighlight lang=text>
 
  COMPARE - SUMMARY MODE
 
  COMPARE - SUMMARY MODE
 
  -----------------------
 
  -----------------------
Line 38: Line 41:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
You can see at the bottom of this file that the files aren't exactly identical, with 3 differences in the <span style="font-family:monospace">FIXED LENGTH HEADER}} and 1619 in the {{LOOKUP}} section. The most important section here though is the {{DATA FIELDS</span> section, which contains the model fields. If there are differences in this section then the files are truly different - in this case the field values are the same but there are some differences in the metadata.
+
You can see at the bottom of this file that the files aren't exactly identical, with 3 differences in the 'FIXED LENGTH HEADER' and 1619 in the 'LOOKUP' section. The most important section here though is the 'DATA FIELDS' section, which contains the model fields. If there are differences in this section then the files are truly different - in this case the field values are the same but there are some differences in the metadata.
  
 
== Full Details ==
 
== Full Details ==
You can see more information in the <span style="font-family:monospace">cumf_full</span> file, which lays out the structure of both files and highlights differences.
+
You can see more information in the 'cumf_full' file, which lays out the structure of both files and highlights differences.
  
 
===Fixed Length Header===  
 
===Fixed Length Header===  
<syntaxhighlight>
+
<syntaxhighlight lang=text>
 
  FIXED LENGTH HEADER
 
  FIXED LENGTH HEADER
 
  -------------------
 
  -------------------
Line 82: Line 85:
 
</syntaxhighlight>
 
</syntaxhighlight>
 
This section includes a timestamp of when the file was created - it's common that the '<span style="font-family:monospace">Creation time</span>' value will differ between two runs. It also includes the calendar and grid type. Any values that differ between the fields will be stated, e.g.
 
This section includes a timestamp of when the file was created - it's common that the '<span style="font-family:monospace">Creation time</span>' value will differ between two runs. It also includes the calendar and grid type. Any values that differ between the fields will be stated, e.g.
<syntaxhighlight>
+
<syntaxhighlight lang=text>
 
  FIXED LENGTH HEADER:
 
  FIXED LENGTH HEADER:
 
ITEM = 38 Values = 14 and 18
 
ITEM = 38 Values = 14 and 18
Line 91: Line 94:
 
===Constants===  
 
===Constants===  
  
The integer and real constant tables contain values such as number of timesteps, land points and grid size and spacing. See [1] for the full list of constants. <span style="font-family:monospace">cumf</span> will show them like:
+
The integer and real constant tables contain values such as number of timesteps, land points and grid size and spacing. See [1] for the full list of constants. 'cumf' will show them like:
<syntaxhighlight>
+
<syntaxhighlight lang=text>
 
  INTEGER HEADER:
 
  INTEGER HEADER:
 
  OK
 
  OK
Line 107: Line 110:
 
Fields are stored in UM files as a collection of 2 dimensional arrays. 3 dimensional fields are stored using a separate 2d array for each level.
 
Fields are stored in UM files as a collection of 2 dimensional arrays. 3 dimensional fields are stored using a separate 2d array for each level.
  
The lookup table contains header information for each of the fields in the file, such as the date, level, axes and processing information. <span style="font-family:monospace">cumf</span> will show differences like
+
The lookup table contains header information for each of the fields in the file, such as the date, level, axes and processing information. 'cumf' will show differences like
<syntaxhighlight>
+
<syntaxhighlight lang=text>
 
  LOOKUP:
 
  LOOKUP:
 
Header1:    1 Header2:    1 Item:  28 Values:  352331094 352331096
 
Header1:    1 Header2:    1 Item:  28 Values:  352331094 352331096
Line 115: Line 118:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
In this case item 28 of the lookup tables are different, which corresponds to an encoding of the experiment id (e.g. '<span style="font-family:monospace">vabcd</span>') according to the UM documentation [1]. In this case the files were generated by two different experiments.
+
In this case item 28 of the lookup tables are different, which corresponds to an encoding of the experiment id (e.g. '>vabcd') according to the UM documentation [1]. In this case the files were generated by two different experiments.
  
 
===Fields===  
 
===Fields===  
  
 
At the end of the comparison will be a list of the individual fields, looking like:
 
At the end of the comparison will be a list of the individual fields, looking like:
<syntaxhighlight>
+
<syntaxhighlight lang=text>
 
Field  1617 : Stash Code  436 : Dust division 6  mass mixing ratio  : Level  36
 
Field  1617 : Stash Code  436 : Dust division 6  mass mixing ratio  : Level  36
  
Line 128: Line 131:
 
==References==  
 
==References==  
  
[1] [https://code.metoffice.gov.uk/doc/um/vn10.2/papers/umdp_F03.pdf | UMDP F03 File Formats]
+
[1] [https://code.metoffice.gov.uk/doc/um/latest/papers/umdp_F03.pdf | UMDP F03 File Formats]
 +
[[Category: Unified Model]]

Latest revision as of 23:28, 11 December 2019

Template:Needs Update This page needs updating Update to mule-cumf

Note The 'cumf' tool has been replaced with 'mule-cumf' in the Conda environment, which works very similarly to what's described here

---

There is a special tool called 'cumf' for comparing two UM output files. Since the UM files contain a timestamp of when the run was performed using 'diff' to compare two output files won't give any useful information.

To use 'cumf' to compare two UM output files 'FILE1' and 'FILE2:

module load um
$UMDIR/vn8.5/normal/utils/cumf FILE1 FILE2

The 'um' module needs to be loaded in order to set up some necessary environment variables.

Once run 'cumf' will give you a list of files that you can look at to see the differences between the files:

CUMF successful
Summary in: /short/w35/saw562/tmp/cumf_summ.saw562.d15202.t155457.1264
Full output in: /short/w35/saw562/tmp/cumf_full.saw562.d15202.t155457.1264
Difference maps (if available) in: /short/w35/saw562/tmp/cumf_diff.saw562.d15202.t155457.1264

Summary

The 'cumf_summ' file gives a summary of the differences:

 COMPARE - SUMMARY MODE
 -----------------------

Number of fields in file 1 = 1619
Number of fields in file 2 = 1619
Number of fields compared = 1619

FIXED LENGTH HEADER: Number of differences = 3
INTEGER HEADER: Number of differences = 0
REAL HEADER: Number of differences = 0
LEVEL DEPENDENT CONSTANTS: Number of differences = 0
LOOKUP: Number of differences = 1619
DATA FIELDS: Number of fields with differences = 0
 files DO NOT compare

You can see at the bottom of this file that the files aren't exactly identical, with 3 differences in the 'FIXED LENGTH HEADER' and 1619 in the 'LOOKUP' section. The most important section here though is the 'DATA FIELDS' section, which contains the model fields. If there are differences in this section then the files are truly different - in this case the field values are the same but there are some differences in the metadata.

Full Details

You can see more information in the 'cumf_full' file, which lays out the structure of both files and highlights differences.

Fixed Length Header

 FIXED LENGTH HEADER
 -------------------
 Dump format version 20
 UM Version No 703
 Atmospheric data
 Charney-Phillips on radius levels
 Over global domain
 FIELDS file
 Exp No =-32768 Run Id = 0
 Gregorian calendar
 Arakawa C grid
 Year Month Day Hour Min Sec DayNo
 Data time = 2002 1 1 0 0 0 1
 Validity time = 2002 5 1 0 30 0 121
 Creation time = 2015 7 20 14 20 1 -32768
 Start 1st dim 2nd dim 1st parm 2nd parm
 Integer Consts 257 46 46
 Real Consts 303 38 38
 Level Dep Consts 341 39 8 39 8
 Row Dep Consts 0-1073741824-1073741824 0 0
 Column Dep Consts 0-1073741824-1073741824 0 0
 Fields of Consts 0-1073741824-1073741824 0 0
 Extra Consts 0-1073741824 0
 History Block 0-1073741824 0
 CFI No 1 0-1073741824 0
 CFI No 2 0-1073741824 0
 CFI No 3 0-1073741824 0
 Lookup Tables 653 64 4096 64 4096
 Model Data 264193 107078676 107078676

 LEVEL DEPENDENT CONSTANTS
 312 64-bit words long

 LOOKUP TABLE
 262144 64-bit words long

This section includes a timestamp of when the file was created - it's common that the 'Creation time' value will differ between two runs. It also includes the calendar and grid type. Any values that differ between the fields will be stated, e.g.

 FIXED LENGTH HEADER:
ITEM = 38 Values = 14 and 18
ITEM = 39 Values = 20 and 0
ITEM = 40 Values = 1 and 39

Constants

The integer and real constant tables contain values such as number of timesteps, land points and grid size and spacing. See [1] for the full list of constants. 'cumf' will show them like:

 INTEGER HEADER:
 OK

 REAL HEADER:
 OK

 LEVEL DEPENDENT CONSTS:
 OK

Lookup

Fields are stored in UM files as a collection of 2 dimensional arrays. 3 dimensional fields are stored using a separate 2d array for each level.

The lookup table contains header information for each of the fields in the file, such as the date, level, axes and processing information. 'cumf' will show differences like

 LOOKUP:
Header1:     1 Header2:     1 Item:  28 Values:  352331094 352331096
Header1:     2 Header2:     2 Item:  28 Values:  352331094 352331096
Header1:     3 Header2:     3 Item:  28 Values:  352331094 352331096

In this case item 28 of the lookup tables are different, which corresponds to an encoding of the experiment id (e.g. '>vabcd') according to the UM documentation [1]. In this case the files were generated by two different experiments.

Fields

At the end of the comparison will be a list of the individual fields, looking like:

Field  1617 : Stash Code   436 : Dust division 6  mass mixing ratio   : Level   36

 OK

References

[1] | UMDP F03 File Formats