Difference between revisions of "Modifying UM Files"

Line 4: Line 4:
Mule is included in the 'analysis27' conda environment, which you can load with
Mule is included in the 'analysis27' conda environment, which you can load with
<syntaxhighlight lang=bash>
module use /g/data3/hh5/public/modules
module use /g/data3/hh5/public/modules
module load conda/analysis27
module load conda/analysis27

Latest revision as of 23:38, 11 December 2019

You can manually modify UM files using the Mule python library

Mule is included in the 'analysis27' conda environment, which you can load with

module use /g/data3/hh5/public/modules
module load conda/analysis27

Reading and writing files

There are several sub-types of UM file. The ones you'll come across most frequently are 'fields files' - which are used for the model input and output, and 'ancil files' which are used for external data sources like emissions.

Fields files are read using `mule.ff.FieldsFile.from_file()`, ancil files are read using `mule.ancil.AncilFile.from_file()`.

Once you've opened a file, you can write it to disk again using | `.to_file()`:

filename_in = 'ab123.astart'
data = mule.ff.FieldsFile.from_file(filename_in)

filename_out = 'ab123.astart.modified'

Accessing Fields

The fields within the file can be accessed using the `.fields` property, which returns a list of all the fields in the file. Each field in the file is a 2d slice of a model variable, almost always a horizontal slice at a single level and time (ozone is an exception, it is often stored as a zonal average)

Each field has its LOOKUP header values available, see | UMDP F03 for the details of what all of them mean. As an example `lbuser4` is the STASH code of the field.

Fields are loaded lazily, only the headers get loaded when you read a file. You can get the values in the field as a numpy array using `.get_data()`

filename_in = 'ab123.astart'
data = mule.ff.FieldsFile.from_file(filename_in)

for field in data.fields:
    print("STASH code: "%field.lbuser4)

Modifying Fields

| Examples in the Mule docs

Fields in Mule get modified using 'operators'. Operators act on the fields as they are getting written - so rather than reading the whole file into memory, modifying all the fields and then writing back to disk, Mule will read one field at a time, apply any operators, then write out that field before moving on to the next. This dramatically reduces memory usage when working with large files.

An operator looks like

import mule
import numpy as np

class SetConstantOperator(mule.DataOperator):
    Set all field values to a constant
    def <u>init</u>(self, constant):
        self.constant = constant
    def new_field(self, source_field):
        """Creates the new field object"""
        return source_field.copy()
    def transform(self, source_field, new_field):
        """Performs the data manipulation"""
        data = source_field.get_data()
        # Multiply by 0 to keep the array shape
        return data * 0.0 + self.constant

# Sets all values in the field to '50'
setconst_50 = SetConstantOperator(50)

ff = mule.FieldsFile.from_file("InputFile")

for ifield, field in enumerate(ff.fields):
    # Set the surface temperature to 50
    if field.lbuser4 == 24:
        ff.fields[ifield] = setconst_50(field)

    # All other fields are their original value