Interpolating NetCDF files to different grids

Revision as of 20:16, 1 July 2019 by S.wales (talk | contribs)

In order to use files from a reanalysis dataset as input to a model it is necessary to convert the input data to the model's resolution.

Interpolating to a new resolution is a conceptually simple operation, it can be thought of as a sparse matrix multiplication s W = t, where the weight matrix W says how each point in the source grid s gets mapped to the target grid t. A number of libraries are available both to calculate the weight matrix between different grids, as well as to perform the matrix multiplication. The Oasis coupler uses the | MCT library for converting between different grids in coupled models. The | Earth System Modelling Framework (ESMF) also has functions for interpolation, as well as a command-line tool to calculate weights offline which we'll cover here.

Doing the weight generation offline means you can re-use the weights matrix, which is handy if you have a lot of files to process. The weight generation is also the most resource intensive part of the process.

Creating a weights file with ESMF_RegridWeightGen

The ESMF_RegridWeightGen program reads grid information directly from a CF-NetCDF file to create the regridding weights. You can also define masks for both source and target grids using the missing values from a field in each file.

Creating the weights can use quite a lot of resources for high-resolution files, it is best to submit this to the queue. The following resources worked for interpolating a 7600x3600 source data set to 1536x1152 (UM n768):

#!/bin/bash
#PBS -l ncpus=16
#PBS -l mem=64
#PBS -l walltime=30:00
#PBS -l wd

module load esmf openmpi
ulimit -s unlimited
mpirun ESMF_RegridWeightGen -t GRIDSPEC -m neareststod  \
    -s SOURCE.nc --src_missingvalue SOURCE_FIELD \
    -d TARGET.nc --dst_missingvalue TARGET_FIELD \
    -w weights.nc--netcdf4

Useful Flags

  • -t: Defines the type of the grid data, possible options are:
    • GRIDSPEC: Read the grid from a CF-NetCDF file
    • SCRIP: Read the grid in SCRIP format
  • -m: Defines the regridding method, possible options are:
    • bilinear: Bilinear interpolation
    • patch: Higher-order interpolation
    • neareststod: Map each target point to the closest point in the source grid (only a single source point will map to a given target point)
    • nearestdtos: Map each source point to the closest point in the target grid (multiple source points can map to a single target point)
    • conserve: Conservative remapping (requires extra boundary information to calculate the grid cell areas, see documentation)
  • -s SOURCE.nc: Sample file that you'd like to regrid
  • -src_missingvalue SOURCE_FIELD}}: A field from Template:SOURCE.nc which defines the mask of the source grid (it should have either the Template:FillValue or {{missing_value attribute set)
  • -d TARGET.nc: Sample file with the target grid
  • -dst_missingvalue TARGET_FIELD}}: A field from Template:TARGET.nc which defines the mask of the target grid (it should have either the Template:FillValue or {{missing_value attribute set)
  • -w weights.nc: Output file for the regridding weights
  • --netcdf4: Output in NetCDF4 format

For a full list of flags, as well as details on the source grid formats, see the ESMF_RegridWeightGen documentation

Regridding Masked Fields

If either your source or target grids need to be masked it is possible that some of the points in the target grid will not be present in the source grid - for instance if a lakes are present in one file but not the other. If this is the case ESMF_RegridWeightGen will error out, printing a message to its log file. To avoid this you can either change to the neareststod}} interpolation method or ignore the missing values error with the {{--ignore flag then add the missing points back in after interpolating your files.

Regridding files with ncks

Once you have the weights file it's simple to convert data to the target grid using ncks:

module load nco
ncks --map weights.nc input.nc output.nc