Difference between revisions of "CodeBreak 29/9/2021"

(Created page with " = <span style="font-size:13.999999999999998pt; font-family:Arial; color:#434343; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-d...")
 
 
Line 1: Line 1:
  
= <span style="font-size:13.999999999999998pt; font-family:Arial; color:#434343; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Working with coordinates</span> =
+
= Working with coordinates =
  
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">When you pull data from different files, there can be cases where the coordinates appear to be the same but in fact differ from each other because of floating point representation. When doing calculations with two DataArrays, Xarray is using the coordinates to find data that is co-located. If the coordinates are slightly different, Xarray will consider the data is not co-located. Then Xarray will likely silently drop the points that are not co-located or return an indexing error. If you know the coordinates should be the same, the simplest is to assign the coordinates of one of the DataArray to the other array so Xarray will now consider them the same. We’ve put a simple example on </span>[https://climate-cms.org/2021/10/01/different_coordinates.html <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">this blog</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">to illustrate the issue and how to solve it.</span>
+
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">When you pull data from different files, there can be cases where the coordinates appear to be the same but in fact differ from each other because of floating point representation. When doing calculations with two DataArrays, Xarray is using the coordinates to find data that is co-located. If the coordinates are slightly different, Xarray will consider the data is not co-located. Then Xarray will likely silently drop the points that are not co-located or return an indexing error. If you know the coordinates should be the same, the simplest is to assign the coordinates of one of the DataArray to the other array so Xarray will now consider them the same. We’ve put a simple example on </span>[https://climate-cms.org/2021/10/01/different_coordinates.html <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">this blog</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">to illustrate the issue and how to solve it.</span>
  
= <span style="font-size:13.999999999999998pt; font-family:Arial; color:#434343; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Incompatible datetime and cftime error</span> =
+
= Incompatible datetime and cftime error =
  
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">When working with COSIMA Cookbook data, trying to add uniform coordinates threw an error because the time coordinates were not the same. The solution is to add the argument </span><span style="font-size:11pt; font-family:'Courier New'; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">use_cftime=True</span><span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">to the </span><span style="font-size:11pt; font-family:'Courier New'; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">getvar</span><span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">function. This forces xarray to always use the more general cftime type for time axes, so the time axes will always use compatible types.</span>
+
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">When working with COSIMA Cookbook data, trying to add uniform coordinates threw an error because the time coordinates were not the same. The solution is to add the argument </span><span style="font-size:11pt; font-family:'Courier New'; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">use_cftime=True</span><span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">to the </span><span style="font-size:11pt; font-family:'Courier New'; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">getvar</span><span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">function. This forces xarray to always use the more general cftime type for time axes, so the time axes will always use compatible types.</span> &nbsp; <span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Background: The standard numpy datetime library only supports a subset of the possible time axes that are commonly found in earth system data files, particularly model output data. The xarray library defaults to using the numpy datetime type as this is more compatible with other python packages (e.g. pandas) and so potentially offers more functionality. See the </span>[http://xarray.pydata.org/en/stable/user-guide/time-series.html <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">xarray docs for more information</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">.</span>
&nbsp;  
 
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Background: The standard numpy datetime library only supports a subset of the possible time axes that are commonly found in earth system data files, particularly model output data. The xarray library defaults to using the numpy datetime type as this is more compatible with other python packages (e.g. pandas) and so potentially offers more functionality. See the </span>[http://xarray.pydata.org/en/stable/user-guide/time-series.html <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">xarray docs for more information</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">.</span>
 
  
= <span style="font-size:13.999999999999998pt; font-family:Arial; color:#434343; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Reprojecting data from metres to lat/lon in python</span> =
+
= Reprojecting data from metres to lat/lon in python =
  
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Had some trouble converting the </span>[https://nsidc.org/data/NSIDC-0756/versions/2 <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">BedMachine Antarctic topography/bathymetry map</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">from metres to lat/lon in python. See </span>[https://climate-cms.org/2021/10/01/pyproj.html <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">this notebook</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">for an example using the </span>[https://pyproj4.github.io/pyproj/stable/ <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">pyproj library</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">. Another possibility: there are </span>[https://github.com/nsidc/nsidc0756-scripts <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">some scripts for performing coordinate transforms on this data</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">.&nbsp;</span>
+
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Had some trouble converting the </span>[https://nsidc.org/data/NSIDC-0756/versions/2 <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">BedMachine Antarctic topography/bathymetry map</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">from metres to lat/lon in python. See </span>[https://climate-cms.org/2021/10/01/pyproj.html <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">this notebook</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">for an example using the </span>[https://pyproj4.github.io/pyproj/stable/ <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">pyproj library</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">. Another possibility: there are </span>[https://github.com/nsidc/nsidc0756-scripts <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">some scripts for performing coordinate transforms on this data</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">.&nbsp;</span>
  
= <span style="font-size:13.999999999999998pt; font-family:Arial; color:#434343; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Calculate daily means of ERA5 pressure-level data using python</span> =
+
= Calculate daily means of ERA5 pressure-level data using python =
  
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">ERA5 data is a big dataset, the spatial and temporal resolution are respectively 0.25X0.25 degrees and 1 hr on 37 levels. The full timeseries for one pressure level variable is ~17TB. So while calculating a daily mean using xarray is quite straightforward we need to handle the data size to manage the memory usage.</span>
+
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">ERA5 data is a big dataset, the spatial and temporal resolution are respectively 0.25X0.25 degrees and 1 hr on 37 levels. The full timeseries for one pressure level variable is ~17TB. So while calculating a daily mean using xarray is quite straightforward we need to handle the data size to manage the memory usage.</span>
  
[https://github.com/ScottWales/training/blob/master/era5_analysis.ipynb <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">This notebook</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">shows in detail some strategies to handle the data and parallelise the computation using xarray and dask. While the example uses surface variables, and the older ERA5 collection, it offers a good step by step explanation of the strategy it adopts, introducing also the climtas module which has some useful functions to make the task more manageable and efficient.&nbsp;</span>
+
[https://github.com/ScottWales/training/blob/master/era5_analysis.ipynb <span style="font-size:11pt; font-family:Arial; color:#1155cc; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:underline; -webkit-text-decoration-skip:none; text-decoration-skip-ink:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">This notebook</span>]<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">shows in detail some strategies to handle the data and parallelise the computation using xarray and dask. While the example uses surface variables, and the older ERA5 collection, it offers a good step by step explanation of the strategy it adopts, introducing also the climtas module which has some useful functions to make the task more manageable and efficient.&nbsp;</span>
  
= <span style="font-size:13.999999999999998pt; font-family:Arial; color:#434343; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Converting UM outputs from pressure levels to model levels</span> =
+
= Converting UM outputs from pressure levels to model levels =
  
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">A question has arisen for interpolating a field that was created on Pressure levels to Model levels.</span>
+
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">A question has arisen for interpolating a field that was created on Pressure levels to Model levels.</span>
  
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">The method, using <span style="font-family:Courier New,Courier,monospace">metpy.interolate.log_interpolate_1d</span> has been tested using air temperature, a field that was output on both model (UM Stash Code m1s30i204) and pressure (m1s16i004) levels.</span>
+
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">The method, using <span style="font-family:Courier New,Courier,monospace">metpy.interolate.log_interpolate_1d</span> has been tested using air temperature, a field that was output on both model (UM Stash Code m1s30i204) and pressure (m1s16i004) levels.</span>
  
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Unfortunately, interpolating the temperature of the pressure level field to model levels did not agree with the model level version, with differences up to 30 degrees, beyond what could be considered interpolation uncertainty.</span>
+
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">Unfortunately, interpolating the temperature of the pressure level field to model levels did not agree with the model level version, with differences up to 30 degrees, beyond what could be considered interpolation uncertainty.</span>
  
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">It's correct to use log interpolation to convert from model levels to pressure levels, however to do this you need to know the pressure on model levels. In this instance the pressure had been approximated with the ideal gas law, but what you really need is the actual pressure values from the model. The differences between these was the most likely cause of the error, the model will need to be re-run outputting the pressure field to verify.</span>
+
<span style="font-size:11pt; font-family:Arial; color:#000000; background-color:transparent; font-weight:400; font-style:normal; font-variant:normal; text-decoration:none; vertical-align:baseline; white-space:pre; white-space:pre-wrap">It's correct to use log interpolation to convert from model levels to pressure levels, however to do this you need to know the pressure on model levels. In this instance the pressure had been approximated with the ideal gas law, but what you really need is the actual pressure values from the model. The differences between these was the most likely cause of the error, the model will need to be re-run outputting the pressure field to verify.</span>

Latest revision as of 16:21, 6 October 2021

Working with coordinates

When you pull data from different files, there can be cases where the coordinates appear to be the same but in fact differ from each other because of floating point representation. When doing calculations with two DataArrays, Xarray is using the coordinates to find data that is co-located. If the coordinates are slightly different, Xarray will consider the data is not co-located. Then Xarray will likely silently drop the points that are not co-located or return an indexing error. If you know the coordinates should be the same, the simplest is to assign the coordinates of one of the DataArray to the other array so Xarray will now consider them the same. We’ve put a simple example on this blogto illustrate the issue and how to solve it.

Incompatible datetime and cftime error

When working with COSIMA Cookbook data, trying to add uniform coordinates threw an error because the time coordinates were not the same. The solution is to add the argument use_cftime=Trueto the getvarfunction. This forces xarray to always use the more general cftime type for time axes, so the time axes will always use compatible types.   Background: The standard numpy datetime library only supports a subset of the possible time axes that are commonly found in earth system data files, particularly model output data. The xarray library defaults to using the numpy datetime type as this is more compatible with other python packages (e.g. pandas) and so potentially offers more functionality. See the xarray docs for more information.

Reprojecting data from metres to lat/lon in python

Had some trouble converting the BedMachine Antarctic topography/bathymetry mapfrom metres to lat/lon in python. See this notebookfor an example using the pyproj library. Another possibility: there are some scripts for performing coordinate transforms on this data

Calculate daily means of ERA5 pressure-level data using python

ERA5 data is a big dataset, the spatial and temporal resolution are respectively 0.25X0.25 degrees and 1 hr on 37 levels. The full timeseries for one pressure level variable is ~17TB. So while calculating a daily mean using xarray is quite straightforward we need to handle the data size to manage the memory usage.

This notebookshows in detail some strategies to handle the data and parallelise the computation using xarray and dask. While the example uses surface variables, and the older ERA5 collection, it offers a good step by step explanation of the strategy it adopts, introducing also the climtas module which has some useful functions to make the task more manageable and efficient. 

Converting UM outputs from pressure levels to model levels

A question has arisen for interpolating a field that was created on Pressure levels to Model levels.

The method, using metpy.interolate.log_interpolate_1d has been tested using air temperature, a field that was output on both model (UM Stash Code m1s30i204) and pressure (m1s16i004) levels.

Unfortunately, interpolating the temperature of the pressure level field to model levels did not agree with the model level version, with differences up to 30 degrees, beyond what could be considered interpolation uncertainty.

It's correct to use log interpolation to convert from model levels to pressure levels, however to do this you need to know the pressure on model levels. In this instance the pressure had been approximated with the ideal gas law, but what you really need is the actual pressure values from the model. The differences between these was the most likely cause of the error, the model will need to be re-run outputting the pressure field to verify.