Reading In and Accessing Grid Data

This notebook will show how to use UXarray to represent various unstructured grids in the UGRID convention and to access grid information, combined with Xarray to read in grid definition and data files.

Dependencies

  • uxarray

  • xarray

  • geocat-datafiles (for accessing test datasets)

Background Information

When working with unstructured grids, the grid definition and data variables are often stored as separate files (most of the time as netCDF). This means that there are multiple separate files that need to be read in at once.

For example, the NOAA Geoflow project consists of 4 files (1 grid file and 3 data files). This project follows the UGRID conventions. Special thanks to John Clyne, Shilpi Gupta, and the VAPOR team for providing these data!

geoflow-small
│   grid.nc
│   v1.nc
│   v2.nc
│   v3.nc

Opening The Grid and Data Files with Xarray

First, read in the grid definition and data variable netCCDF files by using xarray.open_dataset and/or xr.open_mf_dataset.

Imports

[1]:
import geocat.datafiles as gdf
import xarray as xr

Data

[2]:
# Base data path
base_path = "netcdf_files/geoflow-small/"

# Path to Grid file
grid_path = base_path + "grid.nc"

# Paths to Data Variable files
var_names = ['v1.nc', 'v2.nc', 'v3.nc']
data_paths = [base_path + name for name in var_names]
[3]:
# Open the Grid file
grid_ds = xr.open_dataset(gdf.get(grid_path))
[4]:
grid_ds
[4]:
<xarray.Dataset>
Dimensions:          (nMeshFaces: 3840, nFaceNodes: 4, nMeshNodes: 6000,
                      meshLayers: 20)
Coordinates:
    mesh_node_x      (nMeshNodes) float64 0.0 5.214 16.5 ... 62.7 68.8 72.0
    mesh_node_y      (nMeshNodes) float64 58.28 59.8 62.06 ... -31.68 -31.72
Dimensions without coordinates: nMeshFaces, nFaceNodes, nMeshNodes, meshLayers
Data variables:
    mesh             int32 -2147483647
    mesh_face_nodes  (nMeshFaces, nFaceNodes) float64 0.0 1.0 ... 5.998e+03
    mesh_depth       (meshLayers, nMeshNodes) float64 ...
[5]:
# Open a single file or multiple files
multi_data_ds = xr.open_mfdataset([
    gdf.get(data_paths[0]),
    gdf.get(data_paths[1]),
    gdf.get(data_paths[2])
])
[6]:
multi_data_ds
[6]:
<xarray.Dataset>
Dimensions:  (time: 1, meshLayers: 20, nMeshNodes: 6000)
Coordinates:
  * time     (time) float64 13.0
Dimensions without coordinates: meshLayers, nMeshNodes
Data variables:
    v1       (time, meshLayers, nMeshNodes) float64 dask.array<chunksize=(1, 20, 6000), meta=np.ndarray>
    v2       (time, meshLayers, nMeshNodes) float64 dask.array<chunksize=(1, 20, 6000), meta=np.ndarray>
    v3       (time, meshLayers, nMeshNodes) float64 dask.array<chunksize=(1, 20, 6000), meta=np.ndarray>

Representing The Unstructured Grid with UXarray

Import UXarray

[7]:
import uxarray as ux

UXarray.Grid object

[8]:
# Construct a UXarray Grid object from `grid_ds`
grid = ux.Grid(grid_ds)
[9]:
grid.ds
[9]:
<xarray.Dataset>
Dimensions:          (nMeshFaces: 3840, nFaceNodes: 4, nMeshNodes: 6000,
                      meshLayers: 20)
Coordinates:
    mesh_node_x      (nMeshNodes) float64 0.0 5.214 16.5 ... 62.7 68.8 72.0
    mesh_node_y      (nMeshNodes) float64 58.28 59.8 62.06 ... -31.68 -31.72
Dimensions without coordinates: nMeshFaces, nFaceNodes, nMeshNodes, meshLayers
Data variables:
    mesh             int32 -2147483647
    mesh_face_nodes  (nMeshFaces, nFaceNodes) float64 0.0 1.0 ... 5.998e+03
    mesh_depth       (meshLayers, nMeshNodes) float64 ...

As can be seen, the Grid object has its own grid.ds of type xarray.Dataset to define the grid topology. However, the Grid object has further attributes, variables, and functions to specify the unstructured grid and be executed on it, which can be explored in the next notebooks.

Accessing Grid Information

Unstructured grids can be represented in one of many different conventions (UGRID, SCRIP, EXODUS, etc). These conventions have different definitions and representations of the attributes and variables used to describe the unstructured grid topology. Even more, the UGRID conventions does not enforce standard variable namings for most of the attributes and variables (other than just a few required ones).

UXarray unifies all of these conventions at the data loading step by representing grids in the UGRID convention regardless of the original grid type that is read in from the file. Furthermore, it uses a set of standardized names for topology attributes and variables, while still providing the user with the original attribute names and variables that came from the grid definition file.

This section will showcase UXarray’s Standardized UGRID Names for accessing the grid topology attributes and variables stored in the UXarray.Grid object.

For more details on how to load in data, or multiple ways of accessing the grid topology information, check out our UXarray Usage Examples

Topology attribute and variable names coming from the grid file

Let’s first have a look at the original attribute and variable names that come from the grid file in UGRID convention with custom names:

[10]:
# Extract variable names
grid_names = list(grid.ds.keys()) + list(grid.ds.coords) + list(grid.ds.dims)

print("Original attribute and variable names from the grid file:\n", grid_names)
Original attribute and variable names from the grid file:
 ['mesh', 'mesh_face_nodes', 'mesh_depth', 'mesh_node_x', 'mesh_node_y', 'nMeshFaces', 'nFaceNodes', 'nMeshNodes', 'meshLayers']

UXarray’s standardized UGRID Names for topology attributes and variables

UXarray provides a dictionary, Grid.ds_var_names, to map the original topology attribute and variable names that come from the grid file into a standardized set of names. In other words, the dictionary uses a standardized set of UGRID attribute and variable names as keys, and the original variable names that come from the grid file as values. UXarray utilizes this dictionary under the hood to return references to the variables and attributes that are stored within UXarray.Grid.ds.

This eliminates the need to remember the original variable names and needing to index through the ds_var_names dictionary. Because of this, we find this as the most convenient approach:

[11]:
x_2 = grid.Mesh2_node_x
y_2 = grid.Mesh2_node_y
face_nodes_2 = grid.Mesh2_face_nodes
n_face_nodes_2 = grid.nMesh2_node

Please note, for instance, we accessed the actual variable mesh_node_x of grid via indexing the dictionary with the standardized name Mesh2_node_x. Regardless of the grid type we read in (e.g. UGRID, SCRIP, Exodus, etc.), this would be the same because of our representation of the grid in the UGRID convention with standardized names.

Other ways:

Indexing with original variable names

The original variable name can be used as an index into the grid dataset, Grid.ds:

[12]:
x_2 = grid.ds['mesh_node_x']
y_2 = grid.ds['mesh_node_y']
face_nodes_2 = grid.ds['mesh_face_nodes']
n_face_nodes_2 = grid.ds['nFaceNodes']

Indexing with UXarray variable dictionary

The dictionary, Grid.ds_var_names, can be used to index into the dataset. However, this makes the indexing code much longer than the other methods:

[13]:
var_names_dict = grid.ds_var_names
x_2 = grid.ds[var_names_dict['Mesh2_node_x']]
y_2 = grid.ds[var_names_dict['Mesh2_node_y']]
face_nodes_2 = grid.ds[var_names_dict['Mesh2_face_nodes']]
n_face_nodes_2 = grid.ds[var_names_dict['nMesh2_node']]

In conclusion, there are multiple ways of accessing the grid attributes and variables. Even though the UXarray developers recommend using the standardized UGRID names method, users can find each various pros/cons with each of these access ways.

Running Grid-Specific Functions

UXarray even has a few computational functions implemented already!

[14]:
grid_total_face_area = grid.calculate_total_face_area()
grid_face_areas = grid.compute_face_areas()
[15]:
grid_total_face_area
[15]:
12.566370618717663
[16]:
grid_face_areas
[16]:
array([0.00205959, 0.00363576, 0.00318908, ..., 0.00315344, 0.00361048,
       0.00205959])

There is also a Grid attribute for the result of the grid.compute_face_areas() function:

[17]:
grid.face_areas
[17]:
array([0.00205959, 0.00363576, 0.00318908, ..., 0.00315344, 0.00361048,
       0.00205959])