Simplifying ESM Analysis Through Standards
Project Team
Principal Investigator
Co-Principal Investigator
Project Participant
The goal of the Simplifying ESM Analysis Through Standards (SEATS) project is to make it easier to analyze the increasingly complex output from earth system models. To do so, SEATS will augment existing standards where needed, define missing standards, and implement software tools that we hope will become a standard part of modern analysis workflows. It a collaborative effort between Argonne National Laboratory, Lawrence Livermore National Laboratory, and the University of California-Davis. It aims to simplify the construction and use of ESM analysis workflows through the identification and creation of data and software standards.
Project Description:
Earth system model (ESM) developers and researchers frequently run automated analysis tools to generate metrics and diagnostics, and subsequently, obtain insights into model performance and inform model developments. Domain scientists of ESMs usually focus on one component (atmosphere, ocean or land, etc.) and rely on a limited suite of tools to analyze that component.
However, when they start to look at more processes, variables, or components of a model (or even multiple models), a bewildering variety of data types and tools emerge. A new research program, sponsored by the Regional and Global Modeling and Analysis (RGMA) program area in DOE’s Earth and Environmental System Modeling program aims to simplify the experience of ESM analysis.
As climate models have become more complex, standalone components have tended to develop their own command-line-based software package(s) for performing common analyses within their community. Moving from tool to tool requires remembering particular command-line options, and structuring data or a data set’s directories in formats that are potentially incompatible across tools. Even within a model component, there can be multiple packages for different collections of metrics, each with its own usage idiosyncrasies. Context-switching can be mentally exhausting and inefficient, forcing scientists to rely on the one or two packages they know well and subsequently, limiting the scope of their research.
The goal of SEATS (Simplifying ESM Analysis Through Standards) is to simplify the experience of ESM analysis and lower the multiple learning curves within the ESM tool landscape. Led by principal investigator Robert Jacob of Argonne National Laboratory, SEATS (https://seatstandards.org/) expands the use of existing standards in current software, defines new or modifying existing standards where needed, and creates new to-become-standard software tools that fill gaps in the analysis workflows.
Inspired by CMEC: SEATS expands on the goals of the RGMA CMEC (Coordinated Model Evaluation Capabilities, https://cmec.llnl.gov/) project led by Paul Ullrich (a SEATS co-investigator). In collaboration with NOAA partners, CMEC has developed a set of standards for evaluation packages (including metrics modules and process-oriented diagnostics) that enable coordinated execution of distinct packages and unified visualization of their output. One of the goals for SEATS is to standardize the use of CMEC compliant tools, e.g., ILAMB (https://www.ilamb.org/) as a first example, in E3SM routine post-processing workflow, and to further simplify post-processing through the CMEC ecosystem.
Python as a Standard: The Python Programming Language has become a de facto standard for scientific analysis, so the developers of SEATS are looking to increase its use. SEATS is collaborating with E3SM to create xCDAT (https://xcdat.readthedocs.io/en/latest/), a pure Python replacement for key portions of the to-be-retired CDAT package (https://cdat.llnl.gov/). Many analysis tools within DOE have dependencies on CDAT which is partly written in C and no longer actively maintained.
SEATS and E3SM are leveraging the xarray Python package to create modern versions of CDAT routines that can be dropped into where they are used in tools like e3sm_diags (led by SEATS co-investigator Chengzhu Zhang). (https://e3sm-project.github.io/e3sm_diags/_build/html/master/index.html).
While xCDAT will work with structured grids, E3SM and models from other operational centers are increasingly using unstructured grids in their components. This means that output must be interpolated to structured grids before tools like xCDAT can be used. SEATS is also working to create a new Python package called “uxarray” which will become a standard for performing climate analyses directly on unstructured grid output using Python. SEATS is collaborating with the NSF-funded Raijin (https://raijin.ucar.edu/) project which shared the same goal of an unstructured grid analysis package in Python.