HOMMEXX 1.0: A Performance Portable Atmospheric Dynamical Core for the Energy Exascale Earth System Model
A rewrite of the atmosphere dynamics component of the whole-Earth, high-resolution, DOE climate model E3SM can now achieve high performance on current and future supercomputers, including exascale-class supercomputers built using thousands of energy-efficient General Purpose Graphics Processing Unit (GPGPU) accelerators.
A whole-Earth, high-resolution climate model is crucial for understanding the current and future earth climate system and its impact of DOE energy and national security missions. To continue the march towards higher-fidelity climate models, the computer codes must now adapt to ongoing disruptions in top-end computer architectures, which now depend on multiple, heterogeneous levels of parallelism and memory for computational efficiency. Our research has confirmed a viable path for achieving performance and largely insulating the codes from continuing churn in computer hardware.
Under the CMDV Software Modernization project of DOE’s Biological and Environmental Research office, a major code rewrite was undertaken by Sandia Lab scientists. The atmospheric dynamical core (HOMME) of the Energy Exascale Earth System Model (E3SM) was rewritten from the current CPU-centric implementation, in optimized Fortran 90, to a performance-portable implementation (HOMMEXX) in C++ using the Kokkos performance-portability programming model. E3SM is a whole-Earth high-resolution climate model that is being designed to answer important science questions related to climate impacts on DOE missions over the next decades. Within the model, HOMME simulates the dynamics and physical processes of the atmosphere, and is proven to perform well on traditional architectures. However, the newest and future supercomputers now have multiple, heterogeneous levels of parallelism and memory, and one supercomputer's architecture can be very different than another's. Instead of rewriting the atmosphere code for each new architecture, and redoing verification and validation of the new model, we desire one, performance portable, code base for a scientific code that does not depend on computing architecture. In the new performance portable version, HOMMEXX, the Kokkos library provides multidimensional arrays and intra-process parallel execution constructs that insulate the code from the architecture-specific details. HOMMEXX is at least as fast as the original HOMME on traditional platforms; it is up to 1.3 times faster than HOMME on Intel Knights Landing, and it runs on GPGPU up to 3.2 times faster than it runs on a dual-socket Intel Haswell traditional CPU. This success in the atmosphere dynamics has provided an affordable strategy to getting the full E3SM model to run well on exascale architectures.