TECA at 750,000 Cores: Lessons Learned from a Hero Run on Mira
The TECA (Toolkit for Extreme Climate Analysis) software, developed under the CASCADE project, enables petascale-class climate data analytics. TECA can be configured to extract extreme weather events, such as Tropical Cyclones, Atmospheric Rivers, Extra-Tropical Cyclones, etc from massive datasets. In this work, we present lessons learnt from the application of TECA to a large fraction of the CMIP-5 archive. Our end-to-end data analysis workflow consists of the following steps: - downloading 60TBs of CMIP-5 data via ESGF - storing and pre-processing data at NERSC - transferring 6TB sized dataset to ALCF over ESnet - running TECA at full concurrency on Mira We will present a performance data pertaining to end-to-end network bandwidth and server utilization. We will present TECA MPI initialization, Parallel I/O, and communication optimizations that enabled the full-scale Mira run. We also present preliminary results on the characterization of extra-tropical cyclone activity in CMIP-5 (historical period), and comparison to numerous reanalysis products (NCEP2, NCEP-CFSR, JRA55, ERA_Interim). Preliminary investigations seem to suggest a strong dependence of extra-tropical counts on model resolution. We are exploring the applicability of scaling relationships to account for resolution, followed by comparisons with RCP8.5 results to determine changes in ETC activity.