Identifying Key Drivers of Wildfires in the Contiguous US Using Machine Learning and Game Theory Interpretation
Understanding the complex interrelationships between wildfire and its environmental and anthropogenic controls is crucial for wildfire modeling and management. Although machine learning (ML) models have yielded significant improvements in wildfire predictions, their limited interpretability has been an obstacle for their use in advancing understanding of wildfires. This study builds an ML model incorporating predictors of local meteorology, land-surface characteristics, and socioeconomic variables to predict monthly burned area at grid cells of 0.25° × 0.25° resolution over the contiguous United States. Besides these predictors, we construct and include predictors representing the large-scale circulation patterns conducive to wildfires, which largely improves the temporal correlations in several regions by 14%–44%. The Shapley additive explanation is introduced to quantify the contributions of the predictors to burned area. Results show a key role of longitude and latitude in delineating fire regimes with different temporal patterns of burned area. The model captures the physical relationship between burned area and vapor pressure deficit, relative humidity (RH), and energy release component (ERC), in agreement with the prior findings. Aggregating the contribution of predictor variables of all the grids by region, analyses show that ERC is the major contributor accounting for 14%–27% to large burned areas in the western US. In contrast, there is no leading factor contributing to large burned areas in the eastern US, although largescale circulation patterns featuring less active upper-level ridge-trough and low RH two months earlier in winter contribute relatively more to large burned areas in spring in the southeastern US.