Developing High-Resolution Root Zone Soil Moisture Using Machine Learning Across the Contiguous United States
High-quality and high-resolution gridded soil moisture products are essential for Earth science research and operational applications. However, currently available reanalysis or remote sensing products are often at coarse resolutions or only for surface soil. We developed a root zone soil moisture dataset with multiple layers, ranging from the surface to 2 meters in depth by using machine learning models trained with in situ measurements from the International Soil Moisture Network (ISMN) over the contiguous United States (CONUS). Our dataset features a 1 km spatial resolution and a daily temporal resolution, covering the period from 2000 to 2020. Our machine learning models incorporated multiple data sources related to soil moisture dynamics, including Daymet meteorological data, vegetation-related data (e.g., leaf area index, rooting depth), soil properties (e.g., percent sand, percent clay, soil organic matter), topographic data (elevation, slope, aspect), and hydrological data (e.g., ERA5-Land soil moisture reanalysis, runoff reanalysis, snow water equivalent, groundwater table depth). We developed and compared widely-used ensemble machine learning models, such as Random Forest and XGBoost, to select the best-performing ones. We compared our dataset against currently available coarse resolution reanalysis data over CONUS and generally showed higher accuracy. This high-resolution, multi-layer dataset could be highly useful for model benchmarking, extreme event studies, agriculture planning, water resource management, and other applications, especially in the context of climate change.