Observation based benchmarks may improve representation of soil organic carbon dynamics in Land Surface Models
Accurate representation of environmental controls of soil organic carbon (SOC) in Earth system land models could reduce uncertainty in predicting carbon climate feedbacks. Machine learning models can help in quantifying relationships between environmental factors and SOC stocks. In this study, we used a large number of SOC field observations (n = 54, 000), geospatial dataset of environmental factors (n = 50), and two machine learning approaches—Random Forest (RF) and Generalized Additive Modeling (GAM) to: (1) identify dominant environmental controllers of global SOC stocks , (2) derive functional relationships between environmental controller and SOC stocks , and (3) compare the identified environmental controllers of SOC stocks in observations with their representations in the CMIP6 earth system model (ESM) representations. Our results identified temperature, drought index, cation exchange capacity and precipitation as key environmental controllers of global SOC stocks in observations. At global scale, RF model used 14 environmental factors and resulted in R2 and RMSE of 0.61 and 0.46 kg m-2 respectively. In ESMs, precipitation, temperature, and net primary productivity explained >96% variability of modeled SOC stocks. Our results show control of temperature on SOC stocks in ESMs are consistent with observations only in the range of 3-20oC and. Control of precipitation and net primary productivity on SOC stocks in ESMs are not consistent with observations. The SOC representation in ESMs could be improved significantly by including additional environmental controls (e.g., Cation Exchange Capacity) and representing the functional relationships of environmental controllers on SOC stocks consistent with observations.