Partitioning of Ecosystem Autotrophic and Heterotrophic Respiration using Multi-source Data and Knowledge-guided Machine Learning
Autotrophic respiration (Ra) and heterotrophic respiration (Rh) are important components of the terrestrial carbon cycle, but they remain understudied due to the challenge of separately quantifying these processes in field measurements. Ecosystem respiration (Reco) can be partitioned from net ecosystem exchange (NEE) using eddy covariance (EC) flux tower measurements, but Ra and Rh are rarely further partitioned with high confidence. The lack of continuous, ecosystem-level data for Ra and Rh hinders our understanding of the ecosystem carbon cycle and leads to significant disagreements in carbon budget simulations. Efforts have been made over the years to separate Ra and Rh, but existing methods have limitations such as spatial and temporal coverage, or limited resolution and accuracy. Existing datasets measure different components of ecosystem respiration fluxes but are unable to directly separate Ra and Rh, and there is an unmet need for synthesis efforts to obtain a comprehensive understanding of Ra and Rh components.
To address this gap, this study aims to (1) synthesize available datasets for Reco, Ra, and Rh; (2) significantly expand an existing knowledge-guided machine learning (KGML) framework, named KGML-Carbon, to generate separate Ra and Rh determinations at the field-to-regional level; and (3) investigate how spatial and temporal patterns, key drivers, and responses of Ra and Rh to climate change differ from the existing knowledge primarily derived from the combined Reco. The KGML-Carbon framework was originally designed by the research team to quantify carbon budgets (i.e. NEE, Reco, crop yield, and soil carbon stocks). The expanded framework will explicitly separate the component fluxes of Reco, including above- and below-ground Ra (Ra_ab and Ra_bl) and Rh for various ecosystems across the globe. It will seamlessly integrate cutting-edge scientific knowledge of biogeochemical processes with advanced machine learning models, leveraging multi-task learning and transfer learning techniques to synthesize information from multiple sources at different scales. This study will significantly improve our understanding of the carbon budget by providing large amounts of accurate, continuous, and landscape-level data on Ra and Rh.