Learning data fusion and atmospheric forcing corrections using a physics-informed, differentiable hydrologic model
Atmospheric forcing data suffer from inherent biases and errors due to limitations either with the instruments or sampling. Such errors may accumulate and negatively influence hydrological models, generating biased simulations of fluxes like streamflow and evapotranspiration (ET). One of the ways to minimize this effect is to directly train neural networks like long short-term memory (LSTM) on multiple forcing datasets, which can implicitly correct or fuse datasets, but interpreting their complex networks to understand individual biases is difficult. Here, we propose a novel approach utilizing a differentiable hydrological model, which produces static or time-dynamic weights associated with different datasets. For the current work, we use LSTM to adaptively weigh three precipitation datasets: Daymet, Maurer, and NLDAS2. These weights could serve as a medium for data fusion of multiple datasets. Notably, this scheme of data fusion enabled our differentiable hydrologic model to greatly improve the performance of the hydrologic model and enhance both high and low flow simulations, while maintaining prediction performance for ET. When fusing multiple datasets, the trained weights showed almost equal importance for Daymet and NLDAS2 while the Maurer dataset showed the least importance. We further diagnose why the fused data provides better accuracy, e.g., whether it improved point-scale precipitation or corrected biases, and what features were employed by the neural network to achieve data fusion. We conclude by providing a new approach to fuse datasets and correct biases in a manner suitable for climate studies.