New Tool Synthesizing Global Agroeconomic Data for Multisector Agricultural Economic Modeling
Sound modeling hinges significantly upon the quality of data. The FAOSTAT is the most important source of global agricultural data. It offers data on agricultural production, land use, trade, food, nutrients, prices, and more. However, the raw data requires cleaning, balancing, and synthesis, and each modeling team develops its own assumptions and methods in this process. While largely overlooked, the uncertainty in this process contributes to the disparities in model outcomes. Here, to address these gaps, we developed a community tool (gcamfaostat) for downloading, cleaning, synthesizing, and balancing FAOSTAT datasets in a traceable, transparent, and reproducible manner.
Our new tool can flexibly process, expand, and update agriculture and land use data for MultiSector Dynamic models and other classes of models that represent the global agricultural economy. Our initiative enhances the accessibility and quality of data for the global MultiSector Dynamic and agricultural economic modeling communities, with the aim of fostering more robust and harmonized outcomes in a collaborative, efficient, and open-source framework. This collaborative approach, which establishes a standardized and streamlined process for data preparation and processing, has benefits that extend to all modeling groups. Reducing the efforts required for data processing and fostering harmonized base data calibration contributes to reducing modeling uncertainty and enhancing overall research efficiency.
Agriculture and land use (AgLU) play a critical role in MultiSector Dynamic models, as well as in other classes of models that represent the global agricultural economic system. The majority of AgLU-related data used in modeling, whether directly or indirectly, relies on raw data sourced from the Food and Agriculture Organization Statistics (FAOSTAT). Here we develop the gcamfaostat R package to prepare, process, and synthesize FAOSTAT data while ensuring transparency, traceability, and reproducibility. We have demonstrated its capabilities in generating and maintaining agroeconomic data required for the Global Change Analysis Model (GCAM), leveraging 2.1 GB or over 50% of the input data for the model. In other words, every data point from FAOSTAT used in GCAM is now traceable, ensuring accuracy and reliability in future updates. More importantly, the package provides an open-source platform for researchers to continually enhance processing methods. A novel feature of the package is the construction of the FAO Food Balance Sheets at the disaggregated commodity level (with over 500 commodities), providing comprehensive and detailed data input for a variety of analytical and modeling applications. Additionally, the package can be valuable to a broader range of users interested in understanding global agriculture trends and dynamics, as it offers user-friendly data processing and visualization tools.