Multimodal Multiple Perspective Agent - a LLM based Agentic AI Framework for Interpretable Wildfire Evolution Prediction
Reliable prediction of wildfire evolution is essential for fire management and mitigation of the detrimental effects of wildfires on ecosystems and human society. While recent advances in deep learning models bring new chances to improve fire predictability, understanding whether the models learn the complex mechanisms of wildfire evolution is also important. This work proposes a new Large Language Model (LLM) based agentic AI framework named Multimodal Multiple Perspective Agent (MMPA). MMPA takes multimodal inputs to produce interpretable wildfire evolution predictions. The inputs consist of multiple remote sensing images which represent the environmental, weather conditions, and the current fire status. For each image, the name of the variable, the range of the values in the image, and some knowledge about the variable, are also included as text. Leveraging the LLM’s abilities and our custom prompt engineering framework based on a Tree of Thoughts (ToT), our method breaks down the input features, enabling multi-perspective predictions from the LLM. The LLM then resolves conflicts, refines interpretations, and reaches a cohesive conclusion on how the fire will evolve on the next day, categorized into 6 options. Similar to other LLM quality evaluation approaches, the quality of MMPA’s prediction and interpretability are evaluated by its correctness and usefulness in 3 categories: very useful, somewhat useful, and not useful. Our preliminary results show that MMPA gives very useful answers 33 percent of time and a somewhat useful answer 38 percent of the time, a nearly 50% increase compared to the baseline Multi-modal Chain of Thought model’s. This same approach, leveraging the capabilities of LLMs for multi-modal multi-perspective analysis and decision-making, could be applicable to the mitigation and management of other natural hazards, an example being floods.