Comparative Analysis of Statistical, Machine Learning, and Transformer-based Deep Learning Models for Climate-driven Crop Yield Prediction in Maharashtra
Abstract
Maharashtra’s agricultural productivity is increasingly threatened by climate change, manifested through shifting temperature regimes, altered rainfall patterns, and a growing frequency of extreme weather events. Economically significant crops such as sugarcane, cotton, and soybeans are particularly sensitive to these climatic perturbations, necessitating reliable data-driven approaches for climate-resilient planning and yield forecasting. This study presents a comparative evaluation of climate-driven crop yield prediction models at the district level in Maharashtra. A multi-year dataset integrating historical climatic variables—including temperature, rainfall, humidity, and solar radiation—was employed. Five predictive approaches were examined: ARIMA as a statistical baseline, Random Forest and XGBoost as machine learning models, Long Short-Term Memory (LSTM) networks for sequential learning, and the Temporal Fusion Transformer (TFT) to capture long-term dependencies through attention mechanisms. All models were trained and tested under identical experimental conditions. Performance was evaluated using ROC-AUC alongside regression-based metrics. Results indicate that machine learning and deep learning models consistently outperform the statistical baseline. Among them, TFT achieved the highest accuracy across all crops, followed by LSTM and XGBoost. These findings underscore the value of attention-based architectures for modeling climate–crop interactions and offer practical guidance for AI-driven agricultural decision-making in climate-vulnerable regions.
Copyright (c) 2026 Ketaki Ghawali, Abhishek Garg

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

