RealBench: Benchmarking Data-Driven Numerical Weather Forecasting Under Operational Conditions and Extreme Event Challenges
Researchers have introduced RealBench, a new benchmark designed to more accurately evaluate AI weather forecasting models under real-world operational conditions. Unlike previous benchmarks that relied on reanalysis data, RealBench uses low-latency operational analysis and in-situ observations, with a test set from 2025 to prevent data leakage. It also includes specific metrics for high-impact extreme events like heatwaves and tropical cyclones, revealing significant performance gaps compared to traditional benchmarks. AI
IMPACT Provides a more realistic evaluation framework for AI weather models, potentially accelerating the development of more accurate forecasting systems.