AstroMind: A High-Fidelity Benchmark for Spacecraft Behavior Reasoning Based on Large Language Models
Researchers have introduced AstroMind, a new benchmark designed to improve spacecraft behavior reasoning for space domain awareness. This benchmark utilizes high-fidelity astrodynamics simulations and real observational data to create reasoning problems focused on intent inference, maneuver parameter estimation, and threat assessment. Initial evaluations of several open-weight models, including Qwen3 and GPT-OSS, revealed that model size alone is not the sole determinant of performance, with training data composition and reasoning prompt styles also playing significant roles. AI
IMPACT AstroMind aims to advance AI's ability to interpret complex spacecraft maneuvers, crucial for managing increasingly crowded orbital environments.