PEDESTRIANQA: A Benchmark for Vision-Language Models on Pedestrian Intention and Trajectory Prediction
Researchers have introduced PedestrianQA, a new benchmark dataset designed to evaluate vision-language models (VLMs) on predicting pedestrian intentions and trajectories. This dataset frames these critical tasks for autonomous driving as question-answering problems, incorporating structured rationales for explanations. By training state-of-the-art VLMs on PedestrianQA, the study demonstrated significant improvements in intention classification, trajectory forecasting, and the generation of explanatory rationales. AI
IMPACT This benchmark could accelerate the development of safer autonomous driving systems by providing a standardized way to test and improve VLM capabilities in predicting pedestrian behavior.