Video-LLMs suffer from directional motion blindness, researchers find

By PulseAugur Editorial · [2 sources] · 2026-05-21 17:59

Researchers have identified a significant limitation in current Video Large Language Models (Video-LLMs), termed "directional motion blindness," where models struggle to accurately perceive and articulate the direction of object movement. Despite motion direction information being present in the model's internal states, a "direction binding gap" prevents it from being correctly associated with verbal outputs. To address this, the researchers developed MoDirect, a dataset for tuning and evaluation, and DeltaDirect, a novel objective function that significantly improves motion direction accuracy from near chance to over 85% on synthetic benchmarks and by 21.9 points on real-world data. AI

IMPACT Identifies a critical perceptual flaw in Video-LLMs, potentially impacting their reliability for tasks requiring fine-grained temporal understanding.

RANK_REASON Academic paper detailing a new diagnostic method and proposed solution for a specific failure mode in Video-LLMs.

Read on arXiv cs.CV →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

arXiv cs.CV TIER_1 English(EN) · Jongseo Lee, Hyuntak Lee, Sunghun Kim, Sooa Kim, Jihoon Chung, Jinwoo Choi · 2026-05-22 04:00

Which Way Did It Move? Diagnosing and Overcoming Directional Motion Blindness in Video-LLMs

arXiv:2605.22823v1 Announce Type: new Abstract: Video Large Language Models (Video-LLMs) have made rapid progress on temporal video understanding, yet many fail at a basic perceptual primitive: signed image-plane motion direction. On simple videos of a single object moving left, …
arXiv cs.CV TIER_1 English(EN) · Jinwoo Choi · 2026-05-21 17:59

Which Way Did It Move? Diagnosing and Overcoming Directional Motion Blindness in Video-LLMs

Video Large Language Models (Video-LLMs) have made rapid progress on temporal video understanding, yet many fail at a basic perceptual primitive: signed image-plane motion direction. On simple videos of a single object moving left, right, up, or down, most Video-LLMs perform near…

COVERAGE [2]

Which Way Did It Move? Diagnosing and Overcoming Directional Motion Blindness in Video-LLMs

Which Way Did It Move? Diagnosing and Overcoming Directional Motion Blindness in Video-LLMs

RELATED ENTITIES

RELATED TOPICS