A new research paper introduces the Unified Attribution Theory, suggesting that Video-LLMs' struggles with physical reasoning stem from "Semantic Prior Dominance" rather than perceptual issues. To address this, the paper proposes the Programmatic Adversarial Curriculum (PACC) dataset and the Visual-Anchored Reasoning Chain (VARC) method. Experiments show that fine-tuning with PACC significantly improves physical reasoning in state-of-the-art models without architectural changes. AI
影响 Introduces a novel dataset and method to improve physical reasoning in Video-LLMs, potentially enhancing their real-world applicability.
排序理由 Academic paper detailing a new theory and dataset for improving Video-LLM physical reasoning.
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →