PulseAugur
实时 12:19:45

New Benchmark Tests LMMs' Creative Physical Intelligence

研究人员开发了MM-CreativityBench,这是一个旨在评估大型多模态模型(LMM)的创造性物理智能的新基准。该基准侧重于LMM在视觉丰富、物理约束环境中的识别和再利用对象的能力,这是当前模型通常缺乏的一项能力。为了解决这个问题,研究人员提出了一种使用直接偏好优化(Direct Preference Optimization)的具身对齐方法,鼓励模型依赖视觉证据并减少幻觉,从而提高实体选择和具身推理能力。 AI

影响 该基准可以推动具有更复杂创造性问题解决能力的LMM的发展,超越模式识别。

排序理由 该集群描述了一篇介绍用于评估AI模型的新颖基准和方法学的新学术论文。

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

New Benchmark Tests LMMs' Creative Physical Intelligence

报道来源 [2]

  1. arXiv cs.AI TIER_1 English(EN) · Cheng Qian, Hyeonjeong Ha, Jiayu Liu, Jeonghwan Kim, Emre Can Acikgoz, Bingxuan Li, Kunlun Zhu, Jiateng Liu, Aditi Tiwari, Zhenhailong Wang, Xiusi Chen, Mahdi Namazifar, Heng Ji ·

    Advancing Creative Physical Intelligence in Large Multimodal Models

    arXiv:2605.26396v1 Announce Type: new Abstract: Large multimodal models (LMMs) have rapidly advanced in perception and reasoning; however, it remains unclear whether these capabilities generalize to discovering visually grounded solutions in open-ended environments, beyond patter…

  2. Hugging Face Daily Papers TIER_1 English(EN) ·

    Advancing Creative Physical Intelligence in Large Multimodal Models

    Large multimodal models struggle with creative problem-solving in visually complex environments, but performance improves when trained with affordance-grounded alignment that prioritizes visual evidence over hallucinations.