New Benchmark Tests LMMs' Creative Physical Intelligence

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-25 00:00

研究人员开发了MM-CreativityBench，这是一个旨在评估大型多模态模型（LMM）的创造性物理智能的新基准。该基准侧重于LMM在视觉丰富、物理约束环境中的识别和再利用对象的能力，这是当前模型通常缺乏的一项能力。为了解决这个问题，研究人员提出了一种使用直接偏好优化（Direct Preference Optimization）的具身对齐方法，鼓励模型依赖视觉证据并减少幻觉，从而提高实体选择和具身推理能力。 AI

影响该基准可以推动具有更复杂创造性问题解决能力的LMM的发展，超越模式识别。

排序理由该集群描述了一篇介绍用于评估AI模型的新颖基准和方法学的新学术论文。

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

New Benchmark Tests LMMs' Creative Physical Intelligence

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Cheng Qian, Hyeonjeong Ha, Jiayu Liu, Jeonghwan Kim, Emre Can Acikgoz, Bingxuan Li, Kunlun Zhu, Jiateng Liu, Aditi Tiwari, Zhenhailong Wang, Xiusi Chen, Mahdi Namazifar, Heng Ji · 2026-05-27 04:00

Advancing Creative Physical Intelligence in Large Multimodal Models

arXiv:2605.26396v1 Announce Type: new Abstract: Large multimodal models (LMMs) have rapidly advanced in perception and reasoning; however, it remains unclear whether these capabilities generalize to discovering visually grounded solutions in open-ended environments, beyond patter…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-25 00:00

Advancing Creative Physical Intelligence in Large Multimodal Models

Large multimodal models struggle with creative problem-solving in visually complex environments, but performance improves when trained with affordance-grounded alignment that prioritizes visual evidence over hallucinations.

报道来源 [2]

Advancing Creative Physical Intelligence in Large Multimodal Models

Advancing Creative Physical Intelligence in Large Multimodal Models

相关实体

相关话题