PulseAugur
EN
LIVE 10:39:08

New Benchmark Tests LMMs' Creative Physical Intelligence

Researchers have developed MM-CreativityBench, a new benchmark designed to evaluate the creative physical intelligence of large multimodal models (LMMs). The benchmark focuses on the ability of LMMs to identify and repurpose objects in visually rich, physically constrained environments, a capability that current models often lack. To address this, the researchers propose an affordance-grounded alignment method using Direct Preference Optimization, which encourages models to rely on visual evidence and reduce hallucinations, leading to improved entity selection and grounded reasoning. AI

IMPACT This benchmark could drive development of LMMs with more sophisticated creative problem-solving abilities, moving beyond pattern recognition.

RANK_REASON The cluster describes a new academic paper introducing a novel benchmark and methodology for evaluating AI models.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New Benchmark Tests LMMs' Creative Physical Intelligence

COVERAGE [2]

  1. arXiv cs.AI TIER_1 English(EN) · Cheng Qian, Hyeonjeong Ha, Jiayu Liu, Jeonghwan Kim, Emre Can Acikgoz, Bingxuan Li, Kunlun Zhu, Jiateng Liu, Aditi Tiwari, Zhenhailong Wang, Xiusi Chen, Mahdi Namazifar, Heng Ji ·

    Advancing Creative Physical Intelligence in Large Multimodal Models

    arXiv:2605.26396v1 Announce Type: new Abstract: Large multimodal models (LMMs) have rapidly advanced in perception and reasoning; however, it remains unclear whether these capabilities generalize to discovering visually grounded solutions in open-ended environments, beyond patter…

  2. Hugging Face Daily Papers TIER_1 English(EN) ·

    Advancing Creative Physical Intelligence in Large Multimodal Models

    Large multimodal models struggle with creative problem-solving in visually complex environments, but performance improves when trained with affordance-grounded alignment that prioritizes visual evidence over hallucinations.