Researchers have developed a new method called HCM-GRPO to improve the physical plausibility reasoning capabilities of multimodal large language models (MLLMs). This approach includes a Hard Cases Mining strategy and a Dynamic Proportional Accuracy reward integrated into the Group Relative Policy Optimization framework. To support this, a dataset of over 128,000 samples, comprising around 640,000 images, was created to evaluate reasoning across appearance, shadow, layout, and extension rationality. Experiments showed that even advanced models like GPT5.2 and Gemini3-Pro struggle with this task, while the HCM-GRPO method achieved superior results with a smaller model. AI
IMPACT Enhances AI's ability to understand and generate physically plausible images, potentially improving image screening and content moderation.
RANK_REASON The cluster contains an academic paper detailing a new method and dataset for improving AI model capabilities. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →