English(EN) Injecting Distributional Awareness into MLLMs via Reinforcement Learning for Deep Imbalanced Regression

研究人员使用强化学习来改进多模态大语言模型在不平衡数据上的回归性能

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-05 04:00

研究人员开发了一个新框架，以改进多模态大语言模型（MLLMs）处理数值回归任务的能力，特别是那些数据分布不平衡的任务。现有的训练方法通常会导致在频率较低的数据点上表现不佳。所提出的解决方案使用具有批次级监督的分布感知强化学习方法，以更好地对齐预测和实际数据分布。实验表明，与标准的微调方法相比，在训练样本有限的情况下，性能有了显著提高。 AI

影响为多模态大语言模型引入了一种新颖的训练范式，以增强其在不平衡回归数据集上的性能，有可能提高其在数据偏斜的实际应用中的效用。

排序理由这是一篇研究论文，详细介绍了一个用于提高多模态大语言模型在特定回归任务上性能的新框架。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Yao Du, Shanshan Li, Xiaomeng Li · 2026-05-05 04:00

Injecting Distributional Awareness into MLLMs via Reinforcement Learning for Deep Imbalanced Regression

arXiv:2605.01402v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) struggle with numerical regression under long-tailed target distributions. Token-level supervised fine-tuning (SFT) and point-wise regression rewards bias learning toward high-density regio…

报道来源 [1]

Injecting Distributional Awareness into MLLMs via Reinforcement Learning for Deep Imbalanced Regression

相关实体

相关话题