Researchers have developed a new framework to improve how multimodal large language models (MLLMs) handle numerical regression tasks, particularly those with imbalanced data distributions. Existing training methods often lead to poor performance on less frequent data points. The proposed solution uses a distribution-aware reinforcement learning approach with batch-level supervision to better align predicted and actual data distributions. Experiments demonstrated significant improvements over standard fine-tuning methods, especially in scenarios with limited training examples. AI
影响 Introduces a novel training paradigm for MLLMs to enhance performance on imbalanced regression datasets, potentially improving their utility in real-world applications with skewed data.
排序理由 This is a research paper detailing a new framework for improving MLLM performance on specific regression tasks. [lever_c_demoted from research: ic=1 ai=1.0]
- Concordance Correlation Coefficient
- Group Relative Policy Optimization
- MLLMs
- Reinforcement Learning
- Supervised Fine-Tuning
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →