Italiano(IT) Pareto LoRA: Mitigating Modality Imbalance in Unified Multimodal Models via Pareto-Optimal Gradient Integration

新的Pareto LoRA方法平衡了多模态模型中的文本和图像梯度

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-17 04:00

研究人员推出了一种名为Pareto LoRA的新方法，用于解决参数高效微调过程中统一多模态模型（UMMs）中的模态不平衡问题。这种不平衡在基于LoRA的微调中尤为普遍，会导致语言梯度压倒图像生成，从而降低视觉质量。Pareto LoRA将多模态指令微调重构为双目标优化问题，使用帕累托最优策略集成文本和图像梯度，以平衡它们的方向和强度。在Emu2的CoMM基准测试上的实验表明，Pareto LoRA显著改善了多模态生成平衡，感知图像质量提高了44.9%，同时保持了文本性能。 AI

影响该方法可以提高多模态人工智能系统中图像生成的质量和平衡性。

排序理由该集群包含一篇详细介绍多模态模型新方法的论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 Italiano(IT) · Xiwen Wei, Mark Nutter, Madhusudhanan Srinivasan, Radu Marculescu · 2026-06-17 04:00

Pareto LoRA: Mitigating Modality Imbalance in Unified Multimodal Models via Pareto-Optimal Gradient Integration

arXiv:2606.17296v1 Announce Type: new Abstract: Unified multimodal models (UMMs) have recently emerged as a promising paradigm for integrating multimodal understanding and generation within a single autoregressive transformer. However, during multimodal instruction tuning, these …

报道来源 [1]

Pareto LoRA: Mitigating Modality Imbalance in Unified Multimodal Models via Pareto-Optimal Gradient Integration

相关实体

相关话题