PulseAugur
实时 14:32:26
English(EN) CRAM: Centroid-Routing and Adaptive MoE for Multimodal Continual Instruction Tuning

新方法增强了多模态任务中LLM的持续学习能力

研究人员开发了新的方法来改进大型语言模型的多模态持续指令调优。CRAM专注于将任务特定模式隔离到独立模块中,并使用自适应秩实例化来高效分配参数。ProtoAda引入了格式感知任务原型,以使任务分配与语义和输出结构保持一致,并在几何上整合更新。PROXYMIX在一个小型代理模型上学习一个动态重放控制器,并将其转移到一个更大的目标模型上,以减轻遗忘并保持对齐行为。 AI

影响 这些方法旨在提高多模态LLM在现实世界、不断变化的部署场景中的适应性和鲁棒性。

排序理由 多篇研究论文提出了多模态持续指令调优的新颖方法。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 5 个来源。 我们如何撰写摘要 →

报道来源 [5]

  1. arXiv cs.CL TIER_1 English(EN) · Jun-Tao Tang, Zhen-Hao Xie, Yu-Cheng Shi, Da-Wei Zhou ·

    CRAM: Centroid-Routing and Adaptive MoE for Multimodal Continual Instruction Tuning

    arXiv:2606.02502v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) unify heterogeneous vision-language tasks under a shared generative framework via instruction tuning, yet real-world deployment demands continuous capability expansion, making Multimodal Cont…

  2. arXiv cs.LG TIER_1 English(EN) · Ibne Farabi Shihab, Fariya Afrin, Anuj Sharma ·

    Dynamic Proxy-Mixing: Transferring Replay Controllers from Small to Large Models for Continual Instruction Tuning

    arXiv:2606.00400v1 Announce Type: new Abstract: Continual instruction tuning updates a language model through a sequence of new domains, yet each update can progressively erode previously learned capabilities and alignment behavior. Replay is the standard mitigation, but fixed re…

  3. arXiv cs.LG TIER_1 English(EN) · Yu-Cheng Shi, Zhen-Hao Xie, Jun-Tao Tang, Da-Wei Zhou ·

    ProtoAda: Prototype-Guided Adaptive Adapter Expansion and Geometric Consolidation for Multimodal Continual Instruction Tuning

    arXiv:2606.02576v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) achieve strong performance through instruction tuning, but real-world deployment requires them to continually acquire new vision-language capabilities, making Multimodal Continual Instructi…

  4. arXiv cs.LG TIER_1 English(EN) · Da-Wei Zhou ·

    ProtoAda: Prototype-Guided Adaptive Adapter Expansion and Geometric Consolidation for Multimodal Continual Instruction Tuning

    Multimodal Large Language Models (MLLMs) achieve strong performance through instruction tuning, but real-world deployment requires them to continually acquire new vision-language capabilities, making Multimodal Continual Instruction Tuning (MCIT) essential. To reduce inter-task i…

  5. arXiv cs.CL TIER_1 English(EN) · Da-Wei Zhou ·

    CRAM: Centroid-Routing and Adaptive MoE for Multimodal Continual Instruction Tuning

    Multimodal Large Language Models (MLLMs) unify heterogeneous vision-language tasks under a shared generative framework via instruction tuning, yet real-world deployment demands continuous capability expansion, making Multimodal Continual Instruction Tuning (MCIT) essential. Exist…