PulseAugur
实时 21:51:43

新方法改进LLM微调以提升性能

研究人员开发了改进大型语言模型监督微调(SFT)的新方法。一种方法FisherAdapTune利用Fisher信息几何动态选择参数组进行适应,增强了分布内性能和零样本迁移能力。另一组方法,包括Target-SFT和PriFT,将SFT重新解释为目标分布设计。这些技术旨在通过更好地将微调过程与模型的预训练知识对齐,来创建更稳定有效的训练目标,从而在各种推理和代码生成任务上取得最先进的结果。 AI

影响 这些微调技术的进步可能导致更高效、更有效地将大型语言模型适应于特定的下游任务。

排序理由 多篇学术论文介绍了监督微调的新颖方法。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 6 个来源。 我们如何撰写摘要 →

报道来源 [6]

  1. arXiv cs.AI TIER_1 English(EN) · Ghodsiyeh Rostami, Po-Han Chen, Mahdi S. Hosseini ·

    Fisher-Guided Progressive Parameter Selection for Adaptive Fine-Tuning

    arXiv:2606.10196v1 Announce Type: cross Abstract: Parameter-efficient fine-tuning (PEFT) aims to adapt pretrained models with a small trainable parameter subset, however, most existing methods choose this subset from fixed architectural heuristics rather than using dynamic, task-…

  2. arXiv cs.AI TIER_1 English(EN) · Tong Xie, Yuanhao Ban, Yunqi Hong, Sohyun An, Yihang Chen, Cho-Jui Hsieh ·

    A Unifying Lens on Supervised Fine-Tuning Through Target Distribution Design

    arXiv:2606.11189v1 Announce Type: cross Abstract: Supervised fine-tuning (SFT) typically maximizes the likelihood of every token in a demonstrated trajectory. However, an observed token can be non-unique, noisy, or misaligned with the model prior. Strictly fitting toward this one…

  3. arXiv cs.CL TIER_1 English(EN) · Cho-Jui Hsieh ·

    A Unifying Lens on Supervised Fine-Tuning Through Target Distribution Design

    Supervised fine-tuning (SFT) typically maximizes the likelihood of every token in a demonstrated trajectory. However, an observed token can be non-unique, noisy, or misaligned with the model prior. Strictly fitting toward this one-hot target may be suboptimal, especially when the…

  4. arXiv cs.LG TIER_1 English(EN) · Ke Wang, Shuangqi Li, Mathieu Salzmann, Pascal Frossard ·

    PriFT:基于先验知识引导的监督微调

    arXiv:2606.09396v1 Announce Type: cross Abstract: Supervised fine-tuning (SFT) is an efficient approach for downstream task adaptation and often serves as the initialization stage for reinforcement learning (RL), but it can show weaker generalization than RL. A key limitation is …

  5. arXiv cs.CL TIER_1 English(EN) · Pascal Frossard ·

    PriFT:先验支持引导的监督微调

    Supervised fine-tuning (SFT) is an efficient approach for downstream task adaptation and often serves as the initialization stage for reinforcement learning (RL), but it can show weaker generalization than RL. A key limitation is its off-policy objective: SFT fits fixed demonstra…

  6. Medium — fine-tuning tag TIER_1 English(EN) · Panisetti Prudhviraj ·

    Understanding Fine-Tuning: From Zero to Hero (basics and why)

    <div class="medium-feed-item"><p class="medium-feed-snippet">Imagine I just hired a professional pianist who already knows how to play all kinds of music (Jazz, Pop, Classical&#x2026; everything).</p><p class="medium-feed-link"><a href="https://infiniteknowledge.medium.com/unders…