PulseAugur
EN
LIVE 04:53:41

New SSM adapters outperform LoRA for long-context fine-tuning

Researchers have developed a new parameter-efficient fine-tuning (PEFT) method called Hankel Reduced order Model (HRM) adapters, which utilize state space models (SSMs) for long-context fine-tuning. Unlike traditional PEFT methods that focus on attention mechanisms, HRM adapters are designed to be injected into MLP blocks and leverage the time-invariance of SSMs for efficient computation. In evaluations using Mistral-7B on long-context tasks like LongBench, HRM adapters demonstrated superior performance compared to LoRA variants, achieving significant accuracy and ROUGE-1 score improvements. AI

IMPACT Introduces a novel PEFT method that improves performance on long-context tasks, potentially influencing future model fine-tuning strategies.

RANK_REASON The cluster contains a research paper detailing a new method for fine-tuning language models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New SSM adapters outperform LoRA for long-context fine-tuning

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Omanshu Thapliyal ·

    SSM Adapters via Hankel Reduced-order Modeling: Injection Site Determines Task Suitability in Long-Context Fine-Tuning

    arXiv:2606.26290v1 Announce Type: cross Abstract: While parameter-efficient fine-tuning (PEFT) typically targets attention projectors, its efficacy for tasks requiring sequential state accumulation remains under-explored. We examine if PEFT for such tasks can benefit from state s…