ARCA: Adapter-Residual Credit Assignment When Token Signals Degenerate
Researchers have introduced Adapter-Residual Credit Assignment (ARCA), a new method for assigning credit to tokens in language model reinforcement learning. ARCA addresses a failure mode in parameter-efficient fine-tuning, like LoRA, where standard credit signals can become degenerate. Instead of relying on output distribution changes, ARCA measures the adapter's actual impact on the model's hidden states. This approach requires no additional learned components and has shown competitive results in experiments with the MATH dataset and Qwen3-1.7B. AI
IMPACT Introduces a novel technique to improve the efficiency and effectiveness of fine-tuning large language models.