Researchers have developed a new method called Representation-Guided Low-rank Unlearning (REGLU) to address the challenge of removing specific information from large language models (LLMs) without degrading their overall performance. Existing techniques often struggle to balance forgetting unwanted data while retaining useful information due to limitations in identifying critical parameters. REGLU utilizes the geometric properties of representation spaces and a novel initialization for LoRA to pinpoint parameters for selective forgetting, while a regularization loss ensures minimal impact on the model's retained knowledge. Evaluations on benchmarks like TOFU and WMDP show REGLU surpasses current methods in unlearning quality and model utility. AI
排序理由 This is a research paper detailing a new method for LLM unlearning.
在 Hugging Face Daily Papers 阅读 →
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →