English(EN)Introducing Nested Learning: A new ML paradigm for continual learning
AI持续学习研究应对灾难性遗忘
作者PulseAugur 编辑部·[11 个来源]·
研究人员正在探索AI持续学习的新方法,旨在克服“灾难性遗忘”的挑战,即模型在学习新技能时会丢失先前学到的信息。Google Research推出了“嵌套学习”,一种将模型视为相互关联的优化问题的范式,以缓解此问题。其他研究侧重于高效方法,如CIRCLE,它使用固定的存储库特征,以及用于LLM的SAE引导激活正则化,它在激活空间而非权重空间中运行。此外,正在开发新的指标来更好地表征遗忘,并提出了CoVON等新型优化器来平衡持续学习系统的稳定性和可塑性。调查结果表明,AI安全研究人员在广泛的持续学习代理的未来时间表和风险方面缺乏共识。
AI
arXiv:2606.27095v1 Announce Type: cross Abstract: Cold-start exemplar-free class-incremental learning requires learning a growing set of classes without replay, external pretraining, or a large initial task. Existing cold-start methods typically either train the backbone througho…
arXiv:2606.26629v1 Announce Type: cross Abstract: Weight-space regularization methods such as Elastic Weight Consolidation (EWC) are the standard approach to catastrophic forgetting in continual learning. However, those methods tend to underperform when applied to large language …
Cold-start exemplar-free class-incremental learning requires learning a growing set of classes without replay, external pretraining, or a large initial task. Existing cold-start methods typically either train the backbone throughout the stream and compensate for semantic drift, o…
Weight-space regularization methods such as Elastic Weight Consolidation (EWC) are the standard approach to catastrophic forgetting in continual learning. However, those methods tend to underperform when applied to large language models. We argue that such underperformance can be…
arXiv:2512.18471v2 Announce Type: replace Abstract: Continual learning systems face a fundamental geometric obstacle: as experience accumulates on a fixed-capacity manifold, covering numbers grow linearly with time, eventually forcing representational overlap and catastrophic int…
arXiv cs.LG
TIER_1English(EN)·Ahmed Anwar, Andreas Wagner, Federico Raue, Tobias Nauen, Andreas Dengel·
arXiv:2606.25165v1 Announce Type: new Abstract: Accuracy degradation is the standard metric for Catastrophic Forgetting (CF), however, it records only whether forgetting occurred or not. It saturates at the extremes and collapses discretely at task boundaries, hiding the internal…
arXiv cs.AI
TIER_1English(EN)·Subarnaduti Paul, Yohan Jung, Mohammad Emtiyaz Khan, Siddharth Swaroop, Thomas M\"ollenhoff, Martin Mundt·
arXiv:2606.24007v1 Announce Type: cross Abstract: Continual learning remains a major challenge for modern deep networks, partly because commonly used optimizers lack inherent mechanisms for continual adaptation. One such natural mechanism is fast and slow adaptation to balance st…
Continual learning remains a major challenge for modern deep networks, partly because commonly used optimizers lack inherent mechanisms for continual adaptation. One such natural mechanism is fast and slow adaptation to balance stability and plasticity. This mechanism has deep ro…
<p><i><span>This is the fifth post in the sequence </span></i><a href="https://www.lesswrong.com/s/oc5Auteiibo56kNXw"><i><span>Implications of Continual Learning for LLM Agents</span></i></a><i><span>.</span></i></p><h1><span>Summary</span></h1><p><span>While writing our continua…
<!-- SC_OFF --><div class="md"><p>My question on live continual learning use cases was removed by moderators here because they think i asked basic level question about live continual learning which i thought is a frontier level research. But anyways. Is anyone interested in talki…