wikitext
PulseAugur coverage of wikitext — every cluster mentioning wikitext across labs, papers, and developer communities, ranked by signal.
2 day(s) with sentiment data
-
New framework enables linear merging of billion-parameter transformers
Researchers have developed a new framework for merging large pretrained transformers, specifically those with billions of parameters. This method addresses limitations of previous approaches by optimizing interpolation …
-
New research explores merging large transformers and improving looped model stability
Two new research papers explore novel techniques for enhancing the capabilities and stability of large transformer models. The first paper introduces a scalable framework for linear mode connectivity (LMC) that allows f…
-
New methods like SMF and SAM reduce catastrophic forgetting in LLMs
Two new research papers explore methods to mitigate catastrophic forgetting in language models during fine-tuning. One paper introduces Sparse Memory Finetuning (SMF), which adds memory layers and updates only heavily a…