logarithmic loss
PulseAugur coverage of logarithmic loss — every cluster mentioning logarithmic loss across labs, papers, and developer communities, ranked by signal.
-
New SFT objectives outperform NLL for capable LLMs
Researchers have explored alternative objectives for supervised fine-tuning (SFT) of large language models, moving beyond the standard negative log likelihood (NLL). Their study, involving extensive experiments across v…
-
DeepSeek-V4, LoRA, and other LLM techniques detailed in new blogs
A series of six blog posts has been published on Outcome School, detailing fundamental components of contemporary large language models. The posts cover technical concepts such as RMSNorm, DeepSeek-V4, LoRA, RoPE, GQA, …
-
New framework optimizes deep learning training by separating layers
Researchers have introduced a novel framework called Layer Separation Optimization to address challenges in training deep learning models with cross-entropy loss. This method aims to mitigate the strong nonconvexity iss…