PulseAugur / Brief
EN
LIVE 04:53:58

Brief

last 24h
[2/2] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. A Data-Efficient Path to Multilingual LLMs: Language Expansion via Post-training PARAM$Δ$ Integration into Upcycled MoE

    Researchers have developed a new method to efficiently expand Large Language Models (LLMs) to support more languages without extensive retraining. The technique involves converting a dense model into a Mixture-of-Experts (MoE) architecture, with different experts handling different languages. This approach allows for the integration of new language capabilities through post-training parameter deltas, bypassing the need for complex alignment phases and preserving the model's original abilities. AI

    A Data-Efficient Path to Multilingual LLMs: Language Expansion via Post-training PARAM$Δ$ Integration into Upcycled MoE

    IMPACT This method could significantly reduce the cost and complexity of making LLMs multilingual, potentially accelerating global access to advanced AI capabilities.

  2. @manicely6005 The public documentation can be found here too (3/3)

    NVIDIA has open-sourced parts of its cuDNN library, a significant move after 12 years of it being closed-source. This release includes over 20 Mixture-of-Experts (MoE) kernels and NSA sparse attention kernels. The codebase for these kernels is largely written in Python CuTe-DSL, with public documentation now available. AI

    @manicely6005 The public documentation can be found here too (3/3)

    IMPACT Open-sourcing of cuDNN kernels could accelerate research and development in AI infrastructure and model optimization.