PulseAugur
EN
LIVE 10:33:39
ENTITY Mamba-2

Mamba-2

PulseAugur coverage of Mamba-2 — every cluster mentioning Mamba-2 across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
20
20 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
16
16 over 90d
TIER MIX · 90D
TOPICS
RELATIONSHIPS
SENTIMENT · 30D

7 day(s) with sentiment data

RECENT · PAGE 1/1 · 20 TOTAL
  1. SIGNIFICANT · CL_106351 ·

    NVIDIA Nemotron 3 Nano: Open Model for Efficient AI Agents

    NVIDIA has released Nemotron 3 Nano, a 30-billion parameter open model designed for efficient reasoning and long-context applications. This model utilizes a hybrid Mixture-of-Experts architecture, activating only a frac…

  2. SIGNIFICANT · CL_100955 ·

    NVIDIA unveils efficient Nemotron 3 LLM family with hybrid architecture

    NVIDIA has released two new large language models, Nemotron 3 Nano and Nemotron 3 Ultra, focusing on efficiency and advanced capabilities. Nemotron 3 Nano is a 30B-class model designed for private inference and agentic …

  3. RESEARCH · CL_95821 ·

    Ternary Mamba achieves 3.61x compression via QAT with knowledge distillation

    Researchers have developed a new method for compressing State Space Models (SSMs) like Mamba-2, significantly reducing their memory footprint for edge deployment. By employing grouped quantization-aware training (QAT) w…

  4. RESEARCH · CL_95877 ·

    New N-VSSM Model Outperforms Claude Opus 4.5 in Long-Form Narrative Consistency

    Researchers have developed NarrativeWorldBench, a new benchmark designed to evaluate large language models (LLMs) on their ability to maintain narrative consistency in long-form audio dramas. Current frontier LLMs strug…

  5. TOOL · CL_84911 ·

    Compiler-first duality enables portable O(1) Mamba-2 inference

    Researchers have developed a new method for optimizing Mamba-2 inference, focusing on compiler-first state space duality. This approach enables portable autoregressive caching with $O(1)$ complexity, eliminating the nee…

  6. RESEARCH · CL_84478 ·

    xLSTM outperforms Mamba-2 and DeltaNet in sequence modeling tasks

    A new research paper compares three subquadratic architectures—xLSTM, Mamba-2, and Gated DeltaNet—for sequence modeling tasks. The study found that xLSTM outperformed the others in code-model pre-training, distillation,…

  7. TOOL · CL_82633 ·

    DF-SSM compresses Mamba-2 to 1-bit, boosting speed and reducing size

    Researchers have developed Density Field State Space Models (DF-SSM), a novel framework for compressing large SSMs into a 1-bit scaffold with minimal performance loss. Applied to Mamba-2 1.3B, this method resulted in a …

  8. RESEARCH · CL_68175 ·

    Dynamic convolutions boost Transformer performance in LLMs

    Researchers have introduced dynamic short convolutions as a new primitive to enhance Transformer architectures used in large language models. These dynamic convolutions utilize input-dependent filters, increasing expres…

  9. TOOL · CL_65518 ·

    Mamba-2 interpretation probes miss half of state sink

    Researchers have identified a significant limitation in how Mamba-2's internal workings are understood. They found that standard probing techniques, which aim to link representational signatures to computational executi…

  10. RESEARCH · CL_62204 ·

    New framework unifies sequence models using Bayesian memory

    Researchers have introduced a "design-model" framework for creating efficient recurrent sequence maps based on memory assumptions. This framework uses Bayesian filtering to write evidence into memory and a query-depende…

  11. RESEARCH · CL_56423 ·

    New Oryx Model Flexibly Switches Between Attention and Recurrent Mixers

    Researchers have introduced Oryx, a novel hybrid model designed to flexibly switch between different sequence mixers, such as quadratic attention and linear recurrences, throughout a given sequence. This approach allows…

  12. TOOL · CL_48179 ·

    PapersWithCode adds multi-metric leaderboards and external paper support

    Hugging Face has launched new features for PapersWithCode, a platform tracking AI state-of-the-art. The updates include support for multiple metrics on leaderboards, such as for Automatic Speech Recognition and Object D…

  13. TOOL · CL_44790 ·

    WriteSAE enables direct manipulation of recurrent language model states

    Researchers have developed WriteSAE, a novel sparse autoencoder designed to manipulate the matrix updates within recurrent language model states. This method learns rank-1 matrix atoms that directly replace the model's …

  14. RESEARCH · CL_43909 ·

    NVIDIA unveils Gated DeltaNet-2 for improved linear attention

    NVIDIA has introduced Gated DeltaNet-2, a new linear attention layer designed to improve memory editing in recurrent neural networks. This model separates the processes of erasing old information and writing new informa…

  15. RESEARCH · CL_43911 ·

    MambaGaze framework uses Mamba-2 for cognitive load assessment

    Researchers have developed MambaGaze, a new framework designed to accurately assess cognitive load using eye-gaze tracking data. This system utilizes bidirectional Mamba-2 to efficiently model long-range temporal depend…

  16. FRONTIER RELEASE · CL_71083 ·

    NVIDIA releases Nemotron-3 Ultra 550B LLM for advanced reasoning

    NVIDIA has released its Nemotron-3 Ultra 550B model, a large language model designed for advanced reasoning and agentic workflows. This model features a hybrid LatentMoE architecture with Mamba-2 and attention layers, s…

  17. TOOL · CL_32672 ·

    REALM framework enables real-time LFP decoding for BCIs

    Researchers have developed REALM, a new framework for real-time decoding of local field potentials (LFPs) in brain-computer interfaces. This method uses a retrospective distillation process to transfer knowledge from a …

  18. TOOL · CL_15849 ·

    Component-aware self-speculative decoding boosts hybrid language model inference

    Researchers have developed a new method called component-aware self-speculative decoding, which enhances the efficiency of hybrid language models. This technique leverages the internal architectural differences within t…

  19. RESEARCH · CL_04999 ·

    Researchers explore optimal LoRA placement in hybrid language models

    A new paper explores the optimal placement of LoRA adapters in hybrid language models, which combine attention and recurrent components. The research demonstrates that adapting the attention pathway is more effective th…

  20. SIGNIFICANT · CL_47662 ·

    Together AI releases Mamba-3, prioritizing inference speed over training

    Together AI has released Mamba-3, a new state space model (SSM) prioritizing inference efficiency over training speed. This model features a more expressive recurrence formula, complex-valued state tracking, and a multi…