ENTITY Mamba-Transformer

Mamba-Transformer

PulseAugur coverage of Mamba-Transformer — every cluster mentioning Mamba-Transformer across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

5 over 90d

Releases · 30d

0 over 90d

Papers · 30d

4 over 90d

TIER MIX · 90D

frontier release 1
research 2
tool 2

TOPICS

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 5 TOTAL

RESEARCH · CL_93241 · Jun 12 · 00:00

Nemotron 3 Ultra: Open-Source LLM Boasts 1M Context, 6x Throughput

Researchers have introduced Nemotron 3 Ultra, a 550 billion parameter language model that utilizes a hybrid Mamba-Transformer architecture with a Mixture-of-Experts approach. The model was trained on 20 trillion tokens …
FRONTIER RELEASE · CL_71132 · Jun 4 · 13:01

NVIDIA releases open 550B Nemotron 3 models for agents and ASR

NVIDIA has released its Nemotron 3 family of open-source models, including Nemotron 3 Ultra and Nemotron 3.5 ASR. Nemotron 3 Ultra is a 550 billion parameter model designed for long-running AI agents, featuring a hybrid…
RESEARCH · CL_36662 · May 18 · 08:42

NVIDIA unveils 4-bit pretraining method, NVFP4, for LLMs

NVIDIA has developed a new 4-bit pretraining methodology called NVFP4, designed to overcome the challenges of reduced dynamic range and increased quantization error in narrower floating-point formats. This method was su…
TOOL · CL_21962 · May 8 · 04:00

New nGPT architecture enables native 4-bit training for LLMs

Researchers have developed a new neural network architecture called nGPT that natively supports 4-bit precision training for large language models. This architecture constrains weights and hidden representations to a un…
RESEARCH · CL_01012 · Feb 4 · 18:00

Why Nvidia builds open models with Bryan Catanzaro

Nvidia is significantly expanding its open model program, releasing higher quality models and datasets. This strategy benefits Nvidia by capturing value from open language models, creating a sustainable advantage. The c…

Nemotron 3 Ultra: Open-Source LLM Boasts 1M Context, 6x Throughput

NVIDIA releases open 550B Nemotron 3 models for agents and ASR

NVIDIA unveils 4-bit pretraining method, NVFP4, for LLMs

New nGPT architecture enables native 4-bit training for LLMs

Why Nvidia builds open models with Bryan Catanzaro