Megatron-LM
PulseAugur coverage of Megatron-LM — every cluster mentioning Megatron-LM across labs, papers, and developer communities, ranked by signal.
-
New benchmarks and methods advance multimodal LLM capabilities
Researchers are developing new methods for multimodal large language models (MLLMs) to improve their understanding of sequential audio-video data and large-scale visual recognition. One approach, DLLM-VSR, uses diffusio…
-
PyTorch tutorial simplifies distributed AI model inference
This article explains distributed inference techniques for large AI models using PyTorch. It details how to implement Data Parallelism (DP), Tensor Parallelism (TP), and Pipeline Parallelism (PP) with minimal code. The …
-
New 1.58-bit LLM family achieves 6x inference memory reduction
A new family of large language models, BitCPM-CANN, has been developed using a novel 1.58-bit ternary quantization technique. These models, ranging from 0.5B to 8B parameters, achieve significant memory reduction for in…
-
New methods tackle LLM quantization for improved efficiency and accuracy
Researchers have developed several new methods to improve the efficiency of large language models (LLMs) through quantization. OSAQ focuses on suppressing weight outliers using a low-rank Hessian property for accurate l…
-
Why Nvidia builds open models with Bryan Catanzaro
Nvidia is significantly expanding its open model program, releasing higher quality models and datasets. This strategy benefits Nvidia by capturing value from open language models, creating a sustainable advantage. The c…