Horizon 2020
PulseAugur coverage of Horizon 2020 — every cluster mentioning Horizon 2020 across labs, papers, and developer communities, ranked by signal.
No coverage in the last 90 days.
1 day(s) with sentiment data
-
LoKA framework enables low-precision FP8 for large recommendation models
Researchers have developed LoKA, a framework designed to make low-precision arithmetic, specifically FP8, practical for large recommendation models (LRMs). Unlike previous attempts that often degraded model quality, LoK…
-
Superhuman and Databricks build 200K QPS AI inference platform
Superhuman and Databricks engineers collaborated to build a high-throughput inference platform capable of handling over 200,000 queries per second. This joint effort modernized Superhuman's serving stack, migrating from…
-
LLM Study Diary #3: PyTorch tensors, float types, and training infrastructure
This LLM study diary entry focuses on PyTorch fundamentals for training large language models. It details tensor basics, exploring various floating-point data types like FP32, BF16, and FP8 for efficiency and stability.…
-
Chinese chipmakers adopt DeepSeek's V4 AI model, boosting domestic hardware
Chinese technology firms, including Huawei and Cambricon, are rapidly adopting DeepSeek's new V4 AI model. This integration is happening across various hardware architectures within China, driven partly by geopolitical …
-
SnapMLA paper details hardware-aware FP8 quantized pipelining for efficient long-context MLA decoding
Researchers have developed SnapMLA, a new framework designed to enhance the efficiency of long-context decoding in Multi-head Latent Attention (MLA) architectures. This approach utilizes hardware-aware FP8 quantization …
-
NVIDIA launches Nemotron 3 Nano Omni, unifying multimodal AI for efficiency
NVIDIA has released Nemotron 3 Nano Omni, an open multimodal model capable of processing text, images, audio, and video. This model aims to unify these modalities into a single architecture, improving efficiency and ena…
-
No Jensen, Not All Compute is Created Equal
Nvidia CEO Jensen Huang suggested China could overcome advanced chip limitations by using more numerous, less advanced chips for AI training. However, this perspective overlooks the critical differences in chip capabili…
-
TACO framework boosts LLM training throughput by 1.87X with tensor compression
Researchers have introduced TACO, a novel framework designed to enhance the efficiency of training large-scale tensor-parallel Large Language Models (LLMs). TACO addresses communication overhead by employing an FP8-base…
-
Qwen3.6-35B model quantizations show FP8 quality worse than INT8, NVFP4 is a lie
A user on Reddit's LocalLLaMA community shared findings on the Qwen3.6-35B model, focusing on Kullback-Leibler (KLD) divergence metrics for different quantization formats like INT8, FP8, and NVFP4. The analysis, conduct…
-
AI safety research proposes formal framework for computational substrates
This series of posts explores the concept of 'substrates' in AI, which refers to the computational context layers necessary for implementing AI systems. The authors argue that current AI safety research lacks a clear fr…
-
DeepSeek V4 models offer high performance with reduced inference costs and NPU support
DeepSeek has released its V4 family of open-weight large language models, featuring a 1.6 trillion parameter model and a smaller 284 billion parameter Flash MoE model. These new models claim to rival top proprietary LLM…
-
SpikingBrain2.0 model offers efficient long-context and cross-platform AI inference
Researchers have introduced SpikingBrain2.0 (SpB2.0), a 5 billion parameter model designed for efficient long-context processing and cross-platform inference. The model features a novel Dual-Space Sparse Attention mecha…