ENTITY half-precision floating-point format

half-precision floating-point format

PulseAugur coverage of half-precision floating-point format — every cluster mentioning half-precision floating-point format across labs, papers, and developer communities, ranked by signal.

Total · 30d

0 over 90d

Releases · 30d

0 over 90d

Papers · 30d

0 over 90d

TIER MIX · 90D

No coverage in the last 90 days.

RELATIONSHIPS

RECENT · PAGE 1/1 · 7 TOTAL

TOOL · CL_22592 · May 8 · 06:19

INT8 quantization can slow down AI inference, study finds

A recent analysis explored the performance of INT8 quantization versus FP16 precision on NVIDIA's Ada Lovelace architecture, specifically using an L40S datacenter GPU and an RTX 4090 consumer card. The findings indicate…
RESEARCH · CL_15546 · May 4 · 06:52

EdgeLPR paper explores neural network precision vs performance trade-offs for LiDAR place recognition

Researchers have developed EdgeLPR, a method for efficient LiDAR-based place recognition on edge devices. The approach utilizes Bird's Eye View representations to enable lightweight image-based networks for autonomous n…
RESEARCH · CL_14350 · May 4 · 04:00

Object detection models show mixed robustness to quantization and input degradations

A new study investigates how post-training quantization (PTQ) affects the robustness of YOLO object detection models when faced with real-world input degradations like noise and blur. Researchers evaluated various preci…
RESEARCH · CL_06527 · Apr 28 · 04:00

New methods QFlash and ELSA boost Vision Transformer attention efficiency

Researchers have developed two new methods to improve the efficiency of attention mechanisms in vision transformers. QFlash focuses on enabling integer-only operations for FlashAttention, achieving significant speedups …
RESEARCH · CL_03804 · Apr 25 · 16:08

AI safety research proposes formal framework for computational substrates

This series of posts explores the concept of 'substrates' in AI, which refers to the computational context layers necessary for implementing AI systems. The authors argue that current AI safety research lacks a clear fr…
TOOL · CL_17754 · Apr 6 · 08:53

Apple's SeedLM compresses LLM weights using pseudo-random generators

Researchers have developed SeedLM, a novel post-training compression technique for large language models that utilizes pseudo-random generator seeds to encode model weights. This method aims to reduce the high runtime c…
RESEARCH · CL_01035 · Jan 10 · 17:00

Optimizing Transformer Inference: Techniques for Faster, Cheaper Large Models

Large transformer models present significant inference challenges due to their substantial memory footprint and computation costs, which scale quadratically with input length. Researchers and practitioners are exploring…

INT8 quantization can slow down AI inference, study finds

EdgeLPR paper explores neural network precision vs performance trade-offs for LiDAR place recognition

Object detection models show mixed robustness to quantization and input degradations

New methods QFlash and ELSA boost Vision Transformer attention efficiency

AI safety research proposes formal framework for computational substrates

Apple's SeedLM compresses LLM weights using pseudo-random generators

Optimizing Transformer Inference: Techniques for Faster, Cheaper Large Models