PulseAugur / Brief
EN
LIVE 06:47:12

Brief

last 24h
[1/1] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. FlashSinkhorn: IO-Aware Entropic Optimal Transport on GPU

    Researchers have developed FlashSinkhorn, a new GPU-accelerated solver for entropic optimal transport (EOT) that significantly reduces memory input/output operations. By rewriting stabilized log-domain Sinkhorn updates to mimic the normalization process in transformer attention, FlashSinkhorn enables fused kernels that stream data through on-chip SRAM. This approach achieves substantial speedups, up to 32x for forward passes and 161x end-to-end, compared to existing methods on A100 GPUs for tasks like point-cloud OT. AI

    IMPACT This IO-aware solver could accelerate various machine learning applications that rely on optimal transport, potentially improving efficiency and scalability.