PulseAugur / Brief
EN
LIVE 10:31:47

Brief

last 24h
[2/2] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Inside the Together AI kernels team

    The Together AI kernels team, including researchers Dan Fu and Tri Dao, developed FlashAttention, a software layer that significantly optimizes GPU performance for AI models. This breakthrough, achieved by applying database system principles to GPU memory movement, resulted in 2-3x speedups, challenging the notion that transformer attention was already fully optimized. The team's subsequent work, including the ThunderKittens library, aims to accelerate kernel development for new hardware like NVIDIA's Blackwell GPUs, addressing the critical software-hardware gap in AI infrastructure. AI

    IMPACT Optimizes AI inference and training by bridging the software-hardware gap, potentially lowering costs and improving responsiveness.

  2. Salesforce, Zoom, InVideo Train Faster with Together AI Turbocharged with NVIDIA Blackwell

    Together AI has launched new GPU clusters featuring NVIDIA's Blackwell platform, offering significant speedups for AI training and inference. These clusters, powered by the Together Kernel Collection, achieve up to 90% faster training speeds compared to previous NVIDIA H100 hardware, processing over 15,000 tokens per second for large models. Early access customers like Salesforce and Zoom have reported substantial performance gains, with some experiencing double the training speed. Together AI's optimization efforts span custom kernels, inference engines, and speculative decoding, aiming to redefine efficiency in AI model development and deployment. AI

    Salesforce, Zoom, InVideo Train Faster with Together AI Turbocharged with NVIDIA Blackwell

    IMPACT Accelerates AI training and inference, potentially lowering costs and increasing the pace of model development and deployment for enterprises.