PulseAugur
EN
LIVE 23:09:17

Together AI releases open-source Parallel Kernel Builder for LLM inference

Together AI has released Parallel Kernel Builder (PKB), an open-source tool designed to optimize inference performance for large language models. PKB can identify and generate novel kernels, such as those for NeMo vocab-parallel log-probs and Hyena context parallelism, which are not publicly documented. The tool has demonstrated significant speedups, with one kernel achieving 87.9µs compared to the standard 320.6µs for NCCL, and the project encourages community contributions. AI

IMPACT Optimizes LLM inference performance, potentially leading to faster and more efficient AI deployments.

RANK_REASON Release of an open-source tool for optimizing LLM inference.

Read on X — Together (inference / OSS) →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Together AI releases open-source Parallel Kernel Builder for LLM inference

COVERAGE [1]

  1. X — Together (inference / OSS) TIER_1 English(EN) · togethercompute ·

    Single-shot generation still surfaces net-new kernels with no public reference: NeMo vocab-parallel log-probs, Hyena context parallelism, SAM 3 mask suppression

    Single-shot generation still surfaces net-new kernels with no public reference: NeMo vocab-parallel log-probs, Hyena context parallelism, SAM 3 mask suppression. One GEMM + All-Gather kernel hit 87.9µs vs 320.6µs for NCCL. PKB is open. Read more and contribute below. Blog: …