PulseAugur
EN
LIVE 22:33:57

New compiler DCC optimizes ML kernels for Processing-In-Memory architectures

Researchers have developed DCC, a novel data-centric compiler designed to optimize machine learning kernels for Processing-In-Memory (PIM) architectures. This compiler addresses the challenges of data rearrangement and compute code optimization by jointly optimizing these interdependent processes. DCC supports multiple PIM backends through a multi-layer abstraction and has demonstrated significant speedups, achieving up to 7.68x on HBM-PIM and 13.17x on AttAcc PIM compared to GPU-only execution. For end-to-end LLM inference, DCC on AttAcc accelerated GPT-3 and LLaMA-2 by an average of 4.52x. AI

IMPACT Enables significant acceleration for LLM inference and other ML workloads on specialized Processing-In-Memory hardware.

RANK_REASON The cluster contains an academic paper detailing a new compiler for optimizing machine learning kernels on specific hardware architectures. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.LG TIER_1 English(EN) · Peiming Yang, Sankeerth Durvasula, Ivan Fernandez, Mohammad Sadrosadati, Onur Mutlu, Gennady Pekhimenko, Christina Giannoula ·

    DCC: Data-Centric Compilation of Machine Learning Kernels for Processing-In-Memory Architectures

    arXiv:2511.15503v2 Announce Type: replace-cross Abstract: High-performance Host processors can integrate Processing-In-Memory (PIM) devices, which can accelerate memory-intensive kernels of Machine Learning (ML) models, including Large Language Models (LLMs), by leveraging the la…