Brief · PulseAugur

TOOL · arXiv cs.LG English(EN) · 1w

KForge: LLM-Driven Cross-Platform Kernel Generation for AI Accelerators

Researchers have developed KForge, a framework that uses LLM-driven agents to automatically generate optimized kernels for AI accelerators. This system addresses the challenge of creating efficient code for diverse hardware by employing an iterative refinement loop. One agent generates and refines kernels based on compilation feedback, while another analyzes performance data to guide optimization. KForge has demonstrated improvements over existing solutions on NVIDIA and Intel hardware. AI

IMPACT Automates the creation of high-performance code for diverse AI hardware, potentially speeding up inference and reducing development costs.

LLM
PyTorch
NVIDIA B200
AI accelerators
TensorRT-LLM
KForge