From Tokens to Regions: CUDA-Sensitive Instruction Tuning for GPU Kernel Generation
Researchers have developed a new method called CUDA-Sensitive Instruction Tuning (CuSeT) to improve the generation of CUDA kernels by large language models. This technique addresses the challenge of implicit execution constraints in CUDA code, which existing methods struggle to model effectively. CuSeT operates on the principle of "from tokens to regions," combining adaptive token-level masking with region-aware sample reweighting to enhance functional correctness. Experiments demonstrate that CuSeT outperforms standard and advanced supervised fine-tuning methods, achieving competitive results with lower inference costs. AI
IMPACT This research could lead to more efficient and accurate AI systems by improving the generation of specialized GPU code.