Knowledge Boundary Probing and Demand-Guided Intervention for LLM-Based Power System Code Generation
Researchers have developed a new method to improve the reliability of large language models (LLMs) for power system code generation, particularly for on-premise deployments. The approach addresses API knowledge boundary errors, such as incorrect function names or parameters, by introducing a benchmark generator called PowerCodeBench and a boundary-aware intervention technique. This intervention combines API demand estimation with documentation injection and correction, significantly boosting accuracy for various open-weight and commercial LLMs. AI
IMPACT Enhances reliability of LLMs for critical infrastructure code generation, enabling safer on-premise deployments.