Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 1w

Who Deserves the Reward? SHARP: Shapley Credit-based Optimization for Multi-Agent System

Researchers have developed a new framework called SHARP to improve the training of multi-agent systems that integrate large language models with external tools. This method addresses the challenge of assigning credit to individual agents for successful outcomes, which is crucial for efficient learning. SHARP utilizes a decomposed reward mechanism, including a Shapley-based marginal-credit reward, to precisely attribute contributions and stabilize training. Experiments show SHARP significantly outperforms existing methods, achieving substantial improvements in accuracy and efficiency. AI

IMPACT Enhances training efficiency for complex multi-agent LLM systems, potentially accelerating their adoption in real-world problem-solving.

Large Language Models
SHARP
Xuelin Zhang