ProtocolBench: Which LLM MultiAgent Protocol to Choose?
Researchers have introduced ProtocolBench, a new benchmark designed to systematically evaluate the performance and reliability of communication protocols used in large-scale multi-agent systems. The benchmark measures task success, latency, message overhead, and robustness under failures, revealing significant performance variations between different protocols. Additionally, the study presents ProtocolRouter, an adaptive system that selects the most suitable protocol based on specific scenario requirements and runtime signals, demonstrating improved recovery times and task success rates compared to static protocol choices. AI
IMPACT Standardizes evaluation of LLM multi-agent communication, potentially improving reliability and efficiency in complex AI systems.