Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts
Researchers have developed two new frameworks, Retrospective Harness Optimization (RHO) and HarnessFix, aimed at improving the reliability and performance of AI agents. RHO uses a self-supervised approach to optimize an agent's harness by analyzing past trajectories and selecting the most effective updates through self-preference. HarnessFix, on the other hand, focuses on diagnosing and repairing flaws within the agent's harness by compiling execution traces into a specialized intermediate representation, allowing for targeted fixes. Both methods have demonstrated significant improvements in agent performance on various benchmarks, including software engineering tasks, without requiring external validation data. AI
IMPACT These methods offer new ways to enhance AI agent performance and reliability by enabling self-improvement and targeted flaw correction without external supervision.