Predicting Performance of Symbolic and Prompt Programs with Examples
Researchers have developed a method called RAP (Retrieved Approximate Prior) to predict the performance of both symbolic and prompt-based programs. The system analyzes a few in-domain examples to estimate how well a program will perform on unseen tasks. This approach accounts for the distinct prior performance distributions of symbolic programs, which tend to be all-or-nothing, versus prompt programs, which often exhibit near-correctness. AI
IMPACT Provides a framework for more reliably assessing the performance of LLM-based programs before deployment.