PrefBench: Evaluating Zero-Shot LLM Agents in Hidden-Preference Personalized Pricing Negotiations
Researchers have introduced PrefBench, a new benchmark designed to evaluate the performance of Large Language Model (LLM) agents in personalized pricing negotiations where buyer preferences are hidden. While LLM agents demonstrated high success rates in closing deals, achieving over 0.99 deal rates, their profit outcomes were notably weak. The best-performing LLM agent's average profit was only marginally better than a random baseline and significantly lower than a simple concession heuristic, indicating a gap between compliance and profitable bargaining. AI
IMPACT Introduces a benchmark to evaluate LLM agents in complex negotiation scenarios, highlighting current limitations in profitable strategic bargaining.