Brief · PulseAugur

TOOL · arXiv cs.CL English(EN) · 7h

EIBench: A Simulator-Based Benchmark and Turn-Credit RL for Emotion Management

Researchers have introduced EIBench, a new simulator-based benchmark designed to evaluate and train large language models (LLMs) in interactive emotion management. The benchmark features 2,222 scenarios covering support, defense, repair, and charm, with an LLM simulator playing the user and updating an emotion-relation state after each turn. Current LLMs perform well in supportive interactions but struggle with boundary maintenance. To address this, the team developed CTC-GRPO, a reinforcement learning method that utilizes the simulator's per-turn state updates for dense feedback, significantly improving the performance of Qwen3-8B on EIBench and other evaluations. AI

IMPACT This benchmark and training method could lead to more emotionally intelligent and interactive AI agents capable of nuanced, multi-turn communication.

large-language models
SAGE
Qwen3 8B
EIBench
CTC-GRPO
EQBench3