Brief · PulseAugur

RESEARCH · arXiv cs.CL English(EN) · 17h · [2 sources]

DEEPRUBRIC: Evidence-Tree Rubric Supervision for Efficient Reinforcement Learning of Deep Research Agents

Researchers have introduced DeepRubric, a novel framework for constructing query-rubric pairs to improve the efficiency of reinforcement learning for deep research agents. This method synthesizes aligned query-rubric pairs by first identifying evaluation targets and then building an evidence tree to ensure rubrics accurately reflect the information needs of a given query. By training the DeepRubric-8B model with this approach, the researchers achieved comparable performance to existing state-of-the-art models while using significantly fewer computational resources. AI

IMPACT This framework could lead to more efficient training of AI agents for complex research tasks, reducing computational costs.

Hugging Face
arXiv
DagsHub
alphaXiv
ScienceCast
CatalyzeX
Gotit.pub
Grpo
DEEPRUBRIC
DeepRubric-8B