实体 OGRBench

OGRBench

PulseAugur coverage of OGRBench — every cluster mentioning OGRBench across labs, papers, and developer communities, ranked by signal.

总计 · 30天

1

90 天内 1

发布 · 30天

0

90 天内 0

论文 · 30天

1

90 天内 1

层级分布 · 90 天

主题

情绪 · 30 天

1 天有情绪数据

最近 · 第 1/1 页 · 共 1 条

RESEARCH · CL_77162 · Jun 5 · 08:17

StainFlow通过新颖的奖励模型改进GUI智能体训练

研究人员引入了StainFlow，这是一种新颖的过程奖励模型，旨在增强GUI智能体的训练。该方法通过提供更精细的训练信号来解决强化学习中反馈稀疏的问题。StainFlow利用实体污点追踪来客观地分离任务阶段，并动态链接局部证据以提高关键节点验证的准确性。