PulseAugur
EN
LIVE 07:09:02

New ALMANAC dataset trains AI agents on human collaboration

Researchers have introduced ALMANAC, a new dataset designed to improve the collaborative abilities of AI agents. This dataset comprises over 2,900 human collaboration actions, each annotated with detailed mental model information, including self-reasoning, perceived partner intent, and team goals. ALMANAC was benchmarked against six large language models to evaluate their capacity for simulating human collaborative behaviors and inferring mental models. The goal is to guide AI agents toward better process-level collaboration, moving beyond simple task completion. AI

IMPACT Enables development of AI agents that can better understand and participate in human-like collaboration.

RANK_REASON The cluster contains a research paper introducing a new dataset for AI research.

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · Jiaju Chen, Yuxuan Lu, Jiayi Su, Chaoran Chen, Songlin Xiao, Zheng Zhang, Yun Wang, Yunyao Li, Jian Zhao, Tongshuang Wu, Toby Jia-Jun Li, Dakuo Wang, Bingsheng Yao ·

    Humans' ALMANAC: A Human Collaboration Dataset of Action-Level Mental Model Annotations for Agent Collaboration

    arXiv:2606.06388v1 Announce Type: cross Abstract: Recent advances in LLM agents have enabled complex cognitive capabilities, such as multi-step reasoning, planning, and tool use, that increasingly position these agents as human collaborators. Effective collaboration, however, req…

  2. arXiv cs.AI TIER_1 English(EN) · Bingsheng Yao ·

    Humans' ALMANAC: A Human Collaboration Dataset of Action-Level Mental Model Annotations for Agent Collaboration

    Recent advances in LLM agents have enabled complex cognitive capabilities, such as multi-step reasoning, planning, and tool use, that increasingly position these agents as human collaborators. Effective collaboration, however, requires collaborators to continuously maintain and a…