English(EN) PIVOTSBench: Evaluating Fine-Grained Interpersonal Relationship Reasoning in Multimodal Large Language Models

新的PIVOTSBench基准评估MLLM的人际关系推理能力

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-22 09:38

研究人员推出了PIVOTSBench，这是一个旨在评估多模态大语言模型（MLLM）理解和推理人际关系能力的新基准。该基准源自Social-IQ 2.0和YouTube数据，包含评估模型预测关系维度和识别关键视觉线索能力的任务。评估涵盖了专有和开源的MLLM，研究探讨了视觉模态和对话上下文的影响。 AI

影响该基准有望推动具备改进的社交推理能力MLLM的发展，这对于更自然的人机交互至关重要。

排序理由该集群描述了一篇介绍AI模型评估基准的新学术论文。

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

Hugging Face Daily Papers TIER_1 English(EN) · 2026-06-22 09:38

PIVOTSBench: Evaluating Fine-Grained Interpersonal Relationship Reasoning in Multimodal Large Language Models

Humans possess an innate ability to understand fine-grained interpersonal relationships, which is central to everyday social interactions. Although such reasoning is inherently multimodal, it remains largely unexplored by existing multimodal large language models (MLLMs). To addr…
arXiv cs.CL TIER_1 English(EN) · Miao Liu · 2026-06-22 09:38

PIVOTSBench: Evaluating Fine-Grained Interpersonal Relationship Reasoning in Multimodal Large Language Models

Humans possess an innate ability to understand fine-grained interpersonal relationships, which is central to everyday social interactions. Although such reasoning is inherently multimodal, it remains largely unexplored by existing multimodal large language models (MLLMs). To addr…

报道来源 [2]

PIVOTSBench: Evaluating Fine-Grained Interpersonal Relationship Reasoning in Multimodal Large Language Models

PIVOTSBench: Evaluating Fine-Grained Interpersonal Relationship Reasoning in Multimodal Large Language Models

相关实体

相关话题