Dansk(DA) Universal Skeleton Understanding via Differentiable Rendering and MLLMs

SkeletonLLM 使大语言模型能够处理人体骨架数据

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-22 04:00

研究人员开发了 SkeletonLLM，这是一种使多模态大语言模型 (MLLMs) 能够理解结构化、非视觉数据（如人体骨架）的新方法。该系统使用 DrAction，一种可微分渲染器，可将骨架运动转换为图像序列，从而使 MLLMs 能够直接处理这些数据。这种方法促进了开放词汇动作识别、运动描述和跨不同骨架格式的问题回答，为 MLLMs 处理非原生数据类型指明了方向。 AI

影响使大语言模型能够处理结构化、非视觉数据（如人体骨架），从而扩展了其应用范围。

排序理由该集群包含一篇学术论文，详细介绍了一种用大语言模型处理非视觉数据的新方法。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 Dansk(DA) · Ziyi Wang, Peiming Li, Xinshun Wang, Yang Tang, Kai-Kuang Ma, Mengyuan Liu · 2026-05-22 04:00

Universal Skeleton Understanding via Differentiable Rendering and MLLMs

arXiv:2603.18003v5 Announce Type: replace Abstract: Multimodal large language models (MLLMs) exhibit strong visual-language reasoning, yet cannot process structured, non-visual data such as human skeletons. Existing methods either compress skeleton dynamics into lossy feature vec…

报道来源 [1]

Universal Skeleton Understanding via Differentiable Rendering and MLLMs

相关实体

相关话题