English(EN) RhinoVLA Technical Report

RhinoVLA模型可在边缘硬件上实现实时机器人控制

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-05 15:21

研究人员开发了RhinoVLA，一个用于边缘硬件上实时机器人操作的视觉-语言-动作（Vision-Language-Action）模型。该模型利用了令牌效率高的Qwen3-VL骨干网络和一个连续动作专家（continuous Action Expert）来降低计算负载和延迟。RhinoVLA还引入了一个统一的接口用于跨机器人学习，并针对硬件部署进行了优化，在满足10 Hz实时控制目标的同时，实现了与现有模型相当的下游性能。 AI

影响可在边缘设备上实现实时机器人操作，可能加速自主系统。

排序理由该集群包含一份技术报告，详细介绍了新模型及其在特定硬件上的性能。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Huixi Intelligence, :, Chen Zhang, Chenyang Zhou, Guanglei Ding, Guanghui He, Haibin Gao, Jiajia Chen, Jianyong Zhang, Lianyi Yu, Ningyi Xu, Ping Xu, Qingchen Li, Yingjun Hu, Yijia Zhang, Yuxi Liu · 2026-06-08 04:00

RhinoVLA 技术报告

arXiv:2606.07383v1 Announce Type: cross Abstract: Vision-Language-Action (VLA) models have shown strong potential for robotic manipulation, but real-time deployment on edge hardware remains challenging. In this work, we identify VLM visual and context tokens as a major source of …
arXiv cs.LG TIER_1 English(EN) · Yuxi Liu · 2026-06-05 15:21

RhinoVLA 技术报告

Vision-Language-Action (VLA) models have shown strong potential for robotic manipulation, but real-time deployment on edge hardware remains challenging. In this work, we identify VLM visual and context tokens as a major source of deployment latency: for GEMM-dominated projection …

报道来源 [2]

RhinoVLA 技术报告

RhinoVLA 技术报告

相关实体

相关话题