English(EN) DiffAttn: Diffusion-Based Drivers' Visual Attention Prediction with LLM-Enhanced Semantic Reasoning

基于 LLM 的扩散模型增强驾驶员注意力预测

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-17 04:00

研究人员开发了 DiffAttn，一个用于预测驾驶员视觉注意力的、新颖的基于扩散的模型框架。该系统集成了 Swin Transformer 用于场景特征提取，以及特征融合金字塔用于增强去噪和上下文建模。一项关键创新是引入了一个大型语言模型 (LLM) 层，以改进语义推理并识别安全关键线索。在多个数据集上的实验表明，DiffAttn 的性能优于现有方法，为提高智能车辆安全性和驾驶员理解力提供了潜力。 AI

影响通过改进车辆对人类视觉焦点的理解和预测能力，这项研究可能带来更复杂的驾驶员辅助系统。

排序理由该集群包含一篇详细介绍新模型及其实验结果的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Weimin Liu, Qingkun Li, Jiyuan Qiu, Wenjun Wang, Joshua H. Meng · 2026-06-17 04:00

DiffAttn: Diffusion-Based Drivers' Visual Attention Prediction with LLM-Enhanced Semantic Reasoning

arXiv:2603.28251v3 Announce Type: replace-cross Abstract: Drivers' visual attention provides critical cues for anticipating latent hazards and directly shapes decision-making and control maneuvers, where its absence can compromise traffic safety. To emulate drivers' perception pa…

报道来源 [1]

DiffAttn: Diffusion-Based Drivers' Visual Attention Prediction with LLM-Enhanced Semantic Reasoning

相关实体

相关话题