English(EN) AV-SyncBench: Decoupled Benchmarking of Temporal and Semantic Audio-Visual Synchronization

新基准测试解耦音视频同步评估

作者 PulseAugur 编辑部 · [2 个来源] · 2026-07-01 10:12

研究人员推出了AV-SyncBench，这是一个新颖的基准测试，旨在评估多模态AI模型中的音视频同步。该基准测试独特地解耦了时间一致性和语义一致性的评估，从而能够对特征提取模型进行更精细的分析。AV-SyncBench 使用了一个包含3,269个野外视频的数据集，涵盖了各种场景下的语音、音乐和声音，其中38,390个样本经过自动过滤和手动验证，确认了屏幕上的声音来源。该基准测试旨在为对齐和下游任务提供更准确的模型性能衡量标准。 AI

影响为音视频AI模型提供更精确的评估框架，有望提高多模态理解和生成能力。

排序理由该集群描述了一个用于评估AI模型的新学术基准测试。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AV-SyncBench

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CV TIER_1 English(EN) · Tianhong Zhou, Mingyang Han, Boyu Li, Yuxuan Jiang, Jiaxin Ye, Dongxiao Wang, Haoxiang Shi, Kunpeng Wang, Jun Song, Cheng Yu, Bo Zheng · 2026-07-02 04:00

AV-SyncBench: Decoupled Benchmarking of Temporal and Semantic Audio-Visual Synchronization

arXiv:2607.00726v1 Announce Type: new Abstract: Audio-visual feature extraction is a fundamental component of multimodal understanding and generation tasks. However, existing evaluation protocols for feature extraction models exhibit dimensional bias, typically focusing on either…
arXiv cs.CV TIER_1 English(EN) · Bo Zheng · 2026-07-01 10:12

AV-SyncBench：时域和语义视听同步的解耦基准测试

Audio-visual feature extraction is a fundamental component of multimodal understanding and generation tasks. However, existing evaluation protocols for feature extraction models exhibit dimensional bias, typically focusing on either semantic matching or temporal offset detection.…

报道来源 [2]

AV-SyncBench: Decoupled Benchmarking of Temporal and Semantic Audio-Visual Synchronization

AV-SyncBench：时域和语义视听同步的解耦基准测试

相关话题