PulseAugur
实时 18:50:34
实体 Qwen2.5 Omni

Qwen2.5 Omni

PulseAugur coverage of Qwen2.5 Omni — every cluster mentioning Qwen2.5 Omni across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
5
90 天内 5
发布 · 30天
0
90 天内 0
论文 · 30天
5
90 天内 5
层级分布 · 90 天
情绪 · 30 天

2 天有情绪数据

最近 · 第 1/1 页 · 共 5 条
  1. TOOL · CL_50892 ·

    Raon-Speech 发布 90 亿参数模型,用于语音理解与生成

    研究人员推出了 Raon-Speech,一个拥有 90 亿参数的语音语言模型,能够理解、回答和生成英语和韩语的语音。该模型在超过 138 万小时的精选语音和文本数据上进行训练,在以语音为中心的任务上表现优于同等规模的音频基础模型,同时保持了强大的文本问答能力。一个名为 Raon-SpeechChat 的扩展通过额外的对话数据训练,进一步增强了实时全双工对话能力,在轮次转换和中断敏感性方面表现出色。

  2. RESEARCH · CL_49714 ·

    SEATS 方法通过修剪音视频 Token 削减大语言模型计算量

    研究人员开发了一种名为 SEATS 的新方法,以提高全模态大语言模型(om-LLMs)的效率。SEATS 在模型的各个层中修剪冗余的音视频 Token,并根据跨模态融合自适应地调整 Token 选择过程。这种方法在保持高性能的同时,显著降低了计算负荷并加快了推理速度。

  3. TOOL · CL_40907 ·

    AffectVerse model predicts future emotions using temporal imagination

    Researchers have introduced AffectVerse, a new multimodal model designed for affective computing that integrates temporal prediction into its reasoning process. Unlike previous models that treated emotion recognition st…

  4. TOOL · CL_15635 ·

    Omni-Encoder unifies vision and audio processing for human-like motion perception

    Researchers have developed Omni-Encoder, a novel Transformer backbone that unifies visual and audio signals for more holistic perception. Unlike previous models that process modalities separately and at different rates,…

  5. RESEARCH · CL_06508 ·

    New framework reveals audio hallucinations in egocentric video models

    Researchers have developed a new framework to evaluate audio hallucinations in egocentric videos, where models infer sounds from visual cues that are not actually heard. Their study found that advanced audio-visual lang…