PulseAugur
EN
LIVE 12:07:00

New NVMOS model assesses non-verbal vocalization quality in speech

Researchers have developed NVMOS, a novel model designed to assess the perceptual quality of non-verbal vocalizations (NVs) in speech, such as laughter and sighs. Existing methods and general-purpose multimodal models like Gemini have shown inconsistencies in evaluating these NV events. The NVMOS model, trained on a dataset of NV-TTS system outputs and natural NVs rated by acoustic experts, aims to achieve expert-level agreement in predicting NV quality. AI

IMPACT Introduces a specialized model for evaluating non-verbal vocalizations, potentially improving TTS systems and analysis of human-computer interaction.

RANK_REASON The cluster contains an academic paper detailing a new model for speech quality assessment. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Jialong Mai, Jinxin Ji, Xiaofen Xing, Wencui Liu, Xiangmin Xu ·

    NVMOS: Non-Verbal Vocalization Quality Assessment in Speech

    arXiv:2606.15888v1 Announce Type: cross Abstract: Non-verbal vocalizations (NVs), such as laughter, sighs, and coughs, are important acoustic cues for emotion and intent. Existing speech quality assessment methods typically focus on overall naturalness, while non-verbal TTS evalu…