Indic Languages
PulseAugur coverage of Indic Languages — every cluster mentioning Indic Languages across labs, papers, and developer communities, ranked by signal.
2 天有情绪数据
-
新流程针对印度语言中的辱骂性评论
研究人员开发了一种面向印度语言辱骂性评论的多阶段训练流程。该系统利用基于语言的预处理和模型集成来识别社交媒体上的有害内容。研究的一个关键重点是最小化误报,以确保在提高在线安全性的同时不损害言论自由。
-
SCRIBE framework improves ASR for Indic languages with new error analysis
Researchers have introduced SCRIBE, a new diagnostic framework designed to improve automatic speech recognition (ASR) for Indic languages. Unlike traditional metrics like Word Error Rate (WER), SCRIBE categorizes errors…
-
New benchmark FinVQA and FIND framework tackle multilingual financial reasoning
Researchers have introduced FinVQA, a new benchmark designed to evaluate financial reasoning and question answering capabilities across multiple Indic languages. This benchmark includes 18,900 samples in English, Hindi,…
-
New dataset and model boost medical dialogue for Indic languages
Researchers have developed IndicMedDialog, a new dataset designed to improve medical dialogue systems for Indic languages. This dataset includes parallel multi-turn conversations in English and nine Indic languages, cre…
-
Movie subtitle translation enhanced by visual cues for Indic languages
Researchers have developed a method for visually-guided movie subtitle translation, focusing on low-resource Indic languages. Their study compares two lightweight visual grounding strategies, finding that temporal misal…
-
新基准评估六维度的印度语言TTS口音保真度
研究人员推出PSP,一个旨在评估印度语言文本到语音(TTS)系统口音准确性的新基准。与关注清晰度和自然度的现有指标不同,PSP通过将其分解为六个不同的维度来专门衡量口音,包括卷舌音合并和韵律特征发散。对ElevenLabs v3和Sarvam Bulbul等系统的初步测试显示,在词错误率方面表现最佳的系统不一定在口音保真度方面表现出色,这凸显了对更细致评估方法的需求。
-
New study evaluates 7 TTS systems for 10 Indian languages
Researchers have developed a new framework for evaluating Text-to-Speech (TTS) systems in Indian languages, addressing the high variance typically seen in crowdsourced evaluations. This framework uses controlled, multid…