Bangla
PulseAugur coverage of Bangla — every cluster mentioning Bangla across labs, papers, and developer communities, ranked by signal.
4 天有情绪数据
-
新的BLADE数据集改进了多语言孟加拉语大型语言模型的敬语
研究人员开发了一个名为BLADE的新数据集和基准测试框架,以解决多语言孟加拉语文本生成中的敬语失败问题。该数据集包含超过4000个精选的交互对,旨在提高大型语言模型的文化细微差别和语境相关沟通能力。在BLADE上微调DeepSeek-8B和LLaMA-3.2-3B等模型,已显示出在低资源语言的结构保真度和敬语对齐方面有显著改进。
-
LLMs通过调整流畅性来改进多语言语音纠错
研究人员开发了一种新方法,利用大型语言模型(LLMs)来纠正多语言语音转录中的不流畅之处。该流程首先识别不流畅的词元,然后利用这些信号对LLM进行微调,以将转录稿改写为流畅的文本。添加了一个对比学习目标来惩罚不流畅词元的再现,确保语法和含义得以保留。在印地语、孟加拉语和马拉地语进行的实验表明,与现有基线相比有显著改进,为语音驱动的NLP系统提供了实用的解决方案。
-
Diffusion augmentation boosts Bangla character recognition accuracy
Researchers have developed a confidence-guided diffusion augmentation method to improve the recognition of handwritten Bangla compound characters. This approach uses diffusion models to generate high-quality synthetic c…
-
新的防御框架应对多语言提示注入攻击
研究人员开发了MIPIAD,一个用于防御多语言大型语言模型系统中间接提示注入攻击的防御框架。该框架结合了使用LoRA微调的Qwen2.5-1.5B模型、TF-IDF词汇特征以及集成学习方法。在英语和孟加拉语上进行评估,MIPIAD使用混合集成达到了0.9205的高F1分数,使用提升集成达到了0.9378的AUROC,证明了其在缩小跨语言差距方面的有效性。
-
Bengali AI models show identity biases despite similar data, study finds
A new paper investigates biases in sentiment analysis models for the Bengali language, a low-resource context. Researchers audited models like mBERT and BanglaBERT, fine-tuned on Bengali sentiment analysis datasets, and…
-
Llama-3.2-3B model achieves 92% accuracy in parsing blood donation requests
Researchers have developed the Cognitive Blood Request System (CBRS), a framework designed to efficiently filter and parse urgent blood donation requests from social media streams. This system utilizes a novel bilingual…
-
New dataset and benchmark advance Bangla text-to-gloss translation for BdSL
Researchers have developed the first dataset and benchmark for Bangla text-to-gloss translation, addressing a significant gap for the Bangla Sign Language (BdSL) community. The dataset includes manually annotated and sy…
-
LLM Augmentation Boosts Bangla Fake News Detection in Low-Resource Settings
Researchers have developed a method to improve fake news detection for the Bangla language by using the Gemma 3 27B IT model to generate synthetic news articles. This approach addresses the scarcity of data in under-res…
-
BanglaSocialBench 基准测试揭示大型语言模型难以处理文化细微差别
研究人员推出了 BanglaSocialBench,这是一个新的基准测试,旨在评估大型语言模型在孟加拉语中理解和使用社会语用学和文化细微差别的能力。该基准测试侧重于语境相关的语言使用,包括称谓、亲属关系推理和社会习俗,而不是事实回忆。对十二个当前大型语言模型的评估显示出持续的文化不匹配,例如默认使用过于正式的语言以及混淆亲属称谓,这凸显了它们在应用文化上适当的沟通方面的局限性。
-
Multilingual models show significant sentiment misalignment, especially for Bengali
A new research paper highlights significant cross-lingual sentiment misalignment in multilingual language models, particularly affecting low-resource languages like Bengali. The study found that a compressed model archi…
-
AI advances in 3D simulation, Bengali TTS, and Google Cloud Next trends
A researcher named Jousef Murad has introduced a new AI framework called Rigid-Deformation Decomposition for simulating 3D vehicle crash dynamics. Separately, a user named Himu is urging Google developers to integrate n…
-
DeepL adds Bangla language support with impressive translation quality
DeepL has recently added support for the Bangla language, and users are noting the quality of its translations. The machine translation service's new capability is being discussed on social media platforms.