Korean
PulseAugur coverage of Korean — every cluster mentioning Korean across labs, papers, and developer communities, ranked by signal.
10 day(s) with sentiment data
-
LoRA Fine-Tuning Boosts Low-Resource TTS Quality for Khmer
Researchers have developed a method to improve the quality of text-to-speech (TTS) for low-resource languages like Khmer and Korean. By fine-tuning the 2.4B-parameter VoxCPM2 model using a single Low-Rank Adaptation (Lo…
-
Korean NLP research proposes eojeol-based constituency representation
This paper proposes an eojeol-based constituency representation for Korean treebanks, arguing that using morphemes as terminals can obscure syntactic structure and mismatch with dependency resources. The authors demonst…
-
AI models tackle dysarthria severity with cross-lingual and transfer learning
Researchers have developed new methods for assessing dysarthria severity using AI, addressing the challenge of limited labeled speech data. One approach, CRAC, utilizes cross-lingual retrieval-augmented classification b…
-
Korean spoken QA research highlights ASR error impact on LLMs
A new research paper analyzes how errors in Korean speech recognition impact the performance of large language models (LLMs) in spoken question answering (SQA). The study found that the degradation caused by speech reco…
-
Montreal Forced Aligner updated to version 3.0, achieving state-of-the-art speech-to-text alignment
A research paper details the advancements in the Montreal Forced Aligner (MFA) up to version 3.0, a decade after its initial release. The updated MFA tool now supports more languages and dialects, incorporates larger op…
-
Korean language datasets curated in new research report
Researchers have compiled and reviewed a list of Korean language datasets, addressing the perception of Korean as a low-resource language. The report details institutional efforts in resource development and highlights …
-
Korean toddler pronunciation evaluated by AI
Researchers have developed an automated system to evaluate the pronunciation of Korean toddlers, addressing a gap in current assessment tools. The system utilizes neural speaker diarization and self-supervised learning …
-
New method adapts LLM safety tests for Asian cultural contexts
A new research paper introduces a methodology for culturally-adapted red-teaming of large language models (LLMs) across East and Southeast Asian contexts. The study found that direct translation of English benchmarks si…
-
New method aligns LLMs with Korean cultural norms
Researchers have developed a new method to align large language models with Korean cultural norms, moving beyond simply suppressing harmful content. The approach involves creating a culturally adapted safe-response poli…
-
Dual-output L2 speech recognition faces representational entanglement
A new research paper explores the challenges of multi-task learning (MTL) in second-language speech recognition, specifically for Korean and English. The study found that while MTL can improve the recognition of intende…
-
Claude and HWP-MCP automate Korean document tasks
A guide demonstrates how to automate tasks involving Korean HWP files using Claude and the HWP-MCP tool. The process allows users to summarize documents, extract tables into CSV format, and perform batch processing with…
-
Korean GEC annotation refined for morpheme-level errors
Researchers have developed a refined word-based annotation system for Korean grammatical error correction (K-GEC) to address the mismatch between word-level evaluation and morpheme-level errors. The new approach reconst…
-
New Korean Dataset Tackles Obfuscated Toxic Language in LLMs
Researchers have introduced KOTOX, a new dataset designed to improve the detection and detoxification of toxic language in Korean, particularly when users employ obfuscation techniques. The dataset categorizes Korean ob…
-
Multilingual Code-Switching Boosts LLM Performance Across Four Languages
Researchers have explored the impact of multilingual code-switching data (CSD) on large language models (LLMs) across four languages: English, Japanese, Korean, and Chinese. Their experiments demonstrated that incorpora…
-
New benchmark KSAFE-MM tests MLLM safety in Korean cultural context
Researchers have developed KSAFE-MM, a new benchmark designed to evaluate the safety of multimodal large language models (MLLMs) specifically within the context of Korean culture. Existing MLLM safety tools are often li…
-
Bilingual TTS architecture sought for seamless English-Korean speech
A user is seeking the optimal architecture for a bilingual Text-to-Speech system that seamlessly integrates English and Korean within a single sentence. They are encountering issues with Azure Cognitive Services, where …
-
Korean sentiment analysis boosted by new multiword expression resource
Researchers have developed DECO-MWE, a new linguistic resource for analyzing sentiment in Korean text, specifically focusing on multiword expressions (MWEs). This resource utilizes the Local Grammar Graph (LGG) methodol…
-
Korean linguistic resource FIAD aids banking chatbot NLU data generation
Researchers have developed FIAD, a Korean linguistic resource designed to generate Natural Language Understanding (NLU) training data for banking customer service dialog systems. By analyzing banking app reviews, they i…
-
Korean legal chatbot uses novel dataset generation for 91% accuracy
Researchers have developed a novel method for generating large, labeled datasets for Korean legal chatbots, addressing the challenge of high labeling costs. Their approach utilizes local grammar graphs (LGGs) to create …
-
L2 Korean annotation uses parser agreement for human-in-the-loop workflow
Researchers have developed a new human-in-the-loop annotation workflow for L2 Korean using agreement between two parsers. This method leverages parser agreement as a proxy for annotation correctness, showing strong corr…