Arabic
PulseAugur coverage of Arabic — every cluster mentioning Arabic across labs, papers, and developer communities, ranked by signal.
14 day(s) with sentiment data
-
MARBERT model enhances Arabic tweet analysis for STC customer service
Researchers have developed a new method for sentiment and spam detection in Arabic tweets using the MARBERT model. This approach aims to improve customer service for Saudi Telecom Company (STC) by analyzing feedback on …
-
New Red Teaming Framework Exposes LLM Faithfulness Vulnerabilities
Researchers have developed a novel red teaming framework to systematically uncover vulnerabilities in large language models (LLMs). This framework utilizes a multi-role architecture with target, attacker, and jury model…
-
New Arabic Text Deduplication System Uses CTC for Improved LLM Efficiency
Researchers have developed CANDLE, a novel system for deduplicating characters in Arabic text, particularly addressing the challenge of distinguishing intentional character elongation from informal usage on social media…
-
New system CANDLE uses CTC for Arabic text noise deduplication
Researchers have developed CANDLE, a novel system for character-level Arabic noise deduplication. This system utilizes Connectionist Temporal Classification (CTC) to frame normalization as a sequence alignment problem, …
-
New framework evaluates LLMs' ability to control Arabic text readability
Researchers have developed a new framework to evaluate the readability control of Large Language Models (LLMs) in generating Arabic text. The framework assesses how well LLMs can adhere to specific Common European Frame…
-
LLM Cross-Lingual Transfer: Task Alignment Over Linguistic Family
A new research paper explores cross-lingual transfer in large language models, specifically examining Arabic fine-tuning and its impact on Semitic languages. The study found no evidence of Semitic-specific transfer, ind…
-
Arabic NER model fine-tuned for Egyptian legal documents
This article details the process of fine-tuning a named-entity recognition (NER) model to specifically handle Arabic legal documents from Egypt. The goal is to accurately identify and anonymize sensitive information suc…
-
Voynich Manuscript analysis reveals cipher-like structural constraints
A new research paper published on arXiv details a systematic analysis of the Voynich Manuscript's grapheme sequences, revealing two distinct structural layers. The study found character-level right-to-left optimization …
-
New methodology digitizes Arabic-English dictionary for computational use
This paper details a method for digitizing and encoding the Al-Mawrid Arabic-English dictionary using the ISO Lexical Markup Framework and TEI Lex-0 guidelines. The research addresses inconsistencies in legacy dictionar…
-
New AI method improves detection and explanation of hateful memes
Researchers have developed a new method using reinforcement learning and Chain-of-Thought (CoT) supervision to improve the detection and explanation of hateful and propagandistic memes. This approach enhances multimodal…
-
New research tackles spoofed speech detection with advanced AI models
Researchers are developing advanced methods to detect spoofed speech, a growing challenge due to realistic synthesis and voice conversion technologies. One approach, the Temporal Pyramid Adapter, uses parallel temporal …
-
New Arabic NLP Model Enhances Mental Health Disorder Detection
Researchers have developed a novel framework, MentalMARBERT, to improve the detection of mental health disorders from Arabic social media text. The approach involves domain-adaptive pre-training of existing Arabic langu…
-
New hybrid framework tackles Algerian dialect rumor detection
Researchers have developed a novel hybrid framework for detecting rumors in the Algerian dialect of Arabic, a challenging task due to informal language and limited resources. The framework combines transformer embedding…
-
New Benchmark Tests LLMs on Arabic-Hebrew Cognate Ambiguity
Researchers have developed SemCog Bench, a new benchmark designed to evaluate how well large language models (LLMs) handle cognates between Arabic and Hebrew. The benchmark includes 1,858 word pairs and sentence-level a…
-
New taxonomy structures Arabic grammar error explanations
Researchers have introduced ArabiGEE, a novel hierarchical taxonomy designed to categorize and explain grammatical errors in the Arabic language. This system moves beyond free-form text explanations by structuring error…
-
AI research tackles Arabic text challenges in scoring and segmentation
Two new research papers explore the challenges and advancements in processing Arabic text with AI. One paper reviews the use of Large Language Models (LLMs) for automated scoring of Arabic text, highlighting the need fo…
-
New Arabic LLM Leaderboard and Earth Observation Models Released
The QIMMA leaderboard has been released, focusing on the quality of Arabic Large Language Models (LLMs). Separately, Allen Institute for AI has launched OlmoEarth v1.1, a collection of more efficient models designed for…
-
LLMs show geographic bias in medical triage recommendations
A new study using Gemini 3.5 Flash found that large language models provide different medical triage recommendations based on the language of the patient's prompt, even when symptoms are identical. The model recommended…
-
Arabic ASR model training stalls, user seeks community help
A user on Reddit is seeking help with an Arabic Automatic Speech Recognition (ASR) model that is failing to converge during training. The model, based on a SpeechBrain Conformer-Transformer architecture, uses a combinat…
-
New Open-Source Arabic LLM 'RightNow-Arabic-0.5B-Turbo' Released
Researchers have developed RightNow-Arabic-0.5B-Turbo, a new open-source Arabic language model with 518 million parameters. This model is built upon Qwen2.5-0.5B and incorporates a specialized Arabic vocabulary through …