PulseAugur
EN
LIVE 21:43:56

New Arabic-Russian parallel corpus and benchmark improve scientific translation

Researchers have developed a new benchmark and parallel corpus to improve Arabic-Russian scientific translation. The benchmark includes approximately 27,000 sentence pairs compiled from scientific abstracts and general texts. Fine-tuning multilingual language models like Qwen2.5-7B-Instruct with LoRA techniques resulted in significant improvements in translation quality, demonstrating the necessity of domain-specific fine-tuning over few-shot prompting. AI

IMPACT This work facilitates knowledge exchange between Arabic and Russian scientific communities, potentially accelerating research collaboration and innovation.

RANK_REASON The cluster describes a new academic paper presenting a parallel corpus and benchmark for a specific language pair, along with fine-tuned models.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New Arabic-Russian parallel corpus and benchmark improve scientific translation

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · M. K. Arabov ·

    Bridging Scientific Heritage: An Arabic--Russian Parallel Corpus and LLM Benchmark for Sustainable Knowledge Transfer

    arXiv:2606.30943v1 Announce Type: new Abstract: Russian and Arabic are among the major languages of scientific communication. Language barriers impede the exchange of research results between these communities, which affects international collaboration and the progress of sustain…

  2. arXiv cs.CL TIER_1 English(EN) · M. K. Arabov ·

    Bridging Scientific Heritage: An Arabic--Russian Parallel Corpus and LLM Benchmark for Sustainable Knowledge Transfer

    Russian and Arabic are among the major languages of scientific communication. Language barriers impede the exchange of research results between these communities, which affects international collaboration and the progress of sustainability-related research. We present a benchmark…