PulseAugur
EN
LIVE 08:00:58

New Multi-LCB benchmark tests LLMs on code generation across 12 languages

Researchers have introduced Multi-LCB, a new benchmark designed to evaluate large language models (LLMs) on their code-generation capabilities across twelve programming languages, extending beyond Python. This benchmark aims to address the Python-centric nature of the existing LiveCodeBench (LCB) by transforming LCB's Python tasks into equivalent problems in other languages while maintaining LCB's contamination controls. Initial evaluations using Multi-LCB on 24 LLMs revealed evidence of Python overfitting, language-specific contamination, and significant performance disparities in multilingual coding. AI

IMPACT This benchmark will help identify and address LLM limitations in multilingual code generation, pushing for more robust and versatile AI coding assistants.

RANK_REASON The cluster describes a new academic benchmark for evaluating LLMs on code generation, presented in a research paper. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New Multi-LCB benchmark tests LLMs on code generation across 12 languages

COVERAGE [2]

  1. arXiv cs.AI TIER_1 English(EN) · Maria Ivanova, Pavel Zadorozhny, Rodion Levichev, Ivan Petrov, Adamenko Pavel, Ivan Lopatin, Alexey Kutalev, Dmitrii Babaev ·

    Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages

    arXiv:2606.20517v1 Announce Type: new Abstract: LiveCodeBench (LCB) has recently become a widely adopted benchmark for evaluating large language models (LLMs) on code-generation tasks. By curating competitive programming problems, constantly adding fresh problems to the set, and …

  2. arXiv cs.AI TIER_1 English(EN) · Dmitrii Babaev ·

    Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages

    LiveCodeBench (LCB) has recently become a widely adopted benchmark for evaluating large language models (LLMs) on code-generation tasks. By curating competitive programming problems, constantly adding fresh problems to the set, and filtering them by release dates, LCB provides co…