RaguTeam wins SemEval-2026 LLM task with judge-orchestrated ensemble

By PulseAugur Editorial · [2 sources] · 2026-05-06 06:04

RaguTeam has developed a winning system for the SemEval-2026 Task 8, which focuses on faithful multi-turn response generation. Their approach utilizes a heterogeneous ensemble of seven large language models, with a GPT-4o-mini acting as a judge to select the best response. This ensemble method outperformed 26 other teams, achieving a harmonic mean of 0.7827 and demonstrating the effectiveness of diverse model families and prompting strategies. AI

IMPACT Demonstrates an effective ensemble strategy for multi-turn response generation, potentially influencing future research in faithful dialogue systems.

RANK_REASON This is a research paper detailing a system's performance in a specific academic task.

Read on arXiv cs.LG →

paper
other

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

RaguTeam wins SemEval-2026 LLM task with judge-orchestrated ensemble

COVERAGE [2]

arXiv cs.LG TIER_1 English(EN) · Ivan Bondarenko, Roman Derunets, Oleg Sedukhin, Mikhail Komarov, Ivan Chernov, Mikhail Kulakov · 2026-05-07 04:00

RaguTeam at SemEval-2026 Task 8: Meno and Friends in a Judge-Orchestrated LLM Ensemble for Faithful Multi-Turn Response Generation

arXiv:2605.04523v1 Announce Type: cross Abstract: We present our winning system for Task~B (generation with reference passages) in SemEval-2026 Task~8: MTRAGEval. Our method is a heterogeneous ensemble of seven LLMs with two prompting variants, where a GPT-4o-mini judge selects t…
arXiv cs.CL TIER_1 English(EN) · Mikhail Kulakov · 2026-05-06 06:04

RaguTeam at SemEval-2026 Task 8: Meno and Friends in a Judge-Orchestrated LLM Ensemble for Faithful Multi-Turn Response Generation

We present our winning system for Task~B (generation with reference passages) in SemEval-2026 Task~8: MTRAGEval. Our method is a heterogeneous ensemble of seven LLMs with two prompting variants, where a GPT-4o-mini judge selects the best candidate per instance. We ranked 1st out …

COVERAGE [2]

RaguTeam at SemEval-2026 Task 8: Meno and Friends in a Judge-Orchestrated LLM Ensemble for Faithful Multi-Turn Response Generation

RaguTeam at SemEval-2026 Task 8: Meno and Friends in a Judge-Orchestrated LLM Ensemble for Faithful Multi-Turn Response Generation

RELATED ENTITIES

RELATED TOPICS