English(EN) A Benchmark Construction and Evaluation Framework for Specialist Domains: Case Study on Defense-related Documents

新框架DoRA为专业领域创建RAG基准

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-28 04:00

研究人员开发了DoRA，一个用于在专业领域创建检索增强生成（RAG）系统评估基准的框架，特别解决了标记数据有限的挑战。DoRA使用少量领域文档，系统地生成合成问答数据集，并采用不同的LLM家族进行训练和测试，以避免循环性。一项针对国防相关文档的案例研究表明，使用DoRA训练的LoRA适配的Llama3.1-8B模型与其它基线相比，显著减少了幻觉并提高了各项指标的性能。 AI

影响提供了一种为专业AI应用构建评估基准的方法，有望加速RAG在细分行业的应用。

排序理由该集群包含一篇学术论文，详细介绍了RAG系统的新框架和评估方法。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Bao Gia Doan, Aditya Joshi, Pantelis Elinas, Aarya Bodhankar, Oscar Leslie, Tom Marchant, Flora Salim · 2026-05-28 04:00

A Benchmark Construction and Evaluation Framework for Specialist Domains: Case Study on Defense-related Documents

arXiv:2604.17943v2 Announce Type: replace Abstract: RAG-based question-answering (QA) in specialist domains faces a cold-start problem: lack of evaluative benchmarks and absence of labeled data for post-training. We present DoRA (Domain-oriented RAG Assessment), a novel benchmark…

报道来源 [1]

A Benchmark Construction and Evaluation Framework for Specialist Domains: Case Study on Defense-related Documents

相关实体

相关话题