English(EN) Mind Your Tone: Does Tone Alter LLM Performance?

研究：提示词语气显著影响大型语言模型性能，因模型而异

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-29 04:00

一项新近发表在arXiv上的研究探讨了提示词中不同的语气如何影响大型语言模型（LLMs）在客观选择题上的表现。研究人员使用具有不同语气的语料库测试了四种大型语言模型，包括ChatGPT-4o、ChatGPT-5-nano、Gemini 2.5 Flash和Gemini 2.5 Flash Lite。研究结果表明，语气的影响是系统性的，但高度依赖于特定模型，某些模型在不同语气下准确率波动显著。研究还发现了主题层面的语气敏感性差异，并提出了一个路由框架来解释这些差异，同时提醒用户在部署大型语言模型时不要假设其语气鲁棒性可靠。 AI

影响提示词语气会显著改变大型语言模型的准确性，因此需要仔细进行提示词工程和模型选择，以获得可靠的输出。

排序理由学术论文，详细介绍一项关于大型语言模型性能的新研究。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Om Dobariya, Akhil Kumar · 2026-05-29 04:00

Mind Your Tone: Does Tone Alter LLM Performance?

arXiv:2605.29027v1 Announce Type: new Abstract: The use of Large Language Models (LLMs) is proliferating, yet their performance is observed to vary based on prompting styles and tones. In this study, we investigate both whether and how tonal variations in prompts lead to disparat…

报道来源 [1]

Mind Your Tone: Does Tone Alter LLM Performance?

相关实体

相关话题