LLMs likely don't use 'correct answer feature' for multiple choice

By PulseAugur Editorial · [1 sources] · 2026-06-30 22:46

A new analysis suggests that large language models do not solely rely on a 'correct answer feature' to solve multiple-choice questions. Theoretical arguments indicate that this feature, which supposedly scores the correctness of an option based on its tokens, cannot account for all question types, particularly those where correctness depends on subsequent information or intransitive options. Empirical evidence from Llama models further supports this, showing that the same attention heads are utilized for both question-first and option-first prompts, casting doubt on the 'correct answer feature' as the primary mechanism for multiple-choice question answering. AI

IMPACT Challenges a common hypothesis about LLM reasoning, suggesting deeper mechanisms for multiple-choice question answering.

RANK_REASON The item is a research paper presenting theoretical and empirical evidence about LLM capabilities. [lever_c_demoted from research: ic=1 ai=1.0]

Read on LessWrong (AI tag) →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

LLMs likely don't use 'correct answer feature' for multiple choice

COVERAGE [1]

LessWrong (AI tag) TIER_1 English(EN) · Koby Lewis · 2026-06-30 22:46

"Correct Answer Features" Cannot Explain Multiple Choice Capabilities

<p><span>TL;DR: I present theoretical and empirical evidence that LLMs cannot be (exclusively) using a "correct answer feature" as the main mechanism by which they perform multiple choice question answering. A hypothetical correct-answer feature would indicate the "correctness" o…

COVERAGE [1]

"Correct Answer Features" Cannot Explain Multiple Choice Capabilities

RELATED ENTITIES

RELATED TOPICS