GPT-3.5-Turbo struggles with information in the middle of long prompts

By PulseAugur Editorial · [1 sources] · 2026-06-05 11:02

A study found that GPT-3.5-Turbo's accuracy significantly drops when the answer is located in the middle of a long prompt, specifically a 20k-token context window. This phenomenon, documented in the paper "Lost in the Middle: How Language Models Use Long Contexts," is attributed to attention patterns in transformer models that favor information at the beginning or end of a prompt over the middle. The issue is not a retrieval error but rather how the model's attention weights decay towards the center due to training data limitations. AI

IMPACT Highlights a critical limitation in current LLMs for tasks requiring retrieval from long documents, necessitating re-ranking strategies over simply increasing context window size.

RANK_REASON The cluster describes findings from a research paper about a specific model's behavior with long contexts. [lever_c_demoted from research: ic=1 ai=1.0]

Read on dev.to — LLM tag →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

GPT-3.5-Turbo struggles with information in the middle of long prompts

COVERAGE [1]

dev.to — LLM tag TIER_1 English(EN) · A3E Ecosystem · 2026-06-05 11:02

GPT-3.5-Turbo drops from 90% accuracy to 50% when the answer sits in the middle of a 20k-token prompt instead of the sta

<p>GPT-3.5-Turbo drops from 90% accuracy to 50% when the answer sits in the middle of a 20k-token prompt instead of the start or end. Liu et al. (2023) documented this in "Lost in the Middle: How Language Models Use Long Contexts" at ACL. The edges of your context window are prim…

COVERAGE [1]

GPT-3.5-Turbo drops from 90% accuracy to 50% when the answer sits in the middle of a 20k-token prompt instead of the sta

RELATED ENTITIES

RELATED TOPICS