Researchers investigated the effectiveness of in-context learning for classifying Turkish idiomatic light verb constructions (LVCs). They compared a supervised BERTurk baseline against instruction-tuned large language models (LLMs) using zero-shot, one-shot, and few-shot prompting. While LLMs struggled with LVC recall in zero-shot, few-shot prompting with carefully constructed demonstrations improved performance, with GPT-OSS-20B and Qwen 2.5-14B showing robust results that matched or exceeded the supervised baseline. AI
IMPACT Demonstrates how prompt engineering significantly impacts LLM performance on nuanced linguistic tasks, influencing how models are deployed for specialized NLP applications.
RANK_REASON Academic paper detailing a new evaluation of LLM performance on a specific linguistic task.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →