PulseAugur
EN
LIVE 14:52:48

New Benchmark Reveals Citations Increase LLM Hallucinations

A new benchmark called AuthorityBench, comprising 220,564 prompts across general knowledge, science, law, and medicine, has been developed to study how citation presence influences large language models' behavior. The research found that the presence of citations, even fabricated ones, consistently increases hallucination rates compared to prompts without citations. This effect is most pronounced when false citations accompany true claims, significantly raising hallucination rates, particularly in the general knowledge domain. AI

IMPACT This research highlights a critical vulnerability in LLMs, suggesting that citation-augmented systems may require significant re-evaluation to mitigate increased hallucination rates.

RANK_REASON The cluster describes a new academic paper introducing a benchmark for evaluating LLM behavior.

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.LG TIER_1 English(EN) · Dhruv Kumar ·

    Authority, Truth, and Citation Bias: A Large-Scale Multi-Domain Benchmark for Studying Epistemic Susceptibility in Large Language Models

    Large language models are increasingly deployed in citation-augmented settings, yet the effect of citation presence on model behavior independent of factual content remains poorly understood. We introduce AuthorityBench, a 220,564-prompt multi-domain benchmark that isolates how c…

  2. Hugging Face Daily Papers TIER_1 English(EN) ·

    Authority, Truth, and Citation Bias: A Large-Scale Multi-Domain Benchmark for Studying Epistemic Susceptibility in Large Language Models

    Large language models are increasingly deployed in citation-augmented settings, yet the effect of citation presence on model behavior independent of factual content remains poorly understood. We introduce AuthorityBench, a 220,564-prompt multi-domain benchmark that isolates how c…