PulseAugur / Brief
EN
LIVE 10:54:27

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Whose Name Comes Up? II: Benchmarking and Intervention-Based Auditing of LLM-Based Scholar Recommendation

    Researchers have developed LLMScholarBench, a new benchmark designed to audit Large Language Models (LLMs) used for academic expert recommendation. This benchmark evaluates both the LLM's inherent capabilities and the impact of user interventions during the recommendation process. Experiments across 22 LLMs in physics expert recommendation revealed that interventions like temperature adjustments, diversity-focused prompting, and retrieval-augmented generation (RAG) each present unique trade-offs, affecting metrics such as factuality, diversity, and representation. AI

    IMPACT Provides a framework for evaluating and improving the fairness and accuracy of LLM-driven academic discovery tools.