A new benchmark has been developed to evaluate the impact of persona prompting on Large Language Models (LLMs) used for scholar recommendations. The study audited 43 LLMs across six scientific disciplines, analyzing how variations in language, location, and role-and-task prompts affect the technical quality and social representativeness of recommendations. Findings indicate that while model choice primarily influences technical quality, prompt design significantly impacts diversity and factuality, with specific location prompts yielding distinct outcomes in terms of accuracy and homogeneity. AI
IMPACT Highlights the critical role of prompt engineering in shaping AI outputs, particularly in academic contexts, influencing perceived expertise and diversity.
RANK_REASON The cluster contains an academic paper detailing a new benchmark and research findings.
Read on arXiv cs.IR (Information Retrieval) →
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →