Researchers have introduced SciZoom, a large-scale benchmark designed to evaluate hierarchical scientific summarization and analyze writing trends in the era of large language models (LLMs). The benchmark includes over 44,000 papers from top machine learning conferences between 2020 and 2025, divided into pre-LLM and post-LLM periods. SciZoom offers three levels of summarization—Abstract, Contributions, and TL;DR—and reveals significant shifts in scientific writing, such as increased confidence and homogenization of prose, with up to a 10x increase in formulaic expressions and a 23% decline in hedging. AI
IMPACT Provides a new resource for evaluating LLM summarization capabilities and understanding the evolving nature of scientific discourse influenced by AI writing tools.
RANK_REASON The cluster describes a new academic benchmark and dataset for evaluating LLM capabilities in scientific summarization and analyzing writing trends. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →