ScholaWrite: A Dataset of End-to-End Scholarly Writing Process
Researchers have introduced ScholaWrite, a novel dataset designed to capture the complete scholarly writing process. This dataset was collected using a Chrome extension that recorded keystrokes within Overleaf, documenting the multi-month journey from initial drafts to final manuscripts for five computer science preprints. The data includes over 62,000 text changes and provides insights into the cognitive demands and task-switching involved in academic writing, highlighting current limitations of LLMs in assisting this process. AI
IMPACT Provides data to develop more effective AI writing assistants that understand the cognitive process of authors.