PulseAugur
EN
LIVE 08:09:53

Estonian subjectivity dataset created, LLM scoring tested

Researchers have developed a new Estonian-language dataset for document-level subjectivity analysis, comprising 1,000 texts rated on a scale from 0 to 100. Initial experiments using this dataset showed moderate inter-annotator agreement among human raters, prompting a re-annotation of divergent scores. An experiment using GPT-5 for automatic subjectivity scoring indicated feasibility but highlighted differences from human annotations, suggesting LLM-based scoring is not a direct substitute for human judgment. AI

IMPACT Provides a new resource for evaluating LLM understanding of subjective content in Estonian.

RANK_REASON The cluster contains an academic paper detailing the creation of a new dataset and an initial experiment. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 English(EN) · Karl Gustav Gailit, Kadri Muischnek, Kairit Sirts ·

    Creation of the Estonian Subjectivity Dataset: Assessing the Degree of Subjectivity on a Scale

    arXiv:2512.09634v2 Announce Type: replace Abstract: This article presents the creation of an Estonian-language dataset for document-level subjectivity, analyzes the resulting annotations, and reports an initial experiment of automatic subjectivity analysis using a large language …