PulseAugur
EN
LIVE 12:26:25
ENTITY Cohen's kappa

Cohen's kappa

PulseAugur coverage of Cohen's kappa — every cluster mentioning Cohen's kappa across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
7
7 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
6
6 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D

5 day(s) with sentiment data

RECENT · PAGE 1/1 · 7 TOTAL
  1. RESEARCH · CL_106950 ·

    LLM-as-judge tools fail to prioritize human validation, study finds

    A recent evaluation of six LLM-as-judge tools revealed that most prioritize generating scores over ensuring the trustworthiness of those scores. The author argues that a judge's validation against human labels, measured…

  2. RESEARCH · CL_105153 ·

    LLMs analyzed for self-stigma support in drug use communities · 2 sources tracked

    Researchers have developed methods to analyze self-stigma expressed by individuals who use drugs in online communities, specifically on Reddit. One study created a codebook to categorize self-stigma into cognitive, affe…

  3. TOOL · CL_100060 ·

    New framework measures university CS curriculum alignment with global standards

    A new framework has been developed to measure how well university computer science programs align with international curricular guidelines, specifically CS2013 and CS2023. This human-in-the-loop pipeline represents prog…

  4. RESEARCH · CL_99671 ·

    LLM-as-a-Judge models show significant reliability and bias issues, study finds

    A new study evaluating LLM-as-a-Judge models reveals significant issues with their reliability and validity. The research, which analyzed 21 judges across multiple benchmarks and over 541,000 judgments, found that commo…

  5. TOOL · CL_93144 ·

    LLMs show promise in identifying discourse units for aphasia assessment

    A new research paper explores the use of instruction-tuned large language models (LLMs) for classifying Correct Information Units (CIUs) in aphasic discourse. The study found that while zero-shot prompting was insuffici…

  6. TOOL · CL_52901 ·

    LLM judge evaluations require hundreds of labels for reliable results

    A recent article highlights the critical need for larger evaluation datasets when using LLMs as judges in AI model assessments. The author explains that common practice of using small, ad-hoc datasets is insufficient fo…

  7. TOOL · CL_18536 ·

    LLM system aids explainable defect analysis in laser powder bed fusion

    Researchers have developed a new decision-support system that combines structured knowledge about defects with large language models (LLMs) to analyze and guide mitigation strategies in laser powder bed fusion (LPBF) ma…