日本語(JA) ちょっとなぁ、記事の主張の規模が気になるというかMETRの研究そのものが怪しいです。報告では、57人のOSS開発者に、2025年の研究の3分の1の報酬で、自分で提出した開発タスクをやってもらおうとした。サンプルサイズが小さい、タスクの内容と開発者のレベルが曖昧、発表の信頼区間が無意味なほど広い… https:// me

METR AI productivity study faces criticism over methodology

By PulseAugur Editorial · [1 sources] · 2026-05-30 12:38

A recent study by METR on AI's impact on software development productivity is being questioned due to methodological concerns. Critics point to a small sample size of 57 open-source developers, low compensation for their work, and a lack of clarity regarding task complexity and developer skill levels. The study's wide confidence intervals are also cited as a reason to doubt its findings. AI

IMPACT Questions raised about the methodology of AI productivity studies may impact how AI's role in software development is measured and understood.

RANK_REASON The cluster contains commentary and criticism of a research study, rather than the study itself or a new release.

Read on Mastodon — mastodon.social →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

Mastodon — mastodon.social TIER_1 日本語(JA) · [email protected] · 2026-05-30 12:38

I'm a bit concerned about the scale of the article's claims, or rather, the METR research itself seems questionable. The report involved asking 57 OSS developers to perform development tasks they submitted themselves, for one-third of the compensation they would receive in 2025. The sample size is small, the content of the tasks and the developers' skill levels are ambiguous, and the reported confidence intervals are meaningless due to their width... https:// me

ちょっとなぁ、記事の主張の規模が気になるというかMETRの研究そのものが怪しいです。報告では、57人のOSS開発者に、2025年の研究の3分の1の報酬で、自分で提出した開発タスクをやってもらおうとした。サンプルサイズが小さい、タスクの内容と開発者のレベルが曖昧、発表の信頼区間が無意味なほど広い… https:// metr.org/blog/2026-02-24-uplif t-update/#wider-adoption-of-ai-has-made-it-more-difficult-to-measure-task-level-productivit…

LINKS metr.org/…/2026-02-24-uplift-update

COVERAGE [1]

RELATED ENTITIES

RELATED TOPICS