English(EN) In-Domain Supervised Pathology Report Classification: A Reproducible Pipeline from Data Curation to Production-Matched Evaluation

新流程提升病理报告分类准确性

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-16 04:00

研究人员开发了一个可复现的监督式病理报告分类流程，解决了模型应用于不同癌症登记处数据时性能下降的问题。该流程标准化了数据整理，并包括手动审核以识别标签噪声。使用此方法训练的模型（称为肯塔基模型）与在西雅图训练的基线模型相比，假阴性率显著降低，F1分数更高，表明准确性提高，审阅者工作量减少。 AI

影响这项研究提供了一种标准化的方法，以提高AI模型在处理来自不同来源的敏感医疗数据时的准确性和可靠性。

排序理由该集群描述了一篇在arXiv上发表的研究论文，详细介绍了一种用于监督式病理报告分类的新方法。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Isaac Hands, Bin Huang, Adam Spannaus, John Gounley, Heidi Hanson, Eric Durbin, Sally R. Ellingson · 2026-06-16 04:00

In-Domain Supervised Pathology Report Classification: A Reproducible Pipeline from Data Curation to Production-Matched Evaluation

arXiv:2606.16026v1 Announce Type: new Abstract: We introduce an in-domain supervised pipeline designed to counter the out-of-distribution performance drop that hampers supervised biomedical NLP models, a problem observed when models trained on pathology reports are moved across c…