PulseAugur
实时 13:08:38
English(EN) DECSELFMASK: Leveraging Unlabeled Text via Self-Relevance-Guided Masking for Decoder-Only Classification

新方法利用无标签数据提升仅解码器模型的分类性能

研究人员开发了DecSelfMask,一种利用无标签数据提高仅解码器语言模型分类性能的新颖方法。该方法采用相关性引导掩码策略,识别关键文本片段并训练模型进行重构。在190万份临床笔记的数据集上,DecSelfMask在宏观F1分数上比标准的监督微调方法提高了近20个百分点,表现显著。 AI

影响 增强了仅解码器模型的分类能力,可能减少在专业领域对昂贵标签数据的依赖。

排序理由 该集群包含一篇详细介绍改进语言模型性能新方法的学术论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Pietro Ferrazzi, Matteo Merler, Giovanni Bonetta, Alberto Lavelli, Bernardo Magnini ·

    DECSELFMASK: Leveraging Unlabeled Text via Self-Relevance-Guided Masking for Decoder-Only Classification

    arXiv:2606.09466v2 Announce Type: replace Abstract: Classification tasks require annotated data, which can often be expensive, time-consuming, or even unfeasible to collect. This is the case of the medical domain, where large datasets often have few annotated examples. To address…

  2. arXiv cs.CL TIER_1 English(EN) · Bernardo Magnini ·

    DECSELFMASK:利用自相关性引导掩码在无标签文本中进行仅解码器分类

    Classification tasks require annotated data, which can often be expensive, time-consuming, or even unfeasible to collect. This is the case of the medical domain, where large datasets often have few annotated examples. To address this, we propose DecSelfMask (Decoder Self-learning…