GPT-4.1 nano
PulseAugur coverage of GPT-4.1 nano — every cluster mentioning GPT-4.1 nano across labs, papers, and developer communities, ranked by signal.
1 天有情绪数据
-
新的HiPP方法通过分层提示提升宣传检测效果
研究人员开发了一种新的分层提示方法HiPP,以改进社交媒体文本中的宣传检测。该方法在聚合之前预测细粒度的宣传技术,这被证明特别有利于在更模糊的数据集上微调模型。该研究评估了四种语言模型,发现Qwen模型总体表现最佳,而Phi-4 14B持续优于GPT-4.1-nano。研究结果强调了微调对于鲁棒性宣传分类的重要性,并引入了一个新的数据集供未来研究。
-
研究发现:大语言模型在医疗分诊中存在显著性别偏见
一项名为 EQUITRIAGE 的新审计评估了五种大型语言模型在急诊科分诊中的性别偏见,发现所有模型均表现出超过 5% 阈值的偏见。DeepSeek-V3.1 和 Gemini-3-Flash 表现出显著的女性漏诊倾向,翻转率在 9.9% 到 43.8% 之间。虽然人口统计信息匿名化降低了 Gemini 的偏见,但 DeepSeek 仍显示出残留偏见,表明年龄是促成因素之一。该研究强调,不同模型具有不同的潜在偏见机制,并强调在临床部署…
-
SF20K competition shows narrative understanding, not model size, is key for video QA
The first Short-Films 20K (SF20K) Competition, held alongside ICCV 2025, focused on advancing story-level video understanding through an open-ended question-answering task. Using a benchmark of amateur short films and e…
-
Introducing gpt-realtime and Realtime API updates
OpenAI has released GPT-4.1, a new series of models for its API that offer significant improvements in coding, instruction following, and long context comprehension, outperforming previous models like GPT-4o. The compan…