新研究探索基于LLM的专利分析和基准测试

作者 PulseAugur 编辑部 · [4 个来源] · 2026-05-22 23:51

两篇新研究论文探讨了使用大型语言模型改进专利表示学习的方法。第一篇论文介绍了PHAGE，一种新颖的编码器，它使用异构依赖图来更好地捕捉专利权利要求的层次结构，在分类、检索和聚类方面优于现有基线。第二篇论文对22种不同的嵌入模型进行了基准测试，评估了它们在检索、分类和聚类任务上的性能，并发现微调策略是任务相关的，并且单一景观微调可能会损害在外部景观上的性能。 AI

影响这些研究突出了将LLM应用于专利分析等专业领域的进展，表明信息检索和分类的效率和准确性有所提高。

排序理由该集群包含两篇学术论文，详细介绍了使用LLM进行专利表示学习的新方法和基准测试。

在 arXiv cs.IR (Information Retrieval) 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。我们如何撰写摘要 →

报道来源 [4]

arXiv cs.CL TIER_1 English(EN) · Yongmin Yoo, Qiongkai Xu, Zhangkai Wu, Longbing Cao · 2026-05-28 04:00

异构依赖图引导注意力用于专利表示学习

arXiv:2605.10073v2 Announce Type: replace Abstract: Pre-trained language models advance patent classification and retrieval via encoding claims as flat token sequences, yet overlooking the dependency hierarchy among claims. Incorporating the hierarchy into self-attention poses tw…
arXiv cs.AI TIER_1 English(EN) · Amirhossein Yousefiramandi, Ciaran Cooney · 2026-05-26 04:00

基准测试专利嵌入：22个模型在检索、分类和聚类上的多任务评估

arXiv:2605.24297v1 Announce Type: cross Abstract: Which fine-tuning signals improve patent embedding models, and do gains transfer across patent landscapes? We benchmark 22 embedding models, from 22M-parameter encoders to 12B instruction-tuned LLMs, on retrieval, classification, …
arXiv cs.IR (Information Retrieval) TIER_1 English(EN) · Ciaran Cooney · 2026-05-22 23:51

基准测试专利嵌入：22个模型在检索、分类和聚类上的多任务评估

Which fine-tuning signals improve patent embedding models, and do gains transfer across patent landscapes? We benchmark 22 embedding models, from 22M-parameter encoders to 12B instruction-tuned LLMs, on retrieval, classification, and clustering. The study uses 113,148 WIPO assist…
arXiv cs.IR (Information Retrieval) TIER_1 English(EN) · Ciaran Cooney · 2026-05-22 23:51

基准测试专利嵌入：22个模型在检索、分类和聚类上的多任务评估

Two questions regarding practitioners' use of patent embeddings arise: (i) Does one fine-tuning recipe suffice for all downstream applications? (ii) Is fine-tuning on one patent landscape sufficient for downstream application on other landscapes? By evaluating 22 pre-trained embe…

报道来源 [4]

异构依赖图引导注意力用于专利表示学习

基准测试专利嵌入：22个模型在检索、分类和聚类上的多任务评估

基准测试专利嵌入：22个模型在检索、分类和聚类上的多任务评估

基准测试专利嵌入：22个模型在检索、分类和聚类上的多任务评估

相关实体

相关话题