LLMs
PulseAugur coverage of LLMs — every cluster mentioning LLMs across labs, papers, and developer communities, ranked by signal.
- instance of large-language models 95%
- instance of Llama 2 95%
- instance of generative artificial intelligence 90%
- instance of LLM 90%
- instance of Llama 90%
- instance of Bert 90%
- instance of Qwen 90%
- used by transformer 90%
- used by English 90%
- instance of Gemma 90%
- used by KV cache 90%
- instance of Claude Haiku 4.5 90%
- 2026-05-20 research_milestone A study identified significant hallucination and abuse risks in web-deployed medical LLMs. 来源
- 2026-05-19 research_milestone A new theoretical framework for LLM alignment was proposed in a research paper.
- 2026-05-15 research_milestone A paper was published exploring the use of few-shot large language models for actionable triage categorization of online patient inquiries. 来源
- 2026-05-13 research_milestone A new paper identifies a 'Representation-Action Gap' in omnimodal LLMs, where models fail to act on detected contradictions between text and sensory input. 来源
- 2026-05-13 research_milestone A new paper details a method for fine-tuning compact LLMs to generate children's stories with controllable difficulty and safety. 来源
- 2026-05-13 research_milestone A paper details a method for fine-tuning compact LLMs to generate children's stories with controllable difficulty and safety. 来源
- 2026-05-13 research_milestone A new framework using LLMs for dynamic content expiration prediction in web search was presented in a research paper. 来源
- 2026-05-12 research_milestone A new paper proposes a disfluency-aware objective tuning method for multilingual speech correction using LLMs. 来源
- 2026-04-21 research_milestone Multiple studies published in prominent medical journals indicate significant limitations and safety concerns regarding the use of large language models for medical advice.
27 天有情绪数据
-
答案引擎优化是 AI 内容可见性的关键
内容创作者必须针对答案引擎优化(AEO)进行优化,因为这些由 AI 驱动的答案引擎依赖于搜索引擎索引。为了被包含在 AI 摘要和聊天中,发布者需要确保其内容在传统搜索引擎中排名靠前。虽然 LLM 是基于一些内容训练的,但它们的实时答案通常来自搜索结果,这使得搜索引擎可见性对于在线存在至关重要。
-
New method detects adversarial LLM prompts using sequential entropy changes
Researchers have developed a new method called CPD Online to detect adversarial prompts that attempt to jailbreak large language models. This technique treats prompt detection as an online change-point detection problem…
-
人工智能模型主要以英语训练,限制了全球覆盖范围
尽管声称具备多语言能力,但由于训练数据不平衡,大多数人工智能系统主要使用英语运行。大型语言模型主要在英语内容上进行训练,研究表明多达90%的训练词元是英语。这种语言偏见意味着人工智能在处理信息时,即使在翻译输出时,也常常通过以英语为中心的视角进行,可能会忽略文化细微差别和本地背景。因此,人工智能在非英语语言中的表现可能较弱,错误率较高,影响其在多样化全球应用中的有效性。
-
New framework synthesizes long-term medical dialogues for AI evaluation
Researchers have developed a novel framework for synthesizing long-term medical dialogues to address the lack of realistic datasets for evaluating healthcare agents. This framework constructs synthetic patient profiles,…
-
LLMs supercharge cyber attacks, creating new defense challenges
Commercial large language models are increasingly being used by cybercriminals to automate and scale traditional attacks like phishing and malware development. These LLMs enable attackers to generate highly personalized…
-
新的AI代理利用记忆和强化学习进行复杂的CAD生成
研究人员开发了一种新的记忆增强强化学习代理,旨在改进复杂的计算机辅助设计(CAD)模型的生成。该框架将几何内核集成到工具链中,实现了用于设计意图理解、规划、执行和验证的闭环系统。该代理利用带有案例和技能库的双轨道记忆模块,采用动态检索算法,无需大量新的标注数据即可促进在线自我纠正和持续改进。
-
LLMs enhance graph anomaly detection with structure-aware text embeddings
Researchers have developed TERGAD, a new framework for graph anomaly detection that leverages Large Language Models (LLMs). TERGAD translates a node's structural properties into natural language narratives, which are th…
-
量化影响大语言模型性能,更大模型表现出更强的韧性
一篇新的研究论文探讨了量化对大语言模型性能的影响,考察了从2位到6位精度的模型。研究发现,虽然更高的精度通常能带来更好的性能,但激进的量化往往能保留可接受的准确性,尽管一些模型会出现显著的性能下降。更大的模型往往对量化更具韧性,但中等规模的模型(70亿至90亿参数)在效率和性能之间提供了良好的平衡。
-
TORQ框架通过MXFP4量化提升LLM准确性
研究人员开发了TORQ,一种使用MXFP4格式量化大型语言模型(LLM)的新框架。该方法通过分析和纠正激活量化中的不平衡来解决准确性下降问题。TORQ采用双层正交旋转策略来优化激活空间,显著提高了4位浮点量化下LLM的准确性。
-
Vaughn Vernon 讨论 AI 对软件开发的实际影响
在 SAG 2025 的一次专访中,Vaughn Vernon 讨论了 AI 对软件开发的实际影响,区分了真正的价值和炒作。他分享了关于大型语言模型 (LLM)、生产力提升、幻觉问题以及开发人员在采用 AI 工具时面临的实际权衡的见解。谈话还涉及 AI 时代下的开发人员工艺。
-
AI automates healthcare data to improve clinical decision support
Modern healthcare faces a data liquidity problem, where a significant portion of patient information remains trapped in unstructured formats like scanned documents and free-text notes. This necessitates manual data entr…
-
作者警告软件开发中AI的认知风险
作者反对在软件开发中广泛使用AI,理由是可能带来的负面影响,如思维能力退化、主人翁意识减弱以及隐私担忧。虽然承认AI在处理小任务方面的实用性,但他们认为过度依赖会导致维护负担和错误增加。文章建议,对于认知输出是主要产品的任务,不应使用AI,并提倡以人为本的开发。
-
Algebra and LLMs verify flight-plan bug fix in Lean
Researchers have utilized large language models (LLMs) in conjunction with algebraic methods to verify a bug fix within the Lean theorem prover. This approach focused on a specific flight-plan software component, demons…
-
LLM在编码代理和个人助理方面的进展详述
Simon Willison在PyCon US 2026上发表了一个五分钟的演讲,总结了自2025年11月以来LLM的发展。关键进展包括编码代理的显著改进,它们已变得可靠可用于日常使用,以及“Claws”的出现——个人AI助理,如OpenClaw,它们推动了用于本地托管的Mac Mini的销售。
-
LLM clinical accuracy varies significantly by prompting language, study finds
A new study published on arXiv reveals that the language used to prompt large language models significantly impacts their diagnostic reasoning and accuracy in clinical settings. Researchers found that four out of five e…
-
KV 缓存优化解决 LLM GPU 内存瓶颈
大型语言模型 (LLM) 在服务效率方面面临着显著的瓶颈,原因是 KV 缓存的内存需求,它存储中间注意力计算。这个 KV 缓存对于实现更快的响应和处理更长的上下文窗口至关重要,但它会消耗高达 80% 的 GPU 内存。像 vLLM 的 PagedAttention 这样的创新,其灵感来自操作系统内存管理,通过优化 KV 缓存存储和减少内存碎片来解决这个问题,从而显著提高推理吞吐量。
-
大型语言模型成为影响购买决策的关键因素,并成为操纵目标
大型语言模型正日益影响消费者的购买决策,可能将消费者对 YouTube 等平台进行评测和比较的依赖转移开。随着大型语言模型在这些决策中变得越来越核心,它们很可能成为个人和企业试图影响其“观点”的操纵目标。这一趋势表明,大型语言模型如何看待实体变得越来越重要,从而导致影响这些看法的努力增加。
-
Spatial AI poised to redefine computing beyond flat screens
The next major computing platform shift will not be driven by AI alone, but by the integration of AI with spatial computing. Current AI applications are largely confined to flat screens, ignoring the physical environmen…
-
LLM factual recall scales with model size and training data frequency
Researchers have identified a predictable relationship between factual recall in large language models, their size, and the frequency of topics in their training data. By evaluating 38 models on over 8,900 scholarly ref…
-
AI primer stresses education for responsible LLM use and ethics
An individual is developing a primer on responsible AI and LLM usage, emphasizing the need for education on their capabilities, limitations, and ethical considerations. The primer highlights the risks of professional an…