PulseAugur
EN
LIVE 11:46:09
ENTITY GPT-5.2

GPT-5.2

PulseAugur coverage of GPT-5.2 — every cluster mentioning GPT-5.2 across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
65
65 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
52
52 over 90d
TIER MIX · 90D
TOPICS
RELATIONSHIPS
SENTIMENT · 30D

21 day(s) with sentiment data

RECENT · PAGE 2/4 · 65 TOTAL
  1. RESEARCH · CL_62272 ·

    Local LLMs show promise for confidential translation work

    A new research paper benchmarks locally runnable language models for confidential translation tasks, expanding upon previous work with a multilingual corpus. The study evaluates several local LLMs using Ollama across fo…

  2. TOOL · CL_57927 ·

    Open-Source LLMs Evolve: Attention, Multimodality, and Efficiency Gains

    The open-source LLM landscape has seen significant shifts in recent months, with Sliding Window Attention becoming mainstream, enabling much larger context windows. QK-Norm is also gaining traction as a training stabili…

  3. TOOL · CL_53673 ·

    New AI Framework Automates Complex Materials Science Calculations

    Researchers have developed AutoDFT, a novel closed-loop multi-agent framework designed to automate Density Functional Theory (DFT) calculations in materials science. Unlike previous LLM-based agents that only plan upfro…

  4. TOOL · CL_51307 ·

    New benchmark and model advance AI customer service capabilities

    Researchers have developed a new benchmark, OlaBench, and a corresponding model, OlaMind, to better evaluate and improve customer service AI systems. Existing benchmarks often fail to capture real-world dialogue nuances…

  5. TOOL · CL_51196 ·

    LLMs struggle with partially correct answers in automated scoring

    A new research paper explores the challenges of automated short answer scoring (ASAS) using large language models (LLMs). The study found that while LLMs like GPT-5.2, GPT-4o, and Claude Opus 4.5 perform well on fully c…

  6. RESEARCH · CL_51036 ·

    New AI text detector READER outperforms larger models

    Researchers have developed READER, a novel system for detecting AI-generated text that outperforms larger models by incorporating a reasoning-based approach. This system, fine-tuned on a curated dataset of rationales an…

  7. TOOL · CL_81354 ·

    LLMs as mutation operators boost evolutionary search in DEI framework

    Researchers have developed DEI, a distributed Quality-Diversity search framework that leverages heterogeneous large language models as mutation operators. This approach enhances evolutionary inference by utilizing the d…

  8. TOOL · CL_50829 ·

    AI models show improved adherence to behavioral constitutions

    A new audit pipeline reveals that while AI models are improving at adhering to their specified behavioral constitutions, they still exhibit significant failure rates. The pipeline, which decomposes specifications into t…

  9. TOOL · CL_44724 ·

    New ERM framework critiques LLM causal reasoning without labels

    A new framework called Epistemic Regret Minimization (ERM) has been introduced to improve the causal reasoning of large language models. Unlike traditional methods that only reward correct answers, ERM critiques the und…

  10. TOOL · CL_43174 ·

    GPT-5.2 shows expert-level performance in scientific peer review

    A recent evaluation suggests that GPT-5.2 is performing at an expert level in scientific peer review. In a study involving 45 scientists and 469 hours, AI reviews were found to be competitive with top human reviewers on…

  11. RESEARCH · CL_44020 ·

    LLMs outperform fine-tuned models on rare suicide circumstances

    A new research paper compares the performance of large language models (LLMs) against fine-tuned RoBERTa models for extracting complex circumstances from death investigation narratives. The study introduces a "Complexit…

  12. RESEARCH · CL_44807 ·

    New benchmark tests LLMs on rare clinical cases beyond guidelines

    Researchers have developed OGCaReBench, a new benchmark designed to evaluate how well large language models can answer complex clinical questions that fall outside standard medical guidelines. The benchmark, derived fro…

  13. RESEARCH · CL_41794 ·

    AI reviewers outperform humans on scientific paper critiques, study finds

    A new study evaluated AI reviewers against human experts in assessing scientific papers, finding that AI models like GPT-5.2, Gemini 3.0 Pro, and Claude Opus 4.5 can outperform top human reviewers on certain metrics. Wh…

  14. COMMENTARY · CL_35534 ·

    Developer shares structured methodology for AI-assisted coding

    A developer outlines a methodology for effectively using AI coding assistants like Anthropic's Claude Code, emphasizing a structured approach over simply prompting for entire applications. The process involves detailed …

  15. TOOL · CL_32749 ·

    Sea Limited deploys OpenAI Codex AI agents to speed up software development

    Sea Limited is deploying OpenAI's Codex AI agents across its engineering teams to accelerate AI-native software development. This initiative aims to transform internal workflows by leveraging Codex as a 'command center'…

  16. TOOL · CL_36570 ·

    LLM pipeline extracts clinical data from nurse-patient transcripts

    Researchers have developed a retrieval-augmented generation (RAG) pipeline to extract structured clinical information from nurse-patient conversations. This system, utilizing models like Llama-4-Scout and GPT-5.2, aims …

  17. TOOL · CL_32553 ·

    VLMs show promise in signature verification but struggle with skilled forgeries

    Researchers explored the use of advanced Vision-Language Models (VLMs) for online signature verification, testing GPT-5.2 and Gemini 2.5 Pro in a zero-shot capacity. The study converted kinematic data into images and us…

  18. TOOL · CL_27593 ·

    New system MemPrivacy shields user data in edge-cloud AI agents

    Researchers have developed MemPrivacy, a system designed to protect sensitive user information in LLM-powered agents that utilize cloud-assisted memory management. MemPrivacy identifies and masks private data on edge de…

  19. TOOL · CL_25584 ·

    LLMs struggle with nuanced answers in automated scoring, study finds

    A new paper explores how large language models (LLMs) perform on automated short answer scoring (ASAS), particularly with partially correct responses. Researchers found that while LLMs like GPT-5.2, GPT-4o, and Claude O…

  20. TOOL · CL_21267 ·

    Cursor AI uses older models despite newer options being available

    A user on Reddit's Cursor subreddit is questioning why the Cursor IDE's subagent feature is defaulting to older models like GPT-5.1 and GPT-5.2 for coding tasks. Despite configuring the system to use newer and potential…