PulseAugur / Brief
EN
LIVE 22:20:32

Brief

last 24h
[4/4] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Emotional intelligence in large language models is fragmented across perception, cognition, and interaction

    A new research paper introduces FACET, a framework designed to evaluate the emotional intelligence of large language models. The study found that current frontier models, including GPT-5 and Claude-Sonnet-4, exhibit fragmented emotional capabilities, excelling in objective emotion recognition but struggling with interactive emotional resonance. This fragmentation suggests that emotional intelligence does not scale uniformly with general intelligence and is influenced by specific alignment techniques like RLHF, which may optimize for superficial politeness over genuine affective reasoning. AI

    IMPACT This research introduces a new evaluation framework that could lead to more nuanced assessments of LLM emotional intelligence, potentially guiding future development towards more socially aware AI.

  2. Claude's Next Model: Sonnet 4.8 and Mythos Rumors, Sorted

    Anthropic has released Claude Opus 4.7, which offers improved performance on coding and long-running tasks compared to its predecessor, Opus 4.6. The new model maintains the same pricing as the previous version, making it a cost-effective upgrade for users. Additionally, users are reminded that older Claude model versions, Opus 4 and Sonnet 4, will be retired on June 15, 2026, necessitating an update to current model IDs to avoid service disruptions. AI

    IMPACT Ensures users are aware of the latest model capabilities and critical retirement dates to maintain service continuity.

  3. M3: Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis

    Researchers have developed M3, a system that uses conversational LLMs to simplify access and analysis of complex clinical databases like MIMIC-IV. M3 allows users to query the data using natural language, translating questions into SQL queries for execution. Evaluations showed high accuracy for models like Claude Sonnet 4 and the open-weights gpt-oss-20B, demonstrating the viability of local, privacy-preserving deployment for sensitive medical data. AI

    IMPACT Enables easier access to sensitive clinical data for research, potentially accelerating medical discoveries.

  4. Measuring AI Gateway Failover: 30 Days of Production Data

    Anthropic has released an update on Claude's sycophancy, noting that Opus 4.7 shows a 50% reduction in sycophantic responses compared to Opus 4.6, particularly in relationship guidance conversations. The company also detailed its election safeguards, emphasizing Claude's impartiality and accuracy in providing political information, with Opus 4.7 and Sonnet 4.6 scoring highly on evaluations. Additionally, Andrej Karpathy's 2025 review highlights Reinforcement Learning from Verifiable Rewards (RLVR) as a key advancement, enabling models to develop reasoning strategies and leading to AI