PulseAugur / Brief
EN
LIVE 11:56:43

Brief

last 24h
[3/3] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Hierarchical Modeling of ICD Codes in EHR Foundation Models

    Researchers have developed new methods for electronic health record (EHR) foundation models to better utilize the hierarchical structure of ICD diagnosis codes. Current models treat these codes as flat tokens, ignoring their inherent relationships. This work explores augmenting BERT-style transformers with hierarchical tokens and incorporating hierarchy into graph-based code representations. Experiments on MIMIC-IV and eICU datasets demonstrate that explicitly encoding ICD hierarchy improves downstream prediction accuracy and cross-dataset transferability, with the most effective hierarchy level varying by task and model. AI

  2. Generalistic or Specific Embeddings, Which is Better? An Empirical Study on Search for Clinical Coding in Non-English Languages

    Researchers explored the effectiveness of generalistic versus specific embeddings for semantic search in clinical coding across non-English languages. They found that fine-tuning a Spanish biomedical encoder with LLM-generated synthetic data significantly improved performance in languages like Spanish, Catalan, French, and Portuguese. This approach, involving a bi-encoder and a cross-encoder reranker, even surpassed existing English-based models on certain metrics without English biomedical pretraining. AI

    IMPACT Demonstrates a method for improving non-English language model performance in specialized domains using synthetic data.

  3. RAG-Coding: Enhancing LLM Medical Coding with Structured External Knowledge

    Researchers have developed RAG-Coding, a novel method that uses four large language model (LLM) agents to improve the accuracy of automated medical coding for ICD-10-CM. This approach grounds the LLMs' decisions in external knowledge sources like official coding guidelines and tabular lists, enhancing clinical compliance. In evaluations on the MDACE dataset, RAG-Coding demonstrated significant improvements over existing LLM-based baselines, achieving higher micro-F1 and macro-F1 scores. The study also introduced an updated dataset, MDACE-2025, which incorporates the latest 2025 ICD-10-CM guidelines for more precise evaluation. AI

    IMPACT This research could lead to more accurate and compliant automated medical coding systems, reducing errors and improving healthcare administration.