ENTITY HealthBench

HealthBench

PulseAugur coverage of HealthBench — every cluster mentioning HealthBench across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

10 over 90d

Releases · 30d

0 over 90d

Papers · 30d

8 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

4 day(s) with sentiment data

RECENT · PAGE 1/1 · 10 TOTAL

RESEARCH · CL_104746 · Jun 21 · 10:12

LLMs for Medical Q&A: New Reasoning Prompts and Knowledge-Graph Grounding Explored

Researchers are exploring methods to improve Large Language Models (LLMs) for open-ended medical question answering. One approach involves a Chain of Thought (CoT) reasoning prompt called CLINICR, which aims to mimic cl…
SIGNIFICANT · CL_98845 · Jun 18 · 13:46

Baichuan-M4 enhances AI medical diagnosis with multi-turn consultations and long-term memory

Baichuan Intelligence has released its Baichuan-M4 model, which is specifically enhanced for medical applications. This new model demonstrates significant improvements in multi-turn medical consultations, evidence-based…
RESEARCH · CL_95812 · Jun 16 · 17:34

New RubricsTree framework enhances evaluation of personal health AI agents

Researchers have developed RubricsTree, a new framework designed to address the challenges in evaluating personal health AI agents. This system utilizes a hierarchical taxonomy of over 100 clinically verifiable rubrics,…
TOOL · CL_93421 · Jun 16 · 04:00

New JADE framework enhances AI agent evaluation with expert-grounded dynamic assessment

Researchers have introduced JADE, a novel two-layer evaluation framework designed to address the challenges of assessing AI agents on open-ended professional tasks. The first layer of JADE encodes expert knowledge into …
TOOL · CL_72632 · Jun 5 · 04:00

LLMs improve heart medical Q&A with new GRPO reward framework

Researchers have developed a new method to improve the accuracy of Large Language Models (LLMs) in answering heart-related medical questions. Their approach utilizes Group Relative Policy Optimization (GRPO) with a nove…
RESEARCH · CL_45577 · May 23 · 06:53

Baichuan Intelligence Pivots to Medical AI, Launches M4 Model and Agent

Wang Xiaochuan, founder of Baichuan Intelligence, has pivoted the company's focus from general AI models to a specialized medical AI. This strategic shift involves developing the M4 medical large model and an AI doctor …
TOOL · CL_32658 · May 14 · 16:17

COTCAgent improves LLM analysis of patient health records

Researchers have developed COTCAgent, a new framework designed to improve how large language models analyze longitudinal electronic health records. This agent addresses limitations in current models by incorporating sta…
TOOL · CL_30793 · May 13 · 06:15

LLMs learn to actively seek external info for better task adaptation

Researchers have developed a new method for adapting large language models (LLMs) by enabling them to actively seek information from external sources like Wikipedia and web browsers. This approach, termed "active inform…
RESEARCH · CL_21935 · May 8 · 00:00

Apple's RVPO framework enhances LLM alignment by penalizing reward variance

Researchers have introduced Reward-Variance Policy Optimization (RVPO), a novel framework designed to improve the alignment of large language models with multiple objectives. Unlike existing methods that average rewards…
RESEARCH · CL_22198 · May 7 · 10:10

TheraAgent AI improves medical treatment planning with iterative refinement

Researchers have developed TheraAgent, a new framework designed to improve the precision and safety of treatment plans generated by large language models. Unlike traditional one-shot generation, TheraAgent employs an it…

LLMs for Medical Q&A: New Reasoning Prompts and Knowledge-Graph Grounding Explored

Baichuan-M4 enhances AI medical diagnosis with multi-turn consultations and long-term memory

New RubricsTree framework enhances evaluation of personal health AI agents

New JADE framework enhances AI agent evaluation with expert-grounded dynamic assessment

LLMs improve heart medical Q&A with new GRPO reward framework

Baichuan Intelligence Pivots to Medical AI, Launches M4 Model and Agent

COTCAgent improves LLM analysis of patient health records

LLMs learn to actively seek external info for better task adaptation

Apple's RVPO framework enhances LLM alignment by penalizing reward variance

TheraAgent AI improves medical treatment planning with iterative refinement