Brief

last 24h

[3/3] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · Mastodon — fosstodon.org English(EN) · 4d

OpenAI o3 disproves an Erdős conjecture with 125 pages of reasoning, while OpenAI files for IPO at 850B valuation and Cohere returns with an open-weights MoE mo

OpenAI's latest model, o3, has reportedly disproven an Erdős conjecture through extensive reasoning. Concurrently, OpenAI is rumored to be preparing for an IPO with a valuation of $850 billion. In related news, Cohere has released a new open-weights Mixture-of-Experts (MoE) model. AI

IMPACT Potential IPO signals massive market confidence in AI, while new models and research breakthroughs push the frontier.
TOOL · arXiv cs.AI English(EN) · 3d

DrugRAG: Enhancing Pharmacy LLM Performance Through A Novel Retrieval-Augmented Generation Pipeline

Researchers have developed DrugRAG, a novel retrieval-augmented generation pipeline designed to enhance the performance of large language models (LLMs) on pharmacy-related question-answering tasks. In their study, they evaluated ten LLMs, finding that GPT-5 and o3 performed best on a 141-question dataset. DrugRAG, which integrates structured drug information without altering model architecture, significantly improved accuracy across several models, particularly smaller open-source ones, by up to 21 percentage points. AI

IMPACT Provides a practical method to enhance LLM accuracy for specialized knowledge domains like pharmacy.
- Llama 3.1 8B
- o3
- GPT-5
- LLM
- Gemma 3 27B
- Houman Kazemzadeh
- DrugRAG
TOOL · arXiv cs.CL English(EN) · 6d

Prompting language influences diagnostic reasoning and accuracy of large language models

A new study published on arXiv reveals that the language used to prompt large language models significantly impacts their diagnostic reasoning and accuracy in clinical settings. Researchers found that four out of five evaluated models performed better when prompted in English compared to French, with English yielding higher scores in differential diagnosis, logical structure, and internal validity. Only one model, o3, showed no significant language-based performance difference, highlighting the need to consider linguistic and cultural factors for equitable global deployment of LLMs in healthcare. AI

IMPACT Highlights potential disparities in LLM clinical decision support based on language, impacting equitable access to AI healthcare tools.

Brief

OpenAI o3 disproves an Erdős conjecture with 125 pages of reasoning, while OpenAI files for IPO at 850B valuation and Cohere returns with an open-weights MoE mo

DrugRAG: Enhancing Pharmacy LLM Performance Through A Novel Retrieval-Augmented Generation Pipeline

Prompting language influences diagnostic reasoning and accuracy of large language models