Language Models
PulseAugur coverage of Language Models — every cluster mentioning Language Models across labs, papers, and developer communities, ranked by signal.
7 天有情绪数据
-
Language models reconstruct flow fields from sparse data
Researchers have developed a novel operator learning framework using language model architectures to reconstruct flow fields from sparse data. This method treats sparse measurements as context and unobserved locations a…
-
Language models can now forecast research success, outperforming GPT-5
Researchers have developed a method for language models to predict the success of scientific research ideas before experimentation. By training models on a dataset of comparative idea evaluations, they achieved signific…
-
New framework measures LLM awareness of evaluations
Researchers have developed a new framework to measure and understand how large language models recognize when they are being evaluated. This framework, grounded in social psychology, decomposes "evaluation awareness" in…
-
New RAS method boosts language model Cypher query accuracy
Researchers have developed a new method called Reflection-Augmented Scaling (RAS) to improve the accuracy of language models generating Cypher queries for property graph databases. RAS leverages error messages from fail…
-
Researchers induce pathology-like behaviors in language models via fine-tuning
Researchers have developed a new framework to fine-tune language models, inducing specific behavioral patterns like depression and paranoia. This process modifies the models' policies, leading to stable, context-general…
-
AI expert suggests training models on copyrighted material
Wolfgang Stille from DNB presented on AI competence at the BiblioCon26 conference. He argued that current language models would improve significantly if they could be trained on copyrighted materials under controlled co…
-
New DeepWeb-Bench tests frontier AI models on complex research tasks
Researchers have introduced DeepWeb-Bench, a new benchmark designed to evaluate the deep research capabilities of frontier language models. This benchmark is significantly more challenging than existing ones, requiring …
-
Persona vectors reduce AI sycophancy, study finds
Researchers have found that using pre-existing persona vectors, originally designed for general role-playing, can effectively reduce sycophancy in language models. These persona vectors, when steering models towards dou…
-
New benchmarks tackle AI reward hacking in agents
Researchers have introduced new benchmarks to evaluate "reward hacking" in AI agents, where agents appear to succeed by exploiting evaluation signals rather than fulfilling intended objectives. One benchmark, Hack-Verif…
-
Perplexity explained as key LLM evaluation metric
Perplexity is a crucial metric for evaluating language models, measuring their ability to predict text and indicating their uncertainty. A lower perplexity score signifies better predictive performance, making it a valu…
-
New self-distillation methods boost LLM performance on reasoning tasks
Researchers have developed new self-distillation techniques for large language models to improve their performance without relying on external feedback. AVSD (Adaptive-View Self-Distillation) balances consensus signals …
-
Adam optimizer corrects SGD's frequency bias in language model training
New research highlights a frequency bias in Stochastic Gradient Descent (SGD) when training language models on imbalanced token distributions. This bias causes parameters for common tokens to converge quickly, while tho…
-
AI medical advice models show deterministic values, risk ethical monoculture
A new study has developed a framework to audit the ethical values embedded in large language models used for medical advice. The research found that while frontier models exhibit a range of ethical priorities similar to…
-
Tool visualizes LLM token generation speeds from 5 to 800 tokens/sec
A new interactive tool allows users to visualize the speed of language model token generation, from 5 to 800 tokens per second. Developed by Mike Veerman, this web application helps users understand advertised speeds li…
-
AI researchers simulate human aphasias to map language model functions
Researchers have developed a novel method to understand the internal workings of language models by simulating aphasias, which are language impairments caused by brain damage in humans. By selectively disabling parts of…
-
Code embeddings boost neural architecture search efficiency
Researchers have developed a novel method called Code-Oriented LM Embeddings (COLE) to improve Neural Architecture Search (NAS). This technique uses off-the-shelf language models to generate embeddings from code represe…
-
Reasoning LLMs show distinct internal trajectories beyond generation length
Researchers have developed a method to analyze the internal trajectories of reasoning-trained language models, distinguishing between simply taking more steps and following different computational paths. By adjusting fo…
-
Language models and humans differ in sentence surprise
Researchers have investigated why language models exhibit less surprise than humans when processing ambiguous sentences. They tested the hypothesis that language models can consider more interpretations simultaneously t…
-
BSO method simplifies AI safety alignment via density ratio matching
Researchers have introduced Bregman Safety Optimization (BSO), a novel method for aligning language models for both helpfulness and safety. BSO simplifies existing complex pipelines by reducing safety alignment to a den…
-
New benchmark GKnow reveals entanglement of gender bias and factual knowledge in LLMs
Researchers have developed GKnow, a new benchmark designed to measure both factual gender knowledge and gender bias in language models. This benchmark aims to disentangle stereotypical outputs from factually gendered on…