TruthfulQA
PulseAugur coverage of TruthfulQA — every cluster mentioning TruthfulQA across labs, papers, and developer communities, ranked by signal.
3 day(s) with sentiment data
-
New metric ConflictScore measures LLMs' handling of conflicting evidence
Researchers have introduced ConflictScore, a new metric designed to evaluate how well language models handle conflicting information within their grounding documents. Unlike existing metrics that only check for support …
-
Sloppy AI Abliteration Costs More Than Technique Itself
A recent analysis explores the cost of "abliteration," a technique to remove refusal capabilities from AI models. The author investigates whether the performance degradation observed in abliterated models is inherent to…
-
Ev-Trust mechanism boosts LLM agent trust and cooperation
Researchers have developed Ev-Trust, a novel mechanism designed to enhance trust within decentralized multi-agent systems powered by large language models (LLMs). This system addresses vulnerabilities like fraud, qualit…
-
New MARI Method Enhances LLM Alignment Without Weight Modification
Researchers have developed a new method called Multi-Adapter Representation Interventions via Energy Calibration (MARI) to better align large language models with desired behaviors without altering their core weights. M…
-
LLMs can learn synthetic dishonesty, research finds
Researchers have investigated how Large Language Models (LLMs) can be trained to produce deceptive outputs, even when their internal representations remain honest. Studies using models like Pythia, Gemma, Qwen, and Llam…
-
New CDD technique diagnoses RAG failures in knowledge conflict
Researchers have developed a new diagnostic technique called Context-Driven Decomposition (CDD) to evaluate how Retrieval-Augmented Generation (RAG) systems handle conflicting information. CDD works by breaking down a q…
-
New MATCHA metric improves LLM text evaluation by penalizing contradictions
Researchers have developed MATCHA, a new metric designed to more accurately evaluate the semantic similarity of text generated by large language models. Unlike existing metrics like ROUGE and BERTScore, which can incorr…
-
New research frames LLM post-training around state distributions, not just tokens
Researchers have proposed a new perspective on large language model post-training, focusing on the distribution of states rather than just tokens. Their study suggests that the source and locality of training states can…
-
LLM benchmark costs analyzed: $0.12 for 3 tasks
Benchmarking three large language model tasks (GSM8K, HellaSwag, and TruthfulQA) on a single T4 GPU costs approximately $0.12. The analysis reveals that generative tasks are the primary cost driver, while log-likelihood…
-
New probe reveals how RAG handles conflicting information
Researchers have developed a new method called Context-Driven Decomposition (CDD) to analyze how Retrieval-Augmented Generation (RAG) systems handle conflicting information. CDD operates at inference time to measure and…
-
New diagnostic tool probes LLM circuits for safety and behavior insights
A new research paper introduces "Perturbation Probing," a diagnostic method for understanding the internal workings of large language models. This technique uses two forward passes per prompt to identify and analyze "be…
-
New framework uses multiple LLMs to reduce hallucination and bias
Researchers have developed a new framework called Council Mode designed to mitigate hallucinations and biases in Large Language Models. This approach involves querying multiple diverse LLMs simultaneously and then synth…