MS MARCO
PulseAugur coverage of MS MARCO — every cluster mentioning MS MARCO across labs, papers, and developer communities, ranked by signal.
5 day(s) with sentiment data
-
GPUSparse system accelerates learned sparse retrieval using GPU parallelization
Researchers have developed GPUSparse, a novel system designed to accelerate learned sparse retrieval models by leveraging GPU parallelization. This system addresses the CPU-bound bottleneck in current sparse retrieval m…
-
TileMaxSim kernel boosts GPU retrieval model speed by 220x
Researchers have developed TileMaxSim, a new IO-aware kernel for GPUs designed to significantly accelerate the MaxSim scoring process used in multi-vector retrieval models like ColBERT. Existing implementations are inef…
-
AI agents fail due to flawed search index distribution, not prompting
A common issue in AI agents is that their search results appear correct but lead to factually wrong answers due to problems with the underlying search index. This is not a prompting issue but a distribution problem, whe…
-
New indexing framework SPI boosts RAG performance in vector databases
Researchers have introduced Semantic Pyramid Indexing (SPI), a novel indexing framework for vector databases designed to enhance retrieval-augmented generation (RAG) pipelines. SPI adapts the retrieval depth based on qu…
-
New research tackles noise and efficiency in full-duplex dialogue systems
Two new research papers explore advancements in full-duplex spoken dialogue systems, which allow for simultaneous listening and speaking. One paper introduces Interference-Resilient Adaptive Fusion (IRAF) to improve rob…
-
New ECI method ranks hard-negatives for dense retrieval without training
Researchers have developed a new training-free method called Effective Contrastive Information (ECI) to evaluate hard-negative sources for dense retrieval systems. This technique ranks candidate negatives using frozen e…
-
Web search queries reveal 18% geospatial focus, exceeding GIS capabilities
Researchers have analyzed over a million web search queries, finding that a significant portion, nearly 18%, are related to geospatial information. This is substantially higher than previously estimated. The study categ…
-
SilentRetrieval attack hijacks RAG systems with poisoned documents
Researchers have developed "SilentRetrieval," a novel two-stage attack designed to compromise Retrieval-Augmented Generation (RAG) systems. This method uses adversarial data poisoning to inject manipulated documents tha…
-
New SCI-Defense framework combats LLM ranking manipulation attacks
Researchers have developed SCI-Defense, a novel framework designed to counter manipulation attacks targeting LLM-based ranking systems. These attacks, known as Generative Engine Optimization (GEO), involve adversaries i…
-
New Layer-wise Token Compression boosts document reranking speed
Researchers have developed a new method called Layer-wise Token Compression (LTC) to improve the efficiency of transformer-based document reranking models used in information retrieval. Unlike previous token compression…
-
LLMs power new adversarial attacks on neural ranking models
Researchers have developed a new framework called CRAFT to attack neural ranking models used in information retrieval. This framework utilizes large language models to generate adversarial content, which is then used to…
-
Researchers propose Parametric Memory Head to improve generative retrieval models
Researchers have developed a novel approach called Post-Adaptation Memory Tuning (PAMT) to address the challenge of catastrophic forgetting in generative information retrieval models. PAMT introduces a modular parametri…
-
Rabtriever model efficiently retrieves rationales, reducing LLM computational costs
Researchers have developed Rabtriever, a novel method to improve the efficiency of rationale-based information retrieval. This approach uses on-policy distillation from generative rerankers, inspired by the Joint-Embedd…