Beir
PulseAugur coverage of Beir — every cluster mentioning Beir across labs, papers, and developer communities, ranked by signal.
9 day(s) with sentiment data
-
TileMaxSim kernel boosts GPU retrieval model speed by 220x
Researchers have developed TileMaxSim, a new IO-aware kernel for GPUs designed to significantly accelerate the MaxSim scoring process used in multi-vector retrieval models like ColBERT. Existing implementations are inef…
-
DREAM paper proposes autoregressive modeling for dense retrieval training
Researchers have developed DREAM (Dense Retrieval Embeddings via Autoregressive Modeling), a novel method for training dense retrieval systems. Unlike traditional methods that rely on costly labeled data, DREAM leverage…
-
AI agents fail due to flawed search index distribution, not prompting
A common issue in AI agents is that their search results appear correct but lead to factually wrong answers due to problems with the underlying search index. This is not a prompting issue but a distribution problem, whe…
-
KaLM-Reranker-V1: Efficient Document Reranking Model Unveiled
Researchers have introduced KaLM-Reranker-V1, a novel reranking model designed for efficiency in large-scale retrieval systems. This model decouples query and passage computation using an encoder-decoder architecture wi…
-
HAKARI-Bench offers lightweight evaluation for retrieval models · 2 sources tracked
Researchers have introduced HAKARI-Bench, a lightweight benchmark designed to streamline the evaluation of retrieval architectures and efficiency settings for retrieval-augmented generation and semantic search. This new…
-
New multilingual reranker models trained efficiently for diverse tasks
Researchers have developed Querit-Reranker, a new family of multilingual cross-encoder rerankers designed for efficient adaptation to various ranking tasks without requiring extensive labeled data. The models are traine…
-
New ADORE framework improves LLM query expansion with iterative feedback
Researchers have introduced ADORE, an iterative framework designed to enhance Large Language Model (LLM)-based query expansion for information retrieval. Unlike generation-driven methods that can lead to retrieval drift…
-
CompRank framework boosts LLM reranking efficiency
Researchers have developed CompRank, a new framework designed to make large language model (LLM) rerankers more computationally efficient for information retrieval tasks. CompRank achieves this by reducing redundant com…
-
STORM framework enhances lexical query expansion for retrieval
Researchers have developed STORM, a self-supervised framework for lexical query expansion that improves information retrieval. This method uses a reward-guided beam search to optimize token generation, making it more ef…
-
New ECI method ranks hard-negatives for dense retrieval without training
Researchers have developed a new training-free method called Effective Contrastive Information (ECI) to evaluate hard-negative sources for dense retrieval systems. This technique ranks candidate negatives using frozen e…
-
New methods enhance search result reranking with adaptive and long-context AI
Researchers have developed new methods to improve the reranking of search results, particularly in zero-resource scenarios where traditional supervised training is not feasible. One approach, DART, adapts a scoring func…
-
Google Embeddings 2 leads retrieval benchmarks but lags in speed
A new paper benchmarks Google Embeddings 2 (GE2) against several open-source models for multilingual dense retrieval and RAG systems. GE2 achieved top performance across multiple tasks, including BEIR and an Italian RAG…
-
New DIVE method compresses LLM embeddings for efficient vector search
Researchers have developed DIVE, a new method for compressing high-dimensional embeddings from large language models to reduce storage and computational costs in vector search systems. DIVE employs a self-limiting tripl…
-
Deduplication in RAG systems cuts context size without quality loss
A new preprint details an empirical analysis of byte-exact deduplication in Retrieval-Augmented Generation (RAG) systems. The study found significant context reduction across academic, enterprise, and conversational AI …
-
New RAG methods aim to boost AI factuality and reduce hallucinations
Several research papers published on arXiv in May 2026 introduce novel methods to enhance Retrieval-Augmented Generation (RAG) systems. These approaches focus on improving the robustness and trustworthiness of RAG by ad…
-
Rabtriever model efficiently retrieves rationales, reducing LLM computational costs
Researchers have developed Rabtriever, a novel method to improve the efficiency of rationale-based information retrieval. This approach uses on-policy distillation from generative rerankers, inspired by the Joint-Embedd…
-
UnIte method improves information retrieval domain adaptation with uncertainty sampling
Researchers have developed a new method called UnIte for unsupervised domain adaptation in information retrieval. This technique improves how neural retrievers generalize to new domains by strategically selecting docume…
-
A Reproducibility Study of LLM-Based Query Reformulation
Two new research papers explore the application and efficiency of Large Language Models (LLMs) in information retrieval. The first paper, a reproducibility study, evaluates ten LLM-based query reformulation methods acro…