Beir
PulseAugur coverage of Beir — every cluster mentioning Beir across labs, papers, and developer communities, ranked by signal.
2 天有情绪数据
-
Google Embeddings 2 在检索基准测试中领先但速度较慢
一篇新论文对 Google Embeddings 2 (GE2) 与多个开源模型在多语言密集检索和 RAG 系统上的表现进行了基准测试。GE2 在包括 BEIR 和意大利语 RAG 语料库在内的多项任务中取得了最佳性能,但与本地模型相比,其延迟显著更高。Multilingual-E5-large (mE5-L) 在意大利语检索方面提供了相当的性能,但延迟低得多,使其成为对响应时间有严格要求的应用的更实用选择。
-
New DIVE method compresses LLM embeddings for efficient vector search
Researchers have developed DIVE, a new method for compressing high-dimensional embeddings from large language models to reduce storage and computational costs in vector search systems. DIVE employs a self-limiting tripl…
-
Deduplication in RAG systems cuts context size without quality loss
A new preprint details an empirical analysis of byte-exact deduplication in Retrieval-Augmented Generation (RAG) systems. The study found significant context reduction across academic, enterprise, and conversational AI …
-
New RAG methods aim to boost AI factuality and reduce hallucinations
Several research papers published on arXiv in May 2026 introduce novel methods to enhance Retrieval-Augmented Generation (RAG) systems. These approaches focus on improving the robustness and trustworthiness of RAG by ad…
-
Rabtriever model efficiently retrieves rationales, reducing LLM computational costs
Researchers have developed Rabtriever, a novel method to improve the efficiency of rationale-based information retrieval. This approach uses on-policy distillation from generative rerankers, inspired by the Joint-Embedd…
-
UnIte method improves information retrieval domain adaptation with uncertainty sampling
Researchers have developed a new method called UnIte for unsupervised domain adaptation in information retrieval. This technique improves how neural retrievers generalize to new domains by strategically selecting docume…
-
A Reproducibility Study of LLM-Based Query Reformulation
Two new research papers explore the application and efficiency of Large Language Models (LLMs) in information retrieval. The first paper, a reproducibility study, evaluates ten LLM-based query reformulation methods acro…