PulseAugur
EN
LIVE 09:32:39

DREAM paper proposes autoregressive modeling for dense retrieval training

Researchers have developed DREAM (Dense Retrieval Embeddings via Autoregressive Modeling), a novel method for training dense retrieval systems. Unlike traditional methods that rely on costly labeled data, DREAM leverages the next-token prediction objective of large language models (LLMs) to supervise the training process. By injecting query-document similarity scores into an LLM's attention heads, DREAM enables the prediction loss to provide gradients for the retriever. Evaluations on retrieval benchmarks show that DREAM consistently outperforms existing baselines across various model scales. AI

IMPACT This approach could reduce the reliance on expensive labeled datasets for training retrieval systems, potentially accelerating development.

RANK_REASON The cluster describes a new research paper detailing a novel method for training AI models.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

DREAM paper proposes autoregressive modeling for dense retrieval training

COVERAGE [3]

  1. arXiv cs.CL TIER_1 (AF) · Yixuan Tang, Yi Yang ·

    DREAM: Dense Retrieval Embeddings via Autoregressive Modeling

    arXiv:2606.24667v1 Announce Type: new Abstract: Dense retrieval embedding models are a fundamental component of modern retrieval-based AI systems. Most dense retrievers are trained with contrastive objectives, which require labeled positive and negative document pairs that are of…

  2. arXiv cs.CL TIER_1 (AF) · Yi Yang ·

    DREAM: Dense Retrieval Embeddings via Autoregressive Modeling

    Dense retrieval embedding models are a fundamental component of modern retrieval-based AI systems. Most dense retrievers are trained with contrastive objectives, which require labeled positive and negative document pairs that are often costly and difficult to obtain. In this work…

  3. Hugging Face Daily Papers TIER_1 (AF) ·

    DREAM: Dense Retrieval Embeddings via Autoregressive Modeling

    DREAM trains dense retrieval embeddings using autoregressive language model attention mechanisms to supervise document-query similarity without requiring labeled examples.