Paper explains why mean pooling is effective for text embeddings

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 2 sources

This paper investigates the effectiveness of mean pooling in text embedding generation, a common technique that averages token embeddings. Researchers developed a metric to quantify information loss, specifically concerning second-order statistics, which can occur when distinct embedding distributions are mapped to similar text embeddings. Their findings indicate that modern text encoders, particularly those fine-tuned with contrastive learning, demonstrate robustness against this collapse, with this robustness correlating positively with downstream task performance. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Provides a new metric for evaluating text encoders, potentially guiding future model development and fine-tuning strategies.

RANK_REASON Academic paper published on arXiv detailing a new metric for text embeddings.

Read on arXiv cs.CL →

paper
other

COVERAGE [2]

arXiv cs.CL TIER_1 · Tomomasa Hara, Hiroto Kurita, Masaaki Imaizumi, Kentaro Inui, Sho Yokoi · 2026-05-01 04:00

Why Mean Pooling Works: Quantifying Second-Order Collapse in Text Embeddings

arXiv:2604.27398v1 Announce Type: new Abstract: For constructing text embeddings, mean pooling, which averages token embeddings, is the standard approach. This paper examines whether mean pooling actually works well in real models. First, we note that mean pooling can collapse in…
arXiv cs.CL TIER_1 · Sho Yokoi · 2026-04-30 04:09

Why Mean Pooling Works: Quantifying Second-Order Collapse in Text Embeddings

For constructing text embeddings, mean pooling, which averages token embeddings, is the standard approach. This paper examines whether mean pooling actually works well in real models. First, we note that mean pooling can collapse information beyond the first-order statistics of t…

COVERAGE [2]

Why Mean Pooling Works: Quantifying Second-Order Collapse in Text Embeddings

Why Mean Pooling Works: Quantifying Second-Order Collapse in Text Embeddings

RELATED ENTITIES

RELATED TOPICS