What Limits Does Quantization Place on Dense Top-$k$ Retrieval? A Theoretical Study
A new theoretical study published on arXiv explores the limitations imposed by quantization on dense top-k retrieval systems. The research demonstrates that achieving perfect retrieval with B bits per coordinate requires the embedding dimension to grow logarithmically with the corpus size (N), contradicting previous assumptions of corpus independence at infinite precision. The findings suggest that practical vector databases and retrieval systems must increase embedding dimensions and potentially precision as their data corpus expands. AI
IMPACT Highlights that practical vector databases need to scale embedding dimensions with corpus size due to quantization limits.