Ascend-RaBitQ: Heterogeneous NPU-CPU Acceleration of Billion-Scale Similarity Search with 1-bit Quantization
Researchers have developed Ascend-RaBitQ, a novel system designed to accelerate billion-scale vector similarity search by leveraging heterogeneous NPU-CPU architectures. This approach decouples coarse ranking on NPUs with 1-bit quantized vectors from fine ranking on CPUs with full-precision vectors, overcoming limitations of traditional CPU-based methods. The system demonstrates significant improvements in index construction speed and throughput compared to CPU-only baselines, showcasing promising scalability on distributed multi-NPU systems. AI
IMPACT Enables more efficient and scalable vector similarity search, crucial for large-scale AI applications.