Indexing
Exact nearest-neighbor search is O(N·d). Approximate indexes trade a tiny bit of recall for orders of magnitude in speed.
HNSW — Hierarchical Navigable Small World
A multi-layer graph where each node is connected to a handful of neighbors. Queries enter at the top sparse layer and greedily descend, dramatically cutting comparisons. The default choice in Qdrant, Weaviate, Pinecone serverless and pgvector ≥ 0.5.
IVF — Inverted File
Partitions vectors into nlist Voronoi cells via k-means. A query only scans the closest nprobe cells. Combined with product quantization (IVF-PQ) it compresses each vector to a few bytes — perfect for billion-scale indexes that don't fit in RAM.
recall ↑ nprobe ↑ latency ↑
recall ↓ nprobe ↓ latency ↓DiskANN & Vamana
Graph indexes optimized for SSD. The graph lives on disk and only the search path is paged in. Used by Milvus 2.x and Microsoft Bing for tens of billions of vectors on a single node.