Star on GitHub
DocsDatabases

Milvus

A distributed vector database with separated compute and storage. Built to scale to tens of billions of vectors across a cluster.

At a glance

License
Apache 2.0
Architecture
Disaggregated compute/storage
Indexes
HNSW, IVF-PQ, DiskANN, SCANN
Strength
Billion-scale clusters

Define a collection

py
from pymilvus import MilvusClient, DataType

client = MilvusClient("http://localhost:19530")

schema = client.create_schema()
schema.add_field("id",     DataType.INT64, is_primary=True)
schema.add_field("vector", DataType.FLOAT_VECTOR, dim=1536)

client.create_collection(collection_name="docs", schema=schema)
client.create_index(
    "docs",
    index_params=[{"field_name": "vector",
                   "index_type": "HNSW",
                   "metric_type": "COSINE",
                   "params": {"M": 16, "efConstruction": 200}}],
)

When to pick Milvus

Choose Milvus when a single node won't cut it — typically > 500M vectors, or when you need to mix multiple index types behind one API. For smaller workloads its operational footprint (etcd, MinIO, Pulsar) is overkill.