Star on GitHub
DocsDatabases

Chroma

An embedded vector database designed for fast iteration. Run it in-process during development, switch to the server for production.

At a glance

License
Apache 2.0
Hosting
Embedded / self-host
Default index
HNSW (hnswlib)
Strength
Prototyping + LangChain ergonomics

Add documents

py
import chromadb

client = chromadb.PersistentClient(path="./chroma")
collection = client.get_or_create_collection("docs")

collection.add(
    ids=["doc-1", "doc-2"],
    documents=["embeddings are dense vectors", "ANN trades recall for speed"],
    metadatas=[{"src": "intro"}, {"src": "indexing"}],
)

Chroma can embed for you using a default sentence-transformer model. Override it via the embedding_function argument when you need a specific provider.

Query

py
results = collection.query(
    query_texts=["how do approximate indexes work?"],
    n_results=3,
    where={"src": "indexing"},
)