Star on GitHub
DocsConcepts

Similarity search

Three distance functions cover 99% of vector workloads. Picking the right one is mostly about how your embeddings were trained.

Cosine similarity

Measures the angle between two vectors, ignoring magnitude. This is the default for text embeddings from OpenAI, Cohere and most sentence-transformer models.

ts
function cosine(a: number[], b: number[]): number {
  let dot = 0, na = 0, nb = 0;
  for (let i = 0; i < a.length; i++) {
    dot += a[i] * b[i];
    na  += a[i] * a[i];
    nb  += b[i] * b[i];
  }
  return dot / (Math.sqrt(na) * Math.sqrt(nb));
}

Dot product

Equivalent to cosine when both vectors are unit-normalized, but cheaper to compute. Most production indexes default to this when they know vectors are normalized.

Euclidean (L2)

Measures straight-line distance. Common in image embeddings (CLIP, DINOv2) and any model where magnitude carries information.

Choosing

Rule of thumb: use whatever the model's documentation recommends. Mixing metrics will silently degrade recall.