What is the cosine similarity range for embeddings?

In theory, cosine similarity is bounded by -1 and 1. For text embeddings from modern models, scores almost always land between 0 and 1, because language models place related concepts in a roughly positive subspace and truly opposite content rarely produces a negative score in practice.

Is cosine similarity the same as dot product?

Only when both vectors are unit-normalized (length 1). In that case the denominator of the cosine formula equals 1, and the dot product equals the cosine. Most production embedding models normalize their outputs, which is why many vector databases use dot product internally and call it cosine similarity.

Why not use Euclidean distance for embeddings?

Euclidean distance is sensitive to vector magnitude, so longer or denser documents can dominate or be penalized in ways that have nothing to do with meaning. Cosine similarity strips out length and scores direction only, which matches how semantic closeness behaves in embedding space.

What is a good cosine similarity threshold for RAG?

It depends on the embedding model and the task, but 0.7 to 0.9 is a typical range for chunks considered 'related' in retrieval-augmented generation. The right number is best found empirically by inspecting top results for your own queries rather than copying a fixed cutoff.

Does pgvector support cosine similarity?

Yes. The cosine distance operator is ` ` and the matching index operator class is `vector_cosine_ops`. Cosine similarity equals 1 minus cosine distance. For already-normalized vectors, the inner product operator ` ` gives the same ranking with less arithmetic.

What is Cosine Similarity? (How Vector Search Ranks Documents)

What cosine similarity actually is

Cosine similarity is a number between -1 and 1 that tells you how aligned two vectors are. If the vectors point in exactly the same direction, the score is 1. If they are perpendicular (totally unrelated, in the geometric sense), the score is 0. If they point in opposite directions, the score is -1.

The key idea: cosine similarity looks at direction only. It throws away length. That property is the whole reason it became the default metric for ranking text by semantic closeness.

In a retrieval pipeline, you turn every document chunk and every user question into a high-dimensional vector with an embedding model. Cosine similarity is then the scoring function that decides which chunk best answers the question. ChatRaj's semantic search ranks document chunks by cosine similarity between the question embedding and each chunk's embedding.

How cosine similarity is computed

The formula is straightforward:

code

cos(θ) = (A · B) / (||A|| × ||B||)

Where:

A · B is the dot product of the two vectors (sum of pairwise products of their components).
||A|| and ||B|| are the Euclidean lengths (magnitudes) of each vector.

So the numerator captures how much the two vectors "agree" component by component, and the denominator divides out their lengths. What survives is the cosine of the angle between them.

The general range is [-1, 1]. For real text embeddings, the score almost always lands in [0, 1] because language models put related concepts in a roughly positive subspace, and truly antonymous content does not produce a vector pointing in the exact opposite direction. In practice a "good" match in a RAG system sits somewhere around 0.7 to 0.9, depending on the model.

Concrete example. Imagine two short queries: "How do I reset my password?" and "I forgot my login credentials, what now?" These sentences are different lengths and share almost no words, so a keyword scorer would underrate the second one. But their embeddings point in nearly the same direction, so the cosine similarity between them is high (often above 0.85 with a modern model). Cosine similarity is what lets the chatbot recognize them as the same intent.

Why cosine similarity matters for AI chatbots

Magnitude invariance is the practical reason cosine wins for embedding search. A long document's embedding tends to have a larger magnitude than a short snippet's. If you used raw Euclidean distance, the long document could either dominate or be unfairly penalized, depending on the data. Cosine sidesteps that entirely. It asks: "regardless of how 'loud' each vector is, are they pointing the same way?"

For a chatbot doing dense retrieval, this matters because you want the model to find the chunk whose meaning matches the question, not the chunk whose embedding happens to have the right size. A two-sentence FAQ answer and a five-paragraph blog post should be ranked by how relevant they are, not by length.

There is a useful shortcut that production systems lean on heavily. If you normalize every vector to unit length (divide by its magnitude up front), then the denominator in the cosine formula becomes 1, and:

code

cos(θ) = A · B   (when ||A|| = ||B|| = 1)

The cosine similarity is exactly the dot product. Most modern embedding models, including OpenAI's text-embedding-3 family and the BGE series, return unit-normalized vectors by default. So most vector databases compute a plain inner product under the hood and call it cosine similarity. It is faster (no division, no square roots) and gives identical results.

Postgres with the pgvector extension exposes this directly. The <=> operator computes cosine distance, where:

code

cosine_distance = 1 - cosine_similarity

To index for it you create an HNSW or IVFFlat index with vector_cosine_ops. For already-normalized vectors, pgvector's inner product operator <#> (negative inner product) gives the same ranking as cosine distance with less arithmetic, which is why production teams often switch to it once they confirm their model normalizes outputs.

Cosine similarity vs dot product vs Euclidean distance

These three are the metrics you will encounter in any vector store. They are related but not interchangeable.

Cosine similarity measures angle. Magnitude does not matter. Range [-1, 1], higher is more similar. Use it for embedding based semantic search by default.

Dot product measures angle and magnitude together. It is equal to cosine similarity only when both vectors are unit-normalized. If your model does not normalize, dot product will reward longer vectors, which is sometimes what you want (some recommendation systems exploit this on purpose) and sometimes a bug.

Euclidean distance (also called L2) measures straight line distance between the two vector tips. Two vectors pointing the same direction but with very different magnitudes are far apart in Euclidean terms but identical in cosine terms. For text embeddings, Euclidean tends to underperform cosine because it lets vector length leak into the score.

Rule of thumb: if your embedding model normalizes outputs (most do), pick cosine or inner product, they will give the same ranking. If your model does not normalize and length carries meaning, dot product. If you are doing geometric clustering on positions in physical space, Euclidean. For a chatbot retrieving documentation chunks, cosine similarity is the right answer almost every time.

Cosine similarity

What cosine similarity actually is

How cosine similarity is computed

Why cosine similarity matters for AI chatbots

Cosine similarity vs dot product vs Euclidean distance

Common Cosine similarity questions

Sources & further reading

Ship your first chatbot in 60 seconds.

Cosine similarity

What cosine similarity actually is

How cosine similarity is computed

Why cosine similarity matters for AI chatbots

Cosine similarity vs dot product vs Euclidean distance

Related terms

Common Cosine similarity questions

Sources & further reading

Ship your first chatbot in 60 seconds.