Skip to main content
Retrieval-Augmented Generation Crawler graph 3 sliders

Embeddings and Vector Search, Without the Math

Embeddings turn meaning into coordinates. Move the dimension, top-k, and metric sliders to see how a vector store finds the nearest neighbours.

· 2 min de leitura
Ir para o laboratório
▸ Experimente você mesmo

Arraste um slider — o diagrama reage em tempo real.

FR /100
¶ A analogia

The map-of-meaning analogy

Imagine a vast city where every shop is placed by what it sells. Bakeries cluster on one street. Bookshops down the next. Vegan bakeries sit at the intersection of bakery and health-food. Walk anywhere and the shops nearby will feel related.

An embedding model is the mapmaker. It places every chunk of text (or image, or audio) at coordinates in a high-dimensional space, such that similar things end up close together. Vector search is just asking: "what is nearest to this point?"

What an embedding actually is

An embedding is a list of numbers — typically 384 to 3072 dimensions — produced by a model that has learned what "similar" means from billions of examples.

"how do I reset my password" → [0.014, -0.221, 0.087, … ]   (768 numbers)
"forgot my login"             → [0.019, -0.213, 0.091, … ]   (very close)
"recipe for sourdough"        → [-0.41,  0.318, 0.002, … ]   (far away)

You can't read the numbers. You don't have to. All you need is the distance between two of them.

The two distances you'll meet

  • Cosine similarity — angle between vectors, ignores length. Default for text. Range −1 to 1, where 1 is "identical meaning."
  • Dot product / inner product — fast, sensitive to magnitude. Common when vectors are normalised.

For most RAG systems, cosine is the right answer until you have a reason otherwise.

Why we don't just compare every pair

A million chunks × one query = a million distance calculations. Doable, but slow. Vector databases use Approximate Nearest Neighbour (ANN) indexes — HNSW, IVF, ScaNN — that trade a tiny bit of recall for huge speedups. A good ANN index returns top-k in milliseconds over hundreds of millions of vectors.

What you actually need to choose

Decision Sane default
Embedding model A modern hosted model matched to your text language
Vector dimensions What the model gives you — do not truncate without testing
Index type HNSW for most workloads
Distance metric Cosine for text
Top-k 5–10, then rerank

Where embeddings break

  • Domain shift — a general-purpose embedder may not know your jargon. Try domain-specific or fine-tuned variants.
  • Multi-language — pick a multilingual embedder or embed and query in the same language.
  • Long documents — embedding a 10-page PDF as one vector loses everything. Chunk first, then embed.
  • Symmetry trap — passage embedders and query embedders can be different models. Read the docs.
Engr Mejba Ahmed

Engr Mejba Ahmed

Claude Code Expert · Online

👋

Hey there!

Quick Actions

WhatsApp Instant reply

Chat on WhatsApp

+880 1723 741224 · Instant reply

Popular Questions

Engr Mejba Ahmed is connected
Engr Mejba Ahmed is typing...
Engr Mejba Ahmed avatar

✉ Want me to follow up? Drop your email

Engr Mejba Ahmed avatar

📞 Connect Directly

Choose how you'd like to reach me

WhatsApp

+880 1723 741224

Email

[email protected]

✓ Details sent! I'll get back to you shortly.

Powered by OpenAI

335+

Blog Posts

25

AI Courses

63

Projects

Services & Expertise

Pricing & Process

Learning & Resources

Connect & Support