“Beta, what on earth is a ‘vector embedding’ and why are people so excited about it?”
A vector embedding is a way to turn messy, human things (like words, sentences, images, songs, products) into lists of numbers so a computer can measure how similar they are. Things that “mean” similar things end up as vectors that point in similar directions—so the computer can find, group, and recommend them.
The Kitchen-Shelf Analogy
Imagine you’re arranging spices for your mom. You decide to sort them by three qualities:
Spiciness (mild → spicy)
Sweetness (not sweet → sweet)
Aroma (low → high)
Now each spice can be described by three numbers—one for each quality:
Chili powder → [0.9, 0.0, 0.7]
Cinnamon → [0.2, 0.8, 0.6]
Turmeric → [0.3, 0.0, 0.4]
These little number lists are vectors. You’ve just created a tiny embedding space for spices. Spices that are “similar” sit closer together in this 3D world.
That’s exactly what vector embeddings do with language, images, audio, etc. They map them into a space where nearness ≈ similarity.
A vector is just an ordered list of numbers, like [0.12, -0.08, 0.77, ... ].
If you’ve ever used coordinates on a map, you’ve used vectors. On Google Maps, a place might be (latitude, longitude). With embeddings, a sentence might be (x1, x2, x3, …, x768)—many dimensions instead of two.
An embedding is the result of passing something (a word, sentence, image, etc.) through a trained model that outputs a vector. The model learns to place similar things near each other. After training, the embedding of:
“Doctor” will be close to “physician.”
“How to cook rice” will be close to “best way to make rice.”
A photo of a dog will be near photos of other dogs.
Let’s pretend we rate drinks on Energy, Sweetness, Temperature:
DrinkVector (E, S, T)Coffee[0.9, 0.1, 0.8]Tea[0.6, 0.2, 0.8]Smoothie[0.2, 0.8, 0.5]
If Mom says, “I want something warm and mildly sweet,” that query might embed to [0.5, 0.3, 0.8].
Which drink’s vector is closest? Likely Tea. That’s recommendations in a nutshell.
0
4
0