Embedding vector
An embedding vector is a numerical representation of a piece of content (text, image, audio) as a fixed-length vector — typically 384 to 3072 dimensions — produced by a neural network trained so that semantically similar inputs produce vectors close to each other in the embedding space. Embeddings power semantic search, recommendation, and clustering.
The defining property is that cosine similarity or dot product on embedding vectors corresponds to semantic similarity. 'cat' and 'kitten' produce vectors close together; 'cat' and 'spaceship' produce distant vectors. Embeddings let semantic search match queries to documents by meaning rather than keyword overlap — useful for content discovery, RAG retrieval, and clustering. Production embedding models vary by dimensionality (smaller = cheaper retrieval, larger = better quality), language support, and training corpus (general vs domain-specific). The choice of model matters significantly for retrieval quality; benchmarks like MTEB cover most production scenarios.
Related terms
- Vector database
A vector database stores embedding vectors and supports fast approximate-nearest-neighbour search at scale — answering 'find the K vectors most similar to this query vector' in milliseconds across millions or billions of vectors.
- Semantic search
Semantic search retrieves documents based on meaning rather than keyword overlap — using embedding vectors and similarity scoring to match queries to documents that express the same concept in different words.
- Retrieval-augmented generation (RAG)
Retrieval-augmented generation is the pattern where an LLM is given relevant context retrieved from an external source — typically via semantic search over a vector database — before generating its response.