All glossary terms
Cross-cutting

Vector database

A vector database stores embedding vectors and supports fast approximate-nearest-neighbour search at scale — answering 'find the K vectors most similar to this query vector' in milliseconds across millions or billions of vectors. Examples: Pinecone, Weaviate, Qdrant, Milvus, pgvector (Postgres extension), Chroma.

Exact nearest-neighbour is O(N) and infeasible at scale. Vector databases use approximate algorithms — HNSW (hierarchical navigable small world graphs) is the dominant choice — that trade a small recall loss for orders-of-magnitude speed improvement. The choice between dedicated vector databases and adding vector columns to existing operational databases is mostly an operational question: dedicated vector DBs are faster and more scalable; the operational simplicity of one database (pgvector in your existing Postgres) often wins for smaller corpora. The break-even is roughly 10M+ vectors; below that, pgvector or Mongo Atlas Vector Search usually suffice.

Related terms