The Vector Database Hype Cycle
In 2023, vector databases became one of the most hyped technologies in the AI stack. Pinecone, Weaviate, Qdrant, Milvus, Chroma, and a dozen others competed for developer mindshare. Embedding pipelines became a standard component of every AI application architecture. And then the correction: many teams that had adopted vector databases found they were solving problems they didn't have, adding complexity without proportional benefit.
This article provides a clear-eyed view of when vector databases genuinely improve search quality and when simpler alternatives perform as well or better.
What Vector Search Is
Dense retrieval converts both queries and documents into dense vector representations (embeddings) using a neural encoder (e.g., E5, BGE, OpenAI ada-002). Retrieval finds the nearest neighbors in embedding space — documents most semantically similar to the query, regardless of exact term overlap. This solves the vocabulary mismatch problem that BM25 struggles with.
When Vector Search Genuinely Helps
- Short, ambiguous queries on large corpora: "Something about risk management" — BM25 requires matching terms; dense retrieval finds related concepts.
- Cross-lingual search: Multilingual embeddings (mE5, LaBSE) can find relevant documents in different languages than the query.
- Product/recommendation search: "Show me shirts like this one" — similarity-based retrieval with image or product embeddings.
- High-value, low-query-volume applications: Where the cost of false negatives (missing relevant documents) is high enough to justify the infrastructure overhead.
When BM25 + AI Expansion Beats Vector Search
Counterintuitively, pure vector search often underperforms hybrid approaches that combine BM25 with query expansion or reranking:
- For technical queries with specific terminology, BM25 exact matching outperforms semantic search which may find "related" but not the specific document
- For named entity search (specific products, people, error codes), term matching is more reliable than semantic embedding
- For small-to-medium corpora (<1M documents), BM25+expansion+reranking typically matches vector search quality at lower infrastructure cost
The Pragmatic Decision
Practical guidance:
- Start with BM25. Measure your baseline quality on representative queries.
- Add LLM query expansion. Measure improvement. Cost: ~$0.001/query.
- Add a cross-encoder reranker. Measure improvement. Cost: compute to run encoder on top-K results.
- If you still need better recall, add dense retrieval as a second retrieval path (hybrid search with RRF). Cost: embedding infrastructure.
- Only replace BM25 entirely with vector search if you have a strong reason specific to your use case.
Most teams that follow this progression find they achieve their quality targets at step 2 or 3, never needing full vector search infrastructure.