/blog
RAGRetrieval6 min

RAG Is Not a Vector Database

Retrieval quality lives in chunking, query rewriting, and rerankers — not in which vector store you picked.

May 30, 2026

Teams spend weeks benchmarking pgvector vs. Pinecone vs. Qdrant and ship the exact same mediocre answers, because the vector store was never the bottleneck.

The actual stack

  1. Chunking — semantic, not fixed-size. Respect document structure.
  2. Query rewriting — a small LLM turns a vague question into 2–3 search queries.
  3. Hybrid retrieval — BM25 + dense, then fuse.
  4. Reranker — a cross-encoder over the top 50.
  5. Context packing — fit the budget, cite the source.

Swap any vector store out and your answers move by single-digit percent. Add a reranker and they move by 20+.