As vector search becomes a core building block for modern AI applications, many teams face a practical question: do we need a dedicated vector database, or can we start with what we already have?
For teams running PostgreSQL, pgvector is often the first answer they explore.
This article explains what pgvector is, why it matters, how to use it in practice, and where its limits start to show as traffic and data scale.
What Is pgvector?
pgvector is an open-source PostgreSQL extension that adds support for vector data types and similarity search directly inside Postgres. It allows you to store embedding vectors (such as those generated by OpenAI or other models) and run similarity queries using distance metrics like cosine similarity, L2 distance, or inner product.
From a system perspective, pgvector turns PostgreSQL into a vector-capable relational database, rather than a dedicated vector database. You keep SQL, transactions, and existing schemas, while adding vector search as another query primitive.
This design choice—extending Postgres instead of replacing it—is the core reason pgvector is both attractive and constrained.
Why pgvector (and Vector Databases) Matter
Vector databases exist because modern applications increasingly rely on semantic similarity, not exact matches.
Typical use cases include:
- Semantic search over documents
- Recommendation systems
- Retrieval-augmented generation (RAG)
- Deduplication and clustering
pgvector matters because it lowers the barrier to entry. If you already use PostgreSQL, you can experiment with vector search without introducing a new system.
That convenience makes pgvector ideal for:
- Early-stage AI features
- Prototypes and internal tools
- Low- to moderate-scale semantic search
However, as vector workloads grow, the trade-offs between “Postgres + extension” and purpose-built systems become more visible.
How pgvector Works (Conceptually)
pgvector introduces a new column type—vector—and allows similarity queries directly in SQL.
Conceptually:
- Embeddings are stored as fixed-length numeric arrays
- Similarity is computed at query time using a distance function
- Optional indexes reduce scan cost for larger datasets
This means vector search becomes part of your transactional database workload, sharing CPU, memory, and I/O with everything else running on Postgres.
That tight coupling is both a strength and a limitation.
Using pgvector in Practice
Prerequisites
Before using pgvector, you need:
- PostgreSQL 12 or newer
- A workload where vector search latency in the tens to hundreds of milliseconds is acceptable
- A clear understanding of your expected vector count and query rate
pgvector works best when vectors are measured in tens or hundreds of thousands, not hundreds of millions.
Installing pgvector
Installation is straightforward:
| SQL CREATE EXTENSION vector; |
On managed Postgres services, availability depends on provider support. In self-managed environments, it’s typically installed via package manager or built from source.
Basic pgvector Usage
A typical table looks like this:
| SQL CREATE TABLE documents ( id SERIAL PRIMARY KEY, content TEXT, embedding VECTOR(1536) ); |
Similarity queries are simple SQL:
| SQL SELECT id, content FROM documents ORDER BY embedding <-> ‘[…]’ LIMIT 5; |
This simplicity is one of pgvector’s biggest advantages: no new query language, no new infrastructure.
Indexing Vector Data with pgvector
For larger datasets, sequential scans become too slow. pgvector supports approximate indexes such as IVFFLAT.
| SQL CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops); |
Indexes improve performance but introduce trade-offs:
- Index build time increases with data size
- Accuracy becomes approximate
- Maintenance cost grows as data changes
This is often the point where teams begin to feel pgvector’s limits.
Integrating pgvector with Other Tools
pgvector integrates naturally with:
- ORMs and SQL-based applications
- Python and Node.js backends
- LLM pipelines that generate embeddings (OpenAI, Hugging Face, etc.)
Because everything is SQL-based, pgvector fits well into existing Postgres-centric stacks without major refactoring.
Building a Simple pgvector + OpenAI Application
A common pattern looks like this:
- Generate embeddings using OpenAI
- Store vectors in Postgres with pgvector
- Run similarity search during user queries
- Use results to enrich prompts (RAG)
This works well for demos and early production use—but it’s also where scalability questions usually start to surface.
Optimizing pgvector for High Traffic
As traffic grows, pgvector workloads start competing with transactional queries. At this stage, optimization becomes critical.
Key techniques include:
Connection Pooling
Prevent embedding queries from exhausting Postgres connections.
Query Performance Tuning
Limit result sets aggressively and avoid unnecessary joins.
Caching
Cache frequent similarity results outside Postgres to reduce load.
Read Replicas
Offload vector queries to replicas when possible.
PostgreSQL Configuration
Tune memory, work_mem, and autovacuum for mixed workloads.
Index Maintenance
Rebuild or adjust IVFFLAT indexes as data grows.
Monitoring
Track query latency, CPU usage, and index effectiveness.
Even with tuning, pgvector eventually reaches a point where Postgres becomes the bottleneck, not vector math.
pgvector and Its Alternatives
pgvector vs. Pinecone
Pinecone is a managed, purpose-built vector database optimized for large-scale, low-latency similarity search.
- pgvector: simpler, cheaper, SQL-based
- Pinecone: scalable, managed, operationally heavier
pgvector is often used first; Pinecone appears later when scale demands it.
pgvector vs. Milvus
Milvus is designed for massive vector workloads with specialized indexing and storage layers.
- pgvector: tightly coupled to Postgres
- Milvus: optimized for vector-first architectures
- Milvus fits better when vectors are the primary workload.
pgvector vs. Weaviate
Weaviate combines vector search with metadata filtering and schema awareness.
- pgvector: relational-first
- Weaviate: vector-first with built-in semantic features
The choice depends on whether your system is still fundamentally relational.
Where VeloDB Fits In
As vector workloads mature, many teams discover that their real challenge is not just vector similarity, but combining vector search with fast analytical queries, filtering, and real-time insights at scale.
In these scenarios, some teams move beyond embedding storage inside Postgres and adopt analytical databases designed for high-throughput querying. Platforms like VeloDB, built on Apache Doris, are used when teams need to:
- Run vector search alongside large-scale analytics
- Support low-latency queries under high concurrency
- Avoid overloading transactional databases with analytical workloads
The key shift is architectural: separating transactional concerns from analytical and vector-heavy workloads.
Conclusion
pgvector is a pragmatic and elegant solution for bringing vector search into PostgreSQL. It shines when:
- You want to move fast
- Your data size is moderate
- Your system is already Postgres-centric
However, pgvector is not a silver bullet. As vector counts, query rates, and analytical complexity grow, its tight coupling to Postgres becomes a constraint rather than an advantage.
Understanding when pgvector is enough—and when it’s time to look beyond it—is the real key to building scalable vector-powered systems.














