- Features
- Pricing
- English
- français
- Deutsche
- Contact us
- Docs
- Login

Key takeaway: High-performance RAG requires more than just an embedding model; it requires a database that can handle vector similarity at scale. By consolidating on Upsun’s managed PostgreSQL with pgvector, you eliminate the "Egress Tax" and gain a database that scales with your agentic demand.
TL;DR: The RAG infrastructure blueprint
|
Many teams start their RAG journey by bolting a standalone vector database onto their existing stack.
In 2026, this is recognized as a primary driver of the "DevOps Tax." Every time your AI agent moves data between your primary database and a third-party vector store, you are paying in latency, egress costs, and "context drift."
The solution is consolidation. By using PostgreSQL with the pgvector extension on Upsun, your embeddings live in the same table as your application data. One backup strategy. One security model. One source of truth.
Key takeaway: For workloads under 5 million vectors, HNSW (Hierarchical Navigable Small World) indexes on a properly sized Upsun instance provide single-digit millisecond queries.
To achieve production-grade performance, the configuration of your vector index is critical. On Upsun, you have the vertical headroom to tune your database for high-dimensional search:
Key takeaway: RAG pipelines are notoriously "bursty." Upsun allows you to scale your database resources independently and surgically, ensuring your vector search remains performant during indexing spikes without overpaying for idle compute.
A sudden influx of user queries or a massive document re-indexing job can spike database load instantly. Traditional managed primitives often force you into rigid instance tiers where you pay for high CPU just to get the RAM required for vector indexing.
Surgical scaling in action:
Today, you can use the Upsun CLI or console to vertically scale your postgresql instance in seconds. Because the platform allows for independent allocation of vCPU and RAM, you can provide the specific memory overhead required for heavy HNSW indexing without over-provisioning the rest of your stack. This ensures that your self-correction loops and search queries remain responsive, regardless of the data volume.
Key takeaway: You should never test a new HNSW index or a schema migration in production. Upsun’s byte-level clones provide the only safe proving ground for RAG.
As discussed in The data context gap: why agents fail on fragmented stacks, the greatest risk to a RAG pipeline is the "Reality Gap."
vector_cosine_ops, Upsun creates a data-complete preview. This is a 1:1 clone of your production data.Don't let fragmented infrastructure be the reason your AI fails. By consolidating your vector and relational data on a platform designed for environment parity, you reclaim the innovation budget wasted on infrastructure plumbing.
Future-proof your data strategy:
Doesn't cloning production data violate privacy regulations like GDPR?
It would if you cloned it blindly. Upsun allows you to define sanitization hooks in your deployment pipeline. The moment a branch is created, a byte-level clone is made, and a sanitization script (e.g., masking emails or stripping PII) runs automatically before any developer or AI agent gains access. You get the shape and scale of production data without the compliance risk.
Does cloning a 500GB database for every branch explode our storage costs?
No. Upsun uses Copy-on-Write technology. When you clone an environment, you aren't physically duplicating 500GB of data. You are creating a "virtual" pointer to the existing data blocks. You only pay for the changes (diffs) made within that specific branch. This makes "Data-Complete Previews" economically viable even for massive datasets.
Will running an AI agent against a clone slow down our live production site?
Not at all. Because the clone is a logically isolated environment with its own dedicated resources, the AI agent can run heavy queries, re-index vector stores, or execute complex migrations without consuming a single CPU cycle from your production cluster.
How is this different from a traditional "Staging" database?
Traditional staging is a "shared" resource that quickly becomes a graveyard of stale data and conflicting migrations. Upsun provides Ephemeral Parity: every single Git branch gets its own unique, fresh clone. When you delete the branch, the environment (and its data) vanishes, ensuring no "Shadow Data" sprawl.
Can AI agents actually understand the infrastructure?
Yes, through the Upsun MCP Server. Instead of scripting API calls, your agent can create environments, add services, and monitor deployments using natural-language commands, grounded in the live state of your Upsun project rather than guesses about how your infrastructure is shaped.