5 reasons why I'm building our RAG application on Upsun

AIDjangoPostgreSQLPaaSAPI

18 October 2024

Robert Douglass

Developers are racing to harness the power of Large Language Models (LLMs) and apply advanced natural language processing across a myriad of applications. One technique that has surged in popularity is Retrieval-Augmented Generation (RAG), which involves using LLM-specific search methods to retrieve relevant information from data and feed it into an LLM alongside prompts. As an Entrepreneur in Residence at Open Strategy Partners, I’m developing such tools to amplify the value our B2B tech customers derive from our strategic marketing collaborations.

I chose Upsun to develop our applications for five key reasons that I believe are relevant to anyone working with RAG. Here’s a detailed look at each, roughly in order of importance:

1. Managed vector databases out of the box

RAG relies on search techniques that involve LLMs in preparing the search index. This necessitates a specialized database capable of semantic querying rather than traditional SQL querying. For instance, I can query a vector database with "A rose by any other name would smell as sweet" and receive results that are semantically similar, even if they don't contain the exact words.

Upsun offers several managed database options that cater to this need, including:

These options provide robust backends for modern RAG applications, eliminating the need to procure additional third-party services. Keeping the data and query engine within the same network as the application not only streamlines operations but also enhances performance.

2. Efficient cloning for testing (e.g., chunking)

RAG pre-processes texts by chunking them—breaking them into similarly sized pieces that fit within the LLM's embedding engine limits. The method of chunking significantly impacts the performance of a RAG application, making the choice of a chunking algorithm crucial.

Upsun simplifies this process by allowing the creation of a main environment containing all unchunked texts in the database. From there, I can create multiple branches to test different chunking algorithms. Since Upsun clones data from the parent environment when creating a new branch, I end up with identical copies of my application and data, each running independently with its own testable URLs. This setup enables direct comparison of different chunking strategies. Once the optimal chunker is identified, it can be merged into the main environment, and unnecessary branches can be deleted. No other system offers such an efficient workflow for this process.

3. Cost-effective embeddings sharing across environments

After chunking the texts, the next step is to create embeddings by sending these chunks to an LLM, which converts them into vector arrays representing the semantic understanding of the text. Each chunk processed incurs an API call to the LLM, translating to time, computational resources, and costs. These embeddings are valuable assets, and duplicating them would be wasteful.

With Upsun, embeddings can be generated in a parent environment and easily shared with all developers by synchronizing this parent environment with their individual development environments. This approach ensures that embeddings are never duplicated, saving costs and maintaining consistency across the team. Additionally, it eliminates the need for complex DevOps setups, as synchronization is achieved with a single command on Upsun.

4. Secure management of API keys and secrets

Interacting with LLMs necessitates careful handling of API keys and other sensitive secrets. Mismanagement can lead to API keys leaking into public repositories like GitHub, accidental invalidation of production keys, or unauthorized usage leading to unpredictable costs.

Upsun addresses these challenges with a robust system for storing and managing secret keys. It allows for:

Environment-Specific Keys: Different keys can be reserved exclusively for production, testing, or specific projects.
Access Control: Keys can be restricted to certain environments or projects, enhancing security.
Expense Monitoring: By controlling API key usage, it's easier to monitor and manage associated costs.

For example, I recently used Upsun to provide OpenAI keys to an intern, ensuring they could only perform specific types of operations while keeping a close eye on API call expenses.

5. Modern development tools and secure builds

Upsun isn't just about databases and environment management; it also excels in supporting modern application development practices. In our setup, we utilize:

Django (Python): For building robust web applications.
FastAPI: As our REST framework for efficient API development.
Celery: For background workers handling long-running tasks, ensuring the main Django app remains responsive.

Key Advantages of Upsun:

Powerful Request Routing: Facilitates the design of custom REST APIs tailored to our needs.
Deterministic and Immutable Builds: Upsun builds the codebase deterministically from requirements.txt and packages the code into immutable images. This ensures that our application remains secure from unauthorized changes and is protected against a range of potential cyber threats that exploit writable file systems.

For more information I’ve written a series of articles about developing Django projects on Upsun.

Conclusion

Choosing the right platform is crucial for the success of RAG applications. Upsun stands out by offering comprehensive managed services, efficient testing workflows, cost-effective embedding management, secure secret handling, and support for modern development practices. These features collectively make Upsun an ideal choice for building scalable, secure, and high-performance RAG applications. If you're venturing into the world of Retrieval-Augmented Generation, Upsun is certainly a platform worth considering.

About the author

Robert Douglass, a former member of the Platform.sh team, helps product teams bring their greatest innovations to life. He is currently building applications that amplify the value of strategic marketing assets for B2B tech companies as Entrepreneur in Residence at Open Strategy Partners GmbH.