a diagram of a RAG application deployed on GKE
a diagram of a RAG application deployed on GKE
Scalable RAG with GKE and Qdrant

Have you ever struggled to locate that perfect piece of code you wrote months ago? In this article, I will guide you on how to create an LLM application using LlamaIndex and Qdrant that will allow you to interact with your GitHub repositories, making it easier than ever to find forgotten code snippets. We’ll deploy the application on Google Kubernetes Engine (GKE) with Docker and FastAPI and provide an intuitive Streamlit UI for sending queries

This article was featured in the GKE Newsletter (This week in GKE, ISSUE#19, 12 July 2024)

a diagram of a RAG application deployed on AWS
a diagram of a RAG application deployed on AWS
LLM App with AWS Lambda and Qdrant

In this post, I explain how to build a serverless application to perform semantic search over academic papers using AWS Lambda and Qdrant. I used LangChain and OpenAI’s embeddings to create vector representations of document chunks and store them in Qdrant. A simple shell script helps build and push the Docker image to AWS ECR and deploy it as an AWS Lambda function. After testing the Lambda function, I created an API Gateway endpoint and built a Streamlit application to interact with the Lambda function

Articles

List of published articles

a photo of a multimodal RAG application
a photo of a multimodal RAG application
Multimodal LLM with Qdrant and Gemini

Are you feeling hungry and craving your favorite recipes? Imagine having a YouTube playlist filled with your top recipe videos, complete with image frames and detailed descriptions. In this article, I guide you through the process of extracting videos and descriptions from a playlist, capturing images as frames, and storing everything in a Qdrant cluster. You’ll have two separate collections: one for text and one for images. It’s a perfect blend of technology and culinary delight!

RAG App with AWS CDK, Qdrant and LlamaIndex

Infrastructure as Code (IaC) is a modern technique for managing and provisioning infrastructure resources through code. Rather than manually configuring these resources, you specify them in machine-readable configuration files. AWS offers two IaC tools: AWS CloudFormation and AWS CDK. CloudFormation provisions AWS resources using templates written in JSON or YAML, while the AWS CDK allows you to provision resources using familiar programming languages like Python. The CDK acts as an abstraction layer that simplifies the creation of CloudFormation templates.

Agentic RAG Using Claude, LlamaIndex, and Milvus

As AI systems continue to evolve rapidly, relying solely on large language models (LLMs) is no longer sufficient to meet the diverse needs of today’s industries. These increasing challenges require the development of more complex architectures that can solve problems more efficiently and effectively. At the Unstructured Data Meetup hosted by Zilliz, Bill Zhang, Director of Engineering at Zilliz, introduced the concept of Compound AI Systems, which was featured in the Berkeley AI Research (BAIR) blog. This modular approach integrates multiple components to handle various tasks rather than relying on a single AI model, delivering more tailored and efficient results. You can watch Bill’s presentation on the Zilliz YouTube channel.