Scalable RAG with GKE and Qdrant
Have you ever struggled to locate that perfect piece of code you wrote months ago? In this article, I will guide you on how to create an LLM application using LlamaIndex and Qdrant that will allow you to interact with your GitHub repositories, making it easier than ever to find forgotten code snippets. We’ll deploy the application on Google Kubernetes Engine (GKE) with Docker and FastAPI and provide an intuitive Streamlit UI for sending queries
This article was featured in the GKE Newsletter (This week in GKE, ISSUE#19, 12 July 2024)
LLM App with AWS Lambda and Qdrant
In this post, I explain how to build a serverless application to perform semantic search over academic papers using AWS Lambda and Qdrant. I used LangChain and OpenAI’s embeddings to create vector representations of document chunks and store them in Qdrant. A simple shell script helps build and push the Docker image to AWS ECR and deploy it as an AWS Lambda function. After testing the Lambda function, I created an API Gateway endpoint and built a Streamlit application to interact with the Lambda function
Articles
List of published articles
Multimodal LLM with Qdrant and Gemini
Are you feeling hungry and craving your favorite recipes? Imagine having a YouTube playlist filled with your top recipe videos, complete with image frames and detailed descriptions. In this article, I guide you through the process of extracting videos and descriptions from a playlist, capturing images as frames, and storing everything in a Qdrant cluster. You’ll have two separate collections: one for text and one for images. It’s a perfect blend of technology and culinary delight!
RAG App with AWS CDK, Qdrant and LlamaIndex
Infrastructure as Code (IaC) is a modern technique for managing and provisioning infrastructure resources through code. Rather than manually configuring these resources, you specify them in machine-readable configuration files. AWS offers two IaC tools: AWS CloudFormation and AWS CDK. CloudFormation provisions AWS resources using templates written in JSON or YAML, while the AWS CDK allows you to provision resources using familiar programming languages like Python. The CDK acts as an abstraction layer that simplifies the creation of CloudFormation templates.
Agentic RAG Using Claude, LlamaIndex, and Milvus
As AI systems continue to evolve rapidly, relying solely on large language models (LLMs) is no longer sufficient to meet the diverse needs of today’s industries. These increasing challenges require the development of more complex architectures that can solve problems more efficiently and effectively. At the Unstructured Data Meetup hosted by Zilliz, Bill Zhang, Director of Engineering at Zilliz, introduced the concept of Compound AI Systems, which was featured in the Berkeley AI Research (BAIR) blog. This modular approach integrates multiple components to handle various tasks rather than relying on a single AI model, delivering more tailored and efficient results. You can watch Bill’s presentation on the Zilliz YouTube channel.