Logo

dev-resources.site

for different kinds of informations.

Creating a Simple RAG in Python with AzureOpenAI and LlamaIndex

Published at
10/24/2024
Categories
llamaindex
rag
azure
ai
Author
snikidev
Categories
4 categories in total
llamaindex
open
rag
open
azure
open
ai
open
Author
8 person written this
snikidev
open
Creating a Simple RAG in Python with AzureOpenAI and LlamaIndex

During a recent stream, I explored the process of ingesting a PDF using LlamaIndex and AzureOpenAI. This blog post will guide you through the steps to accomplish this task.

The objective was straightforward: answer questions based on information contained in PDF files stored in a folder. Here's the process we'll follow:

  1. Embed the PDF data: Convert PDFs into a format comprehensible for AI
  2. Initialise AzureOpenAI and provide it with the embedded data
  3. Ask a question and analyse the response

While the LlamaIndex documentation provides an excellent guide, I've included some visual aids to enhance clarity.

Setting Up the Environment

First, let's install the necessary packages:

pip install dotenv llama-index llama-index-llms-azure-openai llama-index-embeddings-azure-openai
Enter fullscreen mode Exit fullscreen mode

Note that AzureOpenAI is not included in the llama-index package and must be installed separately.

Configuring AzureOpenAI

Navigate to Azure AI Studio and deploy your chosen model.

Deploying LLM Model

Once deployed, configure LlamaIndex to use it with the following settings:

Settings.llm = AzureOpenAI(
    engine="gpt-4o-mini",
    api_key=os.environ.get('AZURE_OPENAI_API_KEY'),
    azure_endpoint=os.environ.get('AZURE_OPENAI_ENDPOINT'),
    api_version="2024-05-01-preview",
)
Enter fullscreen mode Exit fullscreen mode

BUT.

Azure openAI resources unfortunately differ from standard openAI resources as you can't generate embeddings unless you use an embedding model.

This means that we need a separate embedding model for generating embeddings. Let’s do that.

Deploy an embedding model that starts with text-embedding-*, since in this case we're working exclusively with text.

Deploying Embedding Model

Configure LlamaIndex to use the embedding model:

Settings.embed_model = AzureOpenAIEmbedding(
    model="text-embedding-3-small",
    deployment_name="text-embedding-3-small",
    api_key=os.environ.get('AZURE_OPENAI_API_KEY'),
    azure_endpoint=os.environ.get('AZURE_OPENAI_EMBEDDING_ENDPOINT'),
    api_version='2023-05-15',
)
Enter fullscreen mode Exit fullscreen mode

Implementing the RAG System

With the setup complete, let's load PDFs into the data folder and query the model:

def main():
    documents = SimpleDirectoryReader("data").load_data()
    index = VectorStoreIndex.from_documents(documents)
    query_engine = index.as_query_engine()
    response = query_engine.query("Can we advertise online gambling to under 18 year olds?")
    print(response)
Enter fullscreen mode Exit fullscreen mode

To run the script in our virtual environment:

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python ./src/main.py
Enter fullscreen mode Exit fullscreen mode

And we get a response of

No, advertising online gambling to under 18 year olds is not permitted. Marketing communications must not exploit the vulnerabilities of this age group, and any advertisements that feature under-18s or are directed at them are likely to be considered irresponsible and in breach of established rules.
Enter fullscreen mode Exit fullscreen mode

Future Improvements

We can make it better by embedding the data once and storing it in a database like Azure CosmosDB, which is a MongoDB at its core. This means we get rid of repetitive embedding overhead every time we run the script, allowing the LLM to access pre-embedded data directly from the database. I'll cover this in my future posts.

And here is the repo with this and any future code.

llamaindex Article's
27 articles in total
Favicon
Build Your First AI Application Using LlamaIndex!
Favicon
First step and troubleshooting Docling β€” RAG with LlamaIndex on my CPU laptop
Favicon
LlamaIndex RAG: Build Efficient GraphRAG Systems
Favicon
RedLM: My submission for the NVIDIA and LlamaIndex Developer Contest
Favicon
Exploring RAG: Discover How LangChain and LlamaIndex Transform LLMs?
Favicon
Building a Multi-Agent Framework from Scratch with LlamaIndex
Favicon
Creating a Simple RAG in Python with AzureOpenAI and LlamaIndex
Favicon
Implementing RAG using LlamaIndex, Pinecone and Langtrace: A Step-by-Step Guide
Favicon
LlamaIndex: Revolutionizing Data Indexing for Large Language Models (Part 1)
Favicon
How to Connect to Milvus Lite Using LangChain and LlamaIndex
Favicon
Choosing Between LlamaIndex and LangChain: A Comprehensive Guide
Favicon
LlamaIndex Framework - Context-Augmented LLM Applications
Favicon
Code the Vote!
Favicon
TypeError: Object of type AgentChatResponse is not JSON serializable
Favicon
Chat with your Github Repo using llama_index and chainlit
Favicon
How to Implement RAG with LlamaIndex, LangChain, and Heroku: A Simple Walkthrough
Favicon
RAG observability in 2 lines of code with Llama Index & Langfuse
Favicon
πŸš€ πŸ€– Let's Retrieve Data and Talk: A Full-stack RAG App with Create-Llama and LlamaIndex.TS
Favicon
πŸ€–πŸ“š Take Your First Steps into RAG: Building a LlamaIndex Retrieval Application using OpenAI’s gpt-3.5-turbo
Favicon
Using LlamaIndex for Web Content Indexing and Querying
Favicon
No-code AI: OpenAI MyGPTs, LlamaIndex rags, or LangChain OpenGPTs?
Favicon
GPT-4 Vs Zephyr 7b Beta: Which One Should You Use? 2023
Favicon
Chat with your PDF: Build a PDF Analyst with LlamaIndex and AgentLabs
Favicon
AI-Powered Selection of Asset Management Companies using MindsDB and LlamaIndex
Favicon
My data, your LLM β€” paranoid analysis of iMessage chats with OpenAI, LlamaIndex & DuckDB
Favicon
LlamaIndex Overview
Favicon
Quick tip: Using SingleStoreDB with LlamaIndex

Featured ones: