Logo

dev-resources.site

for different kinds of informations.

End to end LLMOps Pipeline - Part 2 - FastAPI

Published at
8/13/2024
Categories
llm
rag
fastapi
llmops
Author
lakhera2015
Categories
4 categories in total
llm
open
rag
open
fastapi
open
llmops
open
Author
11 person written this
lakhera2015
open
End to end LLMOps Pipeline - Part 2 - FastAPI

Welcome to Day 2. Yesterday, we explored Hugging Face, a leading platform in Natural Language Processing (NLP), which simplifies the process of building and deploying state-of-the-art machine learning models. Today, we will build on the code we wrote yesterday and integrate it with FastAPI.

✅ What is FastAPI?
FastAPI is designed to create robust, fast, and secure APIs with minimal effort. It leverages Python type hints to enable features like automatic generation of interactive API documentation, which is a significant advantage for both development and user experience. Whether you're a beginner or an experienced developer, FastAPI offers tools that streamline API development, from easy parameter validation to detailed error messages.

✅ Key Features:
✔️ Ease of Use: FastAPI simplifies API development by providing automatic interactive documentation via Swagger UI and ReDoc. This interactive interface not only makes it easier to understand and test your API but also enhances collaboration between developers and users.
✔️ Type Hints: FastAPI heavily relies on Python type hints, which improve code quality, readability, and development speed. Type hints also enable powerful IDE support, such as inline errors and code completion, making the coding process smoother and more error-free.
✔️ Performance: Known for being one of the fastest Python frameworks, FastAPI achieves remarkable performance by using Starlette and Pydantic, which ensure that your web applications are both scalable and efficient.
✔️ Async Support: FastAPI natively supports asynchronous programming, making it ideal for building high-performance applications that can handle numerous simultaneous users. This is a crucial feature for modern, scalable web applications.

✅ Getting Started with FastAPI
To begin using FastAPI, you need to set up a few prerequisites:

Prerequisites:

Python 3.6 or higher
FastAPI library
Uvicorn for running the server
Transformers library by Hugging Face
Pydantic for data validation
Enter fullscreen mode Exit fullscreen mode

✅ Installation
You can install the necessary libraries using pip:

pip install fastapi uvicorn transformers pydantic`

Enter fullscreen mode Exit fullscreen mode

Note: While manual installation via pip is demonstrated here, it's recommended to create a requirements.txt file and install these dependencies via GitHub Action for a more streamlined and reproducible setup.

✅ Importing Necessary Libraries
Let's start by importing the required libraries:

from fastapi import FastAPI, HTTPException 
from pydantic import BaseModel 
from transformers import pipeline import uvicorn
Enter fullscreen mode Exit fullscreen mode

FastAPI and HTTPException from FastAPI are used to build the API and handle exceptions.
BaseModel from Pydantic is used to define request and response data models.
pipeline from Transformers initializes the Hugging Face question-answering model.
uvicorn is used to run the FastAPI server.

✅ Creating the FastAPI Application
The core of your API is initialized as follows:

app = FastAPI()
Initializing the Question-Answering Pipeline
We use the Hugging Face Transformers library to initialize a question-answering model:
qa_pipeline = pipeline("question-answering", model="distilbert-base-uncased-distilled-squad")
Enter fullscreen mode Exit fullscreen mode

This pipeline will handle the question-answering task by leveraging the distilbert-base-uncased-distilled-squad model from Hugging Face's model hub.

✅ Defining Data Models
Pydantic models are defined to structure the request and response data:

class ChatRequest(BaseModel): 
     question: str 
     context: str 
class ChatResponse(BaseModel): 
    answer: str
Enter fullscreen mode Exit fullscreen mode

ChatRequest expects two fields: question (the question to be answered) and context (the context in which to search for the answer).
ChatResponse contains a single field: answer, which holds the model's answer.

✅ Creating the /chat Endpoint
Here's how to define an endpoint for the chat functionality:

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    try:
        result = qa_pipeline(question=request.question, context=request.context)
        return ChatResponse(answer=result['answer'])
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
Enter fullscreen mode Exit fullscreen mode

The @app.post("/chat") decorator creates a POST endpoint at /chat.
The chat function takes a ChatRequest object, uses the qa_pipeline to find the answer, and returns it as a ChatResponse.
If any error occurs, an HTTP 500 error is raised with the corresponding exception message.

✅ Running the Server
To start the FastAPI server, use the following script:

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)
Enter fullscreen mode Exit fullscreen mode

✅ Conclusion
FastAPI is an excellent choice for developers looking to build fast, modern, and efficient APIs with Python. Its native support for asynchronous programming, automatic documentation generation, and ease of use make it a standout framework. Whether you're building a small application or a large-scale project, FastAPI provides the tools and features needed to create a robust API effortlessly.

Image description

📚 If you'd like to learn more about this topic, please check out my book. Building an LLMOps Pipeline Using Hugging Face

llmops Article's
30 articles in total
Favicon
A Beginners Guide to LLMOps
Favicon
LLMOps [Quick Guide]
Favicon
The power of MIPROv2 - using DSPy optimizers for your LLM-pipelines
Favicon
Unifying or Separating Endpoints in Generative AI Applications on AWS
Favicon
📚 Download My DevOps and LLMOps Books for Free!📚
Favicon
Deploying LLM Inference Endpoints & Optimizing Output with RAG
Favicon
End to End LLMOps Pipeline - Part 8 - AWS EKS
Favicon
🤖 End to end LLMOps Pipeline - Part 7- Validating Kubernetes Manifests with kube-score🤖
Favicon
📚 Announcing My New Book: Building an LLMOps Pipeline Using Hugging Face 📚
Favicon
End to end LLMOps Pipeline - Part 2 - FastAPI
Favicon
End to end LLMOps Pipeline - Part 1 - Hugging Face
Favicon
Bridging the Gap: Integrating Responsible AI Practices into Scalable LLMOps for Enterprise Excellence
Favicon
Building a Traceable RAG System with Qdrant and Langtrace: A Step-by-Step Guide
Favicon
FastAPI for Data Applications: Dockerizing and Scaling Your API on Kubernetes. Part II
Favicon
FastAPI for Data Applications: From Concept to Creation. Part I
Favicon
Evaluation of OpenAI Assistants
Favicon
Vector stores and embeddings: Dive into the concept of embeddings and explore vector store integrations within LangChain
Favicon
Finding the Perfect Model for Your Project on the Hugging Face Hub
Favicon
The Future of Natural Language APIs
Favicon
How do you know that an LLM-generated response is factually correct? 🤔
Favicon
The Era of LLM Infrastructure
Favicon
Launching LLM apps? Beware of prompt leaks
Favicon
Small Language Models are Going to Eat the World.
Favicon
No Code: Dify's Open Source App Building Revolution
Favicon
Pipeline Parallelism in PyTorch
Favicon
Orquesta raises €800,000 in pre-seed funding!
Favicon
Lifecycle of a Prompt: A Guide to Effective Prompts
Favicon
Integrate Orquesta with LangChain
Favicon
LLM Analytics 101 - How to Improve your LLM app
Favicon
Build an AI App in 5 Minutes without Coding

Featured ones: