Logo

dev-resources.site

for different kinds of informations.

Integrating LangChain with FastAPI for Asynchronous Streaming

Published at
12/12/2024
Categories
langchain
fastapi
llm
Author
louis-sanna
Categories
3 categories in total
langchain
open
fastapi
open
llm
open
Author
11 person written this
louis-sanna
open
Integrating LangChain with FastAPI for Asynchronous Streaming

LangChain and FastAPI working in tandem provide a strong setup for the asynchronous streaming endpoints that LLM-integrated applications need. Modern chat applications live or die by how effectively they handle live data streams and how quickly they can respond.

Introduction to LangChain

LangChain is a library that simplifies the incorporation of language models into applications. It provides an abstracted layer over various components such as large language models (LLMs), data retrievers, and vector storage solutions. This abstraction allows developers to integrate and switch between different backend providers or technologies seamlessly.

Introduction to FastAPI

FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.7+ based on standard Python type hints. It is designed for creating RESTful APIs quickly and efficiently, with automatic interactive API documentation provided by Swagger and ReDoc.

Combining LangChain with FastAPI

By combining LangChain with FastAPI, developers can create robust, asynchronous streaming APIs that handle real-time data efficiently. This integration is particularly useful for applications that require live updates, such as chat applications or real-time analytics dashboards.

Setting Up the FastAPI Project

First, install the necessary packages:

pip install fastapi langchain pydantic uvicorn
Enter fullscreen mode Exit fullscreen mode

Next, define the FastAPI router and Pydantic models to structure and validate incoming messages.

from fastapi import FastAPI, APIRouter
from pydantic import BaseModel
from typing import List

app = FastAPI()
router = APIRouter()

class Message(BaseModel):
    role: str
    content: str

class ChatPayload(BaseModel):
    messages: List[Message]

    class Config:
        schema_extra = {
            "example": {
                "messages": [{"role": "user", "content": "Who are you?"}]
            }
        }
Enter fullscreen mode Exit fullscreen mode

Creating the Streaming API Endpoint

Create an endpoint to receive chat messages and stream responses back to the client using LangChain. This is done by emitting server-sent events (SSE).

from fastapi import Request
from fastapi.responses import StreamingResponse
from langchain_openai import ChatOpenAI
import json

@router.post("/api/completion")
async def stream(request: Request, payload: ChatPayload):
    chat = ChatOpenAI()
    return StreamingResponse(send_completion_events(payload.messages, chat=chat), media_type="text/event-stream")

async def send_completion_events(messages, chat):
    async for patch in chat.astream_log(messages):
        for op in patch.ops:
            if op["op"] == "add" and op["path"] == "/streamed_output/-":
                content = op["value"] if isinstance(op["value"], str) else op["value"].content
                json_dict = {"type": "llm_chunk", "content": content}
                json_str = json.dumps(json_dict)
                yield f"data: {json_str}\\n\\n"

app.include_router(router)
Enter fullscreen mode Exit fullscreen mode

Running the FastAPI Application

Run the FastAPI application using Uvicorn, an ASGI server implementation for Python.

uvicorn main:app --reload
Enter fullscreen mode Exit fullscreen mode

Navigate to http://127.0.0.1:8000/docs to see the interactive API documentation generated by FastAPI. This documentation provides an easy way to test the API endpoints.

Why JSON Patch?

LangChain's astream_log method uses JSON Patch to stream events, which is why understanding JSON Patch is essential for implementing this integration effectively. JSON Patch provides an efficient way to update parts of a JSON document incrementally without needing to send the entire document. This is particularly useful in real-time applications where data needs to be updated frequently and incrementally.

Brief Overview of JSON Patch

JSON Patch supports several operation types, including:

  • Add: Inserts a new value into the JSON document at the specified path.
  • Remove: Removes the value at the specified path.
  • Replace: Replaces the value at the specified path with a new value.
  • Move: Moves a value from one path to another in the document.
  • Copy: Copies a value from one path to another.
  • Test: Tests that a specified value at a specified path exists.

Consider the original document:

{
  "baz": "qux",
  "foo": "bar"
}
Enter fullscreen mode Exit fullscreen mode

Applying the patch:

[
  { "op": "replace", "path": "/baz", "value": "boo" },
  { "op": "add", "path": "/hello", "value": ["world"] },
  { "op": "remove", "path": "/foo" }
]
Enter fullscreen mode Exit fullscreen mode

Results in:

{
  "baz": "boo",
  "hello": ["world"]
}
Enter fullscreen mode Exit fullscreen mode

JSON Patch allows for efficient, incremental updates, making it ideal for applications that require frequent or real-time updates to their data.

Conclusion

By integrating LangChain with FastAPI, developers can build efficient asynchronous streaming APIs capable of handling real-time data. This setup is ideal for applications like chatbots, where timely responses and data processing are crucial. FastAPI's ease of use and LangChain's abstraction capabilities, combined with the efficiency of JSON Patch, make this combination a powerful tool for modern web development.

Want to learn more about building Responsive LLMs? Check out my course on newline: Responsive LLM Applications with Server-Sent Events

I cover :

  • How to design systems for AI applications
  • How to stream the answer of a Large Language Model
  • Differences between Server-Sent Events and WebSockets
  • Importance of real-time for GenAI UI
  • How asynchronous programming in Python works
  • How to integrate LangChain with FastAPI
  • What problems Retrieval Augmented Generation can solve
  • How to create an AI agent ... and much more.

Worth checking out if you want to build your own LLM applications - in the course I will provide extensive code examples and help you go from concept to deployment.

langchain Article's
30 articles in total
Favicon
Get More Done with LangChain’s AI Email Assistant (EAIA)
Favicon
[Boost]
Favicon
Unlocking AI-Powered Conversations: Building a Retrieval-Augmented Generation (RAG) Chatbot
Favicon
AI Innovations to Watch in 2024: Transforming Everyday Life
Favicon
Calling LangChain from Go (Part 1)
Favicon
LangChain vs. LangGraph
Favicon
Mastering Real-Time AI: A Developer’s Guide to Building Streaming LLMs with FastAPI and Transformers
Favicon
Integrating LangChain with FastAPI for Asynchronous Streaming
Favicon
AI Agents + LangGraph: The Winning Formula for Sales Outreach Automation
Favicon
Building Talk-to-Page: Chat or Talk with Any Website
Favicon
AI Agents: The Future of Intelligent Automation
Favicon
Boost Customer Support: AI Agents, LangGraph, and RAG for Email Automation
Favicon
Using LangChain to Search Your Own PDF Documents
Favicon
Lang Everything: The Missing Guide to LangChain's Ecosystem
Favicon
How to make an AI agent with OpenAI, Langgraph, and MongoDB 💡✨
Favicon
Novita AI API Key with LangChain
Favicon
7 Cutting-Edge AI Frameworks Every Developer Should Master in 2024
Favicon
My 2025 AI Engineer Roadmap List
Favicon
AI Agents Architecture, Actors and Microservices: Let's Try LangGraph Command
Favicon
How to integrate pgvector's Docker image with Langchain?
Favicon
A Practical Guide to Reducing LLM Hallucinations with Sandboxed Code Interpreter
Favicon
LangGraph with LLM and Pinecone Integration. What is LangGraph
Favicon
Choosing a Vector Store for LangChain
Favicon
Roadmap for Gen AI dev in 2025
Favicon
AI-Powered Graph Exploration with LangChain's NLP Capabilities, Question Answer Using Langchain
Favicon
Potenciando Aplicaciones de IA con AWS Bedrock y Streamlit
Favicon
How Spring Boot and LangChain4J Enable Powerful Retrieval-Augmented Generation (RAG)
Favicon
Get Started with LangChain: A Step-by-Step Tutorial for Beginners
Favicon
Building RAG-Powered Applications with LangChain, Pinecone, and OpenAI
Favicon
What is Chunk Size and Chunk Overlap

Featured ones: