In this blog post, I'll share my journey of building a React.dev AI assistant which retrieves the info from react.dev website, and answers the prompt provided by the user. The main goal is to use the information available in react.dev website as knowledge base of the assistant.
Tech Stack
Next.js: A React-based framework for server-side rendering and routing.
ChatGPT: A popular generative AI LLM used for the assistant
Vercel AI SDK: The AI SDK is the TypeScript toolkit designed to help developers build AI-powered applications, follow more from their documentation here.
Supabase: A Postgres database to store vector embeddings with pgvector extension.
Tailwind CSS: A utility-first CSS framework that provides flexibility while keeping the UI clean and customisable.
Drizzle ORM: This library provides utilities to interact with DB and run vector similarity search against vector db.
Prerequisites:
This project requires below prerequisites to setup.
Supabase Database with Pgvector extension enabled - this will be used to store vector embeddings of the website content.
Open AI API key - if you dont have an OpenAI API key - get one from here
Please follow this link to enable pgvector extension for the database provisioned in supabase.
Project setup
This project consists of Next.js, Supabase, and Tailwind CSS as frontend UI, Vercel ai as backend chat integration with OpenAI API. It uses cheerio to scrap webpages and stores the embeddings in supabase pgvector database.
Step 1: get the website content scrapped using cheerio plugin. The following code demonstrate the url as input and returns document content scrapped from the website.
once the knowledge base is built with react.dev website links, the document information get stored in supabase table. Using AI SDK, chat api can be implemented based their documentation available here.
following is the sample code to stream text from server:
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: openai('gpt-4o'),
messages,
system: `You are a helpful assistant. Check your knowledge base before answering any questions.
Only respond to questions using information from tool calls.
if no relevant information is found in the tool calls, respond, "Sorry, I don't know."`,
tools: {
getInformation: tool({
description: `get information from your knowledge base to answer questions.`,
parameters: z.object({
question: z.string().describe('the users question'),
}),
execute: async ({ question }) => findRelevantContent(question),
}),
},
});
return result.toDataStreamResponse();
}
With the above Prompt, model will always looks for the knowledge base and triggers the tool getInformation which in turns executes function findRelevantContent. This function queries vector db and executes cosine similarity to find matches and returns the relevant context to LLM. LLM then use the knowledge base info and then generate response to the user query.
AI SDK has AI SDK UI which is framework-agnostic toolkit, streamlining the integration of advanced AI functionalities into your applications. It contains useChat hook which helps to integrate chat api (create earlier) with less effort.
Once UI part is implemented, it can be tested using any of the questions from react.dev website.
Conclusion
AI SDK has multiple features to integrate LLM directly to any of the react application without much hassle. It has utilities to for frontend and backend which makes it unique and easy to integrate.
In this guide, I have covered process of RAG implementation using AI SDK which will be helpful to create a knowledge base for any of the website.
A fully functional example of this project is included at the end of this article.
react chat bot with RAG on react docs from react.dev website
React Docs Generative AI chatbot
This chatbot is a generative AI chatbot that can answer questions about React documentation using RAG (retrieval augmented generation) pipeline.
It is built using Next.js ai sdk and cheerio for web scraping and langchain to split the text into sentences. It uses drizzle ORM to store the embeddings of the scraped data and uses the embeddings to generate answers to the questions asked by the user.
Getting Started
First, run the development server:
npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev