Logo

dev-resources.site

for different kinds of informations.

Build A Rag Chatbot with OpenAI and Langchain

Published at
9/8/2024
Categories
llms
rag
chatbot
Author
abdulganiyy
Categories
3 categories in total
llms
open
rag
open
chatbot
open
Author
11 person written this
abdulganiyy
open
Build A Rag Chatbot with OpenAI and Langchain

Introduction

In this tutorial, we will build a custom chatbot trained with private data to provide responses to users on specific domain knowledge. This was inspired by completing the Scrimba course on Build LLM Apps with JavaScript and OpenAI by Tom Chant. We will use a Scrimba FAQs document as our base knowledge, along with OpenAI's language model.

Key Concepts

AI (Artificial Intelligence)

Artificial Intelligence is the ability of a machine to simulate human-like intelligence and mimic human skills such as learning, problem-solving, and decision-making. This is achieved using various mechanisms and techniques, like neural networks, machine learning, deep learning, and data science.

LLM (Large Language Models)

Large Language Models are AI systems trained using vast amounts of data and parameters. When provided with a prompt, input, or question, they attempt to generate the most probable and relevant response.

RAG (Retrieval Augmented Generation)

RAG stands for Retrieval Augmented Generation. While LLMs are trained on public data, they may lack knowledge about specific domains. RAG allows the use of a knowledge base with LLMs to get the best responses for domain-specific queries.

Embeddings

Computers only understand numbers. Data is processed and stored in arrays of numbers, referred to as matrices in different dimensions. These numerical representations of data are called embeddings.

Vector Store

A vector store is a database for storing vector representations of data, optimized for storing and querying high-dimensional vectors (embeddings).

Creating Our RAG Chatbot

To create our RAG chatbot, we will follow these steps:

  1. Set up a Vector Store on Supabase.
  2. Create an OpenAI account, buy some credits, and generate an API key.
  3. Generate embeddings and upload them to the Vector Store.
  4. Process user questions and generate responses.

In the following sections, we'll dive deeper into each of these steps and implement our chatbot using OpenAI, HTML, CSS, JavaScript, and LangChain.

Setting up Vector Store on Supabase

  1. Go to Supabase and set up an account.
  2. Create a new project.
  3. Go to the table editor on the left panel and create the table “documents.”
  4. Go to the query editor on the left panel and name the query “match_documents.”
  5. Paste the SQL query below into the query editor:
-- Enable the pgvector extension to work with embedding vectors
create extension vector;

-- Create a table to store your documents
create table documents (
  id bigserial primary key,
  content text, -- corresponds to Document.pageContent
  metadata jsonb, -- corresponds to Document.metadata
  embedding vector(1536) -- 1536 works for OpenAI embeddings, change if needed
);

-- Create a function to search for documents
create function match_documents (
  query_embedding vector(1536),
  match_count int DEFAULT null,
  filter jsonb DEFAULT '{}'
) returns table (
  id bigint,
  content text,
  metadata jsonb,
  embedding jsonb,
  similarity float
)
language plpgsql
as $$
#variable_conflict use_column
begin
  return query
  select
    id,
    content,
    metadata,
    (embedding::text)::jsonb as embedding,
    1 - (documents.embedding <=> query_embedding) as similarity
  from documents
  where metadata @> filter
  order by documents.embedding <=> query_embedding
  limit match_count;
end;
$$;
Enter fullscreen mode Exit fullscreen mode

This SQL command will help us find the nearest match in the vector store when the user enters their input.

Generating Embeddings and Uploading to Vector Store

Information Store

The illustration above shows the flow for generating embeddings from our document and uploading them to the Supabase vector store. The document is splitted into multiple chunks, which are used to create embeddings and stored in the database. Below is the code to do that:

import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { createClient } from "@supabase/supabase-js";
import { SupabaseVectorStore } from "@langchain/community/vectorstores/supabase";
import { OpenAIEmbeddings } from "@langchain/openai";
import { promises as fs } from 'fs';

try {
    const text = await fs.readFile("scrimba-info.text", "utf-8");

    const splitter = new RecursiveCharacterTextSplitter({
        chunkSize: 500,
        separators: ['\n\n', '\n', ' ', ''],
        chunkOverlap: 50
    });

    const output = await splitter.createDocuments([text]);

    const sbApiKey = process.env.SUPABASE_API_KEY;
    const sbUrl = process.env.SUPABASE_URL_CHAT_BOT;
    const openAIApiKey = process.env.OPENAI_API_KEY;

    const embeddings = new OpenAIEmbeddings({ openAIApiKey });

    const client = createClient(sbUrl, sbApiKey);

    await SupabaseVectorStore.fromDocuments(output, embeddings, { client, tableName: 'documents' });

} catch (error) {
    console.log(error);
}
Enter fullscreen mode Exit fullscreen mode

Getting User Questions and Generating Responses

Information Flow

To get the most intuitive and user-friendly response to user input, the input will go through the following steps:

  1. We create a standalone question (most meaningful context) from the user input.
  2. Embeddings are created from the standalone question.
  3. The question embeddings are used to find the nearest match in the vector store.
  4. We then use OpenAI to generate an answer, combining the nearest match retrieved from our vector store with the original user input. We can also add conversation memory for the chatbot to be aware of previous conversations—this is optional but can be a challenge to implement.
// retriever.js

import { createClient } from "@supabase/supabase-js";
import { SupabaseVectorStore } from "@langchain/community/vectorstores/supabase";
import { OpenAIEmbeddings } from "@langchain/openai";

const sbApiKey = process.env.SUPABASE_API_KEY;
const sbUrl = process.env.SUPABASE_URL_CHAT_BOT;
const openAIApiKey = process.env.OPENAI_API_KEY;

const embeddings = new OpenAIEmbeddings({ openAIApiKey });

const client = createClient(sbUrl, sbApiKey);

const vectorStore = new SupabaseVectorStore(embeddings, {
    client,
    tableName: 'documents',
    queryName: "match_documents"
});

const retriever = vectorStore.asRetriever();

export { retriever };
Enter fullscreen mode Exit fullscreen mode
// combineDocuments.js

function combineDocuments(docs) {
    return docs.map((doc) => doc.pageContent).join(`\n\n`);
}

export { combineDocuments };
Enter fullscreen mode Exit fullscreen mode
import { ChatOpenAI } from "@langchain/openai";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { RunnableSequence, RunnablePassthrough } from "@langchain/core/runnables";
import { retriever } from "./retriever.js";
import { combineDocuments } from "./combineDocuments.js";
import { PromptTemplate } from "@langchain/core/prompts";

const openAIApiKey = process.env.OPENAI_API_KEY;

const llm = new ChatOpenAI({ openAIApiKey });

const standaloneQuestionTemplate = `Given a question, 
convert the question to a standalone question. 
Question: {question} 
Standalone question: `;

const standaloneQuestionPrompt = PromptTemplate.fromTemplate(standaloneQuestionTemplate);

const standaloneQuestionChain = standaloneQuestionPrompt.pipe(llm).pipe(new StringOutputParser());

const retrievalChain = RunnableSequence.from([
    prevResult => prevResult.standalone_question,
    retriever,
    combineDocuments
]);

const answerTemplate = `You are a helpful and enthusiastic support bot who can answer a given question about Scrimba based on 
the context provided. Try to find the answer in the context. If you really don't know the answer, say "I'm sorry, I don't know 
the answer to that." Direct the questioner to email [email protected]. Don't try to make up an answer. Always speak as if you are 
chatting with a friend.
Context: {context}
Question: {question}
Answer: `;

const answerPrompt = PromptTemplate.fromTemplate(answerTemplate);

const answerChain = answerPrompt.pipe(llm).pipe(new StringOutputParser());

const chain = RunnableSequence.from([
    {
        standalone_question: standaloneQuestionChain,
        original_input: new RunnablePassthrough()
    },
    {
        context: retrievalChain,
        question: ({ original_input }) => original_input.question.question
    },
    answerChain
]);

export async function progressConversation(question) {
    const response = await chain.invoke({ question });
    return response;
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

You can find the template here. It contains the full code for both the UI built with HTML, CSS, and JavaScript, and the backend using Node.js. I am not an AI expert, any feedbacks or questions in the comment section will be appreciated.

chatbot Article's
30 articles in total
Favicon
Improving Response Time with a Chatbot for Customer Care Services
Favicon
Embedding Q Business into existing applications
Favicon
Build a Simple Chatbot with Svelte and ElizaBot
Favicon
FAQs zu Chatbots: Was sollten Sie wissen?
Favicon
Cursos de Blockchain, IA, Python e Outros Temas Gratuitos da IBM
Favicon
Chat Bot AP en batch
Favicon
Curso de InteligĂŞncia Artificial Com Certificado Gratuito da Mirago
Favicon
Getting Started with AI: A Grounded Approach
Favicon
QualiFacti Cursos Gratuitos: IA, No Code, Nuvem E Blockchain
Favicon
Wie OCR-fähige KI-Chatbots Ihre Arbeit vereinfachen
Favicon
Build an App with AI in 10 Mins? Yeah Right… [part 9]
Favicon
Introduction to Chatbot (Bot) Framework SDK in .NET
Favicon
A Python Framework for Telegram Bots
Favicon
The Evolution of AI Chatbots in the Last Decade
Favicon
Rise and Fall of AI Chatbots
Favicon
Evento De Python Chatbot Gratuito Da Alura: Desenvolva Seu PortfĂłlio
Favicon
MultiChat AI: A Centralized Hub for Advanced AI Models
Favicon
How Do AI-Based Interview Chatbots Improve Recruitment Efficiency?
Favicon
🤖 From Chatbots to Personal Assistants: Building LLM Apps in JavaScript
Favicon
The Three Most Important Technologies for NSFW AI Chatbots in 2024
Favicon
The Rise of ChatBot and Conversational AI
Favicon
Engage Customers 24/7 with EnableX Dialogs: The Future of Conversational AI
Favicon
SplutterAI
Favicon
Is ChatBot enough for websites
Favicon
Build A Rag Chatbot with OpenAI and Langchain
Favicon
Curso De InteligĂŞncia Artificial Aplicada Ao Direito Gratuito Da Trybe
Favicon
Streamline your Order Tracking with WhatsApp Chatbots
Favicon
Automate Payment Reminders with Omnichannel Chatbots
Favicon
Webinar De InteligĂŞncia Artificial Para NegĂłcios Gratuita Da Ka Solution
Favicon
So called THE NEW AGE HUMANS, AI chatbots.

Featured ones: