dev-resources.site
for different kinds of informations.
LangGraph with LLM and Pinecone Integration. What is LangGraph
Getting started with GenAI can be overwhelming, especially with the abundance of LLMs and extensive documentation to navigate. One of the most popular frameworks in this space is LangChain. In this article, I’ll focus specifically on LangGraphs. LangGraphs offer a powerful way to build workflows that maintain state throughout the process, allowing you to incorporate conditional operations (nodes) seamlessly.
LangGraph is a framework that orchestrates workflows with stateful interactions between tools, annotations, and large language models (LLMs). By structuring workflows as state graphs, LangGraph makes it easier to design, debug, and execute complex interactions, such as querying databases and refining responses using LLMs.
LangGraph addresses the problem of building complex, structured workflows for conversational AI and decision-making systems, enabling developers to create flexible, stateful, and dynamic logic flows that integrate with multiple tools and AI models.
Why Use LangGraph with LLMs?
*Structured Workflows: *
- Without LangGraph, combining multiple tools and LLMs often involves hard-to-manage code with ad hoc state handling.
- LangGraph provides a structured approach using nodes (tools or functions) and edges (dependencies).
State Management:
- Tracks the flow of data using annotations, ensuring smooth transitions between nodes.
- Avoids repetitive boilerplate code by automating state transitions.
LLM Integration:
- Enables chaining of LLMs with tools like Pinecone for vector database querying.
- Allows post-processing and response refinement to produce user-specific outputs.
Reusability:
- Modular nodes and reusable workflows make LangGraph ideal for scalable applications.
What is the code in this article solving?
This application queries a Pinecone vector database to retrieve relevant information (e.g., "extract the Phase 5 deliverables for project ABC and summarise all the functional requirements") and then refines the results using an LLM. The user specifies what to query, and the program:
Fetches data from Pinecone.
Summarises and filters the results using an LLM.
Workflow Overview The workflow is implemented using a StateGraph, which manages the interactions between the following components:
1. Nodes:
*pineConeTool: *
- Queries Pinecone for relevant results.
*refineResponse: *
- Processes and summarises the results using an LLM.
Define the order of execution:
Start → Query Pinecone → Refine Response → End.
2. Annotations:
- prompt: The user's query.
- nameSpace: Namespace in the Pinecone index.
- indexName: Pinecone index to query.
- messages: Tracks tool outputs and LLM refinements.-
3. User Query:
The user specifies the prompt, nameSpace, and indexName.
Example: "give me data about phase 5 deliverables and functional requirement. extract the key points. do not include phase 6 of the project".
Pinecone Query (pineConeTool):
- Uses embeddings to query Pinecone for relevant results.
- Filters results and returns the matches.
LLM Refinement (refineResponse):
- Passes the raw results to the LLM.
- Instructs the LLM to summarise and filter the results (e.g., exclude Phase 6).
Final Output:
- Returns a refined and concise response to the user.
Key Components
I have put together code that we can look at together. I have written this code in typescript / node.js as the typescript community is growing rapidly in this space.
for the sake of the article i have written the code as a single file to make is simpler to follow. However in production i would split out logic into classes to specialise in specific tasks.
Code walk through
first we import the necessary packages.
import { AIMessage, BaseMessage, HumanMessage } from "@langchain/core/messages";
import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai";
import {
StateGraph,
MemorySaver,
Annotation,
messagesStateReducer,
START,
END,
} from "@langchain/langgraph";
import { Pinecone as PineconeClient } from "@pinecone-database/pinecone";
import dotenv from "dotenv";
dotenv.config();
const variable that we will need at global scope
const DEBUG = false;
const OPEN_AI_TEXT_EMB_3_SMALL = "text-embedding-3-small";
const MODEL_NAME = "gpt-4o";
const MODEL_TEMP = 0.2;
Create and instance of pinecone client. For people that are less familiar with pinecone this is a serverless resource to create embeddings. you can create a free account to start testing.
// create pinecone client connection.
// this is used to run the queries
const pinecone = new PineconeClient({
apiKey: process.env.PINECONE_API_KEY!,
});
Create a state that will be used during the execution of the LangGraph. The messages returned from the pinecone similarity searches and the LLM responses will all be stores in this state
// this is the Root Annotation, it will hold
// the state information as the langGraph
// progresses
const StateAnnotations = Annotation.Root({
messages: Annotation<BaseMessage[]>({
reducer: messagesStateReducer, // append new messages
}),
prompt: Annotation<string>(),
nameSpace: Annotation<string>(),
indexName: Annotation<string>(),
});
This section is very much about how to create queries from pinecone. if you would like more information on this I have also included here a link on using pinecone in more details
https://github.com/EmiRoberti77/pinecone_ts
This code will extract the query from the "_state" variable that is passed into the "agent tool" from the LangGraph. in order to pass the query to pinecone, this must be embedded. we have to convert human text into embedding (number[]) for it to perform a similarity search against vectors already stored in the database. I am using OpenAI for this sample code. ( However there are many others that are available. Hugging Face is a great resource for this. I have more sample of embeddings here https://github.com/EmiRoberti77/bedrock_ts )
// this is where we create our tool
// each tool is added to the langGraph
const pineConeTool = async (_state: typeof StateAnnotations.State) => {
if (DEBUG) console.log("In Pinecone Tool");
//create embedding instance of openAI
const embedding = new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY!,
batchSize: 100,
model: OPEN_AI_TEXT_EMB_3_SMALL,
});
// extract the correct index in pinecone database
const index = pinecone.Index(_state.indexName);
// extract the correct nameSpace in pinecone
const embeddedPrompt = await embedding.embedQuery(_state.prompt);
const result = await index.namespace(_state.nameSpace).query({
vector: embeddedPrompt,
topK: 3,
includeMetadata: true,
});
// iterate over the results from the relevant vectors
const mappedResult = result.matches.map((item) => item.metadata?.text + "\n");
if (DEBUG) console.log("Mapped Result:", mappedResult);
// add the response into the StatAnnotation
return {
messages: [new AIMessage(mappedResult.join(""))],
};
};
This is the next tool in the LangGraph. Before presenting a beautiful looking response that have great context in relation to the question we can refine the response by extract the context that we have store in "_state" create a new template message and pass that to the LLM ( this can be an LLM of choice, I have used OpenAI, I also have samples using AWS Bedrock )
// this is the second tool that is called
// within the workflow of the langGraph to
// refine the results
const refineResponse = async (_state: typeof StateAnnotations.State) => {
if (DEBUG) console.log("Processing with LLM...");
// extract the last message on the graph
const lastToolMessage = _state.messages[
_state.messages.length - 1
] as AIMessage;
// Generate a refined prompt with explicit instructions
// this is to refine the results from the extracted vectors
const refinedPrompt = `
Based on the following information, summarise only the deliverables related to Phase 5.
Do not include details about any other phases.
${lastToolMessage.content}
`;
const refinedMessage = new HumanMessage(refinedPrompt);
// Add the refined message to the state
_state.messages.push(refinedMessage);
const response = await model.invoke(_state.messages);
if (DEBUG) console.log("Refined LLM Response:", response.content);
return {
messages: [response],
};
};
Creating the LLM model
//create the LLM
const model = new ChatOpenAI({
model: MODEL_NAME,
temperature: MODEL_TEMP,
});
This is the key part of the article. Here we define the LangGraph and all its nodes. To keep it simple in this example i have used two nodes, one to extract the data from the pinecone database and the other one to formulate a response with added context back to the user. There "addNode" will add the tool to the graph and the "addEdge" will define the sequence of execution. We can also add conditional execution based on what each node will return. The "START" and "END" commands are defined in our imports at the top of the program. These define the beginning and end of our workflow.
// define the graph
// this is how the workflow is created
// the tools are added to the nodes StateGraph
// then the order of execution is set
const workflow = new StateGraph(StateAnnotations)
.addNode("__pinecone__", pineConeTool)
.addNode("__refine__", refineResponse)
.addEdge(START, "__pinecone__")
.addEdge("__pinecone__", "__refine__")
.addEdge("__pinecone__", END);
Before completing the LangGraph I am adding memory that can be used between the states ( nodes ) of the execution.
// set the memory between states
const checkpointer = new MemorySaver();
// compile the graph workflow
const app = workflow.compile({ checkpointer });
Finally, we are at the point to run the chain. I have set up a async call with a payload that is passed into the State. this can look anyway you need it too, based on your project, but for my sample there is some information that is needed by the agents tools (nodes) to properly execute and satisfy answering the prompt that has been inputted. This example is assuming we are reading documents and we want to know and extract information about a specific part of a documents. In order to full fill this request we will need to provide the information to pinecone of where to extract the information and give the LLM the raw response and add context on how to create a response.
//finally run a query to start the workflow
async function run() {
// prepare the payload that will be passed between states in
// the langGraph
const payload = {
messages: [],
prompt: "give me data about phase 5 deliverables for project ABC and extract the key points. do not include phase 6",
nameSpace: "emi_pdf",
indexName: "emi-pc-example-index",
};
const finalState = await app.invoke(payload, {
configurable: { thread_id: "emi-77" },
});
//extract results
console.log(finalState.messages[finalState.messages.length - 1].content);
}
// run the graph
run();
I hope you found this article helpful and informative. While it’s challenging to cover everything in a single article, I’m happy to answer any questions or assist with specific issues you may be facing.
If you’d like to view all the code in one file, I’ve included a link below.
Happy Coding - Emi Roberti
Featured ones: