Logo

dev-resources.site

for different kinds of informations.

How to Connect to Milvus Lite Using LangChain and LlamaIndex

Published at
7/25/2024
Categories
langchain
llamaindex
tutorial
vectordatabase
Author
chloewilliams
Author
13 person written this
chloewilliams
open
How to Connect to Milvus Lite Using LangChain and LlamaIndex

Milvus Lite, released just one week ago on May 31, is now the default method for third-party connectors like LangChain and LlamaIndex to connect to Milvus, the popular open-source vector database.

Method Control Level for Retrieval Process Time (seconds)
LlamaIndex No control 2156
LangChain Full control 8
Milvus Lite API Full control 28

Table: Timings using the same HuggingFace embedding model (BAAI/bge-large-en-v1.5) and the same HTML data files.

The result? If youโ€™re looking for the best balance between high control over Milvus settings and fast setup, using the Milvus Lite APIs directly is the optimal choice. The full code and timings are available on my GitHub.

In the following sections, weโ€™ll cover:

  1. Connecting to Milvus Lite using LlamaIndex

  2. Connecting to Milvus Lite using LangChain

  3. Connecting to Milvus Lite using Milvus APIs

Connecting to Milvus Lite Using LlamaIndex

Itโ€™s easy to get started using LlamaIndex. It takes about 2000 seconds to connect and create a collection.

from pymilvus import MilvusClient
from llama_index.core import (
   Settings,
   ServiceContext,
   StorageContext,
   VectorStoreIndex,
)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.vector_stores.milvus import MilvusVectorStore


# 1. Define the embedding model.
service_context = ServiceContext.from_defaults(
   # LlamaIndex local: translates to the same location as default HF cache.
   embed_model="local:BAAI/bge-large-en-v1.5")
# LlamaIndex hides this but we need it to create the vector store!
EMBEDDING_DIM = 1024


# 2. Create a Milvus collection from the documents and embeddings.
milvus_client = MilvusClient()
vector_store = MilvusVectorStore(
   client=milvus_client,
   dim=EMBEDDING_DIM,
   overwrite=True
)
storage_context = StorageContext.from_defaults(
   vector_store=vector_store
)
llamaindex = VectorStoreIndex.from_documents(
   # Chunk, embed, insert too slow!  Just use one document.
   docs[:1],
   storage_context=storage_context,
   service_context=service_context
)

Enter fullscreen mode Exit fullscreen mode

Connecting to Milvus Lite Using LangChain

Itโ€™s easy to get started in LangChain. It takes about 8 seconds to connect and create a collection.

from langchain_milvus import Milvus
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter


# 1. Define the embedding model.
model_name = "BAAI/bge-large-en-v1.5"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': True}
embed_model = HuggingFaceEmbeddings(
   model_name=model_name,
   model_kwargs=model_kwargs,
   encode_kwargs=encode_kwargs)
EMBEDDING_DIM = embed_model.dict()['client'].get_sentence_embedding_dimension()


# 2. Create a Milvus collection from the documents and embeddings.
start_time = time.time()
vectorstore = Milvus.from_documents(
   documents=docs,
   embedding=embed_model,
   connection_args={
       "uri": "./milvus_demo.db",},
   # Override LangChain default values for Milvus.
   consistency_level="Eventually",
   drop_old=True,
   index_params = {
       "metric_type": "COSINE",
       "index_type": "AUTOINDEX",
       "params": {}}
)

Enter fullscreen mode Exit fullscreen mode

Connecting to Milvus Lite Using Milvus Lite APIs

But what's happening behind the scenes? Letโ€™s break down the actual steps and make the default values more explicit:

  1. Start the Milvus Lite server and connect.

  2. Select an embedding model.

  3. Create a Milvus database collection.

    1. Define a schema.
    2. Choose an index (data structure for Approximate Nearest Neighbor search).
    3. Choose a distance metric (definition of โ€œcloseโ€ in vector space).
    4. Choose the consistency level for inserting data.
  4. Select a chunking strategy.

  5. Transform chunks of data into vectors using the embedding model inference.

  6. Insert vector data into Milvus.

Here is the Python code using the Milvus Lite API directly. It takes about 28 seconds to connect and create a collection.

import pymilvus


# STEP 1. CONNECT A CLIENT TO LIGHT MILVUS PYTHON SERVER.
from pymilvus import MilvusClient
mc = MilvusClient("milvus_demo.db")


# STEP 2. DOWNLOAD AN OPEN SOURCE EMBEDDING MODEL.
from sentence_transformers import SentenceTransformer
model_name = "BAAI/bge-large-en-v1.5"
encoder = SentenceTransformer(model_name, device=โ€™cpuโ€™)


# STEP 3. CREATE A MILVUS COLLECTION AND DEFINE THE DATABASE INDEX.
# Uses Milvus AUTOINDEX, which defaults to HNSW.
COLLECTION_NAME = "MilvusDocs"
mc.create_collection(COLLECTION_NAME,
       EMBEDDING_DIM,
       consistency_level="Eventually",
       auto_id=True, 
       overwrite=True,)


# STEP 4. CHUNK DATA INTO VECTORS.
from langchain_community.document_transformers import BeautifulSoupTransformer
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Define chunk size and overlap.
chunk_size = 512
chunk_overlap = np.round(chunk_size * 0.10, 0)
# Split the documents into recursive, overlapping chunks.
child_splitter = RecursiveCharacterTextSplitter(
   chunk_size = chunk_size,
   chunk_overlap = chunk_overlap,
   length_function = len,  # use built-in Python len function)
chunks = child_splitter.split_documents(docs)


# STEP 5. TRANSFORM CHUNKS INTO VECTORS USING EMBEDDING MODEL INFERENCE.
list_of_strings = [doc.page_content for doc in chunks if hasattr(doc, 'page_content')]
embeddings = torch.tensor(encoder.encode(list_of_strings))


# STEP 6. INSERT CHUNK LIST INTO MILVUS.
# First, create chunk_list and dict_list.
dict_list = []
for chunk, sparse, dense in zip(chunks, embeddings["sparse"], embeddings["dense"]):
   chunk_dict = {
       'chunk': chunk.page_content,
       'source': chunk.metadata.get('source', ""),
       'vector': dense
   }
   dict_list.append(chunk_dict)
mc.insert(
   COLLECTION_NAME,
   data=dict_list,
   progress_bar=True)

Enter fullscreen mode Exit fullscreen mode

Choosing the Right Milvus Light Method

While the different Milvus Lite APIs offer conveniences, they come with trade-offs in terms of control over retrieval and chunking methods and speed.

Using Milvus Lite APIs directly provides the highest control over Milvus retrieval settings balanced with the fastest collection creation speed.

Resources and Further Reading

Milvus Lite docs

Milvus Lite LlamaIndex docs

Milvus Lite LangChain docs

LangChain Milvus docs

LlamaIndex Milvus docs

llamaindex Article's
27 articles in total
Favicon
Build Your First AI Application Using LlamaIndex!
Favicon
First step and troubleshooting Docling โ€” RAG with LlamaIndex on my CPU laptop
Favicon
LlamaIndex RAG: Build Efficient GraphRAG Systems
Favicon
RedLM: My submission for the NVIDIA and LlamaIndex Developer Contest
Favicon
Exploring RAG: Discover How LangChain and LlamaIndex Transform LLMs?
Favicon
Building a Multi-Agent Framework from Scratch with LlamaIndex
Favicon
Creating a Simple RAG in Python with AzureOpenAI and LlamaIndex
Favicon
Implementing RAG using LlamaIndex, Pinecone and Langtrace: A Step-by-Step Guide
Favicon
LlamaIndex: Revolutionizing Data Indexing for Large Language Models (Part 1)
Favicon
How to Connect to Milvus Lite Using LangChain and LlamaIndex
Favicon
Choosing Between LlamaIndex and LangChain: A Comprehensive Guide
Favicon
LlamaIndex Framework - Context-Augmented LLM Applications
Favicon
Code the Vote!
Favicon
TypeError: Object of type AgentChatResponse is not JSON serializable
Favicon
Chat with your Github Repo using llama_index and chainlit
Favicon
How to Implement RAG with LlamaIndex, LangChain, and Heroku: A Simple Walkthrough
Favicon
RAG observability in 2 lines of code with Llama Index & Langfuse
Favicon
๐Ÿš€ ๐Ÿค– Let's Retrieve Data and Talk: A Full-stack RAG App with Create-Llama and LlamaIndex.TS
Favicon
๐Ÿค–๐Ÿ“š Take Your First Steps into RAG: Building a LlamaIndex Retrieval Application using OpenAIโ€™s gpt-3.5-turbo
Favicon
Using LlamaIndex for Web Content Indexing and Querying
Favicon
No-code AI: OpenAI MyGPTs, LlamaIndex rags, or LangChain OpenGPTs?
Favicon
GPT-4 Vs Zephyr 7b Beta: Which One Should You Use? 2023
Favicon
Chat with your PDF: Build a PDF Analyst with LlamaIndex and AgentLabs
Favicon
AI-Powered Selection of Asset Management Companies using MindsDB and LlamaIndex
Favicon
My data, your LLM โ€” paranoid analysis of iMessage chats with OpenAI, LlamaIndex & DuckDB
Favicon
LlamaIndex Overview
Favicon
Quick tip: Using SingleStoreDB with LlamaIndex

Featured ones: