GitHub Overview
milvus-io/milvus
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
Topics
Star History
Database
Milvus
Overview
Milvus is an open-source vector database designed for managing and searching large-scale vector data. It efficiently processes unstructured data generated by AI applications and machine learning systems (embedding vectors from images, audio, and text), enabling high-speed similarity search. With its cloud-native architecture, Milvus can perform millisecond-level searches even on billions of vectors.
Details
Milvus was developed by Zilliz in 2019 and is now managed as an incubation project under the LF AI & Data Foundation. Its distributed architecture separates compute and storage, achieving high scalability and availability. It is widely adopted in various AI applications requiring vector similarity search, including recommendation systems, image/video search, natural language processing, and anomaly detection.
Key features of Milvus:
- Cloud-native distributed architecture
- Support for multiple vector index types (HNSW, IVF, DiskANN, etc.)
- Hardware acceleration (GPU, SIMD)
- Hybrid search (vector search + scalar filtering)
- Multi-tenancy support
- Cost optimization with hot/cold storage
- ACID properties support
- Rich SDKs (Python, Go, Java, Node.js)
- RESTful/gRPC APIs
- Kubernetes-native deployment
Architecture
Milvus adopts a four-layer architecture:
- Access Layer: Load balancing with stateless proxies
- Coordinator Layer: Metadata management and task scheduling
- Worker Node Layer: Data processing and query execution
- Storage Layer: Object storage and message queue
Advantages and Disadvantages
Advantages
- High Performance: Ultra-fast search with vector-specific optimizations
- Scalability: Horizontal scaling supporting tens of billions of vectors
- High Availability: Replication and automatic failover
- Flexibility: Choose from 10+ index types
- Cost-Effective: Optimized with hot/cold storage
- Ecosystem: Integration with LangChain, LlamaIndex, etc.
- Multi-Modal: Unified search for images, text, and audio
Disadvantages
- Learning Curve: Requires understanding of vector database concepts
- Resource Consumption: High memory usage with large-scale data
- Complexity: Requires distributed system management
- Limited Transactions: Not full ACID support
- Vector-Specific: Not suitable for traditional SQL queries
Key Links
Code Examples
Installation and Setup
# Run with Docker (Milvus Standalone)
wget https://github.com/milvus-io/milvus/releases/download/v2.6.0/milvus-standalone-docker-compose.yml -O docker-compose.yml
docker compose up -d
# Run on Kubernetes (Milvus Distributed)
helm repo add milvus https://zilliztech.github.io/milvus-helm/
helm repo update
helm install my-milvus milvus/milvus
# Install Python SDK
pip install pymilvus
# Milvus Lite (for development)
pip install pymilvus
# Can be used directly in Python
# Install Node.js SDK
npm install @zilliz/milvus2-sdk-node
# Install Go SDK
go get -u github.com/milvus-io/milvus-sdk-go/v2
Basic Operations (Python SDK)
from pymilvus import MilvusClient, DataType
import numpy as np
# Connect
client = MilvusClient(
uri="http://localhost:19530", # Milvus Standalone
# uri="./milvus_demo.db" # Milvus Lite
)
# Create collection
schema = MilvusClient.create_schema(
auto_id=False,
enable_dynamic_field=True,
)
# Define fields
schema.add_field(
field_name="id",
datatype=DataType.INT64,
is_primary=True,
)
schema.add_field(
field_name="vector",
datatype=DataType.FLOAT_VECTOR,
dim=768, # Embedding vector dimension
)
schema.add_field(
field_name="text",
datatype=DataType.VARCHAR,
max_length=65535,
)
# Index parameters
index_params = client.prepare_index_params()
index_params.add_index(
field_name="vector",
index_type="HNSW",
metric_type="L2",
params={"M": 16, "efConstruction": 256}
)
# Create collection
client.create_collection(
collection_name="documents",
schema=schema,
index_params=index_params
)
# Insert data
docs = [
{"id": 1, "vector": np.random.rand(768).tolist(), "text": "Milvus is a high-performance vector database"},
{"id": 2, "vector": np.random.rand(768).tolist(), "text": "Perfect for AI applications"},
{"id": 3, "vector": np.random.rand(768).tolist(), "text": "Enables large-scale similarity search"},
]
client.insert(collection_name="documents", data=docs)
# Vector search
query_vector = np.random.rand(768).tolist()
results = client.search(
collection_name="documents",
data=[query_vector],
limit=3,
output_fields=["id", "text"]
)
for hits in results:
for hit in hits:
print(f"ID: {hit['id']}, Distance: {hit['distance']}, Text: {hit['text']}")
Advanced Search Features
# Hybrid search (vector + scalar filter)
results = client.search(
collection_name="documents",
data=[query_vector],
filter="id > 1", # Scalar filtering
limit=5,
output_fields=["id", "text", "category"]
)
# Range search
results = client.search(
collection_name="documents",
data=[query_vector],
limit=10,
search_params={"metric_type": "L2", "params": {"radius": 1.0}}
)
# Multiple vector search (batch processing)
query_vectors = [np.random.rand(768).tolist() for _ in range(5)]
results = client.search(
collection_name="documents",
data=query_vectors,
limit=3
)
# Grouping search
results = client.search(
collection_name="products",
data=[query_vector],
limit=10,
group_by_field="category",
output_fields=["id", "name", "category", "price"]
)
Integration with Embedding Functions
from pymilvus import model
# OpenAI embeddings
embedding_fn = model.dense.OpenAIEmbeddingFunction(
model_name="text-embedding-3-small",
api_key="your-api-key"
)
# Sentence Transformers
embedding_fn = model.dense.SentenceTransformerEmbeddingFunction(
model_name="all-MiniLM-L6-v2"
)
# Generate embedding vectors from text
texts = ["How to use Milvus", "Implementing vector search"]
embeddings = embedding_fn.encode_documents(texts)
# Insert data with embeddings
docs_with_embeddings = [
{"id": i, "vector": embedding, "text": text}
for i, (text, embedding) in enumerate(zip(texts, embeddings))
]
client.insert(collection_name="documents", data=docs_with_embeddings)
Index Management
# Available index types
# FLAT: Brute-force search (100% accuracy)
# IVF_FLAT: Inverted file index
# IVF_SQ8: Memory reduction with quantization
# HNSW: Hierarchical Navigable Small World graph
# DISKANN: Disk-based ANN index
# Create index (IVF_FLAT)
index_params = {
"metric_type": "L2",
"index_type": "IVF_FLAT",
"params": {"nlist": 1024}
}
client.create_index(
collection_name="documents",
field_name="vector",
index_params=index_params
)
# Get index information
index_info = client.describe_index(
collection_name="documents",
field_name="vector"
)
# Drop and rebuild index
client.drop_index(
collection_name="documents",
field_name="vector"
)
Partition Management
# Create partition
client.create_partition(
collection_name="documents",
partition_name="2024_data"
)
# Insert data into specific partition
client.insert(
collection_name="documents",
data=docs,
partition_name="2024_data"
)
# Search in specific partition
results = client.search(
collection_name="documents",
data=[query_vector],
limit=5,
partition_names=["2024_data"]
)
# List partitions
partitions = client.list_partitions(collection_name="documents")
Practical Example: Image Search System
from PIL import Image
import torch
from transformers import CLIPModel, CLIPProcessor
# Image embeddings with CLIP model
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
# Create collection
schema = MilvusClient.create_schema()
schema.add_field("id", DataType.INT64, is_primary=True)
schema.add_field("image_vector", DataType.FLOAT_VECTOR, dim=512)
schema.add_field("image_path", DataType.VARCHAR, max_length=500)
schema.add_field("metadata", DataType.JSON)
client.create_collection("image_search", schema=schema)
# Generate image embedding
def get_image_embedding(image_path):
image = Image.open(image_path)
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
image_features = model.get_image_features(**inputs)
return image_features.numpy().flatten().tolist()
# Insert image data
image_data = []
for i, img_path in enumerate(image_paths):
embedding = get_image_embedding(img_path)
image_data.append({
"id": i,
"image_vector": embedding,
"image_path": img_path,
"metadata": {"category": "product", "date": "2024-01-15"}
})
client.insert("image_search", data=image_data)
# Search images by text query
def search_images_by_text(text_query, limit=10):
inputs = processor(text=[text_query], return_tensors="pt")
with torch.no_grad():
text_features = model.get_text_features(**inputs)
query_vector = text_features.numpy().flatten().tolist()
results = client.search(
collection_name="image_search",
data=[query_vector],
limit=limit,
output_fields=["image_path", "metadata"]
)
return results
# Execute search
results = search_images_by_text("red car")
RAG Application Integration
from langchain_milvus import Milvus
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Split documents
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
documents = text_splitter.split_documents(raw_documents)
# Milvus vector store
embeddings = OpenAIEmbeddings()
vectorstore = Milvus(
embedding_function=embeddings,
collection_name="rag_documents",
connection_args={"uri": "http://localhost:19530"}
)
# Add documents
vectorstore.add_documents(documents)
# Similarity search
query = "What are the main features of Milvus?"
results = vectorstore.similarity_search(query, k=5)
# Build RAG chain
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(temperature=0)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
retriever=vectorstore.as_retriever(),
return_source_documents=True
)
response = qa_chain({"query": query})
print(response["result"])
Management and Monitoring
# Collection statistics
stats = client.get_collection_stats(collection_name="documents")
print(f"Row count: {stats['row_count']}")
# Check memory usage
client.get_loading_progress(collection_name="documents")
# Flush collection (persist to disk)
client.flush(collection_names=["documents"])
# Backup (data export)
client.query(
collection_name="documents",
expr="", # All data
output_fields=["id", "vector", "text"],
limit=1000000
)
# Drop collection
client.drop_collection(collection_name="documents")
# Close connection
client.close()
Performance Tuning
# Optimize search parameters
search_params = {
"metric_type": "L2",
"params": {
"nprobe": 16, # For IVF index
"ef": 64, # For HNSW index
}
}
# Collection load settings
client.load_collection(
collection_name="documents",
replica_number=2 # Increase replicas for high availability
)
# Enable memory mapping (for large-scale data)
schema.add_field(
field_name="large_text",
datatype=DataType.VARCHAR,
max_length=65535,
mmap_enabled=True # Enable memory mapping
)
# Optimize batch insertion
batch_size = 1000
for i in range(0, len(all_data), batch_size):
batch = all_data[i:i+batch_size]
client.insert(collection_name="documents", data=batch)