GitHub Overview

milvus-io/milvus

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

Stars36,232
Watchers308
Forks3,321
Created:September 16, 2019
Language:Go
License:Apache License 2.0

Topics

annscloud-nativediskanndistributedembedding-databaseembedding-similarityembedding-storefaissgolanghnswimage-searchllmnearest-neighbor-searchragvector-databasevector-searchvector-similarityvector-store

Star History

milvus-io/milvus Star History
Data as of: 7/30/2025, 12:10 AM

Database

Milvus

Overview

Milvus is an open-source vector database designed for managing and searching large-scale vector data. It efficiently processes unstructured data generated by AI applications and machine learning systems (embedding vectors from images, audio, and text), enabling high-speed similarity search. With its cloud-native architecture, Milvus can perform millisecond-level searches even on billions of vectors.

Details

Milvus was developed by Zilliz in 2019 and is now managed as an incubation project under the LF AI & Data Foundation. Its distributed architecture separates compute and storage, achieving high scalability and availability. It is widely adopted in various AI applications requiring vector similarity search, including recommendation systems, image/video search, natural language processing, and anomaly detection.

Key features of Milvus:

  • Cloud-native distributed architecture
  • Support for multiple vector index types (HNSW, IVF, DiskANN, etc.)
  • Hardware acceleration (GPU, SIMD)
  • Hybrid search (vector search + scalar filtering)
  • Multi-tenancy support
  • Cost optimization with hot/cold storage
  • ACID properties support
  • Rich SDKs (Python, Go, Java, Node.js)
  • RESTful/gRPC APIs
  • Kubernetes-native deployment

Architecture

Milvus adopts a four-layer architecture:

  1. Access Layer: Load balancing with stateless proxies
  2. Coordinator Layer: Metadata management and task scheduling
  3. Worker Node Layer: Data processing and query execution
  4. Storage Layer: Object storage and message queue

Advantages and Disadvantages

Advantages

  • High Performance: Ultra-fast search with vector-specific optimizations
  • Scalability: Horizontal scaling supporting tens of billions of vectors
  • High Availability: Replication and automatic failover
  • Flexibility: Choose from 10+ index types
  • Cost-Effective: Optimized with hot/cold storage
  • Ecosystem: Integration with LangChain, LlamaIndex, etc.
  • Multi-Modal: Unified search for images, text, and audio

Disadvantages

  • Learning Curve: Requires understanding of vector database concepts
  • Resource Consumption: High memory usage with large-scale data
  • Complexity: Requires distributed system management
  • Limited Transactions: Not full ACID support
  • Vector-Specific: Not suitable for traditional SQL queries

Key Links

Code Examples

Installation and Setup

# Run with Docker (Milvus Standalone)
wget https://github.com/milvus-io/milvus/releases/download/v2.6.0/milvus-standalone-docker-compose.yml -O docker-compose.yml
docker compose up -d

# Run on Kubernetes (Milvus Distributed)
helm repo add milvus https://zilliztech.github.io/milvus-helm/
helm repo update
helm install my-milvus milvus/milvus

# Install Python SDK
pip install pymilvus

# Milvus Lite (for development)
pip install pymilvus
# Can be used directly in Python

# Install Node.js SDK
npm install @zilliz/milvus2-sdk-node

# Install Go SDK
go get -u github.com/milvus-io/milvus-sdk-go/v2

Basic Operations (Python SDK)

from pymilvus import MilvusClient, DataType
import numpy as np

# Connect
client = MilvusClient(
    uri="http://localhost:19530",  # Milvus Standalone
    # uri="./milvus_demo.db"  # Milvus Lite
)

# Create collection
schema = MilvusClient.create_schema(
    auto_id=False,
    enable_dynamic_field=True,
)

# Define fields
schema.add_field(
    field_name="id",
    datatype=DataType.INT64,
    is_primary=True,
)
schema.add_field(
    field_name="vector",
    datatype=DataType.FLOAT_VECTOR,
    dim=768,  # Embedding vector dimension
)
schema.add_field(
    field_name="text",
    datatype=DataType.VARCHAR,
    max_length=65535,
)

# Index parameters
index_params = client.prepare_index_params()
index_params.add_index(
    field_name="vector",
    index_type="HNSW",
    metric_type="L2",
    params={"M": 16, "efConstruction": 256}
)

# Create collection
client.create_collection(
    collection_name="documents",
    schema=schema,
    index_params=index_params
)

# Insert data
docs = [
    {"id": 1, "vector": np.random.rand(768).tolist(), "text": "Milvus is a high-performance vector database"},
    {"id": 2, "vector": np.random.rand(768).tolist(), "text": "Perfect for AI applications"},
    {"id": 3, "vector": np.random.rand(768).tolist(), "text": "Enables large-scale similarity search"},
]

client.insert(collection_name="documents", data=docs)

# Vector search
query_vector = np.random.rand(768).tolist()

results = client.search(
    collection_name="documents",
    data=[query_vector],
    limit=3,
    output_fields=["id", "text"]
)

for hits in results:
    for hit in hits:
        print(f"ID: {hit['id']}, Distance: {hit['distance']}, Text: {hit['text']}")

Advanced Search Features

# Hybrid search (vector + scalar filter)
results = client.search(
    collection_name="documents",
    data=[query_vector],
    filter="id > 1",  # Scalar filtering
    limit=5,
    output_fields=["id", "text", "category"]
)

# Range search
results = client.search(
    collection_name="documents",
    data=[query_vector],
    limit=10,
    search_params={"metric_type": "L2", "params": {"radius": 1.0}}
)

# Multiple vector search (batch processing)
query_vectors = [np.random.rand(768).tolist() for _ in range(5)]
results = client.search(
    collection_name="documents",
    data=query_vectors,
    limit=3
)

# Grouping search
results = client.search(
    collection_name="products",
    data=[query_vector],
    limit=10,
    group_by_field="category",
    output_fields=["id", "name", "category", "price"]
)

Integration with Embedding Functions

from pymilvus import model

# OpenAI embeddings
embedding_fn = model.dense.OpenAIEmbeddingFunction(
    model_name="text-embedding-3-small",
    api_key="your-api-key"
)

# Sentence Transformers
embedding_fn = model.dense.SentenceTransformerEmbeddingFunction(
    model_name="all-MiniLM-L6-v2"
)

# Generate embedding vectors from text
texts = ["How to use Milvus", "Implementing vector search"]
embeddings = embedding_fn.encode_documents(texts)

# Insert data with embeddings
docs_with_embeddings = [
    {"id": i, "vector": embedding, "text": text}
    for i, (text, embedding) in enumerate(zip(texts, embeddings))
]

client.insert(collection_name="documents", data=docs_with_embeddings)

Index Management

# Available index types
# FLAT: Brute-force search (100% accuracy)
# IVF_FLAT: Inverted file index
# IVF_SQ8: Memory reduction with quantization
# HNSW: Hierarchical Navigable Small World graph
# DISKANN: Disk-based ANN index

# Create index (IVF_FLAT)
index_params = {
    "metric_type": "L2",
    "index_type": "IVF_FLAT",
    "params": {"nlist": 1024}
}

client.create_index(
    collection_name="documents",
    field_name="vector",
    index_params=index_params
)

# Get index information
index_info = client.describe_index(
    collection_name="documents",
    field_name="vector"
)

# Drop and rebuild index
client.drop_index(
    collection_name="documents",
    field_name="vector"
)

Partition Management

# Create partition
client.create_partition(
    collection_name="documents",
    partition_name="2024_data"
)

# Insert data into specific partition
client.insert(
    collection_name="documents",
    data=docs,
    partition_name="2024_data"
)

# Search in specific partition
results = client.search(
    collection_name="documents",
    data=[query_vector],
    limit=5,
    partition_names=["2024_data"]
)

# List partitions
partitions = client.list_partitions(collection_name="documents")

Practical Example: Image Search System

from PIL import Image
import torch
from transformers import CLIPModel, CLIPProcessor

# Image embeddings with CLIP model
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

# Create collection
schema = MilvusClient.create_schema()
schema.add_field("id", DataType.INT64, is_primary=True)
schema.add_field("image_vector", DataType.FLOAT_VECTOR, dim=512)
schema.add_field("image_path", DataType.VARCHAR, max_length=500)
schema.add_field("metadata", DataType.JSON)

client.create_collection("image_search", schema=schema)

# Generate image embedding
def get_image_embedding(image_path):
    image = Image.open(image_path)
    inputs = processor(images=image, return_tensors="pt")
    with torch.no_grad():
        image_features = model.get_image_features(**inputs)
    return image_features.numpy().flatten().tolist()

# Insert image data
image_data = []
for i, img_path in enumerate(image_paths):
    embedding = get_image_embedding(img_path)
    image_data.append({
        "id": i,
        "image_vector": embedding,
        "image_path": img_path,
        "metadata": {"category": "product", "date": "2024-01-15"}
    })

client.insert("image_search", data=image_data)

# Search images by text query
def search_images_by_text(text_query, limit=10):
    inputs = processor(text=[text_query], return_tensors="pt")
    with torch.no_grad():
        text_features = model.get_text_features(**inputs)
    query_vector = text_features.numpy().flatten().tolist()
    
    results = client.search(
        collection_name="image_search",
        data=[query_vector],
        limit=limit,
        output_fields=["image_path", "metadata"]
    )
    return results

# Execute search
results = search_images_by_text("red car")

RAG Application Integration

from langchain_milvus import Milvus
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Split documents
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)
documents = text_splitter.split_documents(raw_documents)

# Milvus vector store
embeddings = OpenAIEmbeddings()
vectorstore = Milvus(
    embedding_function=embeddings,
    collection_name="rag_documents",
    connection_args={"uri": "http://localhost:19530"}
)

# Add documents
vectorstore.add_documents(documents)

# Similarity search
query = "What are the main features of Milvus?"
results = vectorstore.similarity_search(query, k=5)

# Build RAG chain
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(temperature=0)
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vectorstore.as_retriever(),
    return_source_documents=True
)

response = qa_chain({"query": query})
print(response["result"])

Management and Monitoring

# Collection statistics
stats = client.get_collection_stats(collection_name="documents")
print(f"Row count: {stats['row_count']}")

# Check memory usage
client.get_loading_progress(collection_name="documents")

# Flush collection (persist to disk)
client.flush(collection_names=["documents"])

# Backup (data export)
client.query(
    collection_name="documents",
    expr="",  # All data
    output_fields=["id", "vector", "text"],
    limit=1000000
)

# Drop collection
client.drop_collection(collection_name="documents")

# Close connection
client.close()

Performance Tuning

# Optimize search parameters
search_params = {
    "metric_type": "L2",
    "params": {
        "nprobe": 16,  # For IVF index
        "ef": 64,      # For HNSW index
    }
}

# Collection load settings
client.load_collection(
    collection_name="documents",
    replica_number=2  # Increase replicas for high availability
)

# Enable memory mapping (for large-scale data)
schema.add_field(
    field_name="large_text",
    datatype=DataType.VARCHAR,
    max_length=65535,
    mmap_enabled=True  # Enable memory mapping
)

# Optimize batch insertion
batch_size = 1000
for i in range(0, len(all_data), batch_size):
    batch = all_data[i:i+batch_size]
    client.insert(collection_name="documents", data=batch)