GitHub Overview

qdrant/qdrant

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Stars24,978
Watchers138
Forks1,722
Created:May 30, 2020
Language:Rust
License:Apache License 2.0

Topics

ai-searchai-search-engineembeddings-similarityhnswimage-searchknn-algorithmmachine-learningmlopsnearest-neighbor-searchneural-networkneural-searchrecommender-systemsearchsearch-enginesearch-enginessimilarity-searchvector-databasevector-searchvector-search-engine

Star History

qdrant/qdrant Star History
Data as of: 7/30/2025, 12:10 AM

Database

Qdrant

Overview

Qdrant is a high-performance vector similarity search engine and vector database written in Rust. It is designed for efficient storage, search, and management of high-dimensional vectors with their associated JSON payloads. Achieving the highest RPS (Requests Per Second) and lowest latency in 2025 benchmarks, Qdrant is recognized as a performance leader. It is optimized for neural network-based matching, semantic search, and recommendation systems.

Details

Qdrant leverages Rust's speed, memory safety, and concurrency features to deliver fast and reliable performance even under high load. With extended support for complex filtering capabilities, dynamic query planning, and payload indexes, it is ideal for applications requiring sophisticated metadata filtering.

Key features of Qdrant:

  • High-performance Rust implementation: Achieves maximum speed through memory safety and low-level control
  • Vector similarity search: Efficient nearest neighbor search using HNSW algorithm
  • Payload filtering: Complex filtering on JSON metadata (keywords, full-text, numerical ranges, geo-locations)
  • Hybrid search: Supports both dense and sparse vectors
  • Distributed architecture: Horizontal scaling through sharding and replication
  • Multiple interfaces: Provides both REST and gRPC APIs
  • Vector quantization: Reduces RAM usage by up to 97% and manages trade-offs between speed and precision
  • On-disk storage: Support for large-scale datasets
  • Write-Ahead Logging: Ensures data persistence and update confirmation
  • SIMD optimization: Hardware acceleration on x86-x64 and Neon architectures
  • Async I/O: Maximizes disk throughput using io_uring
  • Binary quantization: 40x performance improvement for high-dimensional vectors

Architecture

Qdrant employs a layered architecture:

  1. API Layer: Endpoints via actix-web (REST) and tonic (gRPC)
  2. Application Core: Request dispatcher and Raft consensus
  3. Collection Management: Metadata management and distribution logic
  4. Data Layer: Shard replication and local/remote shards
  5. Storage & Indexing: Segment-based vector and payload storage
  6. Persistence Layer: WriteAheadLog, RocksDB, memory-mapped files

2025 Updates

  • Development of lightweight version for edge devices
  • Introduction of custom Rust-based Gridstore key-value store (migrating from RocksDB)
  • Industry leadership in performance benchmarks
  • Ability to query over 35 million embeddings in under 1 second

Pros and Cons

Pros

  • Top performance: Achieves highest RPS and lowest latency in 2025 benchmarks
  • Rust implementation: Fast and stable through memory safety and concurrency
  • Advanced filtering: Complex metadata filtering via payload indexes
  • Memory efficiency: Up to 97% RAM reduction through vector quantization
  • Scalability: Distributed architecture supports large-scale deployments
  • Developer experience: Simple API with rich SDKs (Python, Go, Java, Node.js, Rust)
  • Hardware optimization: Fast vector operations using SIMD instructions
  • Flexible search: Supports both dense and sparse vectors
  • Open source: Apache 2.0 license

Cons

  • Learning curve: Requires understanding of vector search concepts
  • Rust expertise: Customization requires Rust knowledge
  • Resource consumption: Large indexes consume significant memory
  • Limited ecosystem: Fewer integration tools compared to Milvus
  • Enterprise features: Limited managed service options

Key Links

Code Examples

Installation and Setup

# Run with Docker
docker pull qdrant/qdrant
docker run -p 6333:6333 -p 6334:6334 \
    -v $(pwd)/qdrant_storage:/qdrant/storage:z \
    qdrant/qdrant

# Run with Docker Compose
cat > docker-compose.yml << EOF
version: '3.4'
services:
  qdrant:
    image: qdrant/qdrant
    restart: always
    ports:
      - 6333:6333
      - 6334:6334
    volumes:
      - ./qdrant_storage:/qdrant/storage:z
EOF
docker-compose up -d

# Build from source with Rust
git clone https://github.com/qdrant/qdrant.git
cd qdrant
cargo build --release
./target/release/qdrant

# Install Python client
pip install qdrant-client

# Install Node.js client
npm install @qdrant/js-client-rest

# Install Go client
go get -u github.com/qdrant/go-client

Basic Operations (Python SDK)

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct
import numpy as np

# Initialize client
client = QdrantClient("localhost", port=6333)
# or Qdrant Cloud
# client = QdrantClient(url="https://xxx.qdrant.io", api_key="your-api-key")

# Create collection
client.create_collection(
    collection_name="my_collection",
    vectors_config=VectorParams(size=768, distance=Distance.COSINE),
)

# Insert points
points = [
    PointStruct(
        id=1,
        vector=np.random.rand(768).tolist(),
        payload={"text": "Qdrant is a high-performance vector database", "category": "database"}
    ),
    PointStruct(
        id=2,
        vector=np.random.rand(768).tolist(),
        payload={"text": "Search engine implemented in Rust", "category": "search"}
    ),
    PointStruct(
        id=3,
        vector=np.random.rand(768).tolist(),
        payload={"text": "2025 performance leader", "category": "benchmark"}
    )
]

client.upsert(
    collection_name="my_collection",
    points=points
)

# Vector search
query_vector = np.random.rand(768).tolist()
search_result = client.search(
    collection_name="my_collection",
    query_vector=query_vector,
    limit=3
)

for hit in search_result:
    print(f"ID: {hit.id}, Score: {hit.score}, Payload: {hit.payload}")

Advanced Search Features

from qdrant_client.models import Filter, FieldCondition, MatchValue, Range

# Search with filtering
search_result = client.search(
    collection_name="my_collection",
    query_vector=query_vector,
    query_filter=Filter(
        must=[
            FieldCondition(
                key="category",
                match=MatchValue(value="database")
            )
        ]
    ),
    limit=5
)

# Range filtering
search_result = client.search(
    collection_name="my_collection",
    query_vector=query_vector,
    query_filter=Filter(
        must=[
            FieldCondition(
                key="price",
                range=Range(
                    gte=100,
                    lte=1000
                )
            )
        ]
    ),
    limit=10
)

# Multiple conditions
from qdrant_client.models import MatchAny

search_result = client.search(
    collection_name="products",
    query_vector=query_vector,
    query_filter=Filter(
        must=[
            FieldCondition(
                key="category",
                match=MatchAny(any=["electronics", "computers"])
            )
        ],
        must_not=[
            FieldCondition(
                key="discontinued",
                match=MatchValue(value=True)
            )
        ]
    ),
    limit=20
)

# Search with score threshold
search_result = client.search(
    collection_name="my_collection",
    query_vector=query_vector,
    score_threshold=0.8,
    limit=10
)

Creating Payload Indexes

from qdrant_client.models import PayloadSchemaType

# Create text field index
client.create_payload_index(
    collection_name="my_collection",
    field_name="text",
    field_schema=PayloadSchemaType.TEXT
)

# Create numeric field index
client.create_payload_index(
    collection_name="my_collection",
    field_name="price",
    field_schema=PayloadSchemaType.FLOAT
)

# Create keyword field index
client.create_payload_index(
    collection_name="my_collection",
    field_name="category",
    field_schema=PayloadSchemaType.KEYWORD
)

# Create geo field index
client.create_payload_index(
    collection_name="locations",
    field_name="location",
    field_schema=PayloadSchemaType.GEO
)

Batch Operations and Optimization

# Batch insertion (efficient for large data)
batch_size = 100
all_points = []

for i in range(10000):
    point = PointStruct(
        id=i,
        vector=np.random.rand(768).tolist(),
        payload={
            "text": f"Document {i}",
            "category": f"cat_{i % 10}",
            "timestamp": i * 1000
        }
    )
    all_points.append(point)
    
    if len(all_points) >= batch_size:
        client.upsert(
            collection_name="my_collection",
            points=all_points
        )
        all_points = []

# Insert remaining points
if all_points:
    client.upsert(
        collection_name="my_collection",
        points=all_points
    )

# Optimize collection
client.update_collection(
    collection_name="my_collection",
    optimizer_config={
        "deleted_threshold": 0.2,
        "vacuum_min_vector_number": 1000,
        "default_segment_number": 2,
        "max_segment_size": 200000,
        "memmap_threshold": 50000,
        "indexing_threshold": 10000,
        "flush_interval_sec": 5,
        "max_optimization_threads": 2
    }
)

Integration with Embedding Models

from sentence_transformers import SentenceTransformer
from qdrant_client.models import Batch

# Initialize embedding model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Text data
texts = [
    "Qdrant is a fast vector search engine",
    "Implemented in Rust with high performance",
    "Supports complex filtering and payload management"
]

# Generate embeddings
embeddings = model.encode(texts)

# Insert into Qdrant
points = []
for idx, (text, embedding) in enumerate(zip(texts, embeddings)):
    points.append(
        PointStruct(
            id=idx,
            vector=embedding.tolist(),
            payload={"text": text}
        )
    )

client.upsert(
    collection_name="text_collection",
    points=points
)

# Semantic search
query_text = "vector database performance"
query_embedding = model.encode(query_text)

results = client.search(
    collection_name="text_collection",
    query_vector=query_embedding.tolist(),
    limit=3,
    with_payload=True
)

for result in results:
    print(f"Score: {result.score:.4f}, Text: {result.payload['text']}")

Snapshots and Backup

# Create snapshot
client.create_snapshot(collection_name="my_collection")

# List snapshots
snapshots = client.list_snapshots(collection_name="my_collection")
for snapshot in snapshots:
    print(f"Snapshot: {snapshot.name}, Created: {snapshot.creation_time}")

# Create full backup
client.create_full_snapshot()

# Restore from snapshot
client.recover_snapshot(
    collection_name="my_collection",
    location="http://localhost:6333/collections/my_collection/snapshots/snapshot_2024.snapshot"
)

Using gRPC API

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams

# Initialize gRPC client (faster)
client = QdrantClient(
    host="localhost",
    port=6334,  # gRPC port
    grpc_port=6334,
    prefer_grpc=True
)

# Operations remain the same
client.create_collection(
    collection_name="grpc_collection",
    vectors_config=VectorParams(size=384, distance=Distance.DOT),
)

Multi-Vector Search

from qdrant_client.models import NamedVector, VectorParams

# Create collection with multiple vectors
client.create_collection(
    collection_name="multi_vector_collection",
    vectors_config={
        "text": VectorParams(size=768, distance=Distance.COSINE),
        "image": VectorParams(size=512, distance=Distance.EUCLID)
    }
)

# Insert multi-vector points
client.upsert(
    collection_name="multi_vector_collection",
    points=[
        PointStruct(
            id=1,
            vector={
                "text": np.random.rand(768).tolist(),
                "image": np.random.rand(512).tolist()
            },
            payload={"description": "Product 1"}
        )
    ]
)

# Search with specific vector
text_results = client.search(
    collection_name="multi_vector_collection",
    query_vector=np.random.rand(768).tolist(),
    using="text",
    limit=5
)

image_results = client.search(
    collection_name="multi_vector_collection",
    query_vector=np.random.rand(512).tolist(),
    using="image",
    limit=5
)

Performance Tuning

from qdrant_client.models import OptimizersConfigDiff, HnswConfigDiff

# Adjust HNSW parameters
client.update_collection(
    collection_name="my_collection",
    hnsw_config=HnswConfigDiff(
        m=32,  # More connections for higher precision
        ef_construct=200,  # Construction-time precision
        full_scan_threshold=20000  # Full scan threshold
    )
)

# Search-time parameters
search_result = client.search(
    collection_name="my_collection",
    query_vector=query_vector,
    search_params={
        "hnsw_ef": 128,  # Search-time precision parameter
        "exact": False  # Use approximate search
    },
    limit=10
)

# Quantization configuration
from qdrant_client.models import ScalarQuantization, QuantizationSearchParams

client.update_collection(
    collection_name="my_collection",
    quantization_config=ScalarQuantization(
        scalar={
            "type": "int8",
            "quantile": 0.99,
            "always_ram": True
        }
    )
)

# Search with quantization
search_result = client.search(
    collection_name="my_collection",
    query_vector=query_vector,
    search_params={
        "quantization": QuantizationSearchParams(
            ignore=False,
            rescore=True,
            oversampling=2.0
        )
    },
    limit=10
)