GitHub Overview

weaviate/weaviate

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

Stars14,045
Watchers135
Forks1,014
Created:March 30, 2016
Language:Go
License:BSD 3-Clause "New" or "Revised" License

Topics

approximate-nearest-neighbor-searchgenerative-searchgrpchnswhybrid-searchimage-searchinformation-retrievalmlopsnearest-neighbor-searchneural-searchrecommender-systemsearch-enginesemantic-searchsemantic-search-enginesimilarity-searchvector-databasevector-searchvector-search-enginevectorsweaviate

Star History

weaviate/weaviate Star History
Data as of: 7/30/2025, 12:10 AM

Overview

Weaviate is an open-source vector database that enables efficient data storage, search, and retrieval using machine learning models and vector representations. It supports semantic search, recommendation, and classification capabilities, providing intuitive data operations through its GraphQL API.

Details

GraphQL Support

One of Weaviate's key features is its comprehensive GraphQL API support:

  • Three Main Functions:

    • Get: Data search when the class name is known
    • Explore: Fuzzy search when schema and class names are unknown
    • Aggregate: Metadata search and data aggregation
  • Module Extensions: Ability to add GraphQL filters and custom properties (_additional)

Knowledge Graph Capabilities

While being a vector database, Weaviate also provides knowledge graph functionality:

  • Class-Property Structure: Each data object has an attached vector, enabling complex filtering through GraphQL
  • Relationship Management: Express relationships between objects with GraphQL reference resolution support
  • Semantic Interpretation: Semantically interprets schemas (ontologies), allowing searches by concepts rather than formal entities

Hybrid Search

Powerful search functionality combining vector search with traditional inverted indexes:

  • Filter by scalar values (text, numbers, etc.) simultaneously with vector search
  • Utilize both search methods in a single query
  • Combine BM25 keyword search with vector search

Docker Support

Weaviate supports Docker for everything from local development to production environments:

services:
  weaviate:
    image: cr.weaviate.io/semitechnologies/weaviate:latest
    ports:
      - 8080:8080
      - 50051:50051
    environment:
      ENABLE_MODULES: 'text2vec-transformers,generative-openai'

Advantages

  • Speed: Millisecond-scale 10-NN search on millions of objects
  • Flexibility: Automatic vectorization at import or upload of pre-vectorized data
  • Production-Ready: Designed for scaling, replication, and security
  • Multi-Modal: Supports various data types including text, images, and audio
  • Distributed Architecture: High availability through sharding, replication, and RAFT consensus

Disadvantages

  • Learning Curve: Requires understanding of GraphQL and vector search concepts
  • Resource Consumption: Large datasets require significant memory and storage
  • Module Dependencies: Advanced features may require integration with external AI services

Key Links

Code Examples

Basic GraphQL Query

{
  Get {
    Article(
      nearText: {
        concepts: ["AI technology"]
        distance: 0.6
      }
      limit: 5
    ) {
      title
      content
      _additional {
        distance
        certainty
      }
    }
  }
}

Python Client Usage

import weaviate

# Initialize Weaviate client
client = weaviate.Client("http://localhost:8080")

# Search with nearText
result = client.query.get(
    "Article",
    ["title", "content"]
).with_near_text({
    "concepts": ["machine learning", "deep learning"],
    "distance": 0.7
}).with_limit(10).do()

print(result)

Hybrid Search Example

# Combine vector search with filtering
where_filter = {
    "path": ["category"],
    "operator": "Equal",
    "valueText": "technology"
}

result = client.query.get(
    "Article",
    ["title", "content", "category"]
).with_near_text({
    "concepts": ["AI innovation"]
}).with_where(where_filter).with_limit(5).do()

Docker Compose Setup

version: '3.8'
services:
  weaviate:
    image: cr.weaviate.io/semitechnologies/weaviate:latest
    restart: on-failure:0
    ports:
      - "8080:8080"
      - "50051:50051"
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'text2vec-transformers'
      ENABLE_MODULES: 'text2vec-transformers,generative-openai'
      CLUSTER_HOSTNAME: 'node1'
    volumes:
      - weaviate_data:/var/lib/weaviate

volumes:
  weaviate_data: