GitHub Overview

vespa-engine/vespa

AI + Data, online. https://vespa.ai

Stars6,261
Watchers159
Forks653
Created:June 3, 2016
Language:Java
License:Apache License 2.0

Topics

aibig-datacppjavamachine-learningsearch-engineserverservingserving-recommendationtensorflowvector-searchvespa

Star History

vespa-engine/vespa Star History
Data as of: 7/30/2025, 03:04 AM

What is Vespa

Vespa is the world's leading open-source text search engine and the world's most capable vector database, developed by Yahoo!. It integrates the features of a database, search engine, and machine learning framework into a single platform, enabling the construction of scalable applications that require fast data processing and real-time search capabilities.

Key Features

AI + Data Unified Platform

  • Multi-Data Type Support: Unified processing of vectors, tensors, text, and structured data
  • Real-time Data Processing: Continuously update data corpus while serving queries in less than 100ms
  • Machine Learning Model Integration: Evaluate ML models over selected data at serving time
  • Hybrid Search: Combines vector similarity, relevance models, and multi-vector representations

Distributed Architecture

  • Horizontal Scalability: Handle billions of documents and hundreds of thousands of queries per second
  • Automatic Sharding: Automatically distribute data across multiple nodes with load balancing
  • Fault Tolerance: Automatic data replication and failover capabilities
  • Parallel Processing: Evaluate large datasets across multiple nodes in parallel

High-Performance Vector Search

  • HNSW Algorithm: Implementation of HNSW algorithm for approximate vector search
  • Multi-threaded Search: Use multiple search threads per query to reduce latency
  • High Throughput: Achieve 80,000 vector writes/second without HNSW enabled
  • Scale-out: Horizontally expand throughput by increasing content cluster nodes

ML Framework Integration

  • Major Framework Support: Integration with TensorFlow, PyTorch, XGBoost, and LightGBM
  • ONNX Runtime: Accelerated inference for large deep neural network models
  • Data-to-Vector Transformation: Accelerate powerful data transformation models

Architecture

Vespa consists of three main subsystems:

1. Stateless Container (Java)

  • jDisc Core: Application execution model and protocol-independent request-response handling
  • Search Middleware: Query/result APIs and query execution logic
  • Document Operations: Document model and asynchronous message passing APIs

2. Content Nodes (C++)

  • searchcore: Core functionality for indexes, matching, data storage, and content node server
  • searchlib: Libraries for ranking, index implementations, and attributes (forward indexes)
  • storage: Elastic and auto-recovering data storage system across clusters
  • eval: Library for efficient evaluation of ranking expressions and tensor API

3. Configuration and Administration (Java)

  • configserver: Server for application deployment and configuration requests
  • config-model: System model based on deployed applications
  • config: Client-side library for subscribing to and reading configurations

Pros and Cons

Pros

  • Unified platform combining search, vector search, and real-time ML inference
  • Enterprise-grade track record and reliability (used by Yahoo! Mail, News, etc.)
  • Advanced scalability and fault tolerance
  • Real-time data updates and ML model updates without downtime
  • SQL-like query language integrating structured and unstructured data

Cons

  • Steep learning curve with complex configuration requirements
  • Overkill for small-scale projects
  • High resource consumption and infrastructure costs
  • Limited Japanese documentation

Key Links

Deployment Options

Vespa Cloud (Recommended)

# Install Vespa CLI
brew install vespa-cli

# Deploy application
vespa auth login
vespa deploy --wait 300

Self-hosted (Docker)

# Start Vespa container
docker run --detach --name vespa --hostname vespa-container \
  --publish 8080:8080 --publish 19071:19071 \
  vespaai/vespa

# Deploy application
vespa deploy --target local

Code Examples

Basic Vector Search Schema

<!-- schema/document.sd -->
schema document {
    document document {
        field title type string {
            indexing: summary | index
        }
        field embedding type tensor<float>(x[512]) {
            indexing: summary | attribute
            attribute {
                distance-metric: euclidean
            }
        }
    }
    
    fieldset default {
        fields: title
    }
    
    rank-profile default {
        inputs {
            query(q) tensor<float>(x[512])
        }
        first-phase {
            expression: closeness(field, embedding)
        }
    }
}

Python Client Data Operations

import requests
import json

# Add document
document = {
    "fields": {
        "title": "How to use Vespa",
        "embedding": [0.1] * 512  # 512-dimensional vector
    }
}

response = requests.post(
    "http://localhost:8080/document/v1/mynamespace/document/docid/1",
    json=document
)

# Perform vector search
query = {
    "yql": "select * from sources * where true",
    "ranking.features.query(q)": [0.2] * 512,
    "ranking.profile": "default"
}

search_response = requests.post(
    "http://localhost:8080/search/",
    json=query
)

results = search_response.json()
for hit in results["root"]["children"]:
    print(f"Title: {hit['fields']['title']}")
    print(f"Relevance: {hit['relevance']}")

Hybrid Search Example

# Combine text and vector search
hybrid_query = {
    "yql": "select * from sources * where userQuery() or ({targetHits:100}nearestNeighbor(embedding,q))",
    "query": "Vespa vector search",
    "ranking.features.query(q)": [0.15] * 512,
    "ranking.profile": "hybrid"
}

response = requests.post(
    "http://localhost:8080/search/",
    json=hybrid_query
)

Machine Learning Model Integration

<!-- Use ML model in ranking profile -->
rank-profile ml_ranking {
    inputs {
        query(user_embedding) tensor<float>(x[128])
    }
    
    first-phase {
        expression: onnx(recommendation_model).score
    }
    
    onnx-model recommendation_model {
        file: models/recommendation.onnx
        input "user_features": query(user_embedding)
        input "item_features": attribute(item_embedding)
        output "score": score
    }
}

Major Use Cases

  • Search Applications: Large-scale services like Yahoo! Mail and News
  • Recommendation Systems: Personalization systems like Spotify
  • E-commerce Search: Product search and recommendation systems
  • Content Moderation: Large-scale content filtering
  • Real-time Analytics: Continuous analysis and aggregation of large data volumes

Summary

Vespa is a world-class search platform backed by Yahoo!'s years of experience and proven track record. Beyond being just a vector database, it provides a comprehensive solution that integrates AI, data, and search. For organizations considering large-scale enterprise usage, its proven track record and reliability make it a highly attractive choice.