Technology Catalog | Developer's Catalog

GitHub Overview

vespa-engine/vespa

AI + Data, online. https://vespa.ai

Repository:https://github.com/vespa-engine/vespa

Homepage:https://vespa.ai

Stars6,261

Watchers159

Forks653

Created:June 3, 2016

Language:Java

License:Apache License 2.0

Topics

aibig-datacppjavamachine-learningsearch-engineserverservingserving-recommendationtensorflowvector-searchvespa

Star History

Data as of: 7/30/2025, 03:04 AM

What is Vespa

Vespa is the world's leading open-source text search engine and the world's most capable vector database, developed by Yahoo!. It integrates the features of a database, search engine, and machine learning framework into a single platform, enabling the construction of scalable applications that require fast data processing and real-time search capabilities.

Key Features

AI + Data Unified Platform

Multi-Data Type Support: Unified processing of vectors, tensors, text, and structured data
Real-time Data Processing: Continuously update data corpus while serving queries in less than 100ms
Machine Learning Model Integration: Evaluate ML models over selected data at serving time
Hybrid Search: Combines vector similarity, relevance models, and multi-vector representations

Distributed Architecture

Horizontal Scalability: Handle billions of documents and hundreds of thousands of queries per second
Automatic Sharding: Automatically distribute data across multiple nodes with load balancing
Fault Tolerance: Automatic data replication and failover capabilities
Parallel Processing: Evaluate large datasets across multiple nodes in parallel

High-Performance Vector Search

HNSW Algorithm: Implementation of HNSW algorithm for approximate vector search
Multi-threaded Search: Use multiple search threads per query to reduce latency
High Throughput: Achieve 80,000 vector writes/second without HNSW enabled
Scale-out: Horizontally expand throughput by increasing content cluster nodes

ML Framework Integration

Major Framework Support: Integration with TensorFlow, PyTorch, XGBoost, and LightGBM
ONNX Runtime: Accelerated inference for large deep neural network models
Data-to-Vector Transformation: Accelerate powerful data transformation models

Architecture

Vespa consists of three main subsystems:

1. Stateless Container (Java)

jDisc Core: Application execution model and protocol-independent request-response handling
Search Middleware: Query/result APIs and query execution logic
Document Operations: Document model and asynchronous message passing APIs

2. Content Nodes (C++)

searchcore: Core functionality for indexes, matching, data storage, and content node server
searchlib: Libraries for ranking, index implementations, and attributes (forward indexes)
storage: Elastic and auto-recovering data storage system across clusters
eval: Library for efficient evaluation of ranking expressions and tensor API

3. Configuration and Administration (Java)

configserver: Server for application deployment and configuration requests
config-model: System model based on deployed applications
config: Client-side library for subscribing to and reading configurations

Pros and Cons

Pros

Unified platform combining search, vector search, and real-time ML inference
Enterprise-grade track record and reliability (used by Yahoo! Mail, News, etc.)
Advanced scalability and fault tolerance
Real-time data updates and ML model updates without downtime
SQL-like query language integrating structured and unstructured data

Cons

Steep learning curve with complex configuration requirements
Overkill for small-scale projects
High resource consumption and infrastructure costs
Limited Japanese documentation

Key Links

Deployment Options

Vespa Cloud (Recommended)

# Install Vespa CLI
brew install vespa-cli

# Deploy application
vespa auth login
vespa deploy --wait 300

Self-hosted (Docker)

# Start Vespa container
docker run --detach --name vespa --hostname vespa-container \
  --publish 8080:8080 --publish 19071:19071 \
  vespaai/vespa

# Deploy application
vespa deploy --target local

Code Examples

Basic Vector Search Schema

<!-- schema/document.sd -->
schema document {
    document document {
        field title type string {
            indexing: summary | index
        }
        field embedding type tensor<float>(x[512]) {
            indexing: summary | attribute
            attribute {
                distance-metric: euclidean
            }
        }
    }
    
    fieldset default {
        fields: title
    }
    
    rank-profile default {
        inputs {
            query(q) tensor<float>(x[512])
        }
        first-phase {
            expression: closeness(field, embedding)
        }
    }
}

Python Client Data Operations

import requests
import json

# Add document
document = {
    "fields": {
        "title": "How to use Vespa",
        "embedding": [0.1] * 512  # 512-dimensional vector
    }
}

response = requests.post(
    "http://localhost:8080/document/v1/mynamespace/document/docid/1",
    json=document
)

# Perform vector search
query = {
    "yql": "select * from sources * where true",
    "ranking.features.query(q)": [0.2] * 512,
    "ranking.profile": "default"
}

search_response = requests.post(
    "http://localhost:8080/search/",
    json=query
)

results = search_response.json()
for hit in results["root"]["children"]:
    print(f"Title: {hit['fields']['title']}")
    print(f"Relevance: {hit['relevance']}")

Hybrid Search Example

# Combine text and vector search
hybrid_query = {
    "yql": "select * from sources * where userQuery() or ({targetHits:100}nearestNeighbor(embedding,q))",
    "query": "Vespa vector search",
    "ranking.features.query(q)": [0.15] * 512,
    "ranking.profile": "hybrid"
}

response = requests.post(
    "http://localhost:8080/search/",
    json=hybrid_query
)

Machine Learning Model Integration

<!-- Use ML model in ranking profile -->
rank-profile ml_ranking {
    inputs {
        query(user_embedding) tensor<float>(x[128])
    }
    
    first-phase {
        expression: onnx(recommendation_model).score
    }
    
    onnx-model recommendation_model {
        file: models/recommendation.onnx
        input "user_features": query(user_embedding)
        input "item_features": attribute(item_embedding)
        output "score": score
    }
}

Major Use Cases

Search Applications: Large-scale services like Yahoo! Mail and News
Recommendation Systems: Personalization systems like Spotify
E-commerce Search: Product search and recommendation systems
Content Moderation: Large-scale content filtering
Real-time Analytics: Continuous analysis and aggregation of large data volumes

Summary

Vespa is a world-class search platform backed by Yahoo!'s years of experience and proven track record. Beyond being just a vector database, it provides a comprehensive solution that integrates AI, data, and search. For organizations considering large-scale enterprise usage, its proven track record and reliability make it a highly attractive choice.