GitHub Overview
vespa-engine/vespa
AI + Data, online. https://vespa.ai
Topics
Star History
What is Vespa
Vespa is the world's leading open-source text search engine and the world's most capable vector database, developed by Yahoo!. It integrates the features of a database, search engine, and machine learning framework into a single platform, enabling the construction of scalable applications that require fast data processing and real-time search capabilities.
Key Features
AI + Data Unified Platform
- Multi-Data Type Support: Unified processing of vectors, tensors, text, and structured data
- Real-time Data Processing: Continuously update data corpus while serving queries in less than 100ms
- Machine Learning Model Integration: Evaluate ML models over selected data at serving time
- Hybrid Search: Combines vector similarity, relevance models, and multi-vector representations
Distributed Architecture
- Horizontal Scalability: Handle billions of documents and hundreds of thousands of queries per second
- Automatic Sharding: Automatically distribute data across multiple nodes with load balancing
- Fault Tolerance: Automatic data replication and failover capabilities
- Parallel Processing: Evaluate large datasets across multiple nodes in parallel
High-Performance Vector Search
- HNSW Algorithm: Implementation of HNSW algorithm for approximate vector search
- Multi-threaded Search: Use multiple search threads per query to reduce latency
- High Throughput: Achieve 80,000 vector writes/second without HNSW enabled
- Scale-out: Horizontally expand throughput by increasing content cluster nodes
ML Framework Integration
- Major Framework Support: Integration with TensorFlow, PyTorch, XGBoost, and LightGBM
- ONNX Runtime: Accelerated inference for large deep neural network models
- Data-to-Vector Transformation: Accelerate powerful data transformation models
Architecture
Vespa consists of three main subsystems:
1. Stateless Container (Java)
- jDisc Core: Application execution model and protocol-independent request-response handling
- Search Middleware: Query/result APIs and query execution logic
- Document Operations: Document model and asynchronous message passing APIs
2. Content Nodes (C++)
- searchcore: Core functionality for indexes, matching, data storage, and content node server
- searchlib: Libraries for ranking, index implementations, and attributes (forward indexes)
- storage: Elastic and auto-recovering data storage system across clusters
- eval: Library for efficient evaluation of ranking expressions and tensor API
3. Configuration and Administration (Java)
- configserver: Server for application deployment and configuration requests
- config-model: System model based on deployed applications
- config: Client-side library for subscribing to and reading configurations
Pros and Cons
Pros
- Unified platform combining search, vector search, and real-time ML inference
- Enterprise-grade track record and reliability (used by Yahoo! Mail, News, etc.)
- Advanced scalability and fault tolerance
- Real-time data updates and ML model updates without downtime
- SQL-like query language integrating structured and unstructured data
Cons
- Steep learning curve with complex configuration requirements
- Overkill for small-scale projects
- High resource consumption and infrastructure costs
- Limited Japanese documentation
Key Links
Deployment Options
Vespa Cloud (Recommended)
# Install Vespa CLI
brew install vespa-cli
# Deploy application
vespa auth login
vespa deploy --wait 300
Self-hosted (Docker)
# Start Vespa container
docker run --detach --name vespa --hostname vespa-container \
--publish 8080:8080 --publish 19071:19071 \
vespaai/vespa
# Deploy application
vespa deploy --target local
Code Examples
Basic Vector Search Schema
<!-- schema/document.sd -->
schema document {
document document {
field title type string {
indexing: summary | index
}
field embedding type tensor<float>(x[512]) {
indexing: summary | attribute
attribute {
distance-metric: euclidean
}
}
}
fieldset default {
fields: title
}
rank-profile default {
inputs {
query(q) tensor<float>(x[512])
}
first-phase {
expression: closeness(field, embedding)
}
}
}
Python Client Data Operations
import requests
import json
# Add document
document = {
"fields": {
"title": "How to use Vespa",
"embedding": [0.1] * 512 # 512-dimensional vector
}
}
response = requests.post(
"http://localhost:8080/document/v1/mynamespace/document/docid/1",
json=document
)
# Perform vector search
query = {
"yql": "select * from sources * where true",
"ranking.features.query(q)": [0.2] * 512,
"ranking.profile": "default"
}
search_response = requests.post(
"http://localhost:8080/search/",
json=query
)
results = search_response.json()
for hit in results["root"]["children"]:
print(f"Title: {hit['fields']['title']}")
print(f"Relevance: {hit['relevance']}")
Hybrid Search Example
# Combine text and vector search
hybrid_query = {
"yql": "select * from sources * where userQuery() or ({targetHits:100}nearestNeighbor(embedding,q))",
"query": "Vespa vector search",
"ranking.features.query(q)": [0.15] * 512,
"ranking.profile": "hybrid"
}
response = requests.post(
"http://localhost:8080/search/",
json=hybrid_query
)
Machine Learning Model Integration
<!-- Use ML model in ranking profile -->
rank-profile ml_ranking {
inputs {
query(user_embedding) tensor<float>(x[128])
}
first-phase {
expression: onnx(recommendation_model).score
}
onnx-model recommendation_model {
file: models/recommendation.onnx
input "user_features": query(user_embedding)
input "item_features": attribute(item_embedding)
output "score": score
}
}
Major Use Cases
- Search Applications: Large-scale services like Yahoo! Mail and News
- Recommendation Systems: Personalization systems like Spotify
- E-commerce Search: Product search and recommendation systems
- Content Moderation: Large-scale content filtering
- Real-time Analytics: Continuous analysis and aggregation of large data volumes
Summary
Vespa is a world-class search platform backed by Yahoo!'s years of experience and proven track record. Beyond being just a vector database, it provides a comprehensive solution that integrates AI, data, and search. For organizations considering large-scale enterprise usage, its proven track record and reliability make it a highly attractive choice.