GitHub Overview
typesense/typesense
Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences
Topics
Star History
What is Typesense
Typesense is an open-source alternative to Algolia + Pinecone and an easier-to-use alternative to ElasticSearch. It's a fast, typo-tolerant, in-memory fuzzy search engine designed for building delightful search experiences. Particularly optimized for real-time search and instant search-as-you-type experiences.
Key Features
High-Speed Search Engine
- Sub-50ms Response Times: Speed-engineered design providing smooth and responsive search experiences
- Memory-First Architecture: Optimized C++ data structures with asynchronous disk persistence model
- Millions of Queries/Second: Maintains high throughput even on large datasets
- Single Binary: Lightweight, self-contained native binary for simple setup and operation
Typo Tolerance and Flexibility
- Built-in Typo Tolerance: Automatically handles misspellings and typos, improving search accuracy and user satisfaction
- Adaptive Radix Tree (ART) Index: Fuzzy matching calculates edit distances between query tokens and candidates
- Tunable Ranking: Configure ranking rules to tailor search results
- Real-time Settings: Flexibly change settings at search time via query parameters
Vector Search Capabilities
- HNSW Algorithm: Efficient approximate nearest neighbor search in high-dimensional vector spaces
- Vector Query Syntax:
field_name:([vector_values], k: number_of_results)
format - Hybrid Search: Combine text search and vector search for more relevant results with semantic similarity and keyword matching
- Distance Threshold Filtering: Limit results by cosine distance in the range 0.0 to 2.0
Advanced Search Features
- Faceting and Filtering: Drill down and refine results based on field values and aggregations
- Geosearch: Location-based search capabilities
- Synonyms: Define words as equivalents to improve search precision
- Curation: Boost specific records to fixed positions for featured content
- Voice Search: Capture voice queries and transcribe via Whisper model for search results
2025 Latest Features
- JOIN Functionality: Connect multiple collections via common reference fields for SQL-like relationship modeling
- Scoped API Keys: Generate API keys allowing access only to certain records for multi-tenant applications
- Raft-based Clustering: Set up highly available distributed clusters
- InstantSearch.js Compatibility: Direct compatibility with Algolia's JavaScript library
Pros and Cons
Pros
- Self-hostable open-source solution
- Single index can handle sorting (Algolia requires separate indices per sort order)
- Cost-efficient with fixed cluster pricing instead of per-record or per-operation charges
- Dynamic configuration of fields, facets, and ranking settings at search time
- No runtime dependencies for simple setup and operation
- InstantSearch.js compatibility allows reuse of existing frontend code
Cons
- Missing Algolia's personalization and server-based search analytics features
- Limited advanced ML/AI features compared to Algolia or Elasticsearch
- Relatively new project with limited enterprise adoption cases
- GPL-3.0 license restrictions require attention for commercial use
Key Links
- Official Website
- GitHub Repository
- Documentation
- Typesense Cloud
- Downloads
- InstantSearch Adapter
- Service Comparisons
Installation
Docker Quick Start
# Start Typesense server
docker run -p 8108:8108 -v/tmp/data:/data typesense/typesense:28.0 \
--data-dir /data --api-key=Hu52dwsas2AdxdE
Binary Package Installation
# Linux (x86_64)
wget https://dl.typesense.org/releases/28.0/typesense-server-28.0-linux-amd64.tar.gz
tar -xzf typesense-server-28.0-linux-amd64.tar.gz
# Start server
./typesense-server --data-dir=/tmp/data --api-key=your-api-key
Typesense Cloud (Managed Service)
# Create cloud cluster
# Visit https://cloud.typesense.org/ to create a cluster
Code Examples
Basic Text Search
const Typesense = require('typesense');
// Initialize client
const client = new Typesense.Client({
'nodes': [{
'host': 'localhost',
'port': '8108',
'protocol': 'http'
}],
'apiKey': 'Hu52dwsas2AdxdE',
'connectionTimeoutSeconds': 2
});
// Create collection
const booksSchema = {
'name': 'books',
'fields': [
{'name': 'title', 'type': 'string'},
{'name': 'authors', 'type': 'string[]', 'facet': true},
{'name': 'publication_year', 'type': 'int32', 'facet': true},
{'name': 'rating', 'type': 'float'}
],
'default_sorting_field': 'rating'
};
await client.collections().create(booksSchema);
// Add documents
const documents = [
{
'title': 'Introduction to Machine Learning',
'authors': ['John Smith'],
'publication_year': 2023,
'rating': 4.5
},
{
'title': 'Deep Learning Fundamentals',
'authors': ['Jane Doe'],
'publication_year': 2024,
'rating': 4.8
}
];
await client.collections('books').documents().import(documents);
// Perform search
const searchResults = await client.collections('books').documents().search({
'q': 'machine learning',
'query_by': 'title',
'facet_by': 'authors,publication_year'
});
console.log(searchResults);
Vector Search Example
// Vector field collection schema
const vectorSchema = {
'name': 'products',
'fields': [
{'name': 'name', 'type': 'string'},
{'name': 'description', 'type': 'string'},
{'name': 'embedding', 'type': 'float[]', 'num_dim': 512}
]
};
await client.collections().create(vectorSchema);
// Add documents with vector data
const vectorDocuments = [
{
'name': 'MacBook Pro',
'description': 'High-performance laptop',
'embedding': Array(512).fill(0).map(() => Math.random())
},
{
'name': 'iPad Air',
'description': 'Lightweight tablet',
'embedding': Array(512).fill(0).map(() => Math.random())
}
];
await client.collections('products').documents().import(vectorDocuments);
// Perform vector search
const queryVector = Array(512).fill(0).map(() => Math.random());
const vectorResults = await client.collections('products').documents().search({
'q': '*',
'vector_query': `embedding:([${queryVector.join(',')}], k: 5)`,
'query_by': 'name'
});
console.log('Vector search results:', vectorResults);
Hybrid Search Example
// Combine text and vector search for hybrid results
const hybridResults = await client.collections('products').documents().search({
'q': 'MacBook',
'query_by': 'name,description',
'vector_query': `embedding:([${queryVector.join(',')}], k: 10, alpha: 0.7)`,
'facet_by': 'category',
'filter_by': 'price:>1000'
});
console.log('Hybrid search results:', hybridResults);
Python Usage Example
import typesense
# Initialize client
client = typesense.Client({
'nodes': [{
'host': 'localhost',
'port': '8108',
'protocol': 'http'
}],
'api_key': 'Hu52dwsas2AdxdE',
'connection_timeout_seconds': 2
})
# Create collection
schema = {
"name": "articles",
"fields": [
{"name": "title", "type": "string"},
{"name": "content", "type": "string"},
{"name": "vector", "type": "float[]", "num_dim": 128}
]
}
client.collections.create(schema)
# Add documents
documents = [
{
"title": "The Future of AI",
"content": "Development and future prospects of artificial intelligence technology",
"vector": [0.1] * 128
},
{
"title": "Machine Learning Algorithms",
"content": "Mathematical models that learn patterns from data",
"vector": [0.2] * 128
}
]
client.collections['articles'].documents.import_(documents)
# Hybrid search
query_vector = [0.15] * 128
results = client.collections['articles'].documents.search({
'q': 'AI machine learning',
'query_by': 'title,content',
'vector_query': f'vector:([{",".join(map(str, query_vector))}], k: 5, alpha: 0.6)'
})
for hit in results['hits']:
print(f"Title: {hit['document']['title']}")
print(f"Score: {hit['text_match']}")
print("---")
Major Use Cases
- E-commerce Sites: Product search and filtering
- Document Search: Internal knowledge bases and help centers
- Content Management Systems: Blog and news site search
- Media Libraries: Image and video metadata search
- Enterprise Search: Unified search across databases and file systems
- Semantic Search: Chatbots and Q&A systems
Summary
Typesense is an excellent open-source alternative to Algolia and Pinecone. It integrates high-speed performance, typo tolerance, and vector search capabilities while being self-hostable and cost-efficient. Particularly attractive for small to medium-scale projects seeking to easily implement high-quality search experiences without the complexity and cost of enterprise solutions.