GitHub Overview
facebookresearch/faiss
A library for efficient similarity search and clustering of dense vectors.
Topics
Star History
What is FAISS
FAISS (Facebook AI Similarity Search) is a library for efficient similarity search and clustering of dense vectors developed by Meta's Fundamental AI Research group. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It features diverse algorithms and GPU acceleration capabilities.
Key Features
High-Performance GPU Acceleration
- Industry-Leading Speed: GPU implementations are 5-10x faster than corresponding CPU implementations
- Multi-GPU Support: Near linear speedup with multiple GPUs (6-7x speedup with 8 GPUs)
- CUDA/ROCm Support: Compatible with both NVIDIA CUDA and AMD ROCm
- Memory Efficient: Allocation-free design on GPU for optimal memory usage
2025 Latest Features: NVIDIA cuVS Integration
- cuVS Integration: NVIDIA cuVS integrated into FAISS v1.10 for further acceleration
- IVF Index Performance: Up to 4.7x faster build times and 8.1x reduced search latency
- CAGRA Graph Index: 12.3x faster build times and 4.7x reduced search latency compared to CPU HNSW
- Flexible Backend Selection: Conda packages allow choosing between classic FAISS GPU and NVIDIA cuVS algorithms
Comprehensive Algorithm Support
- Flat Index: Linear search with raw vectors
- IVF (Inverted File): Clustering-based fast search
- PQ (Product Quantization): Vector compression for memory efficiency
- HNSW (Hierarchical NSW): Fast approximate nearest neighbor search
- LSH (Locality Sensitive Hashing): Hash-based approximate search
- Scalar Quantization: Balance between precision and speed
GPU Index Types
- GpuIndexFlat: GPU version of flat index
- GpuIndexIVFFlat: GPU version of IVF flat index
- GpuIndexIVFPQ: GPU version of IVF + PQ index
- GpuIndexIVFScalarQuantizer: GPU version of IVF + scalar quantization
- GpuIndexCagra: NVIDIA-contributed GPU-specific graph-based index
Multi-Language Support
- C++ Core: High-performance native implementation
- Python Wrapper: Complete integration with NumPy
- Evaluation Tools: Supporting code for parameter tuning and evaluation
Pros and Cons
Pros
- Industry-leading search speed and GPU optimization
- High flexibility with diverse algorithms and indexing methods
- Can handle large datasets exceeding RAM capacity
- Practical experience and continuous development from Meta (Facebook)
- Rich documentation and community support
- Optimal for specialized use cases requiring algorithm flexibility and maximum control
Cons
- Pure vector library without complete database functionality
- Requires advanced configuration and tuning with high learning cost
- Library-level tool requiring integrated solutions
- High GPU hardware dependency and infrastructure costs
Key Links
- Official Website
- GitHub Repository
- Documentation
- Meta AI Tools Page
- GPU Support Wiki
- LangChain Integration
- Pinecone Tutorial
Installation
Conda Installation (Recommended)
# CPU version
conda install -c pytorch faiss-cpu
# GPU version (classic FAISS GPU)
conda install -c pytorch faiss-gpu
# GPU version (with NVIDIA cuVS integration)
conda install -c pytorch faiss-gpu-cuvs
pip Installation
# CPU version
pip install faiss-cpu
# GPU version
pip install faiss-gpu
Build from Source
# Clone repository
git clone https://github.com/facebookresearch/faiss.git
cd faiss
# Build with CMake
cmake -B build -DFAISS_ENABLE_PYTHON=ON -DFAISS_ENABLE_GPU=ON
make -C build -j faiss
# Install Python wrapper
cd build/faiss/python
python setup.py install
Code Examples
Basic Vector Search
import numpy as np
import faiss
# Generate random data
d = 64 # vector dimension
nb = 100000 # database size
nq = 10000 # number of queries
np.random.seed(1234) # for reproducibility
xb = np.random.random((nb, d)).astype('float32')
xb[:, 0] += np.arange(nb) / 1000.
xq = np.random.random((nq, d)).astype('float32')
xq[:, 0] += np.arange(nq) / 1000.
# Create index and add data
index = faiss.IndexFlatL2(d) # L2 distance flat index
print(f"Index is trained: {index.is_trained}")
index.add(xb) # add vectors to index
print(f"Index contains {index.ntotal} vectors")
# Perform search
k = 4 # search for 4 nearest neighbors
D, I = index.search(xq, k) # perform search
print(f"First 5 neighbors of first query: {I[:5]}")
print(f"Corresponding distances: {D[:5]}")
GPU Acceleration Example
import faiss
import numpy as np
# Setup GPU resources
res = faiss.StandardGpuResources() # initialize GPU resources
# Prepare data
d = 64
nb = 100000
nq = 1000
xb = np.random.random((nb, d)).astype('float32')
xq = np.random.random((nq, d)).astype('float32')
# Move CPU index to GPU
index_flat = faiss.IndexFlatL2(d)
gpu_index = faiss.index_cpu_to_gpu(res, 0, index_flat)
# Add data and search
gpu_index.add(xb)
D, I = gpu_index.search(xq, 5)
print(f"GPU search completed. First result: {I[0]}")
IVF Index for Fast Search
import faiss
import numpy as np
# Prepare data
d = 64
nb = 100000
nlist = 100 # number of clusters
k = 4
np.random.seed(1234)
xb = np.random.random((nb, d)).astype('float32')
xq = np.random.random((10000, d)).astype('float32')
# Create IVF index
quantizer = faiss.IndexFlatL2(d) # quantizer for clustering
index = faiss.IndexIVFFlat(quantizer, d, nlist)
# Train the index
assert not index.is_trained
index.train(xb)
assert index.is_trained
# Add data
index.add(xb)
print(f"Added {index.ntotal} vectors to index")
# Adjust search parameters
index.nprobe = 10 # number of clusters to search
D, I = index.search(xq, k)
print(f"IVF search completed. Recall vs exact: {(I[:5] == I[:5]).sum() / 20}")
Product Quantization for Memory Savings
import faiss
import numpy as np
# Prepare data
d = 64
nb = 100000
nlist = 100
m = 8 # number of PQ sub-vectors
np.random.seed(1234)
xb = np.random.random((nb, d)).astype('float32')
xq = np.random.random((10000, d)).astype('float32')
# Create IVF + PQ index
quantizer = faiss.IndexFlatL2(d)
index = faiss.IndexIVFPQ(quantizer, d, nlist, m, 8)
# 8 specifies that each sub-vector is encoded as 8 bits
# Train and add data
index.train(xb)
index.add(xb)
# Search with parameter adjustment
index.nprobe = 10
D, I = index.search(xq, k)
print(f"PQ search completed. Memory usage significantly reduced.")
print(f"Index size: {index.ntotal} vectors")
Multi-GPU Search
import faiss
import numpy as np
# Check available GPUs
ngpus = faiss.get_num_gpus()
print(f"Number of GPUs: {ngpus}")
if ngpus > 1:
# Prepare data
d = 64
nb = 100000
xb = np.random.random((nb, d)).astype('float32')
xq = np.random.random((1000, d)).astype('float32')
# Create multi-GPU index
cpu_index = faiss.IndexFlatL2(d)
gpu_index = faiss.index_cpu_to_all_gpus(cpu_index)
# Add data and search
gpu_index.add(xb)
D, I = gpu_index.search(xq, 5)
print(f"Multi-GPU search completed on {ngpus} GPUs")
else:
print("Multi-GPU example requires more than 1 GPU")
HNSW Index Example
import faiss
import numpy as np
# HNSW parameters
d = 64
M = 16 # number of connections
ef_construction = 200 # search width during construction
# Prepare data
nb = 10000
nq = 100
np.random.seed(1234)
xb = np.random.random((nb, d)).astype('float32')
xq = np.random.random((nq, d)).astype('float32')
# Create HNSW index
index = faiss.IndexHNSWFlat(d, M)
index.hnsw.efConstruction = ef_construction
# Add data
index.add(xb)
# Adjust search parameters
index.hnsw.efSearch = 50
D, I = index.search(xq, 5)
print(f"HNSW search completed. Results: {I[:3]}")
Major Use Cases
- Recommendation Systems: Similar product and content recommendations
- Image/Video Search: Similarity search for multimedia content
- Natural Language Processing: Similarity search for document embeddings
- Bioinformatics: Similarity analysis of DNA and protein sequences
- Finance: Risk management and similar trading pattern search
- Anomaly Detection: Outlier detection in high-dimensional data
- Clustering: Grouping and classification of large-scale data
Summary
FAISS is the world's leading vector similarity search library, backed by Meta's years of research and practical experience. With GPU acceleration and diverse algorithms, it's optimal for applications requiring high-speed search on large-scale data. The 2025 NVIDIA cuVS integration provides even greater performance improvements. For specialized use cases requiring algorithm flexibility and maximum control, FAISS is an unparalleled choice.