What is Marqo
Marqo is more than a vector database, it's an end-to-end vector search engine for both text and images. Vector generation, storage and retrieval are handled out of the box through a single API. No need to bring your own embeddings, making it developer-friendly by design.
Key Features
Unified Architecture
- All-in-One Solution: Handles vector generation through search in one integrated system
- Multimodal Support: Works with both text and images
- Automatic Embedding Generation: No pre-embedding preparation required
- Document-Level Abstraction: Treats data as documents rather than pure vectors
Advanced Search Capabilities
- Complex Semantic Queries: Build queries with weighted search terms
- Filtering: Filter results using Marqo's query DSL
- Searchable Attributes: Limit search to specific fields
- Hybrid Search: Supports tensor, lexical, and hybrid search types
Latest Features (2024-2025)
- Stella Embedding Model Support: Supports high-performance models like stella_en_400M_v5
- FFmpeg-CUDA Integration: GPU acceleration for video processing (up to 5x faster)
- Video/Audio File Size Limits: Configurable file size restrictions
- Python 3.9 Support: Enhanced security and compatibility
Pros and Cons
Pros
- No need to manage embeddings manually
- Simple API design with low learning curve
- High flexibility with multimodal support
- Available on both cloud and on-premises
- Active development and community support
Cons
- May have limited features compared to pure vector databases for specific use cases
- Can be overkill for simple vector search needs
- Custom embedding model flexibility may be restricted in some cases
Key Links
- Official Website
- GitHub Repository
- Documentation
- Python Client
- Kubernetes Support
- Getting Started Tutorial
Installation
Using Docker
# Pull the Docker image
docker pull marqoai/marqo:latest
# Remove existing container (if needed)
docker rm -f marqo
# Run Marqo container
docker run --name marqo -it -p 8882:8882 marqoai/marqo:latest
Python Client Installation
pip install marqo
Code Examples
Basic Usage
import marqo
# Create a Marqo client
mq = marqo.Client(url='http://localhost:8882')
# Create an index
mq.create_index("movies-index", model="hf/e5-base-v2")
# Add documents
mq.index("movies-index").add_documents([
{
"Title": "The Travels of Marco Polo",
"Description": "A 13th-century travelogue describing Polo's travels"
},
{
"Title": "Extravehicular Mobility Unit (EMU)",
"Description": "The EMU is a spacesuit that provides environmental protection",
"_id": "article_591"
}
], tensor_fields=["Description"])
# Search the index
results = mq.index("movies-index").search(
q="What is the best outfit to wear on the moon?",
searchable_attributes=["Description"]
)
# Print results
for result in results['hits']:
print(f"Title: {result['Title']}")
print(f"Description: {result['Description']}")
print(f"Score: {result['_score']}")
Image Search Example
# Create an image index
mq.create_index("image-index", model="open_clip/ViT-B-32/openai")
# Add images
mq.index("image-index").add_documents([
{
"image_url": "https://example.com/image1.jpg",
"caption": "Beautiful sunset landscape"
},
{
"image_url": "https://example.com/image2.jpg",
"caption": "City nightscape"
}
], tensor_fields=["image_url", "caption"])
# Search images with text
results = mq.index("image-index").search(
q="sunset scenery"
)
Integrations and Ecosystem
LangChain Integration
from langchain_community.vectorstores import Marqo
# Use Marqo with LangChain
vectorstore = Marqo(
marqo_url="http://localhost:8882",
marqo_api_key="", # Optional
index_name="langchain-demo"
)
Haystack Integration
Marqo can be used as a Document Store for Haystack pipelines including retrieval-augmented generation (RAG), question answering, and document search.
Summary
Marqo is a powerful tool that simplifies end-to-end vector search implementation. By eliminating the need for manual embedding generation and supporting both text and images, it significantly improves developer productivity. Available as both a cloud service and self-hosted solution, it can accommodate projects of various scales.