OpenSearch
Open-source fork of Elasticsearch. Distributed search and analytics engine supporting log analysis, real-time monitoring, and security analytics. Developed under AWS leadership.
Server
OpenSearch
Overview
OpenSearch is an open-source distributed search and analytics engine that emerged as an AWS-led fork of Elasticsearch. In 2024, it transitioned to Linux Foundation governance and has evolved into a large-scale community project with over 1,400 contributors and 100+ GitHub repositories. It features vector search, hybrid search, and AI-driven search capabilities with excellent integration in AWS environments.
Details
OpenSearch 2024 edition has established its own identity beyond simple Elasticsearch compatibility. It includes Facebook FAISS integration, SIMD hardware acceleration, vector quantization for high-performance semantic search, cross-cluster replication, trace analytics, data streams, transforms, new observability UI, and significant improvements to k-NN, anomaly detection, PPL, SQL, and alerting features. With native integration to AWS IAM, KMS, and CloudWatch, it's optimized for AWS environment operations.
Key Features
- Vector & Hybrid Search: Next-generation search combining semantic and keyword search capabilities
- Distributed Architecture: Horizontal scaling and high availability through distributed design
- AWS Integration: Native integration with IAM, KMS, and CloudWatch
- Observability: Comprehensive trace analytics and monitoring capabilities
- AI & Machine Learning: Built-in anomaly detection and neural search features
- Real-time Analytics: Immediate analysis and alerting on streaming data
Pros and Cons
Pros
- Open governance and long-term stability under Linux Foundation management
- Excellent AWS environment integration with managed service (Amazon OpenSearch Service)
- Support for next-generation workloads through vector search and AI features
- Freedom from Elasticsearch licensing issues and true open-source nature
- Active community (1,400+ contributors) providing continuous development
- Rich library of AWS-authored plugins and features
Cons
- May underperform Elasticsearch in enterprise-scale or complex query scenarios
- Limited plugin ecosystem compared to Elasticsearch
- More mature toolchain available for Elasticsearch outside AWS environments
- Potential compatibility challenges when migrating from Elasticsearch
- Some advanced Elastic Stack features are not available
- Commercial support options are not as extensive as Elasticsearch
Reference Pages
- OpenSearch Official Website
- OpenSearch Documentation
- OpenSearch GitHub Repository
- Amazon OpenSearch Service
Code Examples
Setup and Installation
# Docker execution
docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:latest
# Docker Compose cluster configuration
cat > docker-compose.yml << 'EOF'
version: '3'
services:
opensearch-node1:
image: opensearchproject/opensearch:latest
container_name: opensearch-node1
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node1
- discovery.seed_hosts=opensearch-node1,opensearch-node2
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data1:/usr/share/opensearch/data
ports:
- 9200:9200
- 9600:9600
networks:
- opensearch-net
opensearch-node2:
image: opensearchproject/opensearch:latest
container_name: opensearch-node2
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node2
- discovery.seed_hosts=opensearch-node1,opensearch-node2
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data2:/usr/share/opensearch/data
networks:
- opensearch-net
opensearch-dashboards:
image: opensearchproject/opensearch-dashboards:latest
container_name: opensearch-dashboards
ports:
- 5601:5601
expose:
- "5601"
environment:
OPENSEARCH_HOSTS: '["https://opensearch-node1:9200","https://opensearch-node2:9200"]'
networks:
- opensearch-net
volumes:
opensearch-data1:
opensearch-data2:
networks:
opensearch-net:
EOF
docker-compose up -d
# Binary installation on Linux
wget https://artifacts.opensearch.org/releases/bundle/opensearch/2.11.1/opensearch-2.11.1-linux-x64.tar.gz
tar -xzf opensearch-2.11.1-linux-x64.tar.gz
cd opensearch-2.11.1
# Edit configuration file
vi config/opensearch.yml
# Start OpenSearch
./bin/opensearch
Index Creation and Document Management
# Create index
curl -X PUT "localhost:9200/movies" -H 'Content-Type: application/json' -d'
{
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 1
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "standard"
},
"overview": {
"type": "text",
"analyzer": "standard"
},
"genre": {
"type": "keyword"
},
"release_date": {
"type": "date"
},
"rating": {
"type": "float"
},
"location": {
"type": "geo_point"
},
"embedding": {
"type": "knn_vector",
"dimension": 512,
"method": {
"name": "hnsw",
"space_type": "l2",
"engine": "nmslib"
}
}
}
}
}'
# Add single document
curl -X POST "localhost:9200/movies/_doc/1" -H 'Content-Type: application/json' -d'
{
"title": "Avengers: Endgame",
"overview": "The epic conclusion to the Marvel Cinematic Universe",
"genre": ["Action", "Adventure", "Sci-Fi"],
"release_date": "2019-04-26",
"rating": 8.4,
"director": "Russo Brothers",
"studio": "Marvel Studios",
"location": {
"lat": 40.7589,
"lon": -73.9851
}
}'
# Bulk document addition
curl -X POST "localhost:9200/_bulk" -H 'Content-Type: application/json' -d'
{"index":{"_index":"movies","_id":"2"}}
{"title":"Your Name","overview":"A time-transcending youth love story","genre":["Animation","Romance","Drama"],"release_date":"2016-08-26","rating":8.4,"director":"Makoto Shinkai"}
{"index":{"_index":"movies","_id":"3"}}
{"title":"Parasite","overview":"Korean film depicting class disparity","genre":["Thriller","Drama","Comedy"],"release_date":"2019-05-30","rating":8.6,"director":"Bong Joon-ho"}
{"index":{"_index":"movies","_id":"4"}}
{"title":"Top Gun: Maverick","overview":"Tom Cruise sequel","genre":["Action","Drama"],"release_date":"2022-05-27","rating":8.3,"director":"Joseph Kosinski"}
'
# Update document
curl -X POST "localhost:9200/movies/_update/1" -H 'Content-Type: application/json' -d'
{
"doc": {
"rating": 8.5,
"updated_at": "2024-01-15"
}
}'
# Delete document
curl -X DELETE "localhost:9200/movies/_doc/1"
Search Query Implementation
# Basic search
curl -X GET "localhost:9200/movies/_search?q=Avengers"
# Structured search
curl -X GET "localhost:9200/movies/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"title": "Avengers"
}
},
"size": 10,
"from": 0
}'
# Complex search (Bool Query)
curl -X GET "localhost:9200/movies/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [
{"match": {"overview": "Marvel"}}
],
"filter": [
{"range": {"rating": {"gte": 8.0}}},
{"term": {"genre": "Action"}}
],
"must_not": [
{"term": {"genre": "Horror"}}
],
"should": [
{"match": {"director": "Russo Brothers"}}
]
}
},
"sort": [
{"rating": {"order": "desc"}},
{"release_date": {"order": "desc"}}
]
}'
# Faceted search (Aggregations)
curl -X GET "localhost:9200/movies/_search" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"genres": {
"terms": {
"field": "genre",
"size": 10
}
},
"avg_rating": {
"avg": {
"field": "rating"
}
},
"rating_histogram": {
"histogram": {
"field": "rating",
"interval": 1
}
},
"release_years": {
"date_histogram": {
"field": "release_date",
"calendar_interval": "year"
}
}
}
}'
# Geographic search
curl -X GET "localhost:9200/movies/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": {
"geo_distance": {
"distance": "10km",
"location": {
"lat": 40.7589,
"lon": -73.9851
}
}
}
}
},
"sort": [
{
"_geo_distance": {
"location": {
"lat": 40.7589,
"lon": -73.9851
},
"order": "asc",
"unit": "km"
}
}
]
}'
Vector Search and AI Features
# k-NN vector search configuration
curl -X PUT "localhost:9200/documents" -H 'Content-Type: application/json' -d'
{
"settings": {
"index": {
"knn": true,
"knn.algo_param.ef_search": 100
}
},
"mappings": {
"properties": {
"title": {
"type": "text"
},
"content": {
"type": "text"
},
"embedding": {
"type": "knn_vector",
"dimension": 768,
"method": {
"name": "hnsw",
"space_type": "l2",
"engine": "faiss",
"parameters": {
"ef_construction": 128,
"m": 24
}
}
}
}
}
}'
# Add vector document
curl -X POST "localhost:9200/documents/_doc" -H 'Content-Type: application/json' -d'
{
"title": "AI Technology Advancement",
"content": "Artificial intelligence technology is rapidly developing, with machine learning and deep learning being utilized across various fields.",
"embedding": [0.1, 0.2, 0.3, ...]
}'
# Execute k-NN search
curl -X GET "localhost:9200/documents/_search" -H 'Content-Type: application/json' -d'
{
"size": 5,
"query": {
"knn": {
"embedding": {
"vector": [0.15, 0.25, 0.35, ...],
"k": 10
}
}
}
}'
# Hybrid search (Keyword + Vector)
curl -X GET "localhost:9200/documents/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"hybrid": {
"queries": [
{
"match": {
"content": "artificial intelligence"
}
},
{
"knn": {
"embedding": {
"vector": [0.15, 0.25, 0.35, ...],
"k": 10
}
}
}
]
}
}
}'
# Neural search (semantic search)
curl -X GET "localhost:9200/documents/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"neural": {
"embedding": {
"query_text": "machine learning algorithms",
"model_id": "huggingface_embeddings",
"k": 10
}
}
}
}'
Advanced Configuration and Performance Optimization
# Cluster settings
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
"persistent": {
"cluster.routing.allocation.disk.watermark.low": "85%",
"cluster.routing.allocation.disk.watermark.high": "90%",
"cluster.routing.allocation.disk.watermark.flood_stage": "95%",
"cluster.max_shards_per_node": 3000,
"search.max_buckets": 65536
}
}'
# Create index template
curl -X PUT "localhost:9200/_index_template/logs_template" -H 'Content-Type: application/json' -d'
{
"index_patterns": ["logs-*"],
"template": {
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1,
"index.lifecycle.name": "logs_policy",
"index.lifecycle.rollover_alias": "logs"
},
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"level": {
"type": "keyword"
},
"message": {
"type": "text",
"analyzer": "standard"
},
"service": {
"type": "keyword"
},
"host": {
"type": "keyword"
}
}
}
}
}'
# Index State Management (ISM) policy configuration
curl -X PUT "localhost:9200/_plugins/_ism/policies/log_policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"description": "Log retention policy",
"default_state": "hot",
"states": [
{
"name": "hot",
"actions": [
{
"rollover": {
"min_size": "5gb",
"min_doc_count": 1000000,
"min_index_age": "1d"
}
}
],
"transitions": [
{
"state_name": "warm",
"conditions": {
"min_index_age": "7d"
}
}
]
},
{
"name": "warm",
"actions": [
{
"replica_count": {
"number_of_replicas": 0
}
}
],
"transitions": [
{
"state_name": "delete",
"conditions": {
"min_index_age": "30d"
}
}
]
},
{
"name": "delete",
"actions": [
{
"delete": {}
}
]
}
]
}
}'
# Performance monitoring
curl -X GET "localhost:9200/_cluster/health?pretty"
curl -X GET "localhost:9200/_nodes/stats?pretty"
curl -X GET "localhost:9200/_cat/indices?v&s=store.size:desc"
curl -X GET "localhost:9200/_cat/shards?v&s=store:desc"
Security and Access Control
# Security plugin configuration (opensearch.yml)
cat >> config/opensearch.yml << 'EOF'
plugins.security.ssl.transport.pemcert_filepath: certs/opensearch.pem
plugins.security.ssl.transport.pemkey_filepath: certs/opensearch-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: certs/root-ca.pem
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: certs/opensearch.pem
plugins.security.ssl.http.pemkey_filepath: certs/opensearch-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: certs/root-ca.pem
plugins.security.allow_unsafe_democertificates: true
plugins.security.allow_default_init_securityindex: true
plugins.security.authcz.admin_dn:
- CN=opensearch-admin,OU=IT,O=Example,L=Tokyo,ST=Tokyo,C=JP
plugins.security.nodes_dn:
- CN=opensearch-node,OU=IT,O=Example,L=Tokyo,ST=Tokyo,C=JP
plugins.security.audit.type: internal_opensearch
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]
plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices:
[
".opendistro-alerting-config",
".opendistro-alerting-alert*",
".opendistro-anomaly-results*",
".opendistro-anomaly-detector*",
".opendistro-anomaly-checkpoints",
".opendistro-anomaly-detection-state",
".opendistro-reports-*",
".opensearch-notifications-*",
".opensearch-notebooks",
".opensearch-observability",
".opendistro-asynchronous-search-response*",
".replication-metadata-store"
]
EOF
# Create user
curl -X PUT "https://localhost:9200/_plugins/_security/api/internalusers/analyst" \
-u admin:admin -k -H 'Content-Type: application/json' -d'
{
"password": "analyst@123",
"opendistro_security_roles": ["readall"],
"backend_roles": ["analytics_team"],
"attributes": {
"department": "analytics"
}
}'
# Create role
curl -X PUT "https://localhost:9200/_plugins/_security/api/roles/movie_reader" \
-u admin:admin -k -H 'Content-Type: application/json' -d'
{
"cluster_permissions": ["cluster_monitor"],
"index_permissions": [
{
"index_patterns": ["movies*"],
"allowed_actions": ["read", "indices:data/read/*"]
}
]
}'
# Create API Key
curl -X POST "https://localhost:9200/_plugins/_security/api/account" \
-u admin:admin -k -H 'Content-Type: application/json' -d'
{
"current_password": "admin",
"password": "new_password_123"
}'
AWS Integration and Managed Services
# AWS CloudFormation Template (Amazon OpenSearch Service)
AWSTemplateFormatVersion: '2010-09-09'
Resources:
OpenSearchDomain:
Type: AWS::OpenSearch::Domain
Properties:
DomainName: my-opensearch-domain
EngineVersion: OpenSearch_2.11
ClusterConfig:
InstanceType: t3.medium.search
InstanceCount: 3
DedicatedMasterEnabled: true
MasterInstanceType: t3.small.search
MasterInstanceCount: 3
EBSOptions:
EBSEnabled: true
VolumeType: gp3
VolumeSize: 100
VPCOptions:
SecurityGroupIds:
- !Ref OpenSearchSecurityGroup
SubnetIds:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
EncryptionAtRestOptions:
Enabled: true
NodeToNodeEncryptionOptions:
Enabled: true
DomainEndpointOptions:
EnforceHTTPS: true
AccessPolicies:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
AWS: !Sub 'arn:aws:iam::${AWS::AccountId}:root'
Action: 'es:*'
Resource: !Sub 'arn:aws:es:${AWS::Region}:${AWS::AccountId}:domain/my-opensearch-domain/*'
OpenSearchSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Security group for OpenSearch domain
VpcId: !Ref VPC
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 443
ToPort: 443
SourceSecurityGroupId: !Ref ApplicationSecurityGroup
# Python boto3 AWS OpenSearch Service operations
import boto3
from opensearchpy import OpenSearch, RequestsHttpConnection
from aws_requests_auth.aws_auth import AWSRequestsAuth
# AWS credentials configuration
session = boto3.Session()
credentials = session.get_credentials()
region = 'us-east-1'
service = 'es'
host = 'search-my-domain-xxx.us-east-1.es.amazonaws.com'
awsauth = AWSRequestsAuth(credentials, region, service)
# Create OpenSearch client
client = OpenSearch(
hosts=[{'host': host, 'port': 443}],
http_auth=awsauth,
use_ssl=True,
verify_certs=True,
connection_class=RequestsHttpConnection
)
# Create index
response = client.indices.create(
index='logs',
body={
'settings': {
'number_of_shards': 2,
'number_of_replicas': 1
},
'mappings': {
'properties': {
'timestamp': {'type': 'date'},
'message': {'type': 'text'},
'level': {'type': 'keyword'},
'service': {'type': 'keyword'}
}
}
}
)
# Add document
response = client.index(
index='logs',
body={
'timestamp': '2024-01-15T10:00:00',
'message': 'Application started successfully',
'level': 'INFO',
'service': 'web-app'
}
)
# Execute search
response = client.search(
index='logs',
body={
'query': {
'bool': {
'must': [
{'match': {'message': 'error'}}
],
'filter': [
{'range': {'timestamp': {'gte': 'now-1d'}}}
]
}
},
'sort': [
{'timestamp': {'order': 'desc'}}
]
}
)
print(f"Found {response['hits']['total']['value']} documents")
Advanced Features and Observability
# Anomaly detection setup
curl -X POST "localhost:9200/_plugins/_anomaly_detection/detectors" -H 'Content-Type: application/json' -d'
{
"name": "cpu-anomaly-detector",
"description": "Detect CPU usage anomalies",
"time_field": "@timestamp",
"indices": ["system-metrics-*"],
"feature_attributes": [
{
"feature_name": "cpu_usage",
"feature_enabled": true,
"aggregation_query": {
"avg_cpu": {
"avg": {
"field": "cpu.percentage"
}
}
}
}
],
"filter_query": {
"bool": {
"filter": [
{
"range": {
"@timestamp": {
"gte": "now-1h"
}
}
}
]
}
},
"detection_interval": {
"period": {
"interval": 10,
"unit": "Minutes"
}
},
"window_delay": {
"period": {
"interval": 1,
"unit": "Minutes"
}
}
}'
# SQL queries on OpenSearch
curl -X POST "localhost:9200/_plugins/_sql" -H 'Content-Type: application/json' -d'
{
"query": "SELECT genre, AVG(rating) as avg_rating FROM movies GROUP BY genre ORDER BY avg_rating DESC"
}'
# Piped Processing Language (PPL) queries
curl -X POST "localhost:9200/_plugins/_ppl" -H 'Content-Type: application/json' -d'
{
"query": "source=movies | where rating > 8.0 | stats avg(rating) by genre | sort avg_rating desc"
}'
# Trace analytics configuration
curl -X PUT "localhost:9200/_plugins/_trace/settings" -H 'Content-Type: application/json' -d'
{
"cluster.trace.enable": true,
"cluster.trace.indices": ["jaeger-span-*"],
"cluster.trace.service_map.enabled": true
}'
# Alerting configuration
curl -X POST "localhost:9200/_plugins/_alerting/monitors" -H 'Content-Type: application/json' -d'
{
"type": "monitor",
"name": "High Error Rate Monitor",
"enabled": true,
"schedule": {
"period": {
"interval": 1,
"unit": "MINUTES"
}
},
"inputs": [
{
"search": {
"indices": ["application-logs-*"],
"query": {
"size": 0,
"query": {
"bool": {
"filter": [
{
"range": {
"@timestamp": {
"gte": "now-5m"
}
}
},
{
"term": {
"level": "ERROR"
}
}
]
}
},
"aggs": {
"error_count": {
"value_count": {
"field": "_id"
}
}
}
}
}
}
],
"triggers": [
{
"name": "High Error Count",
"severity": "1",
"condition": {
"script": {
"source": "ctx.results[0].aggregations.error_count.value > 100"
}
},
"actions": [
{
"name": "Send Email Alert",
"destination_id": "email-destination-id",
"message_template": {
"source": "High error rate detected: {{ctx.results.0.aggregations.error_count.value}} errors in the last 5 minutes"
},
"throttle_enabled": true,
"throttle": {
"value": 60,
"unit": "MINUTES"
}
}
]
}
]
}'
OpenSearch is a modern search and analytics platform that has established its own identity from an Elasticsearch fork. It provides an excellent choice for building next-generation applications that leverage AWS environment operations and AI/vector search capabilities, offering commercial-grade functionality while maintaining the benefits of open-source software.