Database
OpenTSDB
Overview
OpenTSDB is a distributed, scalable time-series database built on top of Apache HBase. Developed by StumbleUpon in 2010, it has the capability to efficiently store and process billions of data points. It is optimized for systems that handle large volumes of time-series data such as IoT devices, server monitoring, and application metrics.
Details
Key Features
- Unlimited Scalability: Leverages HBase's distributed architecture to scale linearly by simply adding nodes
- High Cardinality: Flexible data identification through tags (key-value pairs)
- Raw Data Retention: Stores original data without aggregation by default
- Powerful Query Capabilities: Complex aggregation and filtering via HTTP API or command line
- Plugin System: Supports custom extensions for authentication, search, real-time distribution, and more
- Command Line Tools: Comprehensive toolset for data import, querying, UID management, and more
Architecture
OpenTSDB consists of three main components:
- tCollector: Deployed on each server to periodically collect metrics data
- TSD (Time Series Daemon): Receives data, stores it in HBase, and handles query processing
- HBase: Acts as the backend storage system
Data Model
Time-series data is identified by the following elements:
- Metric Name: The measurement target (e.g., cpu.usage, memory.free)
- Tags: Key=value pairs for identification (e.g., host=web01, region=us-east)
- Timestamp: The time of the data point
- Value: The measured value
For efficient storage, string names are mapped to unique binary IDs (UIDs).
Pros and Cons
Pros
- Exceptional Scalability: Capable of handling millions of writes per second
- Mature HBase Ecosystem: Rich operational knowledge and tooling
- Flexible Tag System: Supports high cardinality
- Long-term Storage: No limits on data retention period
- Open Source: Free to use with an active community
- RESTful API: Easy integration and querying
- Plugin Support: High customizability
Cons
- Complex Setup: Requires HBase cluster construction and operation
- High Hardware Requirements: Minimum 4GB RAM recommended, production requires even more resources
- High Learning Curve: Knowledge of both HBase and OpenTSDB required
- Real-time Constraints: Batch-oriented, unsuitable for extremely low-latency use cases
- No SQL Support: Need to learn proprietary query syntax
- Less Modern Compared to InfluxDB or TimescaleDB: Design is older compared to newer time-series databases
Reference Pages
Code Examples
Installation and Setup
Prerequisites
# Java 8 or higher required
java -version
# HBase 0.94 or higher required
# ZooKeeper cluster must be set up beforehand
Simple Setup Using Docker
# Start OpenTSDB container
docker run -d \
--name opentsdb \
-p 4242:4242 \
petergrace/opentsdb-docker
# Access web interface
# http://localhost:4242
Configuration File (/etc/opentsdb/opentsdb.conf)
# HBase connection settings
tsd.storage.hbase.zk_quorum = localhost:2181
tsd.storage.hbase.zk_basedir = /hbase
# Port settings
tsd.network.port = 4242
# Automatic creation of new metrics
tsd.core.auto_create_metrics = true
# Enable metadata cache
tsd.core.meta.enable_realtime_ts = true
Basic Operations (Data Insertion and Querying)
Data Insertion (HTTP API)
# Insert single data point
curl -X POST http://localhost:4242/api/put \
-H "Content-Type: application/json" \
-d '{
"metric": "cpu.usage",
"timestamp": 1609459200,
"value": 45.2,
"tags": {
"host": "web01",
"region": "us-east"
}
}'
# Bulk insertion of multiple data points
curl -X POST http://localhost:4242/api/put \
-H "Content-Type: application/json" \
-d '[
{
"metric": "memory.usage",
"timestamp": 1609459200,
"value": 78.5,
"tags": {"host": "web01", "type": "physical"}
},
{
"metric": "memory.usage",
"timestamp": 1609459260,
"value": 79.1,
"tags": {"host": "web01", "type": "physical"}
}
]'
Insertion via Telnet Interface
# Connect with telnet client
telnet localhost 4242
# Data format: put <metric> <timestamp> <value> <tag1=value1> [<tag2=value2>...]
put cpu.usage 1609459200 45.2 host=web01 region=us-east
put memory.free 1609459260 2048 host=web01 type=available
Data Querying
# Basic query
curl "http://localhost:4242/api/query?start=1h-ago&m=avg:cpu.usage{host=web01}"
# Multiple metrics query
curl "http://localhost:4242/api/query?start=1d-ago&m=avg:cpu.usage{host=*}&m=avg:memory.usage{host=*}"
# Query using aggregation functions
curl "http://localhost:4242/api/query?start=6h-ago&m=avg:rate:network.bytes.in{interface=eth0}"
Data Modeling
Effective Tag Design
// Good example: Appropriate tag granularity
{
"metric": "http.requests",
"tags": {
"method": "GET",
"status": "200",
"endpoint": "api",
"datacenter": "us-east-1"
}
}
// Example to avoid: Too high cardinality
{
"metric": "http.requests",
"tags": {
"user_id": "12345", // User IDs could reach millions
"request_id": "abc123" // Request IDs grow infinitely
}
}
Metric Naming Conventions
# Hierarchical naming (dot-separated)
system.cpu.usage
system.memory.free
app.response_time
database.connections.active
cache.hits_per_second
# Category prefixes
prod.web.cpu.usage
dev.api.response_time
monitoring.alerts.count
Metrics Collection
Application Integration Example (Java)
import net.opentsdb.client.*;
public class MetricsCollector {
private OpenTSDBClient client;
public MetricsCollector() {
this.client = new OpenTSDBClient("localhost", 4242);
}
public void recordMetric(String metric, double value, Map<String, String> tags) {
DataPoint point = new DataPoint(metric, System.currentTimeMillis() / 1000, value, tags);
client.put(point);
}
// Example: Send CPU usage
public void sendCpuUsage() {
Map<String, String> tags = new HashMap<>();
tags.put("host", "app-server-01");
tags.put("environment", "production");
double cpuUsage = getCpuUsage(); // Get CPU usage from system
recordMetric("system.cpu.usage", cpuUsage, tags);
}
}
Python Client Example
import requests
import time
import json
class OpenTSDBClient:
def __init__(self, host='localhost', port=4242):
self.base_url = f"http://{host}:{port}"
def put(self, metric, value, tags, timestamp=None):
if timestamp is None:
timestamp = int(time.time())
data = {
"metric": metric,
"timestamp": timestamp,
"value": value,
"tags": tags
}
response = requests.post(
f"{self.base_url}/api/put",
json=data,
headers={'Content-Type': 'application/json'}
)
return response.status_code == 204
# Usage example
client = OpenTSDBClient()
client.put("temperature", 23.5, {"sensor": "room1", "building": "office"})
client.put("humidity", 65.2, {"sensor": "room1", "building": "office"})
Practical Examples
Building System Monitoring Dashboard
# Grafana integration settings
# Configure OpenTSDB data source
curl -X POST http://admin:admin@localhost:3000/api/datasources \
-H "Content-Type: application/json" \
-d '{
"name": "OpenTSDB",
"type": "opentsdb",
"url": "http://localhost:4242",
"access": "proxy"
}'
Alert Configuration
# Alerts using Nagios plugin
./check_tsd -H localhost -p 4242 -m cpu.usage -t host=web01 -w 80 -c 90
# Custom alert script
curl "http://localhost:4242/api/query?start=5m-ago&m=avg:cpu.usage{host=web01}" | \
jq '.[] | if .dps | to_entries | .[-1].value > 90 then "CRITICAL: CPU > 90%" else "OK" end'
Best Practices
Performance Optimization
# Configuration file optimization
# Adjust batch size
tsd.storage.flush_interval = 1000
# Enable compression
tsd.storage.enable_compaction = true
# Adjust cache size
tsd.core.meta.cache.enable = true
tsd.core.meta.cache.size = 1000000
Operations Monitoring
# Monitor OpenTSDB itself
curl "http://localhost:4242/api/stats"
# Check HBase status
echo "status" | hbase shell
# Log monitoring
tail -f /var/log/opentsdb/opentsdb.log
Data Retention Policy
# Delete old data (HBase TTL settings)
echo "alter 'tsdb', {NAME => 'f', TTL => 31536000}" | hbase shell # Retain for 1 year
# Run data compaction
tsdb fsck --fix-duplicates --compact
Scaling Strategy
# Load balancing configuration for multiple TSD instances
# HAProxy configuration example
backend opentsdb_cluster
balance roundrobin
server tsd1 192.168.1.10:4242 check
server tsd2 192.168.1.11:4242 check
server tsd3 192.168.1.12:4242 check