Database
InfluxDB
Overview
InfluxDB is an open-source NoSQL database specialized for time series data. Designed as a scalable datastore for metrics, events, and real-time analytics, it efficiently stores and queries time series data from IoT sensors, application monitoring, business metrics, and more.
Details
InfluxDB was developed by InfluxData in 2013. Optimized for time series data characteristics (time-ordered, high write frequency, range query-centric), it serves as the core component of the TICK stack (Telegraf, InfluxDB, Chronograf, Kapacitor). It provides SQL-like query language "Flux" and HTTP APIs, achieving high performance.
Key features of InfluxDB:
- Purpose-built for time series data
- High-speed write and read performance
- SQL-like query language (Flux)
- Schema-less design
- Automatic data retention management
- High-precision timestamps
- Tag and field data model
- Horizontal scaling (cluster edition)
- Real-time aggregation and downsampling
- RESTful HTTP API
Advantages and Disadvantages
Advantages
- High performance: Optimized high-speed writes and reads for time series data
- Ease of use: SQL-like query language with low learning curve
- Automatic management: Automated data retention and downsampling
- Rich functionality: Statistical functions, time window aggregation, forecasting
- Ecosystem: Complete solution with TICK stack
- APIs: Language-agnostic access via HTTP API
- Visualization: High compatibility with visualization tools like Grafana
Disadvantages
- Specialized use: Not suitable for non-time series data
- Memory usage: Can consume large amounts of memory
- Complexity: Complex configuration for advanced features
- Learning curve: Need to learn Flux query language
- Licensing: Enterprise features are commercial
Key Links
Code Examples
Installation & Setup
# Run with Docker (recommended)
docker run -d --name influxdb \
-p 8086:8086 \
-v influxdb-storage:/var/lib/influxdb2 \
-e DOCKER_INFLUXDB_INIT_MODE=setup \
-e DOCKER_INFLUXDB_INIT_USERNAME=admin \
-e DOCKER_INFLUXDB_INIT_PASSWORD=password123 \
-e DOCKER_INFLUXDB_INIT_ORG=myorg \
-e DOCKER_INFLUXDB_INIT_BUCKET=mybucket \
influxdb:2.7
# Ubuntu/Debian
wget -qO- https://repos.influxdata.com/influxdata-archive_compat.key | sudo apt-key add -
echo "deb https://repos.influxdata.com/ubuntu stable main" | sudo tee /etc/apt/sources.list.d/influxdb.list
sudo apt update && sudo apt install influxdb2
# Red Hat/CentOS
cat > /etc/yum.repos.d/influxdb.repo << EOF
[influxdb]
name = InfluxDB Repository - RHEL \$releasever
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdata-archive_compat.key
EOF
sudo yum install influxdb2
# macOS (Homebrew)
brew install influxdb
# Start service
sudo systemctl start influxdb
sudo systemctl enable influxdb
# Initial setup
influx setup
Basic Operations (HTTP API)
# Create organization and bucket
curl -X POST "http://localhost:8086/api/v2/orgs" \
-H "Authorization: Token YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "myorg"
}'
curl -X POST "http://localhost:8086/api/v2/buckets" \
-H "Authorization: Token YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "sensors",
"orgID": "YOUR_ORG_ID",
"retentionRules": [
{
"type": "expire",
"everySeconds": 2592000
}
]
}'
# Write data (Line Protocol)
curl -X POST "http://localhost:8086/api/v2/write?org=myorg&bucket=sensors" \
-H "Authorization: Token YOUR_TOKEN" \
-H "Content-Type: text/plain" \
-d 'temperature,location=room1,sensor=DHT22 value=23.5 1640995200000000000
humidity,location=room1,sensor=DHT22 value=65.2 1640995200000000000
cpu_usage,host=server1,region=us-east value=85.3 1640995260000000000'
# Read data (Flux query)
curl -X POST "http://localhost:8086/api/v2/query?org=myorg" \
-H "Authorization: Token YOUR_TOKEN" \
-H "Content-Type: application/vnd.flux" \
-d 'from(bucket: "sensors")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> filter(fn: (r) => r.location == "room1")'
CLI Operations
# Configure InfluxDB CLI
influx config create \
--config-name myconfig \
--host-url http://localhost:8086 \
--org myorg \
--token YOUR_TOKEN \
--active
# Write data
influx write \
--bucket sensors \
--precision s \
'temperature,location=room2 value=24.1 1640995320'
# Write data from file
cat > data.txt << EOF
temperature,location=room1 value=23.5 1640995200
temperature,location=room2 value=24.1 1640995260
humidity,location=room1 value=65.2 1640995200
humidity,location=room2 value=67.8 1640995260
EOF
influx write --bucket sensors --file data.txt
# Execute Flux query
influx query 'from(bucket: "sensors")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> mean()'
# List buckets
influx bucket list
# List organizations
influx org list
Flux Query Language
// Basic range query
from(bucket: "sensors")
|> range(start: -24h)
|> filter(fn: (r) => r._measurement == "temperature")
|> filter(fn: (r) => r.location == "room1")
// Aggregation and grouping
from(bucket: "sensors")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu_usage")
|> group(columns: ["host"])
|> mean()
// Time window aggregation
from(bucket: "sensors")
|> range(start: -6h)
|> filter(fn: (r) => r._measurement == "temperature")
|> aggregateWindow(every: 10m, fn: mean)
// Join multiple measurements
temp = from(bucket: "sensors")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
humidity = from(bucket: "sensors")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "humidity")
join(tables: {temp: temp, humidity: humidity}, on: ["_time", "location"])
// Statistical functions
from(bucket: "sensors")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu_usage")
|> group(columns: ["host"])
|> aggregateWindow(every: 5m, fn: mean)
|> percentile(percentile: 0.95)
// Data transformation
from(bucket: "sensors")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> map(fn: (r) => ({
r with
_value: (r._value * 9.0 / 5.0) + 32.0,
unit: "°F"
}))
Data Retention and Downsampling
# Task for automatic downsampling
influx task create --file - << EOF
option task = {name: "downsample-5m", every: 1h}
from(bucket: "sensors")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> aggregateWindow(every: 5m, fn: mean)
|> to(bucket: "sensors_5m")
EOF
# Set data retention period
influx bucket update \
--id YOUR_BUCKET_ID \
--retention 720h # 30 days
# Delete old data
influx delete \
--bucket sensors \
--start 2023-01-01T00:00:00Z \
--stop 2023-01-31T23:59:59Z \
--predicate '_measurement="old_data"'
Practical Examples
// IoT sensor monitoring
from(bucket: "iot")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "sensor_data")
|> filter(fn: (r) => r._field == "temperature")
|> group(columns: ["device_id"])
|> aggregateWindow(every: 1m, fn: last)
|> map(fn: (r) => ({
r with
alert: if r._value > 30.0 then "HIGH" else "NORMAL"
}))
// Application performance monitoring
from(bucket: "metrics")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "http_requests")
|> filter(fn: (r) => r._field == "response_time")
|> group(columns: ["endpoint", "status_code"])
|> aggregateWindow(every: 5m, fn: mean)
|> filter(fn: (r) => r._value > 1000.0) // Response time > 1 second
// Business metrics analysis
sales = from(bucket: "business")
|> range(start: -30d)
|> filter(fn: (r) => r._measurement == "sales")
|> filter(fn: (r) => r._field == "amount")
|> aggregateWindow(every: 1d, fn: sum)
revenue = from(bucket: "business")
|> range(start: -30d)
|> filter(fn: (r) => r._measurement == "revenue")
|> filter(fn: (r) => r._field == "total")
|> aggregateWindow(every: 1d, fn: sum)
join(tables: {sales: sales, revenue: revenue}, on: ["_time"])
|> map(fn: (r) => ({
_time: r._time,
avg_order_value: r.revenue_total / r.sales_amount
}))
// Forecasting
from(bucket: "sensors")
|> range(start: -7d)
|> filter(fn: (r) => r._measurement == "energy_consumption")
|> aggregateWindow(every: 1h, fn: mean)
|> holtWinters(n: 24, seasonality: 24) // 24-hour forecast
Python Client
from influxdb_client import InfluxDBClient, Point
from influxdb_client.client.write_api import SYNCHRONOUS
import datetime
# Client connection
client = InfluxDBClient(
url="http://localhost:8086",
token="YOUR_TOKEN",
org="myorg"
)
# Write data
write_api = client.write_api(write_options=SYNCHRONOUS)
# Create and write point
point = Point("temperature") \
.tag("location", "room1") \
.tag("sensor", "DHT22") \
.field("value", 23.5) \
.time(datetime.datetime.utcnow())
write_api.write(bucket="sensors", record=point)
# Bulk write
points = []
for i in range(100):
point = Point("cpu_usage") \
.tag("host", f"server{i%5}") \
.field("value", 50 + i % 40) \
.time(datetime.datetime.utcnow() - datetime.timedelta(minutes=i))
points.append(point)
write_api.write(bucket="metrics", record=points)
# Read data
query_api = client.query_api()
query = '''
from(bucket: "sensors")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> filter(fn: (r) => r.location == "room1")
'''
result = query_api.query(query)
for table in result:
for record in table.records:
print(f"Time: {record.get_time()}, Value: {record.get_value()}")
# Convert to pandas DataFrame
df = query_api.query_data_frame(query)
print(df.head())
# Close client
client.close()
Configuration & Optimization
# influxdb.conf key settings
[http]
bind-address = ":8086"
auth-enabled = true
[meta]
dir = "/var/lib/influxdb/meta"
retention-autocreate = true
[data]
dir = "/var/lib/influxdb/data"
wal-dir = "/var/lib/influxdb/wal"
series-id-set-cache-size = 100
[cluster]
shard-writer-timeout = "5s"
write-timeout = "10s"
[retention]
enabled = true
check-interval = "30m"
[shard-precreation]
enabled = true
check-interval = "10m"
advance-period = "30m"
[monitor]
store-enabled = true
store-database = "_internal"
[admin]
enabled = true
bind-address = ":8083"
[subscriber]
enabled = true
http-timeout = "30s"
[continuous_queries]
enabled = true
log-enabled = true
run-interval = "1s"
Monitoring and Maintenance
# System statistics
influx query 'from(bucket: "_monitoring")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "influxdb_database")
|> last()'
# Performance monitoring
curl "http://localhost:8086/metrics"
# Backup
influx backup /path/to/backup
# Restore
influx restore /path/to/backup
# Data integrity check
influx inspect verify-seriesfile /var/lib/influxdb/data
# TSM file information
influx inspect dump-tsm /var/lib/influxdb/data/mydb/autogen/1/000000001-000000001.tsm