Database
QuestDB
Overview
QuestDB is a high-performance open-source SQL database specialized for time-series data. It provides compatibility with PostgreSQL Wire Protocol, allowing access through existing PostgreSQL client libraries. With its pure columnar architecture, QuestDB delivers exceptional performance in financial, IoT, and monitoring applications.
Details
Key Features
- Ultra-fast Ingestion: Supports millions of records per second with InfluxDB Line Protocol
- PostgreSQL Compatibility: High compatibility with existing tools through PostgreSQL Wire Protocol support
- Pure Columnar: Columnar storage engine optimized for time-series analytics
- SQL Extensions: Time-series specific SQL functions (SAMPLE BY, LATEST ON)
- Zero-copy: Efficient data access through memory-mapped files
- ZFS Compression: Added in v8.0 (2024), providing 6x storage efficiency and improved query performance
Architecture
- Schema-less Ingestion: Dynamic table creation and column addition
- Partitioning: Automatic time-based partitioning
- Indexing: High-efficiency indexing with SYMBOL data type
- Parallel Processing: Parallel query execution leveraging multi-core CPUs
- WAL (Write-Ahead Log): Data durability and replication capabilities
Advantages and Disadvantages
Advantages
- Exceptional Performance: 10-150x faster ingestion than TimescaleDB, 6.5x faster than InfluxDB
- PostgreSQL Compatibility: Easy migration from existing PostgreSQL applications
- Low Learning Curve: Intuitive operations with standard SQL plus time-series specific functions
- Lightweight Operations: High performance without complex configuration
- Rich Client Support: Support for Java, Python, Node.js, C#, Go, Rust, and more
- Real-time Analytics: Ultra-fast queries enable real-time dashboard construction
Disadvantages
- Limited PostgreSQL Features: Some PostgreSQL-specific features are not supported
- Transaction Constraints: Limitations on complex transaction patterns
- Data Type Limitations: Not all PostgreSQL data types are supported
- Metadata Queries: Some metadata queries for DB tools may not work properly
Key Links
- Official Website
- Official Documentation
- GitHub
- Web Console - Online Demo
Code Examples
Installation & Setup
# Start with Docker
docker run -p 9000:9000 -p 9009:9009 -p 8812:8812 questdb/questdb
# macOS with Homebrew
brew install questdb
questdb start
# Linux (.tar.gz)
wget https://github.com/questdb/questdb/releases/download/7.4.2/questdb-7.4.2-rt-linux-amd64.tar.gz
tar -xzf questdb-*.tar.gz
cd questdb-*
./questdb.sh start
Basic Operations (Data Insertion & Queries)
-- Create table
CREATE TABLE cpu_metrics (
timestamp TIMESTAMP,
cpu_id SYMBOL,
usage DOUBLE,
temperature DOUBLE
) timestamp(timestamp) PARTITION BY DAY;
-- Insert data (PostgreSQL Wire Protocol)
INSERT INTO cpu_metrics VALUES
(now(), 'cpu-001', 45.2, 68.5),
(now(), 'cpu-002', 52.1, 71.2);
-- Time-series query (SAMPLE BY)
SELECT timestamp, cpu_id, avg(usage) as avg_usage
FROM cpu_metrics
WHERE timestamp >= dateadd('h', -24, now())
SAMPLE BY 1h;
-- Get latest values (LATEST ON)
SELECT * FROM cpu_metrics
LATEST ON timestamp PARTITION BY cpu_id;
Data Modeling
-- IoT sensor data table
CREATE TABLE sensor_readings (
ts TIMESTAMP,
sensor_id SYMBOL CAPACITY 10000 CACHE,
location SYMBOL,
temperature DOUBLE,
humidity DOUBLE,
pressure DOUBLE
) timestamp(ts) PARTITION BY DAY;
-- Financial data model
CREATE TABLE trades (
timestamp TIMESTAMP,
symbol SYMBOL,
price DOUBLE,
volume LONG,
side SYMBOL
) timestamp(timestamp) PARTITION BY DAY;
-- Add index
CREATE INDEX trades_symbol_ts ON trades (symbol, timestamp);
Performance Optimization
-- SYMBOL type optimization
CREATE TABLE optimized_table (
ts TIMESTAMP,
device SYMBOL CAPACITY 1000000 NOCACHE, -- For large unique values
status SYMBOL CACHE, -- For few values
value DOUBLE
) timestamp(ts) PARTITION BY DAY;
-- Batch insertion optimization
INSERT BATCH 10000 INTO metrics
SELECT * FROM source_table;
-- O3 (Out of Order) configuration
ALTER TABLE metrics SET PARAM o3MaxLag = '10s';
Practical Examples
// Java (PostgreSQL JDBC)
import java.sql.*;
import java.util.Properties;
public class QuestDBExample {
public static void main(String[] args) throws SQLException {
String url = "jdbc:postgresql://localhost:8812/qdb";
Properties props = new Properties();
props.setProperty("user", "admin");
props.setProperty("password", "quest");
try (Connection conn = DriverManager.getConnection(url, props)) {
// Time-series data query
String query = """
SELECT ts, symbol, avg(price) as avg_price
FROM trades
WHERE ts >= dateadd('d', -7, now())
SAMPLE BY 1h
""";
try (PreparedStatement ps = conn.prepareStatement(query);
ResultSet rs = ps.executeQuery()) {
while (rs.next()) {
System.out.printf("Time: %s, Symbol: %s, Price: %.2f%n",
rs.getTimestamp("ts"),
rs.getString("symbol"),
rs.getDouble("avg_price"));
}
}
}
}
}
# Python (psycopg3)
import psycopg
async def fetch_metrics():
async with await psycopg.AsyncConnection.connect(
host='localhost', port=8812, user='admin',
password='quest', dbname='qdb'
) as conn:
async with conn.cursor() as cur:
await cur.execute("""
SELECT ts, sensor_id, avg(temperature) as avg_temp
FROM sensor_readings
WHERE ts >= now() - INTERVAL '1 DAY'
SAMPLE BY 10m
""")
async for row in cur:
print(f"Sensor: {row[1]}, Avg Temp: {row[2]:.1f}°C")
Best Practices
-- 1. Partitioning strategy
-- High-frequency data: daily, low-frequency: monthly
CREATE TABLE high_freq_data (...) PARTITION BY DAY;
CREATE TABLE low_freq_data (...) PARTITION BY MONTH;
-- 2. SYMBOL optimization
-- Set expected unique value count in advance
CREATE TABLE events (
ts TIMESTAMP,
event_type SYMBOL CAPACITY 100 CACHE, -- Few types
user_id SYMBOL CAPACITY 1000000 NOCACHE, -- Many users
data VARCHAR
) timestamp(ts);
-- 3. Query performance
-- Always specify time range in WHERE clause
SELECT * FROM metrics
WHERE ts BETWEEN '2024-01-01' AND '2024-01-02'
AND sensor_id = 'temp-001';
-- 4. Leverage bulk operations
-- Fast CSV import with COPY statement
COPY trades FROM '/path/to/trades.csv' WITH HEADER true;