CockroachDB
Distributed SQL database. Supports global scale while maintaining strong consistency and ACID properties. Provides PostgreSQL-compatible SQL API.
Database Server
CockroachDB
Overview
CockroachDB is a distributed SQL database designed for the cloud. True to its cockroach namesake, it features resilient design and robustness against failures, providing a PostgreSQL-compatible SQL API while guaranteeing ACID transactions even across geographically distributed clusters. With automatic sharding, replication, and repair capabilities, it significantly reduces manual operational overhead, combining the reliability of traditional RDBMS with the scalability of NoSQL. Developed by Cockroach Labs and inspired by Google's Spanner architecture, it provides a distributed database solution capable of supporting global-scale applications.
Details
CockroachDB 2025 edition has matured as an enterprise-grade distributed database through years of continuous improvements. Recent versions feature enhanced geographic distribution optimization with Multi-Region SQL, automatic scaling capabilities through Serverless (CockroachDB Serverless), and strengthened machine learning workload support via Vector Indexing. It transparently solves complex distributed system challenges through Raft consensus algorithm for consistency guarantees, Range distribution for automatic sharding, Gossip protocol for inter-node communication, and hybrid logical clocks similar to TrueTime that balance strong consistency with geographic distribution. PostgreSQL wire protocol compatibility enables minimal-change migration of existing PostgreSQL applications.
Key Features
- Strong Consistency: Complete ACID transaction guarantees even in geographically distributed environments
- Auto Scaling: Automatic node addition/removal based on data volume and load
- PostgreSQL Compatible: High compatibility with existing PostgreSQL applications
- Automated Operations: Automatic sharding, replication, and failure recovery
- Multi-Region Support: Global datacenter distribution deployment
- Zero Downtime: Rolling upgrades and automatic failover during failures
Pros and Cons
Pros
- Horizontal scaling possible while maintaining strong consistency in distributed environments
- Easy migration of existing applications due to PostgreSQL compatibility
- Significant operational overhead reduction and DBA-free operations through automated features
- Automatic recovery from failures and zero-downtime upgrades
- Complete integration in Kubernetes environments through cloud-native design
- Flexible usage-based billing through Serverless mode
Cons
- High learning costs and time required to understand complex distributed systems
- Potential increased latency for geographically distributed transactions
- Not fully PostgreSQL compatible with some feature limitations and behavioral differences
- May underperform single-node PostgreSQL due to distributed overhead
- Enterprise features require expensive commercial licensing
- Limited operational know-how and developing ecosystem as emerging technology
Reference Pages
Code Examples
Installation and Basic Setup
# Download CockroachDB binary (Linux)
curl https://binaries.cockroachdb.com/cockroach-v24.2.0.linux-amd64.tgz | tar -xz
sudo cp -i cockroach-v24.2.0.linux-amd64/cockroach /usr/local/bin/
# macOS installation via Homebrew
brew install cockroachdb/tap/cockroach
# Docker execution
docker pull cockroachdb/cockroach:v24.2.0
docker run -d \
--name=cockroach-node1 \
--hostname=cockroach-node1 \
-p 26257:26257 -p 8080:8080 \
-v cockroach-data:/cockroach/cockroach-data \
cockroachdb/cockroach:v24.2.0 start-single-node \
--insecure
# 3-node cluster with Docker Compose
cat > docker-compose.yml << 'EOF'
version: '3.8'
services:
cockroach-node1:
image: cockroachdb/cockroach:v24.2.0
command: start --insecure --join=cockroach-node1,cockroach-node2,cockroach-node3
environment:
- COCKROACH_CHANNEL=kubernetes-multiregion
ports:
- "26257:26257"
- "8080:8080"
volumes:
- cockroach-data1:/cockroach/cockroach-data
cockroach-node2:
image: cockroachdb/cockroach:v24.2.0
command: start --insecure --join=cockroach-node1,cockroach-node2,cockroach-node3
environment:
- COCKROACH_CHANNEL=kubernetes-multiregion
volumes:
- cockroach-data2:/cockroach/cockroach-data
depends_on:
- cockroach-node1
cockroach-node3:
image: cockroachdb/cockroach:v24.2.0
command: start --insecure --join=cockroach-node1,cockroach-node2,cockroach-node3
environment:
- COCKROACH_CHANNEL=kubernetes-multiregion
volumes:
- cockroach-data3:/cockroach/cockroach-data
depends_on:
- cockroach-node1
volumes:
cockroach-data1:
cockroach-data2:
cockroach-data3:
EOF
docker-compose up -d
# Cluster initialization
cockroach init --insecure --host=localhost:26257
# Access admin UI
# Navigate to http://localhost:8080
# CLI connection
cockroach sql --insecure --host=localhost:26257
Basic SQL Operations and Database Management
-- Create database
CREATE DATABASE company;
USE company;
-- Create users table
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
first_name VARCHAR(50),
last_name VARCHAR(50),
created_at TIMESTAMP DEFAULT now(),
updated_at TIMESTAMP DEFAULT now(),
is_active BOOLEAN DEFAULT true
);
-- Create departments table
CREATE TABLE departments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(100) UNIQUE NOT NULL,
description TEXT,
manager_id UUID REFERENCES users(id),
created_at TIMESTAMP DEFAULT now()
);
-- Create employees table (with foreign key constraints)
CREATE TABLE employees (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
department_id UUID REFERENCES departments(id),
employee_id VARCHAR(20) UNIQUE NOT NULL,
position VARCHAR(100),
salary DECIMAL(12,2),
hire_date DATE,
status VARCHAR(20) DEFAULT 'active',
created_at TIMESTAMP DEFAULT now()
);
-- Create indexes
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_employees_dept ON employees(department_id);
CREATE INDEX idx_employees_status ON employees(status);
-- Insert data
INSERT INTO users (username, email, password_hash, first_name, last_name) VALUES
('john_doe', '[email protected]', 'hashed_password_1', 'John', 'Doe'),
('jane_smith', '[email protected]', 'hashed_password_2', 'Jane', 'Smith'),
('bob_wilson', '[email protected]', 'hashed_password_3', 'Bob', 'Wilson');
INSERT INTO departments (name, description) VALUES
('Engineering', 'Software development and technical operations'),
('Sales', 'Customer acquisition and revenue generation'),
('Marketing', 'Brand promotion and customer engagement');
-- Complex query examples
-- Employee count and average salary by department
SELECT
d.name as department_name,
COUNT(e.id) as employee_count,
AVG(e.salary) as avg_salary,
MIN(e.hire_date) as earliest_hire,
MAX(e.hire_date) as latest_hire
FROM departments d
LEFT JOIN employees e ON d.id = e.department_id
GROUP BY d.id, d.name
ORDER BY employee_count DESC;
-- Salary ranking using window functions
SELECT
u.first_name || ' ' || u.last_name as full_name,
d.name as department,
e.salary,
RANK() OVER (PARTITION BY d.name ORDER BY e.salary DESC) as salary_rank,
DENSE_RANK() OVER (ORDER BY e.salary DESC) as overall_rank
FROM employees e
JOIN users u ON e.user_id = u.id
JOIN departments d ON e.department_id = d.id
WHERE e.status = 'active'
ORDER BY e.salary DESC;
-- JSON operations example
CREATE TABLE user_profiles (
user_id UUID PRIMARY KEY REFERENCES users(id),
profile_data JSONB,
preferences JSONB DEFAULT '{}',
created_at TIMESTAMP DEFAULT now()
);
INSERT INTO user_profiles (user_id, profile_data, preferences) VALUES
(
(SELECT id FROM users WHERE username = 'john_doe'),
'{"skills": ["Python", "Go", "SQL"], "experience_years": 5, "certifications": ["AWS", "Kubernetes"]}',
'{"theme": "dark", "notifications": true, "language": "en"}'
);
-- JSONB operations
SELECT
u.username,
up.profile_data->'skills' as skills,
up.profile_data->>'experience_years' as experience,
up.preferences->>'theme' as preferred_theme
FROM users u
JOIN user_profiles up ON u.id = up.user_id
WHERE up.profile_data ? 'skills'
AND up.profile_data->'skills' ? 'Python';
Clustering and Replication Configuration
-- Check cluster information
SHOW CLUSTER SETTING cluster.organization;
SHOW CLUSTER SETTING version;
-- Replication configuration
-- Change from default 3 replicas to 5 replicas
ALTER RANGE default CONFIGURE ZONE USING num_replicas = 5;
-- Specific database replication settings
ALTER DATABASE company CONFIGURE ZONE USING num_replicas = 3, gc.ttlseconds = 86400;
-- Specific table replication settings
ALTER TABLE employees CONFIGURE ZONE USING
num_replicas = 5,
constraints = '[+region=us-east-1]',
lease_preferences = '[[+region=us-east-1]]';
-- Multi-region configuration
-- Add regions to cluster
SET CLUSTER SETTING cluster.organization = 'Company Inc.';
-- Configure database for multi-region
ALTER DATABASE company SET PRIMARY REGION "us-east-1";
ALTER DATABASE company ADD REGION "us-west-1";
ALTER DATABASE company ADD REGION "europe-west1";
-- Table geographic distribution settings
ALTER TABLE users SET LOCALITY GLOBAL;
ALTER TABLE departments SET LOCALITY REGIONAL BY TABLE IN "us-east-1";
ALTER TABLE employees SET LOCALITY REGIONAL BY ROW AS region;
-- Partitioning example
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
customer_id UUID,
region VARCHAR(20),
order_date DATE,
total_amount DECIMAL(12,2),
status VARCHAR(20)
) PARTITION BY LIST (region) (
PARTITION us_east VALUES IN ('us-east-1'),
PARTITION us_west VALUES IN ('us-west-1'),
PARTITION europe VALUES IN ('europe-west1')
);
-- Partition-specific constraint configuration
ALTER PARTITION us_east OF TABLE orders
CONFIGURE ZONE USING constraints = '[+region=us-east-1]';
ALTER PARTITION us_west OF TABLE orders
CONFIGURE ZONE USING constraints = '[+region=us-west-1]';
ALTER PARTITION europe OF TABLE orders
CONFIGURE ZONE USING constraints = '[+region=europe-west1]';
Performance Optimization and Monitoring
-- Cluster setting optimization
SET CLUSTER SETTING sql.stats.automatic_collection.enabled = true;
SET CLUSTER SETTING sql.stats.histogram_collection.enabled = true;
SET CLUSTER SETTING kv.range_merge.queue_enabled = true;
SET CLUSTER SETTING kv.raft.command.max_size = '64MiB';
-- Query execution plan analysis
EXPLAIN (ANALYZE, DISTSQL)
SELECT d.name, COUNT(*) as emp_count, AVG(e.salary) as avg_salary
FROM departments d
JOIN employees e ON d.id = e.department_id
GROUP BY d.id, d.name;
-- Slow query analysis
SELECT
application_name,
query,
exec_count,
mean_exec_time,
mean_rows,
overhead_latency
FROM crdb_internal.statement_statistics
WHERE mean_exec_time > INTERVAL '100ms'
ORDER BY mean_exec_time DESC
LIMIT 10;
-- Index usage analysis
SELECT
table_name,
index_name,
total_reads,
last_read
FROM crdb_internal.index_usage_statistics
WHERE database_name = 'company'
ORDER BY total_reads DESC;
-- Transaction contention analysis
SELECT * FROM crdb_internal.cluster_contended_tables;
SELECT * FROM crdb_internal.cluster_contended_indexes;
-- Range distribution analysis
SELECT
range_id,
start_key,
end_key,
replicas,
lease_holder
FROM crdb_internal.ranges
WHERE database_name = 'company'
ORDER BY start_key;
-- Node-specific load analysis
SELECT
node_id,
store_id,
capacity,
available,
used,
logical_bytes,
range_count
FROM crdb_internal.kv_store_status;
-- Memory usage optimization
SET CLUSTER SETTING sql.distsql.temp_storage.workmem = '64MiB';
SET CLUSTER SETTING kv.bulk_io_write.concurrent_addsstable_requests = 5;
SET CLUSTER SETTING kv.bulk_io_write.max_rate = '500MB';
Transaction Management and Concurrency Control
-- Explicit transactions
BEGIN;
-- Set isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Multi-table updates
UPDATE employees SET salary = salary * 1.1
WHERE department_id = (SELECT id FROM departments WHERE name = 'Engineering');
UPDATE departments SET description = 'Advanced software development'
WHERE name = 'Engineering';
-- Using savepoints
SAVEPOINT engineering_update;
INSERT INTO user_profiles (user_id, profile_data)
VALUES ((SELECT user_id FROM employees WHERE employee_id = 'ENG001'),
'{"role": "senior", "updated": true}');
-- Conditional rollback
-- ROLLBACK TO SAVEPOINT engineering_update; -- On error
COMMIT;
-- Distributed transaction example
BEGIN;
-- Multi-region updates
UPDATE users SET updated_at = now()
WHERE id IN (
SELECT user_id FROM employees
WHERE department_id IN (
SELECT id FROM departments WHERE name IN ('Engineering', 'Sales')
)
);
-- Conditional insert
INSERT INTO departments (name, description)
SELECT 'DevOps', 'Infrastructure and deployment'
WHERE NOT EXISTS (SELECT 1 FROM departments WHERE name = 'DevOps');
COMMIT;
-- Retry-able transaction example
-- Application-side implementation example
-- RETRY_LOOP:
-- BEGIN;
--
-- SELECT account_balance FROM accounts WHERE id = $1 FOR UPDATE;
-- -- Balance check
-- UPDATE accounts SET balance = balance - $2 WHERE id = $1;
-- UPDATE accounts SET balance = balance + $2 WHERE id = $3;
--
-- COMMIT;
-- -- Return to RETRY_LOOP on retry error (40001)
-- Deadlock priority setting
SET TRANSACTION PRIORITY HIGH;
-- or
-- SET TRANSACTION PRIORITY LOW;
Backup, Restore and Maintenance
-- Database backup
BACKUP DATABASE company TO 's3://backup-bucket/company-backup?AWS_ACCESS_KEY_ID=xxx&AWS_SECRET_ACCESS_KEY=yyy';
-- Specific table backup
BACKUP TABLE company.employees TO 'nodelocal://1/backups/employees';
-- Incremental backup
BACKUP DATABASE company TO 's3://backup-bucket/company-incremental'
AS OF SYSTEM TIME '-1m'
WITH revision_history;
-- Restore
RESTORE DATABASE company FROM 's3://backup-bucket/company-backup';
-- Point-in-time restore
RESTORE DATABASE company FROM 's3://backup-bucket/company-backup'
AS OF SYSTEM TIME '2024-01-15 10:00:00';
-- Cluster maintenance
-- Prepare for node shutdown
ALTER RANGE default CONFIGURE ZONE USING num_replicas = 5;
-- Node decommission
-- cockroach node decommission 4 --insecure --host=localhost:26257
-- Update statistics
ANALYZE TABLE employees;
-- Cleanup unnecessary data
DELETE FROM user_profiles WHERE created_at < '2023-01-01';
-- Vacuum (automatic in CockroachDB, manual vacuum not needed)
-- CockroachDB has automatic garbage collection, no manual vacuum required
-- Cluster settings backup
SHOW ALL CLUSTER SETTINGS;
Application Integration Examples
# Python psycopg2 connection example
import psycopg2
from psycopg2.extras import RealDictCursor
import uuid
from datetime import datetime
class CockroachDBConnection:
def __init__(self, connection_params):
self.conn_params = connection_params
self.conn = None
def connect(self):
"""Connect to CockroachDB"""
try:
self.conn = psycopg2.connect(**self.conn_params)
self.conn.set_session(autocommit=False)
print("Connected to CockroachDB")
except Exception as e:
print(f"Connection error: {e}")
raise
def execute_with_retry(self, query, params=None, max_retries=3):
"""Execute query with retry functionality"""
for attempt in range(max_retries):
try:
with self.conn.cursor(cursor_factory=RealDictCursor) as cur:
cur.execute(query, params)
if cur.description: # SELECT statements
return cur.fetchall()
else: # INSERT/UPDATE/DELETE statements
self.conn.commit()
return cur.rowcount
except psycopg2.errors.SerializationFailure as e:
print(f"Retry {attempt + 1}/{max_retries}: {e}")
self.conn.rollback()
if attempt == max_retries - 1:
raise
except Exception as e:
self.conn.rollback()
raise
def create_user(self, username, email, first_name, last_name):
"""Create user"""
query = """
INSERT INTO users (username, email, password_hash, first_name, last_name)
VALUES (%s, %s, %s, %s, %s)
RETURNING id
"""
password_hash = f"hashed_{username}" # In practice, use proper hashing
result = self.execute_with_retry(
query,
(username, email, password_hash, first_name, last_name)
)
return result
def get_user_with_department(self, user_id):
"""Get user with department information"""
query = """
SELECT
u.id, u.username, u.email, u.first_name, u.last_name,
d.name as department_name,
e.position, e.salary, e.hire_date
FROM users u
LEFT JOIN employees e ON u.id = e.user_id
LEFT JOIN departments d ON e.department_id = d.id
WHERE u.id = %s
"""
return self.execute_with_retry(query, (user_id,))
def transfer_between_accounts(self, from_account, to_account, amount):
"""Distributed transaction example: account transfer"""
try:
# Transfer transaction
self.execute_with_retry("""
UPDATE accounts SET balance = balance - %s
WHERE id = %s AND balance >= %s
""", (amount, from_account, amount))
self.execute_with_retry("""
UPDATE accounts SET balance = balance + %s
WHERE id = %s
""", (amount, to_account))
# Record transaction history
self.execute_with_retry("""
INSERT INTO transaction_logs (from_account, to_account, amount, timestamp)
VALUES (%s, %s, %s, %s)
""", (from_account, to_account, amount, datetime.now()))
print(f"Transfer completed: {from_account} -> {to_account}, amount: {amount}")
except Exception as e:
print(f"Transfer error: {e}")
self.conn.rollback()
raise
def close(self):
"""Close connection"""
if self.conn:
self.conn.close()
# Usage example
if __name__ == "__main__":
# Connection parameters
conn_params = {
'host': 'localhost',
'port': 26257,
'database': 'company',
'user': 'root',
'sslmode': 'disable' # Use proper SSL settings in production
}
db = CockroachDBConnection(conn_params)
try:
db.connect()
# Create user
result = db.create_user(
'alice_cooper', '[email protected]',
'Alice', 'Cooper'
)
print(f"User creation result: {result}")
# Get user information
user_info = db.get_user_with_department('some-uuid')
print(f"User information: {user_info}")
except Exception as e:
print(f"Error: {e}")
finally:
db.close()
# Go language connection example (reference)
"""
package main
import (
"database/sql"
"fmt"
"log"
_ "github.com/lib/pq"
"github.com/cockroachdb/cockroach-go/v2/crdb/crdbpgx"
)
func main() {
db, err := sql.Open("postgres",
"postgresql://root@localhost:26257/company?sslmode=disable")
if err != nil {
log.Fatal("Database connection error:", err)
}
defer db.Close()
// Transaction with retry functionality
err = crdbpgx.ExecuteTx(context.Background(), db, pgx.TxOptions{},
func(tx pgx.Tx) error {
// Transaction processing
_, err := tx.Exec(context.Background(),
"UPDATE accounts SET balance = balance - $1 WHERE id = $2",
100, "account1")
return err
})
if err != nil {
log.Fatal("Transaction error:", err)
}
}
"""