CockroachDB

Distributed SQL database. Supports global scale while maintaining strong consistency and ACID properties. Provides PostgreSQL-compatible SQL API.

Database ServerDistributed DatabaseSQLACIDHigh AvailabilityCloud NativePostgreSQL CompatibleAuto Scaling

Database Server

CockroachDB

Overview

CockroachDB is a distributed SQL database designed for the cloud. True to its cockroach namesake, it features resilient design and robustness against failures, providing a PostgreSQL-compatible SQL API while guaranteeing ACID transactions even across geographically distributed clusters. With automatic sharding, replication, and repair capabilities, it significantly reduces manual operational overhead, combining the reliability of traditional RDBMS with the scalability of NoSQL. Developed by Cockroach Labs and inspired by Google's Spanner architecture, it provides a distributed database solution capable of supporting global-scale applications.

Details

CockroachDB 2025 edition has matured as an enterprise-grade distributed database through years of continuous improvements. Recent versions feature enhanced geographic distribution optimization with Multi-Region SQL, automatic scaling capabilities through Serverless (CockroachDB Serverless), and strengthened machine learning workload support via Vector Indexing. It transparently solves complex distributed system challenges through Raft consensus algorithm for consistency guarantees, Range distribution for automatic sharding, Gossip protocol for inter-node communication, and hybrid logical clocks similar to TrueTime that balance strong consistency with geographic distribution. PostgreSQL wire protocol compatibility enables minimal-change migration of existing PostgreSQL applications.

Key Features

Strong Consistency: Complete ACID transaction guarantees even in geographically distributed environments
Auto Scaling: Automatic node addition/removal based on data volume and load
PostgreSQL Compatible: High compatibility with existing PostgreSQL applications
Automated Operations: Automatic sharding, replication, and failure recovery
Multi-Region Support: Global datacenter distribution deployment
Zero Downtime: Rolling upgrades and automatic failover during failures

Pros and Cons

Pros

Horizontal scaling possible while maintaining strong consistency in distributed environments
Easy migration of existing applications due to PostgreSQL compatibility
Significant operational overhead reduction and DBA-free operations through automated features
Automatic recovery from failures and zero-downtime upgrades
Complete integration in Kubernetes environments through cloud-native design
Flexible usage-based billing through Serverless mode

Cons

High learning costs and time required to understand complex distributed systems
Potential increased latency for geographically distributed transactions
Not fully PostgreSQL compatible with some feature limitations and behavioral differences
May underperform single-node PostgreSQL due to distributed overhead
Enterprise features require expensive commercial licensing
Limited operational know-how and developing ecosystem as emerging technology

Reference Pages

Code Examples

Installation and Basic Setup

# Download CockroachDB binary (Linux)
curl https://binaries.cockroachdb.com/cockroach-v24.2.0.linux-amd64.tgz | tar -xz
sudo cp -i cockroach-v24.2.0.linux-amd64/cockroach /usr/local/bin/

# macOS installation via Homebrew
brew install cockroachdb/tap/cockroach

# Docker execution
docker pull cockroachdb/cockroach:v24.2.0
docker run -d \
  --name=cockroach-node1 \
  --hostname=cockroach-node1 \
  -p 26257:26257 -p 8080:8080 \
  -v cockroach-data:/cockroach/cockroach-data \
  cockroachdb/cockroach:v24.2.0 start-single-node \
  --insecure

# 3-node cluster with Docker Compose
cat > docker-compose.yml << 'EOF'
version: '3.8'
services:
  cockroach-node1:
    image: cockroachdb/cockroach:v24.2.0
    command: start --insecure --join=cockroach-node1,cockroach-node2,cockroach-node3
    environment:
      - COCKROACH_CHANNEL=kubernetes-multiregion
    ports:
      - "26257:26257"
      - "8080:8080"
    volumes:
      - cockroach-data1:/cockroach/cockroach-data

  cockroach-node2:
    image: cockroachdb/cockroach:v24.2.0
    command: start --insecure --join=cockroach-node1,cockroach-node2,cockroach-node3
    environment:
      - COCKROACH_CHANNEL=kubernetes-multiregion
    volumes:
      - cockroach-data2:/cockroach/cockroach-data
    depends_on:
      - cockroach-node1

  cockroach-node3:
    image: cockroachdb/cockroach:v24.2.0
    command: start --insecure --join=cockroach-node1,cockroach-node2,cockroach-node3
    environment:
      - COCKROACH_CHANNEL=kubernetes-multiregion
    volumes:
      - cockroach-data3:/cockroach/cockroach-data
    depends_on:
      - cockroach-node1

volumes:
  cockroach-data1:
  cockroach-data2:
  cockroach-data3:
EOF

docker-compose up -d

# Cluster initialization
cockroach init --insecure --host=localhost:26257

# Access admin UI
# Navigate to http://localhost:8080

# CLI connection
cockroach sql --insecure --host=localhost:26257

Basic SQL Operations and Database Management

-- Create database
CREATE DATABASE company;
USE company;

-- Create users table
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    created_at TIMESTAMP DEFAULT now(),
    updated_at TIMESTAMP DEFAULT now(),
    is_active BOOLEAN DEFAULT true
);

-- Create departments table
CREATE TABLE departments (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(100) UNIQUE NOT NULL,
    description TEXT,
    manager_id UUID REFERENCES users(id),
    created_at TIMESTAMP DEFAULT now()
);

-- Create employees table (with foreign key constraints)
CREATE TABLE employees (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID REFERENCES users(id) ON DELETE CASCADE,
    department_id UUID REFERENCES departments(id),
    employee_id VARCHAR(20) UNIQUE NOT NULL,
    position VARCHAR(100),
    salary DECIMAL(12,2),
    hire_date DATE,
    status VARCHAR(20) DEFAULT 'active',
    created_at TIMESTAMP DEFAULT now()
);

-- Create indexes
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_employees_dept ON employees(department_id);
CREATE INDEX idx_employees_status ON employees(status);

-- Insert data
INSERT INTO users (username, email, password_hash, first_name, last_name) VALUES
('john_doe', '[email protected]', 'hashed_password_1', 'John', 'Doe'),
('jane_smith', '[email protected]', 'hashed_password_2', 'Jane', 'Smith'),
('bob_wilson', '[email protected]', 'hashed_password_3', 'Bob', 'Wilson');

INSERT INTO departments (name, description) VALUES
('Engineering', 'Software development and technical operations'),
('Sales', 'Customer acquisition and revenue generation'),
('Marketing', 'Brand promotion and customer engagement');

-- Complex query examples
-- Employee count and average salary by department
SELECT 
    d.name as department_name,
    COUNT(e.id) as employee_count,
    AVG(e.salary) as avg_salary,
    MIN(e.hire_date) as earliest_hire,
    MAX(e.hire_date) as latest_hire
FROM departments d
LEFT JOIN employees e ON d.id = e.department_id
GROUP BY d.id, d.name
ORDER BY employee_count DESC;

-- Salary ranking using window functions
SELECT 
    u.first_name || ' ' || u.last_name as full_name,
    d.name as department,
    e.salary,
    RANK() OVER (PARTITION BY d.name ORDER BY e.salary DESC) as salary_rank,
    DENSE_RANK() OVER (ORDER BY e.salary DESC) as overall_rank
FROM employees e
JOIN users u ON e.user_id = u.id
JOIN departments d ON e.department_id = d.id
WHERE e.status = 'active'
ORDER BY e.salary DESC;

-- JSON operations example
CREATE TABLE user_profiles (
    user_id UUID PRIMARY KEY REFERENCES users(id),
    profile_data JSONB,
    preferences JSONB DEFAULT '{}',
    created_at TIMESTAMP DEFAULT now()
);

INSERT INTO user_profiles (user_id, profile_data, preferences) VALUES
(
    (SELECT id FROM users WHERE username = 'john_doe'), 
    '{"skills": ["Python", "Go", "SQL"], "experience_years": 5, "certifications": ["AWS", "Kubernetes"]}',
    '{"theme": "dark", "notifications": true, "language": "en"}'
);

-- JSONB operations
SELECT 
    u.username,
    up.profile_data->'skills' as skills,
    up.profile_data->>'experience_years' as experience,
    up.preferences->>'theme' as preferred_theme
FROM users u
JOIN user_profiles up ON u.id = up.user_id
WHERE up.profile_data ? 'skills'
    AND up.profile_data->'skills' ? 'Python';

Clustering and Replication Configuration

-- Check cluster information
SHOW CLUSTER SETTING cluster.organization;
SHOW CLUSTER SETTING version;

-- Replication configuration
-- Change from default 3 replicas to 5 replicas
ALTER RANGE default CONFIGURE ZONE USING num_replicas = 5;

-- Specific database replication settings
ALTER DATABASE company CONFIGURE ZONE USING num_replicas = 3, gc.ttlseconds = 86400;

-- Specific table replication settings
ALTER TABLE employees CONFIGURE ZONE USING 
    num_replicas = 5,
    constraints = '[+region=us-east-1]',
    lease_preferences = '[[+region=us-east-1]]';

-- Multi-region configuration
-- Add regions to cluster
SET CLUSTER SETTING cluster.organization = 'Company Inc.';

-- Configure database for multi-region
ALTER DATABASE company SET PRIMARY REGION "us-east-1";
ALTER DATABASE company ADD REGION "us-west-1";
ALTER DATABASE company ADD REGION "europe-west1";

-- Table geographic distribution settings
ALTER TABLE users SET LOCALITY GLOBAL;
ALTER TABLE departments SET LOCALITY REGIONAL BY TABLE IN "us-east-1";
ALTER TABLE employees SET LOCALITY REGIONAL BY ROW AS region;

-- Partitioning example
CREATE TABLE orders (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    customer_id UUID,
    region VARCHAR(20),
    order_date DATE,
    total_amount DECIMAL(12,2),
    status VARCHAR(20)
) PARTITION BY LIST (region) (
    PARTITION us_east VALUES IN ('us-east-1'),
    PARTITION us_west VALUES IN ('us-west-1'),
    PARTITION europe VALUES IN ('europe-west1')
);

-- Partition-specific constraint configuration
ALTER PARTITION us_east OF TABLE orders 
    CONFIGURE ZONE USING constraints = '[+region=us-east-1]';
ALTER PARTITION us_west OF TABLE orders 
    CONFIGURE ZONE USING constraints = '[+region=us-west-1]';
ALTER PARTITION europe OF TABLE orders 
    CONFIGURE ZONE USING constraints = '[+region=europe-west1]';

Performance Optimization and Monitoring

-- Cluster setting optimization
SET CLUSTER SETTING sql.stats.automatic_collection.enabled = true;
SET CLUSTER SETTING sql.stats.histogram_collection.enabled = true;
SET CLUSTER SETTING kv.range_merge.queue_enabled = true;
SET CLUSTER SETTING kv.raft.command.max_size = '64MiB';

-- Query execution plan analysis
EXPLAIN (ANALYZE, DISTSQL) 
SELECT d.name, COUNT(*) as emp_count, AVG(e.salary) as avg_salary
FROM departments d
JOIN employees e ON d.id = e.department_id
GROUP BY d.id, d.name;

-- Slow query analysis
SELECT 
    application_name,
    query,
    exec_count,
    mean_exec_time,
    mean_rows,
    overhead_latency
FROM crdb_internal.statement_statistics
WHERE mean_exec_time > INTERVAL '100ms'
ORDER BY mean_exec_time DESC
LIMIT 10;

-- Index usage analysis
SELECT 
    table_name,
    index_name,
    total_reads,
    last_read
FROM crdb_internal.index_usage_statistics
WHERE database_name = 'company'
ORDER BY total_reads DESC;

-- Transaction contention analysis
SELECT * FROM crdb_internal.cluster_contended_tables;
SELECT * FROM crdb_internal.cluster_contended_indexes;

-- Range distribution analysis
SELECT 
    range_id,
    start_key,
    end_key,
    replicas,
    lease_holder
FROM crdb_internal.ranges
WHERE database_name = 'company'
ORDER BY start_key;

-- Node-specific load analysis
SELECT 
    node_id,
    store_id,
    capacity,
    available,
    used,
    logical_bytes,
    range_count
FROM crdb_internal.kv_store_status;

-- Memory usage optimization
SET CLUSTER SETTING sql.distsql.temp_storage.workmem = '64MiB';
SET CLUSTER SETTING kv.bulk_io_write.concurrent_addsstable_requests = 5;
SET CLUSTER SETTING kv.bulk_io_write.max_rate = '500MB';

Transaction Management and Concurrency Control

-- Explicit transactions
BEGIN;

-- Set isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- Multi-table updates
UPDATE employees SET salary = salary * 1.1 
WHERE department_id = (SELECT id FROM departments WHERE name = 'Engineering');

UPDATE departments SET description = 'Advanced software development' 
WHERE name = 'Engineering';

-- Using savepoints
SAVEPOINT engineering_update;

INSERT INTO user_profiles (user_id, profile_data) 
VALUES ((SELECT user_id FROM employees WHERE employee_id = 'ENG001'), 
        '{"role": "senior", "updated": true}');

-- Conditional rollback
-- ROLLBACK TO SAVEPOINT engineering_update; -- On error

COMMIT;

-- Distributed transaction example
BEGIN;

-- Multi-region updates
UPDATE users SET updated_at = now() 
WHERE id IN (
    SELECT user_id FROM employees 
    WHERE department_id IN (
        SELECT id FROM departments WHERE name IN ('Engineering', 'Sales')
    )
);

-- Conditional insert
INSERT INTO departments (name, description)
SELECT 'DevOps', 'Infrastructure and deployment'
WHERE NOT EXISTS (SELECT 1 FROM departments WHERE name = 'DevOps');

COMMIT;

-- Retry-able transaction example
-- Application-side implementation example
-- RETRY_LOOP:
-- BEGIN;
-- 
-- SELECT account_balance FROM accounts WHERE id = $1 FOR UPDATE;
-- -- Balance check
-- UPDATE accounts SET balance = balance - $2 WHERE id = $1;
-- UPDATE accounts SET balance = balance + $2 WHERE id = $3;
-- 
-- COMMIT;
-- -- Return to RETRY_LOOP on retry error (40001)

-- Deadlock priority setting
SET TRANSACTION PRIORITY HIGH;
-- or
-- SET TRANSACTION PRIORITY LOW;

Backup, Restore and Maintenance

-- Database backup
BACKUP DATABASE company TO 's3://backup-bucket/company-backup?AWS_ACCESS_KEY_ID=xxx&AWS_SECRET_ACCESS_KEY=yyy';

-- Specific table backup
BACKUP TABLE company.employees TO 'nodelocal://1/backups/employees';

-- Incremental backup
BACKUP DATABASE company TO 's3://backup-bucket/company-incremental' 
AS OF SYSTEM TIME '-1m' 
WITH revision_history;

-- Restore
RESTORE DATABASE company FROM 's3://backup-bucket/company-backup';

-- Point-in-time restore
RESTORE DATABASE company FROM 's3://backup-bucket/company-backup' 
AS OF SYSTEM TIME '2024-01-15 10:00:00';

-- Cluster maintenance
-- Prepare for node shutdown
ALTER RANGE default CONFIGURE ZONE USING num_replicas = 5;

-- Node decommission
-- cockroach node decommission 4 --insecure --host=localhost:26257

-- Update statistics
ANALYZE TABLE employees;

-- Cleanup unnecessary data
DELETE FROM user_profiles WHERE created_at < '2023-01-01';

-- Vacuum (automatic in CockroachDB, manual vacuum not needed)
-- CockroachDB has automatic garbage collection, no manual vacuum required

-- Cluster settings backup
SHOW ALL CLUSTER SETTINGS;

Application Integration Examples

# Python psycopg2 connection example
import psycopg2
from psycopg2.extras import RealDictCursor
import uuid
from datetime import datetime

class CockroachDBConnection:
    def __init__(self, connection_params):
        self.conn_params = connection_params
        self.conn = None
    
    def connect(self):
        """Connect to CockroachDB"""
        try:
            self.conn = psycopg2.connect(**self.conn_params)
            self.conn.set_session(autocommit=False)
            print("Connected to CockroachDB")
        except Exception as e:
            print(f"Connection error: {e}")
            raise
    
    def execute_with_retry(self, query, params=None, max_retries=3):
        """Execute query with retry functionality"""
        for attempt in range(max_retries):
            try:
                with self.conn.cursor(cursor_factory=RealDictCursor) as cur:
                    cur.execute(query, params)
                    if cur.description:  # SELECT statements
                        return cur.fetchall()
                    else:  # INSERT/UPDATE/DELETE statements
                        self.conn.commit()
                        return cur.rowcount
            except psycopg2.errors.SerializationFailure as e:
                print(f"Retry {attempt + 1}/{max_retries}: {e}")
                self.conn.rollback()
                if attempt == max_retries - 1:
                    raise
            except Exception as e:
                self.conn.rollback()
                raise
    
    def create_user(self, username, email, first_name, last_name):
        """Create user"""
        query = """
        INSERT INTO users (username, email, password_hash, first_name, last_name)
        VALUES (%s, %s, %s, %s, %s)
        RETURNING id
        """
        password_hash = f"hashed_{username}"  # In practice, use proper hashing
        
        result = self.execute_with_retry(
            query, 
            (username, email, password_hash, first_name, last_name)
        )
        return result
    
    def get_user_with_department(self, user_id):
        """Get user with department information"""
        query = """
        SELECT 
            u.id, u.username, u.email, u.first_name, u.last_name,
            d.name as department_name,
            e.position, e.salary, e.hire_date
        FROM users u
        LEFT JOIN employees e ON u.id = e.user_id
        LEFT JOIN departments d ON e.department_id = d.id
        WHERE u.id = %s
        """
        return self.execute_with_retry(query, (user_id,))
    
    def transfer_between_accounts(self, from_account, to_account, amount):
        """Distributed transaction example: account transfer"""
        try:
            # Transfer transaction
            self.execute_with_retry("""
                UPDATE accounts SET balance = balance - %s 
                WHERE id = %s AND balance >= %s
            """, (amount, from_account, amount))
            
            self.execute_with_retry("""
                UPDATE accounts SET balance = balance + %s 
                WHERE id = %s
            """, (amount, to_account))
            
            # Record transaction history
            self.execute_with_retry("""
                INSERT INTO transaction_logs (from_account, to_account, amount, timestamp)
                VALUES (%s, %s, %s, %s)
            """, (from_account, to_account, amount, datetime.now()))
            
            print(f"Transfer completed: {from_account} -> {to_account}, amount: {amount}")
            
        except Exception as e:
            print(f"Transfer error: {e}")
            self.conn.rollback()
            raise
    
    def close(self):
        """Close connection"""
        if self.conn:
            self.conn.close()

# Usage example
if __name__ == "__main__":
    # Connection parameters
    conn_params = {
        'host': 'localhost',
        'port': 26257,
        'database': 'company',
        'user': 'root',
        'sslmode': 'disable'  # Use proper SSL settings in production
    }
    
    db = CockroachDBConnection(conn_params)
    
    try:
        db.connect()
        
        # Create user
        result = db.create_user(
            'alice_cooper', '[email protected]', 
            'Alice', 'Cooper'
        )
        print(f"User creation result: {result}")
        
        # Get user information
        user_info = db.get_user_with_department('some-uuid')
        print(f"User information: {user_info}")
        
    except Exception as e:
        print(f"Error: {e}")
    
    finally:
        db.close()

# Go language connection example (reference)
"""
package main

import (
    "database/sql"
    "fmt"
    "log"
    
    _ "github.com/lib/pq"
    "github.com/cockroachdb/cockroach-go/v2/crdb/crdbpgx"
)

func main() {
    db, err := sql.Open("postgres", 
        "postgresql://root@localhost:26257/company?sslmode=disable")
    if err != nil {
        log.Fatal("Database connection error:", err)
    }
    defer db.Close()

    // Transaction with retry functionality
    err = crdbpgx.ExecuteTx(context.Background(), db, pgx.TxOptions{}, 
        func(tx pgx.Tx) error {
            // Transaction processing
            _, err := tx.Exec(context.Background(), 
                "UPDATE accounts SET balance = balance - $1 WHERE id = $2", 
                100, "account1")
            return err
        })
    
    if err != nil {
        log.Fatal("Transaction error:", err)
    }
}
"""