Distributed Cache

Distributed SystemsCachingScalabilityPerformanceRedisMemcachedHazelcast

GitHub Overview

dotnet/extensions

This repository contains a suite of libraries that provide facilities commonly needed when creating production-ready applications.

Repository:https://github.com/dotnet/extensions

Stars2,979

Watchers234

Forks819

Created:February 17, 2015

Language:C#

License:MIT License

Topics

dotnet

Star History

Data as of: 7/18/2025, 05:46 AM

Library

Distributed Cache

Overview

Distributed cache is a caching system that stores data across multiple machines or nodes. It pools the RAM of multiple computers into a single in-memory data store used as a data cache to provide fast access to data.

Details

Distributed cache systems are essential technology for modern large-scale web applications and microservices architectures. Major distributed cache solutions include Redis (Remote Dictionary Server), Memcached, and Hazelcast. Redis is an in-memory data structure store known for high performance, supporting rich data types like strings, lists, sets, sorted sets, hashes, bitmaps, and geospatial indexes. Memcached is a simple yet powerful general-purpose distributed memory caching system designed to speed up dynamic web applications. Hazelcast is a distributed in-memory data grid that provides distributed computing capabilities beyond just caching. The main benefits of distributed caching include application acceleration, scalability, fault tolerance, and performance improvement. Architectural patterns utilized include Embedded Cache, Client/Server pattern, and Sidecar pattern. The choice between systems depends on specific requirements, with Redis excelling in simple caching scenarios, Memcached providing straightforward key-value caching, and Hazelcast offering comprehensive distributed computing capabilities.

Pros and Cons

Pros

Scalability: Additional cache servers can be added without disrupting existing operations as traffic grows
High Availability: If one cache server fails, requests can be rerouted to other servers ensuring continuous availability
Performance Improvement: Data is stored closer to users, reducing fetch time and improving response times
Fault Tolerance: Data replication eliminates single points of failure
Load Distribution: Data and requests are distributed across multiple nodes, reducing overall system load
Flexible Architecture: Supports various caching patterns (cache-aside, read-through, write-through, etc.)
Memory Efficiency: Efficiently manages large amounts of data across multiple machines' RAM

Cons

Complexity: Increased system complexity compared to single-node cache
Network Dependency: Network latency affects performance
Data Consistency: Consistency management can be challenging in distributed environments
Configuration and Management Costs: Requires setup, monitoring, and maintenance of multiple nodes
Complex Failure Scenarios: More complex failure patterns than single systems (network partitions, node failures)
Security: Need to ensure communication security between multiple nodes

Key Links

Code Examples

Redis Distributed Cache Setup

import redis
from redis.sentinel import Sentinel

# High availability setup using Redis Sentinel
sentinel = Sentinel([
    ('localhost', 26379),
    ('localhost', 26380),
    ('localhost', 26381)
])

# Discover master and slave
master = sentinel.master_for('mymaster', socket_timeout=0.1)
slave = sentinel.slave_for('mymaster', socket_timeout=0.1)

# Write to master
master.set('key', 'value')

# Read from slave
value = slave.get('key')

# Redis Cluster usage example
from rediscluster import RedisCluster

startup_nodes = [
    {"host": "127.0.0.1", "port": "7000"},
    {"host": "127.0.0.1", "port": "7001"},
    {"host": "127.0.0.1", "port": "7002"}
]

rc = RedisCluster(startup_nodes=startup_nodes, decode_responses=True)
rc.set("key", "value")
print(rc.get("key"))

Memcached Distributed Cache (Python)

import pymemcache
from pymemcache.client.base import Client
from pymemcache.client.hash import HashClient

# Single server connection
client = Client(('localhost', 11211))
client.set('some_key', 'some_value')
result = client.get('some_key')

# Distributed cache with multiple servers
servers = [
    ('127.0.0.1', 11211),
    ('127.0.0.1', 11212),
    ('127.0.0.1', 11213)
]

# Distribution using consistent hashing
hash_client = HashClient(servers)

# Set data (automatically distributed to appropriate servers)
hash_client.set('user:1001', {'name': 'Alice', 'age': 30})
hash_client.set('user:1002', {'name': 'Bob', 'age': 25})

# Get data
user_data = hash_client.get('user:1001')
print(user_data)

# Bulk operations
hash_client.set_many({
    'product:1': {'name': 'Laptop', 'price': 999},
    'product:2': {'name': 'Mouse', 'price': 29}
})

products = hash_client.get_many(['product:1', 'product:2'])

Hazelcast Distributed Cache (Java)

import com.hazelcast.config.Config;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;
import com.hazelcast.map.IMap;

// Hazelcast cluster configuration
Config config = new Config();
config.setClusterName("dev-cluster");

// Network configuration
config.getNetworkConfig().getJoin().getMulticastConfig().setEnabled(false);
config.getNetworkConfig().getJoin().getTcpIpConfig()
    .setEnabled(true)
    .addMember("192.168.1.100")
    .addMember("192.168.1.101")
    .addMember("192.168.1.102");

// Create Hazelcast instance
HazelcastInstance hazelcast = Hazelcast.newHazelcastInstance(config);

// Use distributed map (cache)
IMap<String, String> cache = hazelcast.getMap("my-cache");

// Store data (automatically distributed across cluster)
cache.put("key1", "value1");
cache.put("key2", "value2");

// Retrieve data
String value = cache.get("key1");
System.out.println("Retrieved: " + value);

// Store data with TTL (Time To Live)
cache.put("temp-key", "temp-value", 30, TimeUnit.SECONDS);

// Distributed execution example
hazelcast.getExecutorService("default").executeOnAllMembers(
    () -> System.out.println("Execution node: " + 
        Hazelcast.getAllHazelcastInstances().iterator().next()
            .getCluster().getLocalMember())
);

Node.js Redis Distributed Cache

const Redis = require('ioredis');

// Redis Cluster configuration
const cluster = new Redis.Cluster([
  {
    host: '127.0.0.1',
    port: 7000,
  },
  {
    host: '127.0.0.1',
    port: 7001,
  },
  {
    host: '127.0.0.1',
    port: 7002,
  }
], {
  redisOptions: {
    password: 'your-password'
  }
});

// Cache helper class
class DistributedCache {
  constructor(redisCluster) {
    this.redis = redisCluster;
  }

  async get(key) {
    try {
      const value = await this.redis.get(key);
      return value ? JSON.parse(value) : null;
    } catch (error) {
      console.error('Cache get error:', error);
      return null;
    }
  }

  async set(key, value, ttl = 3600) {
    try {
      await this.redis.setex(key, ttl, JSON.stringify(value));
      return true;
    } catch (error) {
      console.error('Cache set error:', error);
      return false;
    }
  }

  async del(key) {
    try {
      await this.redis.del(key);
      return true;
    } catch (error) {
      console.error('Cache delete error:', error);
      return false;
    }
  }

  async exists(key) {
    try {
      const result = await this.redis.exists(key);
      return result === 1;
    } catch (error) {
      console.error('Cache exists error:', error);
      return false;
    }
  }
}

// Usage example
const cache = new DistributedCache(cluster);

async function cacheExample() {
  // Cache data
  await cache.set('user:1001', { 
    name: 'Alice', 
    email: '[email protected]',
    lastLogin: new Date()
  }, 1800); // 30 minutes TTL

  // Retrieve data
  const userData = await cache.get('user:1001');
  console.log('Cached user data:', userData);

  // Pattern matching deletion
  const keys = await cluster.keys('user:*');
  if (keys.length > 0) {
    await cluster.del(...keys);
    console.log(`Deleted ${keys.length} keys`);
  }
}

cacheExample();

Distributed Cache Architecture Patterns

# Cache-Aside Pattern (Lazy Loading)
class CacheAsidePattern:
    def __init__(self, cache_client, database):
        self.cache = cache_client
        self.db = database
    
    def get_user(self, user_id):
        # 1. Check cache first
        cached_user = self.cache.get(f"user:{user_id}")
        if cached_user:
            return cached_user
        
        # 2. On cache miss, fetch from database
        user = self.db.get_user(user_id)
        if user:
            # 3. Store in cache
            self.cache.set(f"user:{user_id}", user, ttl=3600)
        
        return user

# Write-Through Pattern
class WriteThroughPattern:
    def __init__(self, cache_client, database):
        self.cache = cache_client
        self.db = database
    
    def update_user(self, user_id, user_data):
        # 1. Write to database
        self.db.update_user(user_id, user_data)
        
        # 2. Update cache simultaneously
        self.cache.set(f"user:{user_id}", user_data, ttl=3600)
        
        return user_data

# Write-Behind Pattern (Write-Back)
import asyncio
from collections import deque

class WriteBehindPattern:
    def __init__(self, cache_client, database):
        self.cache = cache_client
        self.db = database
        self.write_queue = deque()
        self.start_background_writer()
    
    async def update_user(self, user_id, user_data):
        # 1. Update cache immediately
        await self.cache.set(f"user:{user_id}", user_data, ttl=3600)
        
        # 2. Add to write queue (async DB write)
        self.write_queue.append(('update_user', user_id, user_data))
        
        return user_data
    
    async def background_writer(self):
        while True:
            if self.write_queue:
                operation, user_id, user_data = self.write_queue.popleft()
                try:
                    await self.db.update_user(user_id, user_data)
                except Exception as e:
                    # On error, put back in queue
                    self.write_queue.appendleft((operation, user_id, user_data))
                    print(f"DB write error: {e}")
            
            await asyncio.sleep(1)  # Check every second
    
    def start_background_writer(self):
        asyncio.create_task(self.background_writer())