software-design|March 19, 2026|11 min read

System Design Patterns for Real-Time Updates at High Traffic

TL;DR

Use WebSockets for bidirectional real-time (chat, gaming). Use SSE for server-push only (notifications, feeds). Scale horizontally with a pub/sub backbone (Redis or NATS) so any server node can push to any connected client. Fan-out is the hard part — resolve subscribers once, push via channels, and always handle backpressure.

System Design Patterns for Real-Time Updates at High Traffic

The previous articles in this series covered scaling reads and scaling writes. Both deal with request-response patterns — client asks, server answers. Real-time systems flip that model. The server pushes data to clients the moment something changes, without waiting for a request.

This is the hardest scaling problem in web architecture. A single chat message in a 10,000-member group means 10,000 pushes. A stock price tick means pushing to every connected trader. A live sports score means millions of simultaneous updates.

This article covers the patterns I’ve used to build and scale real-time systems from thousands to millions of concurrent connections.

The Architecture at a Glance

Real-Time Architecture

The architecture has four concerns: transport (how clients connect), routing (which server has which client), fan-out (who gets each message), and reliability (what happens when connections drop).


Choosing a Transport

The first decision: how do clients receive updates?

Transport Comparison

WebSocket — The Default Choice

WebSocket provides a persistent, bidirectional TCP connection. After an HTTP upgrade handshake, both client and server can send frames at any time. This is the right choice for most real-time applications.

// --- Server: Node.js with ws library ---
import { WebSocketServer, WebSocket } from 'ws';
import { createServer } from 'http';

const server = createServer();
const wss = new WebSocketServer({ server });

// Track connections by user
const connections = new Map<string, Set<WebSocket>>();

wss.on('connection', (ws, req) => {
  const userId = authenticateFromHeaders(req.headers);
  if (!userId) {
    ws.close(4001, 'Unauthorized');
    return;
  }

  // Register connection
  if (!connections.has(userId)) {
    connections.set(userId, new Set());
  }
  connections.get(userId)!.add(ws);

  ws.on('message', (data) => {
    const message = JSON.parse(data.toString());
    handleMessage(userId, message);
  });

  ws.on('close', () => {
    connections.get(userId)?.delete(ws);
    if (connections.get(userId)?.size === 0) {
      connections.delete(userId);
    }
  });

  // Heartbeat to detect dead connections
  ws.isAlive = true;
  ws.on('pong', () => { ws.isAlive = true; });
});

// Ping all clients every 30s — kill dead ones
setInterval(() => {
  wss.clients.forEach((ws) => {
    if (!ws.isAlive) return ws.terminate();
    ws.isAlive = false;
    ws.ping();
  });
}, 30_000);

// Push to a specific user (all their connections)
function pushToUser(userId: string, event: any): void {
  const sockets = connections.get(userId);
  if (!sockets) return;

  const payload = JSON.stringify(event);
  for (const ws of sockets) {
    if (ws.readyState === WebSocket.OPEN) {
      ws.send(payload);
    }
  }
}

server.listen(8080);
// --- Client: browser ---
class RealtimeClient {
  private ws: WebSocket | null = null;
  private url: string;
  private reconnectDelay = 1000;
  private maxDelay = 30000;
  private handlers = new Map<string, Function[]>();

  constructor(url: string) {
    this.url = url;
    this.connect();
  }

  private connect(): void {
    this.ws = new WebSocket(this.url);

    this.ws.onopen = () => {
      console.log('Connected');
      this.reconnectDelay = 1000; // Reset backoff
    };

    this.ws.onmessage = (event) => {
      const data = JSON.parse(event.data);
      const listeners = this.handlers.get(data.type) || [];
      listeners.forEach((fn) => fn(data.payload));
    };

    this.ws.onclose = () => {
      // Exponential backoff with jitter
      const jitter = Math.random() * 1000;
      const delay = Math.min(
        this.reconnectDelay + jitter,
        this.maxDelay
      );
      setTimeout(() => this.connect(), delay);
      this.reconnectDelay *= 2;
    };
  }

  on(eventType: string, handler: Function): void {
    if (!this.handlers.has(eventType)) {
      this.handlers.set(eventType, []);
    }
    this.handlers.get(eventType)!.push(handler);
  }

  send(data: any): void {
    if (this.ws?.readyState === WebSocket.OPEN) {
      this.ws.send(JSON.stringify(data));
    }
  }
}

// Usage
const client = new RealtimeClient('wss://api.example.com/ws');
client.on('message', (msg) => renderMessage(msg));
client.on('typing', (data) => showTypingIndicator(data));

Server-Sent Events — When You Only Need Push

If the client never sends data over the real-time channel (notifications, live feeds, dashboards), SSE is simpler and scales better. It uses plain HTTP, works through every proxy, and has built-in reconnection with event IDs.

// --- Server: Express SSE endpoint ---
import express from 'express';

const app = express();
const clients = new Map<string, express.Response[]>();

app.get('/events/:userId', (req, res) => {
  const { userId } = req.params;

  // SSE headers
  res.writeHead(200, {
    'Content-Type': 'text/event-stream',
    'Cache-Control': 'no-cache',
    'Connection': 'keep-alive',
    'X-Accel-Buffering': 'no', // Disable nginx buffering
  });

  // Send initial connection event
  res.write(`event: connected\ndata: {"status":"ok"}\n\n`);

  // Register
  if (!clients.has(userId)) clients.set(userId, []);
  clients.get(userId)!.push(res);

  // Keepalive comment every 15s (prevents proxy timeouts)
  const keepalive = setInterval(() => {
    res.write(': keepalive\n\n');
  }, 15_000);

  req.on('close', () => {
    clearInterval(keepalive);
    const userClients = clients.get(userId);
    if (userClients) {
      const idx = userClients.indexOf(res);
      if (idx > -1) userClients.splice(idx, 1);
      if (userClients.length === 0) clients.delete(userId);
    }
  });
});

// Push to user — with event ID for resume
let eventId = 0;
function pushSSE(userId: string, eventType: string, data: any): void {
  const userClients = clients.get(userId);
  if (!userClients) return;

  eventId++;
  const payload = [
    `id: ${eventId}`,
    `event: ${eventType}`,
    `data: ${JSON.stringify(data)}`,
    '',
    '',
  ].join('\n');

  for (const res of userClients) {
    res.write(payload);
  }
}
// --- Client: native EventSource ---
const events = new EventSource('/events/user123');

// Auto-reconnects with Last-Event-ID header
events.addEventListener('message', (e) => {
  const msg = JSON.parse(e.data);
  renderMessage(msg);
});

events.addEventListener('notification', (e) => {
  showNotification(JSON.parse(e.data));
});

events.onerror = () => {
  // Browser handles reconnection automatically
  console.log('SSE reconnecting...');
};

Scaling Beyond One Server: The Pub/Sub Backbone

A single server can handle ~50,000-100,000 WebSocket connections. For more, you need multiple servers. The problem: Client A is connected to Server 1, but the event is produced on Server 3. How does the message reach Client A?

The answer: a pub/sub backbone that all servers subscribe to.

graph LR
    subgraph "Server 1"
        A[Client A]
        B[Client B]
    end
    subgraph "Server 2"
        C[Client C]
        D[Client D]
    end
    subgraph "Pub/Sub"
        R[(Redis)]
    end

    S[Event Source] -->|publish| R
    R -->|subscribe| A
    R -->|subscribe| C

Redis Pub/Sub Implementation

import Redis from 'ioredis';

class PubSubBridge {
  private pub: Redis;
  private sub: Redis;
  private localConnections: Map<string, Set<WebSocket>>;

  constructor(redisUrl: string, localConnections: Map<string, Set<WebSocket>>) {
    this.pub = new Redis(redisUrl);
    this.sub = new Redis(redisUrl);
    this.localConnections = localConnections;
  }

  async start(): Promise<void> {
    // Subscribe to user-specific and broadcast channels
    await this.sub.psubscribe('user:*', 'channel:*', 'broadcast');

    this.sub.on('pmessage', (_pattern, channel, message) => {
      const event = JSON.parse(message);

      if (channel === 'broadcast') {
        // Push to ALL local connections
        this.pushToAll(event);
      } else if (channel.startsWith('user:')) {
        // Push to specific user if connected to THIS server
        const userId = channel.replace('user:', '');
        this.pushToLocal(userId, event);
      } else if (channel.startsWith('channel:')) {
        // Push to all local members of a channel
        const channelId = channel.replace('channel:', '');
        this.pushToChannelMembers(channelId, event);
      }
    });
  }

  // Publish event — reaches ALL servers via Redis
  async publish(channel: string, event: any): Promise<void> {
    await this.pub.publish(channel, JSON.stringify(event));
  }

  // Deliver to local connections only
  private pushToLocal(userId: string, event: any): void {
    const sockets = this.localConnections.get(userId);
    if (!sockets) return;

    const payload = JSON.stringify(event);
    for (const ws of sockets) {
      if (ws.readyState === WebSocket.OPEN) {
        ws.send(payload);
      }
    }
  }

  private pushToAll(event: any): void {
    const payload = JSON.stringify(event);
    for (const [, sockets] of this.localConnections) {
      for (const ws of sockets) {
        if (ws.readyState === WebSocket.OPEN) {
          ws.send(payload);
        }
      }
    }
  }
}

// Usage
const bridge = new PubSubBridge('redis://redis:6379', connections);
await bridge.start();

// When a chat message is sent
await bridge.publish('channel:general', {
  type: 'message',
  payload: { sender: 'alice', text: 'Hello!', ts: Date.now() },
});

// When a notification targets a specific user
await bridge.publish('user:bob', {
  type: 'notification',
  payload: { title: 'New follower', body: 'Alice followed you' },
});

Redis Pub/Sub Limitations

Redis pub/sub is fire-and-forget — no message persistence. If a server is briefly disconnected from Redis, it misses messages. For production systems with millions of users, consider:

Backbone Throughput Persistence Best For
Redis Pub/Sub ~500K msg/s None Simple real-time, < 100K connections
NATS ~10M msg/s Optional (JetStream) High-throughput microservices
Kafka ~1M msg/s Full Ordered, durable event streams
Redis Streams ~500K msg/s Yes (with consumer groups) Best of both worlds

For most applications, Redis Streams is the sweet spot — pub/sub semantics with message persistence and consumer groups for reliability.


Fan-Out: The Hard Part

Fan-out is where real-time systems break. A user posts in a group with 50,000 members. That’s 50,000 push operations triggered by a single write.

Fan-Out on Write vs Fan-Out on Read

graph TD
    subgraph "Fan-Out on Write"
        A1[Message arrives] --> B1[Resolve 50K members]
        B1 --> C1[Push to each member's channel]
        C1 --> D1["Latency: high write, instant read"]
    end

    subgraph "Fan-Out on Read"
        A2[Message arrives] --> B2[Store in channel]
        B2 --> C2[Client polls/subscribes to channel]
        C2 --> D2["Latency: instant write, read fetches"]
    end

Fan-out on write (push model): when a message arrives, immediately resolve all recipients and push to each one. This is what Twitter originally did — and it broke at Bieber-scale.

Fan-out on read (pull model): store the message once, and each client fetches from channels they’re subscribed to. Less write amplification, but higher read latency.

The hybrid approach (what actually works): fan-out on write for small groups (< 1,000 members), fan-out on read for large channels.

const FANOUT_THRESHOLD = 1000;

async function deliverMessage(channelId: string, message: any): Promise<void> {
  const memberCount = await redis.scard(`channel:${channelId}:members`);

  if (memberCount <= FANOUT_THRESHOLD) {
    // Small channel: push to each member immediately
    await fanOutOnWrite(channelId, message);
  } else {
    // Large channel: store and let clients pull
    await fanOutOnRead(channelId, message);
  }
}

async function fanOutOnWrite(channelId: string, message: any): Promise<void> {
  // Store message
  await redis.xadd(
    `messages:${channelId}`,
    '*',
    'data', JSON.stringify(message)
  );

  // Get all member IDs
  const members = await redis.smembers(`channel:${channelId}:members`);

  // Push to each member's personal channel
  // Batch into chunks to avoid overwhelming Redis
  const BATCH = 500;
  for (let i = 0; i < members.length; i += BATCH) {
    const batch = members.slice(i, i + BATCH);
    const pipeline = redis.pipeline();

    for (const userId of batch) {
      pipeline.publish(`user:${userId}`, JSON.stringify({
        type: 'message',
        channel: channelId,
        payload: message,
      }));
    }

    await pipeline.exec();
  }
}

async function fanOutOnRead(channelId: string, message: any): Promise<void> {
  // Just store the message — clients subscribe to the channel directly
  await redis.xadd(
    `messages:${channelId}`,
    '*',
    'data', JSON.stringify(message)
  );

  // Publish a lightweight notification to the channel
  await pubsub.publish(`channel:${channelId}`, JSON.stringify({
    type: 'new_message',
    channel: channelId,
    messageId: message.id,
    // Don't include full payload — client fetches it
  }));
}

Presence: Who’s Online?

Presence tracking answers “who’s online right now?” — essential for chat, collaboration, and gaming.

Redis Sorted Set for Presence

class PresenceService {
  private redis: Redis;
  private readonly TTL = 60; // seconds

  constructor(redis: Redis) {
    this.redis = redis;
  }

  // Called on connect and every heartbeat interval
  async setOnline(userId: string): Promise<void> {
    const now = Date.now();
    await this.redis.zadd('presence:online', now, userId);
  }

  // Called on disconnect
  async setOffline(userId: string): Promise<void> {
    await this.redis.zrem('presence:online', userId);

    // Notify subscribers
    await this.redis.publish('presence', JSON.stringify({
      userId,
      status: 'offline',
      timestamp: Date.now(),
    }));
  }

  // Get all online users (with stale cleanup)
  async getOnlineUsers(): Promise<string[]> {
    const cutoff = Date.now() - this.TTL * 1000;

    // Remove stale entries (no heartbeat for TTL seconds)
    await this.redis.zremrangebyscore('presence:online', 0, cutoff);

    // Return remaining
    return this.redis.zrange('presence:online', 0, -1);
  }

  // Check specific users
  async areOnline(userIds: string[]): Promise<Map<string, boolean>> {
    const cutoff = Date.now() - this.TTL * 1000;
    const result = new Map<string, boolean>();

    const pipeline = this.redis.pipeline();
    for (const id of userIds) {
      pipeline.zscore('presence:online', id);
    }
    const scores = await pipeline.exec();

    userIds.forEach((id, i) => {
      const score = scores![i][1] as number | null;
      result.set(id, score !== null && score > cutoff);
    });

    return result;
  }

  // Get online count for a channel
  async channelOnlineCount(channelId: string): Promise<number> {
    const members = await this.redis.smembers(
      `channel:${channelId}:members`
    );
    const statuses = await this.areOnline(members);
    return [...statuses.values()].filter(Boolean).length;
  }
}

// Heartbeat on the WebSocket server
setInterval(async () => {
  for (const [userId] of connections) {
    await presence.setOnline(userId);
  }
}, 30_000);

Backpressure: When Clients Can’t Keep Up

A slow client on a 3G connection can’t consume messages as fast as they arrive. Without backpressure, the server buffers messages in memory until it OOMs.

Per-Connection Buffer with Overflow

class BackpressureManager {
  private buffers = new Map<WebSocket, any[]>();
  private readonly MAX_BUFFER = 100;
  private readonly MAX_BEHIND_MS = 30_000;

  send(ws: WebSocket, event: any): void {
    // Check if socket buffer is draining
    if (ws.bufferedAmount > 65536) {
      // Socket is backed up — buffer in application
      this.bufferMessage(ws, event);
      return;
    }

    // Flush any buffered messages first
    this.flushBuffer(ws);

    // Send current message
    ws.send(JSON.stringify(event));
  }

  private bufferMessage(ws: WebSocket, event: any): void {
    if (!this.buffers.has(ws)) {
      this.buffers.set(ws, []);
    }

    const buffer = this.buffers.get(ws)!;

    if (buffer.length >= this.MAX_BUFFER) {
      // Drop oldest messages — client is too far behind
      buffer.shift();
      // Mark that messages were dropped
      event._dropped = true;
    }

    buffer.push(event);
  }

  private flushBuffer(ws: WebSocket): void {
    const buffer = this.buffers.get(ws);
    if (!buffer?.length) return;

    while (buffer.length > 0 && ws.bufferedAmount < 65536) {
      const msg = buffer.shift()!;
      ws.send(JSON.stringify(msg));
    }

    if (buffer.length === 0) {
      this.buffers.delete(ws);
    }
  }
}

Client-Side: Detecting Gaps and Catching Up

When the server drops messages due to backpressure, the client needs to know and recover.

// Client-side catch-up on reconnect or gap detection
class RealtimeClientWithCatchup {
  private lastEventId: string | null = null;

  onMessage(event: any): void {
    if (event._dropped) {
      // Server dropped messages — fetch the gap
      this.catchUp();
      return;
    }

    this.lastEventId = event.id;
    this.render(event);
  }

  async catchUp(): Promise<void> {
    // Fetch missed messages from REST API
    const response = await fetch(
      `/api/messages?after=${this.lastEventId}&limit=100`
    );
    const missed = await response.json();

    for (const msg of missed) {
      this.render(msg);
      this.lastEventId = msg.id;
    }
  }
}

Message Ordering and Delivery Guarantees

Real-time systems can deliver messages out of order across multiple servers. A chat where replies arrive before the original message is unusable.

Logical Clocks for Ordering

// Server-side: assign monotonic IDs per channel
class MessageOrderer {
  private redis: Redis;

  async assignId(channelId: string): Promise<string> {
    // Atomic increment — guaranteed unique and ordered per channel
    const seq = await this.redis.incr(`seq:${channelId}`);
    // Combine with timestamp for global rough ordering
    const ts = Date.now();
    return `${ts}-${seq}`;
  }
}

// Client-side: reorder buffer
class OrderedMessageBuffer {
  private buffer: Map<string, any> = new Map();
  private lastSeq = 0;

  receive(message: any): any[] {
    const seq = parseInt(message.id.split('-')[1]);
    this.buffer.set(message.id, message);

    // Flush consecutive messages
    const flushed: any[] = [];
    while (this.buffer.has(this.nextExpectedId())) {
      const id = this.nextExpectedId();
      flushed.push(this.buffer.get(id));
      this.buffer.delete(id);
      this.lastSeq++;
    }

    return flushed;
  }

  private nextExpectedId(): string {
    // Simplified — in practice, match the full ID format
    return `*-${this.lastSeq + 1}`;
  }
}

Scaling Milestones: What Breaks When

Here’s what I’ve seen break at each order of magnitude:

10K connections
├── Single server, no pub/sub needed
├── In-memory connection map
└── Problem: nothing yet

100K connections
├── Multiple server nodes required
├── Need pub/sub backbone (Redis)
├── Need sticky sessions or connection registry
└── Problem: Redis pub/sub memory under fan-out spikes

500K connections
├── Redis Streams > Redis Pub/Sub
├── Fan-out batching critical
├── Need presence service (separate from main DB)
└── Problem: Thundering herd on reconnect after deploy

1M+ connections
├── Multiple pub/sub shards
├── Hybrid fan-out (write for small, read for large)
├── Connection-level backpressure mandatory
├── Need message catch-up service
└── Problem: Operating system file descriptor limits,
    TCP port exhaustion, GC pauses

OS-Level Tuning for High Connection Counts

# /etc/sysctl.conf — for servers handling 100K+ connections

# Increase max file descriptors
fs.file-max = 2097152

# Increase socket buffer sizes
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535

# Increase ephemeral port range
net.ipv4.ip_local_port_range = 1024 65535

# Reuse TIME_WAIT sockets
net.ipv4.tcp_tw_reuse = 1

# Faster TCP keepalive detection
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6
# Per-process limits — /etc/security/limits.conf
*    soft    nofile    1048576
*    hard    nofile    1048576

Putting It All Together: The Decision Framework

Question Answer Pattern
Client sends data too? Yes WebSocket
Server push only? Yes SSE
< 50K connections? Single server In-memory connection map
50K-500K connections? Multi-server Redis Pub/Sub backbone
> 500K connections? Large scale Redis Streams + sharded pub/sub
Small groups (< 1K)? Fan-out on write Push to each member
Large channels (> 1K)? Fan-out on read Store + notify, client fetches
Need “who’s online”? Presence Redis Sorted Set + heartbeat
Slow clients? Backpressure Per-connection buffer + drop policy
Messages must be ordered? Ordering Channel-level sequence numbers
Clients reconnect often? Catch-up Message store + Last-Event-ID

The Golden Rule

Real-time at scale is not about picking the fastest transport — it’s about handling the fan-out multiplication gracefully. One event becoming 100,000 pushes is where systems buckle. Batch your fan-out, apply backpressure at every layer, and always have a catch-up path for clients that miss messages. The WebSocket connection is the easy part. The pub/sub fan-out is where you’ll spend 80% of your engineering time.


Further Reading

Related Posts

System Design Patterns for Scaling Reads

System Design Patterns for Scaling Reads

Most production systems are read-heavy. A typical web application sees 90-9…

Principles of Software Designing

Principles of Software Designing

It is very easy to build a software or app. But, it is trickier to have a good…

System Design Patterns for Managing Long-Running Tasks

System Design Patterns for Managing Long-Running Tasks

Introduction Some operations simply can’t finish in the time a user is willing…

System Design Patterns for Handling Large Blobs

System Design Patterns for Handling Large Blobs

Introduction Every non-trivial application eventually needs to handle large…

Serverless vs Containers — The Decision I Keep Revisiting

Serverless vs Containers — The Decision I Keep Revisiting

Every time I start a new service, I have the same argument with myself. Lambda…

System Design Patterns for Scaling Writes

System Design Patterns for Scaling Writes

In the companion article on scaling reads, we covered caching, replicas, and…

Latest Posts

System Design Patterns for Managing Long-Running Tasks

System Design Patterns for Managing Long-Running Tasks

Introduction Some operations simply can’t finish in the time a user is willing…

System Design Patterns for Handling Large Blobs

System Design Patterns for Handling Large Blobs

Introduction Every non-trivial application eventually needs to handle large…

Explaining SAGA Patterns with Examples

Explaining SAGA Patterns with Examples

In a monolith, placing an order is a single database transaction — deduct…

System Design Patterns for Scaling Writes

System Design Patterns for Scaling Writes

In the companion article on scaling reads, we covered caching, replicas, and…

Serverless vs Containers — The Decision I Keep Revisiting

Serverless vs Containers — The Decision I Keep Revisiting

Every time I start a new service, I have the same argument with myself. Lambda…

System Design Patterns for Scaling Reads

System Design Patterns for Scaling Reads

Most production systems are read-heavy. A typical web application sees 90-9…