# Pub/Sub: Broadcasting Events to Multiple Consumers

> **Series:** System Design · Architecture Patterns — Pillar 7 of 8

## Systems Design

| # | Post | What it covers |
|---|------|----------------|
| 00 | [Architecture Patterns: How Systems Are Structured](/architecture-patterns-how-systems-are-structured) | Twenty patterns covering monoliths, microservices, events, resilience, deployment, and data processing. How to structure systems that survive growth. |
| 01 | [Monolithic Architecture: The Default That Gets Abandoned Too Early](/monolithic-architecture-the-default-that-gets-abandoned-too-early) | Monoliths are fast to build and easy to operate. Learn when they're the right choice, when they break down, and how to know the difference. |
| 02 | [Microservices: The Architecture You Earn, Not Choose](/microservices-the-architecture-you-earn-not-choose) | Microservices enable independent scaling and team autonomy — but at significant cost. Learn what you actually get, what you pay, and when it's worth it. |
| 03 | [Serverless: Pay for What You Use, Not What You Provision](/serverless-pay-for-what-you-use-not-what-you-provision) | Serverless scales to zero and charges per invocation. Learn where it shines, where it fails, and how to design around cold starts and vendor lock-in. |
| 04 | [Event-Driven Architecture: Decoupling Through Events](/event-driven-architecture-decoupling-through-events) | Event-driven systems communicate via events rather than direct calls. Learn how producers, consumers, and event brokers work — and the consistency tradeoffs involved. |
| 05 | [Message Queues: Decoupling Produce from Consume](/message-queues-decoupling-produce-from-consume) | Message queues decouple producers and consumers, enable load levelling, and provide durability. Learn how they work and when to use Kafka vs SQS vs RabbitMQ. |
| 06 | **Pub/Sub: Broadcasting Events to Multiple Consumers** ← you are here | Pub/sub decouples publishers from subscribers through topics. Learn how it differs from message queues and when to use Kafka, SNS, or Google Pub/Sub. |
| 07 | [CQRS: When Reads and Writes Need Different Models](/cqrs-when-reads-and-writes-need-different-models) | CQRS separates writes from reads so each can be optimised independently. Learn how it works, when it's worth the complexity, and when it isn't. |
| 08 | [Event Sourcing: The Ledger, Not the Balance](/event-sourcing-the-ledger-not-the-balance) | Event sourcing stores state as a sequence of events. Learn how it works, what you get (audit log, time travel), and what it costs (complexity, schema evolution). |
| 09 | [The Saga Pattern: Distributed Transactions Without Locks](/the-saga-pattern-distributed-transactions-without-locks) | The Saga pattern manages distributed transactions across services using compensating transactions. Learn choreography vs orchestration and when to use each. |
| 10 | [The Outbox Pattern: Atomic Writes and Event Publishing](/the-outbox-pattern-atomic-writes-and-event-publishing) | The Outbox pattern solves the dual-write problem — publishing an event and writing to a database atomically. Learn how it works using CDC or polling. |
| 11 | [The Circuit Breaker: Stopping Cascading Failures](/the-circuit-breaker-stopping-cascading-failures) | Circuit breakers prevent cascading failures by fast-failing calls to unhealthy dependencies. Learn the three states, how to configure them, and where to apply them. |
| 12 | [The Bulkhead Pattern: Containing Failures Through Resource Isolation](/the-bulkhead-pattern-containing-failures-through-resource-isolation) | Bulkheads isolate thread pools and connections per dependency so one failure can't exhaust resources needed by others. Learn how to apply them in practice. |
| 13 | [The Sidecar Pattern: Cross-Cutting Concerns Without Code Changes](/the-sidecar-pattern-cross-cutting-concerns-without-code-changes) | The sidecar pattern deploys a helper process alongside each service for logging, metrics, TLS, and service discovery — without modifying the service itself. |
| 14 | [Service Mesh: A Programmable Network for Microservices](/service-mesh-a-programmable-network-for-microservices) | A service mesh handles service-to-service traffic, mTLS, circuit breaking, and observability via a fleet of sidecar proxies. Learn how it works and when to use it. |
| 15 | [Service Discovery: Finding Services in a Dynamic Environment](/service-discovery-finding-services-in-a-dynamic-environment) | Service discovery lets services find each other in dynamic environments. Learn client-side vs server-side discovery, health checks, and DNS vs registry approaches. |
| 16 | [The Strangler Fig: Replacing a Legacy System Without Burning It Down](/the-strangler-fig-replacing-a-legacy-system-without-burning-it-down) | The Strangler Fig replaces a legacy system incrementally by routing specific functionality to new implementations while the old system keeps running. |
| 17 | [Backend for Frontend: One API Per Client Type](/backend-for-frontend-one-api-per-client-type) | BFF creates dedicated API backends per client type. Learn why one general API struggles to serve mobile and web well, and how BFF solves it. |
| 18 | [ETL Pipelines: Moving Data from Operations to Analytics](/etl-pipelines-moving-data-from-operations-to-analytics) | ETL moves data from operational systems into analytical stores. Learn how pipelines work, what ELT is, and how to design reliable data movement at scale. |
| 19 | [Batch vs Stream Processing: How Fresh Do Your Answers Need to Be?](/batch-vs-stream-processing-how-fresh-do-your-answers-need-to-be) | Batch processes accumulate data then processes in bulk; streaming processes each event as it arrives. Learn the tradeoffs and when each is right. |
| 20 | [MapReduce: Processing Petabytes in Parallel](/mapreduce-processing-petabytes-in-parallel) | MapReduce processes massive datasets in parallel by splitting work into map and reduce phases. Learn how it works and why Spark has largely replaced it. |
| 21 | [Architecture Patterns: Wrap-Up](/architecture-patterns-wrap-up) | A recap of all 20 architecture patterns across decomposition, async communication, data patterns, resilience, and data processing. How they connect. |

---

# Pub/Sub: Broadcasting Events to Multiple Consumers

## The problem

Your URL shortener's Link Service creates a new link. Multiple services need to react: Analytics must initialise a stats record, QR must generate a code, Billing must increment the usage counter, Webhooks must fire registered endpoints.

A message queue sends each message to one consumer. You'd need four separate queues — one per consumer service — and the Link Service would need to know about and publish to all four. Every time you add a new consumer, you modify the Link Service.

Pub/sub inverts this. The Link Service publishes to one topic. Every service that cares subscribes to that topic. Adding a fifth consumer — say, a search indexer — means creating a new subscription on the existing topic. The Link Service doesn't change.

---

## The core idea

In the publish-subscribe pattern, publishers emit messages to a named topic rather than to a specific recipient. Each topic can have multiple subscribers; every subscriber receives a copy of every message. Subscribers register their interest independently of publishers — publishers don't know who's listening.

---

## The analogy: a radio broadcast

A radio station (publisher) broadcasts on a frequency (topic). Anyone with a receiver (subscriber) tuned to that frequency receives the signal. The station doesn't maintain a list of listeners and doesn't know how many people are tuned in. Adding a new listener doesn't require the station to do anything. Each listener processes the signal independently.

Message queues are like point-to-point phone calls — one caller, one recipient, the call ends when the recipient answers. Pub/sub is the broadcast — one transmission, every receiver gets it.

---

## How pub/sub works

### Topics and subscriptions

```
Publisher: Link Service
  publishes LinkCreated to topic: link.created

Subscribers (each has its own subscription on link.created):
  analytics-subscription   → Analytics Service
  qr-subscription          → QR Service
  billing-subscription     → Billing Service
  webhook-subscription     → Webhook Service
```

Each subscriber maintains its own message cursor or acknowledgement state. If the Analytics Service processes events slowly, that lag doesn't affect QR Service or Billing Service — each subscription is independent.

### Push vs pull delivery

**Push (the broker delivers to the subscriber):** the broker sends messages to a subscriber's endpoint (HTTP webhook, Lambda, queue). The subscriber doesn't need to poll. Simple for low-latency fan-out. The subscriber must be available to receive.

**Pull (the subscriber fetches from the broker):** the subscriber polls the broker for messages. More control over consumption rate — the subscriber processes at its own pace. Kafka and Google Pub/Sub support both; Kafka's consumer group model is fundamentally pull-based.

### Fan-out architectures in practice

**SNS + SQS (AWS):** SNS is a push-based pub/sub broker. Each subscriber is typically an SQS queue — the fan-out writes to multiple queues, each consumed by a different service. The SQS queue provides buffering so the subscriber can process at its own rate.

```
Link Service → SNS topic: link.created
  SNS → SQS queue: analytics-queue → Analytics Service
  SNS → SQS queue: qr-queue → QR Service
  SNS → SQS queue: billing-queue → Billing Service
```

**Kafka consumer groups:** multiple consumer groups each read the same Kafka topic. Each consumer group has its own offset — they're completely independent.

```
Kafka topic: link.created
  Consumer group: analytics → reads at offset 5,230,000
  Consumer group: qr-service → reads at offset 5,230,001
  Consumer group: billing → reads at offset 5,229,800 (slightly behind)
```

Kafka's model is ideal for high-throughput fan-out where consumers have different processing speeds and you need event replay capability.

---

## Pub/Sub vs Message Queues

| | Message Queue | Pub/Sub |
|---|---|---|
| **Delivery** | One consumer per message | Every subscriber gets every message |
| **Consumers** | Competing (load-balanced) | Independent (each gets all messages) |
| **Producer knowledge** | Producer targets the queue | Producer targets a topic only |
| **Adding consumers** | Producer sends to new queue | New subscriber on existing topic |
| **Ordering** | Within a queue | Within a partition/subscription |
| **Best for** | Load-balanced task distribution | Fan-out, event broadcast |

A common pattern combines both: pub/sub for fan-out from producer to subscriber queues; message queues for load-levelled consumption within each subscriber service.

---

## Tradeoffs

**Subscriber independence vs backpressure.** Because each subscriber is independent, a slow subscriber doesn't slow down other subscribers — but it also can't apply backpressure to the publisher. If a subscriber falls significantly behind, messages accumulate in its subscription. Monitor subscriber lag and alert on it.

**Schema coupling.** Subscribers depend on the message schema published by the publisher. Schema changes that break consumers require versioning strategies. All subscribers must be updated before the publisher changes an incompatible field.

**At-least-once and idempotency.** Like message queues, pub/sub systems typically provide at-least-once delivery. Subscribers must handle duplicate messages. Design subscriber logic to be idempotent.

**Fan-out cost.** Every message is stored separately per subscriber. A topic with 10 subscribers and a million messages stores 10 million message copies (in traditional implementations). Kafka avoids this by having all consumer groups read from the same log.

---

## The one thing to remember

> **Pub/sub decouples publishers from subscribers through a topic: the publisher doesn't know who's listening, and subscribers register independently.** Every subscriber gets every message — making it the right model for fan-out (one event, many reactions). It's not a replacement for message queues — it's the broadcast layer. Combine it with queues (SNS → SQS) or use Kafka consumer groups for durable, high-throughput, multi-consumer fan-out.

---

*← Previous: **[Message Queues](/message-queues-decoupling-produce-from-consume)** — the concrete infrastructure behind event-driven systems: how queues work, what durability guarantees they provide, and how producers and consumers decouple their processing rates.*

*→ Next: **[CQRS](/cqrs-when-reads-and-writes-need-different-models)** — separating the write model from the read model so each can be optimised independently.*

