Series: System Design · Architecture Patterns — Pillar 7 of 8

Systems Design

#	Post	What it covers
00	Architecture Patterns: How Systems Are Structured	Twenty patterns covering monoliths, microservices, events, resilience, deployment, and data processing. How to structure systems that survive growth.
01	Monolithic Architecture: The Default That Gets Abandoned Too Early	Monoliths are fast to build and easy to operate. Learn when they're the right choice, when they break down, and how to know the difference.
02	Microservices: The Architecture You Earn, Not Choose ← you are here	Microservices enable independent scaling and team autonomy — but at significant cost. Learn what you actually get, what you pay, and when it's worth it.
03	Serverless: Pay for What You Use, Not What You Provision	Serverless scales to zero and charges per invocation. Learn where it shines, where it fails, and how to design around cold starts and vendor lock-in.
04	Event-Driven Architecture: Decoupling Through Events	Event-driven systems communicate via events rather than direct calls. Learn how producers, consumers, and event brokers work — and the consistency tradeoffs involved.
05	Message Queues: Decoupling Produce from Consume	Message queues decouple producers and consumers, enable load levelling, and provide durability. Learn how they work and when to use Kafka vs SQS vs RabbitMQ.
06	Pub/Sub: Broadcasting Events to Multiple Consumers	Pub/sub decouples publishers from subscribers through topics. Learn how it differs from message queues and when to use Kafka, SNS, or Google Pub/Sub.
07	CQRS: When Reads and Writes Need Different Models	CQRS separates writes from reads so each can be optimised independently. Learn how it works, when it's worth the complexity, and when it isn't.
08	Event Sourcing: The Ledger, Not the Balance	Event sourcing stores state as a sequence of events. Learn how it works, what you get (audit log, time travel), and what it costs (complexity, schema evolution).
09	The Saga Pattern: Distributed Transactions Without Locks	The Saga pattern manages distributed transactions across services using compensating transactions. Learn choreography vs orchestration and when to use each.
10	The Outbox Pattern: Atomic Writes and Event Publishing	The Outbox pattern solves the dual-write problem — publishing an event and writing to a database atomically. Learn how it works using CDC or polling.
11	The Circuit Breaker: Stopping Cascading Failures	Circuit breakers prevent cascading failures by fast-failing calls to unhealthy dependencies. Learn the three states, how to configure them, and where to apply them.
12	The Bulkhead Pattern: Containing Failures Through Resource Isolation	Bulkheads isolate thread pools and connections per dependency so one failure can't exhaust resources needed by others. Learn how to apply them in practice.
13	The Sidecar Pattern: Cross-Cutting Concerns Without Code Changes	The sidecar pattern deploys a helper process alongside each service for logging, metrics, TLS, and service discovery — without modifying the service itself.
14	Service Mesh: A Programmable Network for Microservices	A service mesh handles service-to-service traffic, mTLS, circuit breaking, and observability via a fleet of sidecar proxies. Learn how it works and when to use it.
15	Service Discovery: Finding Services in a Dynamic Environment	Service discovery lets services find each other in dynamic environments. Learn client-side vs server-side discovery, health checks, and DNS vs registry approaches.
16	The Strangler Fig: Replacing a Legacy System Without Burning It Down	The Strangler Fig replaces a legacy system incrementally by routing specific functionality to new implementations while the old system keeps running.
17	Backend for Frontend: One API Per Client Type	BFF creates dedicated API backends per client type. Learn why one general API struggles to serve mobile and web well, and how BFF solves it.
18	ETL Pipelines: Moving Data from Operations to Analytics	ETL moves data from operational systems into analytical stores. Learn how pipelines work, what ELT is, and how to design reliable data movement at scale.
19	Batch vs Stream Processing: How Fresh Do Your Answers Need to Be?	Batch processes accumulate data then processes in bulk; streaming processes each event as it arrives. Learn the tradeoffs and when each is right.
20	MapReduce: Processing Petabytes in Parallel	MapReduce processes massive datasets in parallel by splitting work into map and reduce phases. Learn how it works and why Spark has largely replaced it.
21	Architecture Patterns: Wrap-Up	A recap of all 20 architecture patterns across decomposition, async communication, data patterns, resilience, and data processing. How they connect.

Microservices: The Architecture You Earn, Not Choose

The problem

Three squads are trying to deploy on the same Tuesday afternoon. Squad A is shipping a new link analytics dashboard. Squad B is deploying a billing feature. Squad C is patching a redirect performance regression. All three are merging to the same repository, running the same test suite, and deploying the same binary.

Squad C's deployment is critical — it fixes a live performance issue. But it's blocked behind Squad B's feature, which has a failing integration test. Squad A and C are waiting. The incident drags on.

This is the coordination tax that microservices eliminate: each squad owns a service, deploys independently, and can't be blocked by another team's broken build. But that coordination saving doesn't come free.

The core idea

A microservices architecture decomposes a system into independently deployable services, each owned by a single team, each managing its own data, and each communicating with others over a network (HTTP, gRPC, message queues). Services are designed around business capabilities, not technical layers.

The analogy: a city of specialist workshops

A monolith is a department store — one building, one entrance, every product under one roof, one management team. Fast to find things, easy to coordinate promotions, one P&L.

Microservices are a high street of specialist workshops — a cobbler, a tailor, a bakery, a hardware store. Each is independently owned and operated. The cobbler doesn't need to know when the bakery changes its hours. The bakery can renovate without the tailor closing. Each shop can be busy or quiet on its own schedule.

The coordination is externally imposed (they're on the same street, visible to the same customers) but internally autonomous. The cobbler and the tailor can't share a cash register or a storeroom — each has its own.

How microservices work

Service boundaries

The hardest problem in microservices isn't the technology — it's drawing the right boundaries. A service boundary is a contract: everything inside a service is the service's concern; everything outside it is reached via its public API.

Domain-Driven Design (DDD) provides the conceptual framework: identify "bounded contexts" — areas of the business with their own distinct model and language — and make each a service candidate.

For the URL shortener:

Each service owns its data. The Analytics Service has its own Cassandra cluster. The Link Service has its own PostgreSQL schema. They do not query each other's databases directly — they call each other's APIs.

Data ownership

The "database per service" pattern is non-negotiable. If two services share a database, they're coupled at the data layer — schema changes in one affect the other, deployments must be coordinated, and you've lost independence.

Inter-service communication

Services don't share memory — they communicate over the network. Two patterns:

Synchronous (HTTP/gRPC): Service A calls Service B and waits for a response. Simple to reason about but creates temporal coupling — if B is down, A's request fails.

Asynchronous (message queues/events): Service A publishes an event to a queue; Service B processes it when ready. B's availability doesn't affect A's ability to publish. But: eventual consistency — A doesn't know the result of B's processing immediately.

Most microservices systems use both: synchronous for reads that need immediate data, asynchronous for writes that can tolerate eventual consistency.

What you actually get

Independent deployment. Each service deploys on its own schedule. Squad C can ship the redirect fix without waiting for Squad B's billing feature. Deployment frequency increases.

Independent scaling. Run 200 instances of the Redirect Service and 3 of the Billing Service. Each scales to its workload without paying for the other.

Technology diversity. The Redirect Service can be written in Rust for performance. The Analytics Service can use Python for its rich data science ecosystem. The Link Service can use the team's preferred language.

Fault isolation. A bug in the Billing Service that causes it to crash doesn't crash the Redirect Service. Blast radius is reduced to one service.

What you actually pay

Network failure. In a monolith, function calls don't fail with ConnectionRefused. In microservices, every service call can fail: the network drops, the target service is overloaded, a pod is restarting. Every caller must handle these failures — with retries, timeouts, circuit breakers (post 11). This is not optional. A microservices system without retry and circuit breaker logic is fragile by design.

Distributed tracing. A request that touches five services produces five separate log streams. Without distributed tracing (Jaeger, Zipkin, Honeycomb), debugging a slow or failed request means correlating logs across five systems manually. Observability infrastructure is not an afterthought — it's a prerequisite.

Eventual consistency. Cross-service transactions can't use database ACID guarantees. A "create link and record analytics event" operation that touches two services requires either synchronous two-phase commit (slow, fragile) or asynchronous patterns (Saga, Outbox) that accept eventual consistency. Posts 09 and 10 cover this.

Deployment complexity. Coordinating dozens of services requires container orchestration (Kubernetes), service registries, health checks, rollout strategies, and secrets management. The operational surface area grows with every service added.

Latency. A monolith operation that calls 10 functions (nanoseconds each) becomes 10 network hops (milliseconds each) in microservices. The cumulative latency of a deeply nested service call graph can be 10–50ms before the first line of business logic runs.

Service decomposition anti-patterns

Too fine-grained (nanoservices). A service that does one trivial thing and exists only to be "microservices-ish." The coordination overhead of three services that each do thirty lines of logic often exceeds the value of their independence.

Distributed monolith. Services that are independently deployed but tightly coupled through shared databases, synchronous call chains, or coordinated deployments. This is worse than a monolith — you have all the operational complexity of distributed systems with none of the team autonomy benefits.

Big bang decomposition. Rewriting the entire monolith as microservices at once. Almost always fails. Use the Strangler Fig pattern (post 16) to migrate incrementally.

Tradeoffs

Team autonomy vs distributed systems complexity. You buy team autonomy and independent scalability by paying distributed systems complexity. The exchange rate is high: retry logic, circuit breakers, distributed tracing, eventual consistency, service discovery, deployment pipelines per service. It only makes sense when the autonomy benefit exceeds this cost — which means a team is large enough and the system complex enough that coordination is the actual bottleneck.

The microservices premium. Martin Fowler coined "the microservices premium" — the upfront cost in infrastructure, tooling, and engineering overhead. Startups and small teams often pay this premium without ever extracting the benefit.

The one thing to remember

Microservices are the answer to specific organisational and scaling problems — not a default architectural choice. They eliminate coordination tax between teams and enable independent scaling, at the cost of distributed systems complexity that must be handled explicitly. Draw boundaries around business capabilities, enforce data ownership, invest in observability before you need it, and handle network failures everywhere. If your team isn't large enough to feel coordination pain, and your system isn't large enough to need component-level scaling, a well-structured monolith ships faster and operates simpler.

← Previous: Monolithic Architecture — the right default, and when it stops being right

→ Next: Serverless — a third deployment model where you neither manage servers nor run persistent services; pay per invocation, scale to zero, focus purely on function logic.

Microservices: The Architecture You Earn, Not Choose

Systems Design

Microservices: The Architecture You Earn, Not Choose

The problem

The core idea

The analogy: a city of specialist workshops