# Client-Server Architecture: The Model Everything Else Builds On

> **Series:** System Design · Scalability & Infrastructure — Pillar 6 of 8

## Systems Design

| # | Post | What it covers |
|---|------|----------------|
| 00 | [Scalability & Infrastructure: The Layer Between Your Code and the Internet](/scalability-infrastructure-the-layer-between-your-code-and-the-internet) | Nine concepts covering load balancing, rate limiting, proxies, compression, and probabilistic data structures that keep large systems fast and reliable. |
| 01 | **Client-Server Architecture: The Model Everything Else Builds On** ← you are here | Client-server is the foundational model for distributed systems. Learn what clients and servers know, where state lives, and how the model scales. |
| 02 | [Load Balancing: Distributing Traffic Across Servers](/load-balancing-distributing-traffic-across-servers) | Load balancers distribute traffic across servers for scale and availability. Learn how they work, what types exist, and what they require of backend servers. |
| 03 | [Load Balancing Algorithms: How Traffic Is Distributed](/load-balancing-algorithms-how-traffic-is-distributed) | Round robin, least connections, IP hash, weighted — each algorithm makes different tradeoffs. Learn how to choose the right one for your workload. |
| 04 | [Rate Limiting: Protecting Services from Overload](/rate-limiting-protecting-services-from-overload) | Rate limiting protects services from overload and abuse. Learn how token bucket, leaky bucket, and sliding window algorithms work and when to use each. |
| 05 | [Proxy vs Reverse Proxy: Which Way Does It Face?](/proxy-vs-reverse-proxy-which-way-does-it-face) | Forward proxies protect clients; reverse proxies protect servers. Learn how each works, what Nginx and Cloudflare do, and when you need which. |
| 06 | [Data Compression: Smaller, Faster, Cheaper](/data-compression-smaller-faster-cheaper) | Compression reduces bandwidth and storage costs. Learn how Gzip, Brotli, LZ4, and zstd work, where to apply them, and the CPU tradeoffs involved. |
| 07 | [Checksums: Detecting Corruption Before It Becomes a Catastrophe](/checksums-detecting-corruption-before-it-becomes-a-catastrophe) | Checksums detect silent data corruption in transit and storage. Learn how CRC32, MD5, and SHA-256 work and where to apply them in distributed systems. |
| 08 | [Bloom Filters: Answering "Have I Seen This?" Without Storing Everything](/bloom-filters-answering-have-i-seen-this-without-storing-everything) | A Bloom filter answers "have I seen this?" in constant memory. Learn how they work, why false positives are acceptable, and where they're used in production. |
| 09 | [HyperLogLog: Counting Distinct Items Without Storing Them](/hyperloglog-counting-distinct-items-without-storing-them) | HyperLogLog counts distinct values in ~1.5 KB of memory with <2% error. Learn how it works and why Redis, BigQuery, and Postgres use it. |
| 10 | [Scalability & Infrastructure: Wrap-Up](/scalability-infrastructure-wrap-up) | A recap of all 9 scalability concepts: load balancing, rate limiting, proxies, compression, checksums, Bloom filters, and HyperLogLog. How they fit together. |

---

# Client-Server Architecture: The Model Everything Else Builds On

## The problem

Before load balancers, microservices, CDNs, and distributed caches, there is a simpler question: who is talking to whom, and who holds what?

Every distributed system, regardless of complexity, is built on the same basic pattern: some processes make requests (clients), others fulfill them (servers). The complexity of modern systems comes from how this simple pattern scales, what happens when there are millions of clients and hundreds of servers, and how the boundaries between client and server shift as systems evolve.

Understanding the model deeply — not just the labels but the underlying constraints — is what makes the rest of this pillar legible.

---

## The core idea

In a client-server architecture, clients initiate requests and servers respond to them. Clients and servers have an asymmetric relationship: clients know about servers (they hold a hostname or IP address to connect to), but servers generally don't know about individual clients in advance. Communication is typically over a network using a request-response protocol.

---

## The analogy: a restaurant

A client is a diner. A server is the kitchen. A diner initiates the interaction — they sit down, call the waiter, place an order. The kitchen responds — it prepares the meal and sends it back. The kitchen doesn't walk around looking for people to feed; it waits for orders.

The kitchen doesn't know who the diner is before they walk in. The diner knows where the restaurant is (the address / hostname). The diner makes multiple requests in a meal (order, refill, dessert) — each a discrete round-trip.

As the restaurant gets busier, you add more kitchen stations (horizontal scaling). Each station is equivalent — any order can go to any station. The waiter (load balancer) decides which station handles which order.

---

## How it works

### The basic model

```
Client                      Server
  │                            │
  │──── Request (HTTP GET) ───▶│
  │                            │ process
  │◀─── Response (200 OK) ─────│
  │                            │
```

The client sends a request. The server processes it and returns a response. The client initiates; the server reacts.

In HTTP (the dominant protocol for web systems), requests include:
- **Method:** GET (read), POST (create), PUT (replace), PATCH (update), DELETE (remove)
- **Path:** `/links/x7Kp2`
- **Headers:** `Authorization`, `Content-Type`, `Accept-Encoding`
- **Body:** request payload (for POST/PUT/PATCH)

Responses include a status code (200 OK, 404 Not Found, 500 Internal Server Error), headers, and a body.

### What clients know vs what servers know

This asymmetry is fundamental:

**Clients know:**
- The server's address (hostname or IP, resolved via DNS)
- The protocol and port
- Their own state (what they've seen, their auth tokens, their UI state)

**Servers know:**
- The contents of this request
- Whatever is in the request's headers (auth token, cookies, client metadata)
- The server's own state (database, cache, configuration)

**Servers don't know by default:**
- Who the client is (unless the request carries identity)
- What other requests the client has made (unless stored in a session or database)
- Whether this is the client's first request or ten thousandth

This is why HTTP is **stateless by design**: each request carries all the information needed to fulfill it. The server doesn't maintain per-client context between requests. Session management, authentication tokens, and cookies are all mechanisms to carry state within requests — not deviations from the model, but applications of it.

### State: where does it live?

The most consequential design decision in a client-server system is where state lives.

**Stateless servers:** servers hold no client-specific state between requests. All state is in the database (durable) or in the request itself (e.g., JWT tokens carrying identity). Any server can handle any request — this is what makes horizontal scaling trivially possible.

```
Client 1 → Load Balancer → Server A (stateless) → Database
Client 1 → Load Balancer → Server B (stateless) → Database
# Both requests work equally well regardless of which server handles them
```

**Stateful servers:** servers hold per-client state (open connections, in-flight sessions, WebSocket state). This makes horizontal scaling more complex — a client must consistently reach the same server (session stickiness), or state must be externalised to a shared store.

For most web applications, the right architecture is **stateless application servers backed by a shared state layer** (database, cache). The app servers can scale freely; state lives in a small number of shared, durable systems.

### The two-tier model and beyond

**Two-tier:** client talks directly to the server.
```
Web Browser ←→ Web Server (handles both app logic and database)
```

**Three-tier (the standard):** client talks to an application server; application server talks to a data layer.
```
Client ←→ App Server ←→ Database
```

**N-tier / microservices:** the app server is itself a client to other services.
```
Client ←→ API Gateway ←→ Link Service ←→ Database
                      ←→ Analytics Service ←→ Cassandra
                      ←→ Auth Service ←→ Redis
```

In a microservices architecture, almost every service is simultaneously a server (to its callers) and a client (to the services it calls). The client-server model doesn't disappear — it recurses.

### The evolution at scale

As client volume grows, the server side of the model changes:

**One server → many servers (horizontal scaling):** a load balancer distributes clients across a server pool. Servers must be stateless — any server handles any request. This is the subject of the next two posts.

**Synchronous → asynchronous:** at scale, making every client wait for its request to fully complete is expensive. Message queues allow servers to acknowledge receipt and process asynchronously — the client is no longer blocked.

**Monolith → services:** as server logic grows complex, the "server" becomes a collection of services, each acting as a server to its callers. Service discovery, inter-service auth, and circuit breakers become necessary. Covered in Pillar 7.

---

## The tradeoffs

**Stateless servers are simpler to scale but require externalised state.** Moving state to a database or cache adds latency (every request touches external storage) and operational complexity (the database becomes critical infrastructure). The tradeoff is worth it — stateless servers are the foundation of every large-scale web system.

**Request-response is simple but one-directional.** The client always initiates. Servers can't push updates to clients without a persistent connection (WebSockets, SSE) or polling. For real-time features (live notifications, collaborative editing), the standard request-response model requires augmentation. Covered in Pillar 3.

**Every hop adds latency.** Each additional tier (API gateway, service proxy, middleware) adds a network round-trip. In a microservices system with five service hops, each adding 2ms, you've added 10ms before any business logic runs. Every layer must justify its existence in the latency budget.

---

## The one thing to remember

> **Clients initiate, servers respond, and the boundary between them is defined by who knows what.** Stateless servers — where no per-client state is held between requests — are the foundation of horizontal scalability. All client-specific state lives in the request itself or in an externalised store. Every load balancer, proxy, rate limiter, and gateway in this pillar exists to manage the client-server relationship at scale without violating this principle.

---

*← Previous: **[Pillar 6 Overview](/scalability-infrastructure-the-layer-between-your-code-and-the-internet)** — introducing scalability & infrastructure*

*→ Next: **[Load Balancing](/load-balancing-distributing-traffic-across-servers)** — when a single server can't handle all the clients, a load balancer distributes them across a pool — here's how that works and what it requires of the servers behind it.*