Sync vs Async Communication: The Architectural Fork

Systems Design
| # | Post | What it covers |
|---|---|---|
| 00 | APIs & Communication: How Services Talk to Each Other | How services talk to each other shapes everything about a system. Nine concepts covering REST, WebSockets, async patterns, and API gateways. (146 chars) |
| 01 | API Design: Building Contracts That Last | A great API is a contract that outlasts your code. Here are the principles that make APIs intuitive to consume, safe to evolve, and cheap to maintain. (154 chars) |
| 02 | REST APIs: Constraints That Create Benefits | REST isn't just HTTP with JSON. It's an architectural style with specific constraints — and understanding them explains why REST APIs are designed the way they are. (166 chars) |
| 03 | Authentication vs Authorisation: Two Questions, Two Checks | Authentication is who you are. Authorisation is what you're allowed to do. Confusing them is one of the most common security mistakes in system design. (153 chars) |
| 04 | Session vs Token Authentication: Stateful vs Stateless Identity | Session auth stores identity on the server. Token auth encodes it in the token. Here's how each works, where each breaks, and how to choose. (144 chars) |
| 05 | OAuth 2.0 & OpenID Connect: Delegated Access and Federated Identity | OAuth 2.0 lets users grant apps access without sharing passwords. OpenID Connect adds identity on top. Here's how both actually work. (137 chars) |
| 06 | JWT: What's Actually Inside the Token | JWTs are everywhere in modern auth — and frequently misused. Here's exactly what a JWT contains, how the signature works, and what it doesn't protect. (153 chars) |
| 07 | WebSockets: Real-Time Bidirectional Communication | HTTP is request-response. WebSockets are a persistent two-way channel. Here's how they work, when to use them, and what to watch out for at scale. (151 chars) |
| 08 | Long Polling, SSE & Webhooks: The Server-Push Spectrum | Three patterns for server-push communication — long polling, server-sent events, and webhooks. Here's how each works and when to reach for each. (150 chars) |
| 09 | Sync vs Async Communication: The Architectural Fork ← you are here | Synchronous services couple tightly. Asynchronous services decouple — but add complexity. Here's how to reason about which your system needs. (147 chars) |
| 10 | API Gateways: One Entry Point, Every Cross-Cutting Concern | An API gateway centralises auth, rate limiting, routing, and observability for all your services. Here's what it does, how it works, and when you need one. (158 chars) |
| 11 | APIs & Communication: Wrap-Up | A complete recap of all ten API and communication concepts — REST, auth, JWT, WebSockets, webhooks, async patterns, and API gateways — and how they connect. (161 chars) |
Sync vs Async Communication: The Architectural Fork
The problem
Your URL shortener's link creation flow has grown. When a user creates a short link, your system now needs to: save the link record to the database, generate a QR code, notify the analytics service, send a confirmation email, invalidate any relevant CDN cache entries, and deliver webhooks to subscribed partners.
You implement this synchronously: create link → generate QR code → notify analytics → send email → invalidate cache → deliver webhooks → return response to user.
In testing this works fine. In production, the email service has a 200ms response time, webhook delivery occasionally takes 800ms, and when the analytics service is slow, the whole chain blocks. Your link creation endpoint, which should take 50ms, sometimes takes 2 seconds — because a user creating a short link is waiting for their confirmation email to send.
Worse: when the webhook service has an outage, link creation fails entirely, even though webhooks are completely unrelated to whether the link was actually created successfully.
You've coupled unrelated operations into a single synchronous chain. The fix — making post-creation side effects asynchronous — is the architectural shift that unlocks scale, resilience, and better user experience simultaneously.
The core idea
Synchronous communication means the caller sends a request and waits for a response before proceeding. The caller and callee are temporally coupled — the caller's progress is blocked on the callee's response.
Asynchronous communication means the caller sends a message and continues without waiting. The message is processed by the receiver at its own pace. The caller and callee are temporally decoupled — they run independently.
This is one of the most consequential architectural decisions in system design. The choice determines how tightly services are coupled, how failures propagate, how the system scales under load, and how complex the eventual consistency logic needs to be.
The analogy: waiting at the desk vs sending an email
Synchronous communication is walking to a colleague's desk and standing there until they answer your question. You get an immediate answer. You also can't do anything else while you're standing there. If they're in a meeting, you wait. If they're sick, you're blocked.
Asynchronous communication is sending them an email. You write it, send it, and go back to work. The answer arrives when it arrives. You handle it when you see it. If they're in a meeting or out sick, the email waits in their inbox. Your work continues regardless.
Both patterns exist in healthy organisations. Standing at someone's desk is right when you need an immediate answer and the conversation is the work. Email is right when the answer doesn't block your progress and the other person's availability shouldn't be your problem.
How synchronous communication works
In synchronous service communication, the calling service makes an HTTP (or gRPC, or equivalent) request and blocks until it receives a response:
User Request
│
▼
Link Service ──HTTP POST /links──────────────────────► Database
◄──201 Created, {link_id}────────────────
──HTTP POST /qr-codes──────────────────► QR Service
◄──200 OK, {qr_url}────────────────────
──HTTP POST /analytics/events──────────► Analytics Service
◄──200 OK──────────────────────────────
──HTTP POST /emails/send──────────────► Email Service
◄──200 OK──────────────────────────────
▼
User Response: 201 Created (after all upstream calls complete)
What synchronous gives you: simplicity. The calling service knows immediately whether each step succeeded. Error handling is straightforward — if any step fails, you can roll back or return an error. The request-response model is easy to reason about and debug.
What synchronous costs you: coupling and latency. The calling service's response time is the sum of all upstream response times. Any slow or unavailable upstream service slows or breaks the calling service. Load spikes in the calling service immediately create load spikes in every upstream service. Services that should be independent become a chain of dependencies.
How asynchronous communication works
In asynchronous communication, the calling service publishes a message to a broker (message queue, event bus) and returns immediately. Downstream services consume messages from the broker at their own pace.
User Request
│
▼
Link Service ──INSERT INTO links──────────────────────► Database
◄──201 Created, {link_id}────────────────
──PUBLISH link.created event──────────────► Message Broker
(Kafka / SQS / RabbitMQ)
▼
User Response: 201 Created (immediate, after DB write only)
[Async, decoupled, later:]
Message Broker ──link.created──► QR Service (generates QR, stores URL)
Message Broker ──link.created──► Analytics Service (updates counters)
Message Broker ──link.created──► Email Service (sends confirmation)
Message Broker ──link.created──► Webhook Dispatcher (delivers to partners)
Message Broker ──link.created──► CDN Invalidation Service (purges cache)
What asynchronous gives you:
Temporal decoupling — the link service doesn't know or care whether the email service is up. The event sits in the queue until the email service processes it.
Resilience — if the QR service is down, its queue builds up. When it recovers, it processes the backlog. No link creation failures due to QR service outages.
Independent scaling — the email service can scale based on its own throughput requirements, independently of the link service.
Traffic smoothing — a spike in link creations (a viral campaign) creates a spike of events in the queue. Downstream services process the queue at their own rate. The spike is absorbed rather than propagated.
What asynchronous costs you:
Eventual consistency — the QR code won't exist the moment the link is created. The confirmation email won't send instantly. If the user's response includes a QR code URL, you need to either wait for QR generation (defeating the purpose) or handle the not-yet-generated case gracefully.
Complexity — distributed message delivery has its own failure modes: duplicate delivery, out-of-order messages, consumer failures partway through processing. These require careful design.
Observability — a synchronous call that fails produces an immediate error. An asynchronous failure might not surface for minutes, and tracing "why didn't the email send?" through an event broker requires distributed tracing infrastructure.
Delivery semantics
Message brokers make different guarantees about how they deliver messages:
At-most-once: the message is delivered zero or one time. If delivery fails, the message is dropped. No duplicates, but potential message loss. Used where loss is acceptable and duplicates are worse (metrics aggregation where a duplicate would double-count).
At-least-once: the message is delivered one or more times. The broker retries until the consumer acknowledges. No loss, but duplicates are possible if the consumer processes but fails before acknowledging. This is the most common guarantee.
Exactly-once: the message is delivered exactly one time. No loss, no duplicates. Achieved through distributed transactions or idempotent consumers with deduplication. The hardest to implement correctly; used for financial transactions and operations where duplicate processing causes real-world harm.
The practical approach: use at-least-once delivery and build idempotent consumers. Processing the same link.created event twice should produce the same result as processing it once — the QR service checks whether a QR already exists for this link before generating one. This is simpler than exact-once and achieves the same correctness guarantees.
Request-reply over async (async with correlation)
Sometimes you need the result of an asynchronous operation before proceeding. The request-reply pattern over async messaging bridges this:
Client sends request with correlation ID:
Link Service ──{job_id: "abc123", link_id: "x7Kp2"}──► QR Request Queue
QR Service processes and publishes result:
QR Service ──{job_id: "abc123", qr_url: "..."}──► QR Response Queue
Link Service polls response queue for job_id: "abc123":
◄──{job_id: "abc123", qr_url: "https://qr.sho.rt/x7Kp2.png"}──
This pattern is used for long-running operations that can't be completed synchronously but whose result is needed by the caller. The correlation ID links request and response without coupling the services directly.
Combining synchronous and asynchronous
Real systems use both patterns for different operations:
Synchronous (caller needs immediate response):
├── Read operations (GET /links/{id})
├── Operations that must succeed before responding (creating the link record)
└── Real-time checks (authentication, authorisation, rate limiting)
Asynchronous (caller doesn't need to wait):
├── Side effects (emails, notifications, webhooks)
├── Analytics event processing
├── Cache invalidation
├── Search index updates
└── Long-running jobs (QR generation, image processing, report generation)
In the URL shortener: link creation is synchronous for the DB write (the user needs confirmation the link exists) and asynchronous for all side effects. The user gets a 201 response the moment the link is saved. QR generation, analytics, emails, and webhooks process in the background.
The tradeoffs
Consistency. Synchronous operations are immediately consistent — the downstream action either happened or it didn't. Asynchronous operations are eventually consistent — the side effects catch up over time. For anything where downstream state must be consistent before responding (a payment that must deduct balance before confirming a purchase), synchronous is required.
Error handling. Synchronous failures are explicit — the caller knows immediately. Asynchronous failures are invisible to the original caller and require monitoring, alerting on dead-letter queues, and operational processes for replaying or discarding failed messages.
Testing. Synchronous flows are tested with standard unit and integration tests — call the service, verify the response. Asynchronous flows require testing that messages are published correctly, that consumers process them correctly, and that the system behaves correctly under delayed, duplicated, or out-of-order delivery. More test surface area.
Latency vs throughput. Synchronous is lower latency for the initial caller (assuming fast upstream services). Asynchronous is higher throughput — the system can accept more requests because each one doesn't need to complete all downstream work before responding.
The one thing to remember
Synchronous communication couples the caller's response time and reliability to every service in the chain. Asynchronous communication decouples them — at the cost of eventual consistency and operational complexity. Use synchronous for operations the caller needs to know succeeded before proceeding. Use asynchronous for side effects, analytics, notifications, and anything the user doesn't need confirmed in the immediate response. The distinction is usually clear: if the user would wait for it, make it synchronous; if they wouldn't notice the lag, make it asynchronous.
← Previous: Long Polling, SSE & Webhooks — WebSockets enable bidirectional real-time communication; the next post covers the spectrum of server-push patterns for cases where full bidirectional communication isn't needed.
→ Next: API Gateways: One Entry Point, Every Cross-Cutting Concern — An API gateway centralises auth, rate limiting, routing, and observability for all your services. Here's what it does...




