Series: System Design · Data & Storage — Pillar 4 of 8

Systems Design

#	Post	What it covers
00	Data & Storage: Where Everything Lives	Where data lives shapes everything about a system. Nineteen concepts covering databases, indexing, sharding, replication, and the data structures underneath. (161 chars)
01	SQL vs NoSQL: Choosing the Right Database	SQL vs NoSQL isn't a simple choice. Learn what each type optimises for, when to use relational databases, and when NoSQL is the right call.
02	Database Indexing: The Highest-Leverage Performance Tool	Indexes are the highest-leverage database performance tool. Learn how they work, what they cost, and how to decide when to add one.
03	B-Trees & B+ Trees: The Data Structure Behind Database Indexes	Almost every database index is built on a B-tree or B+ tree. Learn how they work, why they're fast, and what this means for your queries.
04	LSM Trees: Why Some Databases Are Built for Writes	LSM trees power Cassandra, RocksDB, and LevelDB. Learn how they achieve massive write throughput and what they trade off to get it.
05	Denormalisation: Trading Storage for Speed	Denormalisation trades storage for read speed by pre-computing joins. Learn when it helps, when it hurts, and how to do it safely.
06	Database Sharding: Scaling Beyond a Single Node	Sharding splits a database across multiple nodes. Learn how it works, the strategies available, and the significant tradeoffs it introduces.
07	Data Partitioning: Choosing How to Divide Your Data	Range, hash, and list partitioning each make different tradeoffs. Learn how to divide data effectively for queries, maintenance, and scale.
08	Consistent Hashing: Minimising Resharding Pain	Consistent hashing minimises data movement when nodes are added or removed. Learn how it works and why it's fundamental to distributed systems.
09	Replication & Read Replicas: Scaling Reads and Surviving Failures	Replication copies data across nodes for fault tolerance and read scaling. Learn how primary-replica setups work and when to use them.
10	Object Storage: Unlimited Scale for Large Binary Data	Object storage handles large binary files at unlimited scale. Learn how it works, why it replaced file servers, and when to use it.
11	Block vs File vs Object Storage: Three Models, Three Use Cases	Three storage models, three different use cases. Learn what block, file, and object storage optimise for and how to choose between them.
12	Distributed File Systems: File Storage Across Many Machines	Distributed file systems spread file storage across many machines. Learn how HDFS, Ceph, and GlusterFS work and when to use them.
13	Time Series Databases: Built for Metrics and Events ← you are here	Time series databases handle append-heavy metric data far better than SQL. Learn how they work and when to use InfluxDB, Prometheus, or TimescaleDB.
14	Vector Databases: Semantic Search and AI Memory	Vector databases power semantic search, recommendations, and LLM memory. Learn how embeddings work, what ANN search is, and when to use one.
15	Full-Text Search Engines: Beyond SQL LIKE	Full-text search needs more than SQL LIKE. Learn how inverted indexes, relevance ranking, and Elasticsearch make text search fast and powerful.
16	Materialized Views: Pre-Computing Expensive Queries	Materialized views cache expensive query results as physical tables. Learn how they work, when to refresh them, and when to use them vs other approaches.
17	Query Optimisation: From Slow to Fast	Slow queries aren't always fixed by adding indexes. Learn how to read EXPLAIN output, understand query plans, and systematically make queries fast.
18	Connection Pooling: Managing the Hidden Bottleneck	Opening a database connection per request doesn't scale. Learn how connection pooling works, what PgBouncer does, and how to size your pool correctly.
19	Data & Storage: Wrap-Up	A recap of all 19 data storage concepts: SQL, NoSQL, indexing, sharding, replication, specialised databases, and how they connect in a real system.

Time Series Databases: Built for Metrics and Events

The problem

Your URL shortener now collects click data at 50,000 events per second. The operations team collects infrastructure metrics: CPU usage, memory, request latency, error rates — every server emitting 200 metrics every 10 seconds.

You start storing this in PostgreSQL. Three months later:

The clicks table has 12 billion rows. Queries that used to take 200ms now take 40 seconds, even with indexes.
The metrics table is 2TB. Running "what was the p99 latency for the redirect endpoint between 2pm and 3pm last Tuesday?" takes 90 seconds and pins the database CPU.
VACUUM runs constantly and can't keep up with the delete rate for old data being rotated out.
Storage costs are 10x what the data's information density justifies — old, compressed metric data stored in variable-length PostgreSQL rows is inefficient.

The fundamental problem: PostgreSQL is a general-purpose database optimised for relational data with complex queries, transactions, and flexible access patterns. Time series data — metrics and events ordered by timestamp — has completely different characteristics: append-only, time-ordered, queried by time range, aggregated in predictable ways, and usually retained for a fixed window before deletion.

Using a general-purpose database for time series data is technically possible and persistently painful.

The core idea

A time series database (TSDB) is purpose-built for data points indexed by time. It assumes writes are always new (append-only), data is almost always queried by time range, aggregations over time windows are the dominant query pattern, and old data has lower value and can be compressed or deleted automatically.

These assumptions enable storage formats, compression schemes, query engines, and retention policies that outperform general-purpose databases for this workload by orders of magnitude.

The analogy: a rainfall gauge vs a general ledger

A general-purpose database is like a general ledger — it can record anything, in any order, with relationships to any other record. Powerful, flexible, appropriate for business transactions.

A time series database is like a rainfall gauge logging data sheet — designed specifically for the pattern of "value at this moment, value at the next moment, value at the moment after." The form doesn't accommodate arbitrary notes. It's formatted for fast recording of one kind of entry, and efficient retrieval of averages, totals, and trends over time ranges.

You wouldn't use a general ledger to record continuous sensor readings, and you wouldn't use a rainfall gauge to manage accounts payable. Different tools for different jobs.

How it works

What makes time series data special

Time series data has four properties that TSDBs exploit:

Append-only writes. A temperature reading at 10:00:00 doesn't change. New data is always the next timestamp — there are no updates to past entries. TSDBs optimise for append-only write paths (often using LSM-tree-based storage).

Time is always in the query. Queries like "give me all CPU metrics" are rare. Queries like "give me CPU metrics for server-A between 2pm and 3pm" are universal. TSDBs store data ordered by time and partition by time window, enabling efficient range scans.

High-cardinality tag filters. Data points carry metadata: {server="web-01", region="us-east", env="prod"}. Queries filter by these tags alongside time range. TSDBs index tag combinations efficiently; general-purpose databases struggle with high-cardinality tag columns.

Retention and downsampling. Raw second-level metrics from three years ago are rarely needed. Keeping raw data forever is expensive. TSDBs support automatic retention policies (delete data older than 90 days) and downsampling (convert second-level data to 1-minute averages after 7 days, 1-hour averages after 30 days) — dramatically reducing storage costs.

Storage format: columnar + delta encoding

TSDBs use columnar storage rather than row-based storage. For time series data, this is a significant advantage:

Row-based (PostgreSQL):
(web-01, 2025-06-01 14:00:00, cpu=78.2, mem=45.1)
(web-01, 2025-06-01 14:00:10, cpu=79.1, mem=45.3)
(web-01, 2025-06-01 14:00:20, cpu=77.8, mem=45.0)
Each row stored together: timestamp + tags + values

Columnar (TSDB):
Timestamps column: [14:00:00, 14:00:10, 14:00:20, ...]
CPU column:        [78.2, 79.1, 77.8, ...]
Memory column:     [45.1, 45.3, 45.0, ...]
Tags stored once per chunk, not repeated per row

When a query asks "what was the average CPU for web-01 between 14:00 and 15:00?", the database reads only the timestamp and CPU columns — memory and other metrics are not read at all. This is columnar pruning: only the relevant columns are touched.

Delta encoding: rather than storing absolute values, TSDBs store deltas between consecutive values. CPU usage changing from 78.2 → 79.1 → 77.8 is stored as [78.2, +0.9, -1.3]. Small deltas compress extremely well — typical compression ratios are 10:1 to 100:1 compared to storing raw values.

Key TSDBs

Prometheus:

Pull-based: scrapes metrics endpoints every 15 seconds rather than receiving pushed metrics
Local storage only (no native clustering in core Prometheus; use Thanos/Cortex for scale)
PromQL: powerful time series query language with functions for rates, averages, percentiles
Designed for operational monitoring and alerting, not long-term storage

InfluxDB:

Push-based: applications write metrics to InfluxDB
InfluxQL and Flux query languages
Native clustering, retention policies, and continuous queries (downsampling)
Good fit for application-level metrics and IoT data

TimescaleDB:

PostgreSQL extension that adds time series optimisations: automatic time-based partitioning (hypertables), columnar compression for historical data, time-specific SQL functions
Full SQL support — all existing PostgreSQL queries work
Best choice when you're already using PostgreSQL and need time series performance without adding another system

ClickHouse:

Column-oriented OLAP database, not strictly a TSDB but frequently used for time series analytics at very large scale
Extremely fast for aggregation queries over large datasets
Used by Cloudflare, ByteDance, Uber for analytics workloads

The URL shortener's TSDB use

Click events — (link_id, timestamp, country, device, referer) — are the natural workload for a TSDB:

-- InfluxDB Flux: click rate per link for the last hour
from(bucket: "clicks")
  |> range(start: -1h)
  |> filter(fn: (r) => r.link_id == "x7Kp2")
  |> aggregateWindow(every: 1m, fn: count)

Storing this in PostgreSQL required careful partitioning, manual retention management, and aggregate queries that scaled poorly. InfluxDB handles all of it natively.

The tradeoffs

Limited query flexibility. TSDBs are optimised for time range + tag filter + aggregate. Ad-hoc queries across non-time dimensions ("which link was clicked most by Australian users in Q3?") are often slower or impossible without an additional data store or OLAP system.

No joins. TSDBs don't join time series data with relational data. Correlation between click events and user account data (stored in PostgreSQL) must happen in the application layer.

Learning curve. PromQL, Flux, and InfluxQL are powerful but distinct from SQL. Teams with deep SQL expertise face a learning investment.

Operational overhead. Running a TSDB alongside PostgreSQL is another system to monitor, back up, and upgrade. TimescaleDB avoids this by extending PostgreSQL, at the cost of some performance relative to purpose-built TSDBs.

Data loss tolerance. Many TSDBs are designed with the assumption that losing a few seconds of metrics in a crash is acceptable. For applications where every event must be recorded durably (financial transactions, billing events), a TSDB's durability guarantees may not be sufficient.

When to use a time series database

Use a TSDB when:

Write volume exceeds what PostgreSQL can absorb without degrading read performance
Your primary queries are time range + aggregation (sum, count, average, percentile over time windows)
You need automatic retention policies and downsampling
You're storing operational metrics (infrastructure, application performance)
You're recording IoT sensor data, financial tick data, or any other high-frequency measurement stream

Don't use a TSDB when:

Your time series data is low volume (a few thousand events per day) — PostgreSQL with time-based partitioning handles this fine
You need to join time series data with relational data frequently in the database layer
You need full ACID transactions on time series writes

The one thing to remember

A time series database is not just a fast place to put timestamps. It's a system built around the assumption that data is append-only, queries are always time-ranged, and compression ratios from delta encoding make storage 10–100x more efficient than row-based formats. The moment your write volume starts stressing a relational database with time-ordered data, and your queries look like "average X between time A and time B", you've outgrown the general-purpose database for this workload.

← Previous: Distributed File Systems: File Storage Across Many Machines — Distributed file systems spread file storage across many machines. Learn how HDFS, Ceph, and GlusterFS work and when...

→ Next: Vector Databases: Semantic Search and AI Memory — Vector databases power semantic search, recommendations, and LLM memory. Learn how embeddings work, what ANN search i...

Time Series Databases: Built for Metrics and Events

Systems Design

Time Series Databases: Built for Metrics and Events

The problem

The core idea

The analogy: a rainfall gauge vs a general ledger