Database Sharding — Data Partitioning Strategies When Your System Hits the Ceiling

Posted on: 4/18/2026 1:11:32 AM

Table of contents

Contents
1. Why do we need Database Sharding?
2. Sharding strategies
3. Consistent Hashing & Virtual Nodes
1. Virtual Nodes (VNodes)
  1. 💡 In practice
4. The art of picking a Shard Key
5. Challenges of sharding
6. Comparing solutions: Vitess, Citus, CockroachDB, TiDB
1. Case study: Ninja Van picked TiDB over Vitess
7. When you should NOT shard
1. ⚠️ Premature sharding is an anti-pattern
8. Conclusion

1. Why do we need Database Sharding?

As an app grows from thousands to millions of users, a single-node database hits hard limits in storage, throughput, and latency. Vertical scaling (bigger hardware) only takes you so far — you can't buy infinite CPU for a single machine.

Database Sharding (a.k.a. horizontal partitioning) splits one large database into multiple smaller pieces called shards, each running on its own server. Every shard holds a subset of the full dataset but uses the same schema.

100x Write throughput gain with the right shards

<10ms Query latency on a small shard

∞ Theoretical storage scalability

70% Of large systems use sharding (DB-Engines 2026)

graph TB
    Client[👤 Client Application] --> Router[🔀 Shard Router / Proxy]
    Router --> S1[📦 Shard 1
User ID 1-1M]
    Router --> S2[📦 Shard 2
User ID 1M-2M]
    Router --> S3[📦 Shard 3
User ID 2M-3M]
    Router --> SN[📦 Shard N
User ID ...]

    S1 --> R1[🔄 Replica 1a]
    S2 --> R2[🔄 Replica 2a]
    S3 --> R3[🔄 Replica 3a]

    style Client fill:#e94560,stroke:#fff,color:#fff
    style Router fill:#2c3e50,stroke:#fff,color:#fff
    style S1 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style S2 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style S3 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style SN fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style R1 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
    style R2 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
    style R3 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50

A basic Database Sharding architecture with a shard router and read replicas

2. Sharding strategies

2.1 Hash-based Sharding

Apply a hash function to the shard key to pick the destination shard. This is the most common strategy because it spreads data evenly.

shard_id = hash(shard_key) % number_of_shards

-- Example: user_id = 12345
-- hash(12345) = 7829431
-- shard_id = 7829431 % 4 = 3 → Shard 3

✅ Strengths

Even data distribution, no hotspots. PlanetScale recommends hash-based as the default strategy for most use cases because it's simple and effective.

⚠️ Weaknesses

Range queries are inefficient (they scan every shard). Adding/removing shards forces data to be rehashed — which is exactly why Consistent Hashing exists.

2.2 Range-based Sharding

Split data into contiguous ranges of the shard key. A natural fit for data with inherent ordering (timestamps, sequential IDs, geography).

Shard A

orders 2024-01 → 2024-06

Shard B

orders 2024-07 → 2024-12

Shard C

orders 2025-01 → 2025-06

Shard D

orders 2025-07 → 2026-04

Pros: very fast range queries (only 1-2 shards touched). Easy to reason about and operate.

Cons: prone to hotspots — the shard holding the newest data is hammered the most. For example, the "current month" order shard may absorb 90% of traffic.

2.3 Directory-based Sharding

Use a separate lookup table (directory) to map each shard key → shard location. The most flexible option, but the directory becomes a single point of failure.

graph LR
    App[Application] --> Dir[(📋 Directory
Lookup Table)]
    Dir --> |"user_id 1-500K"| S1[Shard 1]
    Dir --> |"user_id 500K-800K"| S2[Shard 2]
    Dir --> |"user_id 800K-2M"| S3[Shard 3]
    Dir --> |"VIP users"| S4[Shard VIP]

    style App fill:#e94560,stroke:#fff,color:#fff
    style Dir fill:#2c3e50,stroke:#fff,color:#fff
    style S1 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style S2 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style S3 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style S4 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50

Directory-based sharding enables flexible mapping — even special logic (a VIP shard)

Directory-based is often paired with a cache (Redis, Memcached) to keep lookup latency low. Moving data between shards simply updates the directory — no rehashing required.

2.4 Geographic Sharding

Distribute data by the user's geographic location — keep it near where it's accessed. Especially important for global apps and data-residency regulations (GDPR, PDPA).

Strategy	Even distribution	Range query	Resharding	Typical use case
Hash-based	⭐⭐⭐	⭐	⭐⭐	User data, session store
Range-based	⭐	⭐⭐⭐	⭐⭐⭐	Time-series, log data
Directory-based	⭐⭐⭐	⭐⭐	⭐⭐⭐	Multi-tenant SaaS
Geographic	⭐⭐	⭐⭐	⭐⭐	Global apps, data residency

3. Consistent Hashing & Virtual Nodes

The biggest issue with naïve hash-based sharding (hash(key) % N) is that adding or removing a shard forces most keys to move (rehash). Going from N = 4 to N = 5, about 80% of data must migrate — unacceptable in production.

Consistent Hashing addresses this by arranging the hash space into a circle (the hash ring). Each shard occupies a position on the ring, and each key is mapped to the nearest shard clockwise.

graph TB
    subgraph Ring["🔵 Hash Ring (0 → 2³²)"]
        direction TB
        N1["🟢 Node A
position: 0°"]
        N2["🔴 Node B
position: 90°"]
        N3["🟡 Node C
position: 180°"]
        N4["🟣 Node D
position: 270°"]
    end

    K1["Key X → hash: 45°"] -.->|"nearest clockwise"| N2
    K2["Key Y → hash: 200°"] -.->|"nearest clockwise"| N4
    K3["Key Z → hash: 350°"] -.->|"nearest clockwise"| N1

    style N1 fill:#4CAF50,stroke:#fff,color:#fff
    style N2 fill:#e94560,stroke:#fff,color:#fff
    style N3 fill:#ff9800,stroke:#fff,color:#fff
    style N4 fill:#9c27b0,stroke:#fff,color:#fff
    style K1 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style K2 fill:#f8f9fa,stroke:#9c27b0,color:#2c3e50
    style K3 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50

The Consistent Hashing Ring — adding/removing a node only affects keys on adjacent nodes

When Node E is added at position 135°, only keys in the (90°, 135°] range migrate from Node C to Node E — roughly 25% of a single node's data, not 80% of the whole system.

Virtual Nodes (VNodes)

In practice, only 4-5 physical nodes on the ring result in uneven distribution. Virtual Nodes fix this by having each physical node occupy multiple positions on the ring:

-- Node A holds 3 virtual nodes at:   0°, 120°, 240°
-- Node B holds 3 virtual nodes at:  40°, 160°, 280°
-- → distribution is far more uniform

-- When Node A is removed:
-- Data at   0° → moves to the nearest vnode (Node B at 40°)
-- Data at 120° → moves to Node C at 160°
-- Data at 240° → moves to Node B at 280°
-- → Load is evenly redistributed across the remaining nodes

💡 In practice

Cassandra uses 256 virtual nodes per physical node by default. Amazon's DynamoDB also uses consistent hashing with virtual nodes for partition management. For new designs, start with 128-256 VNodes per node to get an even distribution from day one.

4. The art of picking a Shard Key

Picking the shard key is the single most important decision when designing sharding — a bad choice creates hotspots, spreads cross-shard queries everywhere, and is extremely expensive to fix. The core criteria:

4.1 High Cardinality

The shard key must have many distinct values. user_id (millions of values) is far better than country_code (~200 values). Low cardinality means data concentrates in a handful of shards.

4.2 Even Distribution

Data must spread evenly across shards. For instance, created_date is a poor shard key for range-based sharding because new data always piles onto the last shard.

4.3 Query Pattern Alignment

The shard key should match your most common WHERE clause. If 90% of queries filter by tenant_id, that's your shard key — each query only hits a single shard.

4.4 Stability (Immutable)

The shard key should never change. If a user updates their email and email is the shard key, that row must migrate to another shard — expensive and error-prone.

Shard Key	Cardinality	Distribution	Stability	Assessment
`user_id`	Very high	Even (hash)	Immutable	⭐ Best for multi-user apps
`tenant_id`	Medium	Can be skewed	Immutable	⭐ Good for SaaS, monitor size
`order_id`	Very high	Even	Immutable	⭐ Good for e-commerce
`created_at`	High	Skewed (new = hot)	Immutable	⚠️ Only good for archive data
`country`	Low (~200)	Very skewed	Can change	❌ Avoid using alone
`email`	High	Even	Can change	❌ Not stable

⚠️ Compound shard keys

When no single key is perfect, use a compound key. Example: (tenant_id, user_id) keeps a tenant's data on the same shard (good for isolation) while hashing on user_id distributes load evenly within a large tenant.

5. Challenges of sharding

5.1 Cross-Shard Queries

Queries that need data across multiple shards are the biggest challenge. E.g. SELECT * FROM orders WHERE product_id = 42 when the shard key is user_id — you must scatter the query to EVERY shard and gather the results.

sequenceDiagram
    participant App as Application
    participant Router as Shard Router
    participant S1 as Shard 1
    participant S2 as Shard 2
    participant S3 as Shard 3

    App->>Router: SELECT * FROM orders WHERE product_id = 42
    Router->>S1: Forward query
    Router->>S2: Forward query
    Router->>S3: Forward query
    S1-->>Router: 15 rows
    S2-->>Router: 8 rows
    S3-->>Router: 22 rows
    Router->>Router: Merge + Sort + Limit
    Router-->>App: Final result (45 rows)

The scatter-gather pattern — cross-shard queries fan out to every shard

Solutions: design the shard key around your primary query pattern. If you need to query along multiple dimensions, use denormalization (duplicate data across shards) or a separate secondary-index shard.

5.2 Distributed Transactions

Transactions spanning multiple shards require distributed protocols like 2-Phase Commit (2PC) — slow and complex. Prefer designs where every transaction fits within a single shard.

-- ❌ Cross-shard transaction (slow, complex)
BEGIN DISTRIBUTED TRANSACTION
  UPDATE shard_1.accounts SET balance = balance - 100 WHERE user_id = 1;
  UPDATE shard_3.accounts SET balance = balance + 100 WHERE user_id = 999;
COMMIT

-- ✅ Better design: use a Saga pattern
-- Step 1: Debit user 1 (shard 1) → emit an event
-- Step 2: Credit user 999 (shard 3) → consume the event
-- Step 3: If step 2 fails → compensating transaction on shard 1

5.3 Resharding (Rebalancing)

As data grows and you need more shards, moving data between shards (resharding) is extremely expensive. The usual steps:

Double-write: write to both old and new shards in parallel
Backfill: copy historical data from the old shard to the new one
Verify: compare checksums between the two sides
Cutover: switch traffic to the new shard schema
Cleanup: delete the now-redundant data from the old shard

✅ Pre-split

If you need 3 shards today, start with 4 (a power of 2). Consistent hashing behaves better when node counts are powers of two. Many teams start with 8-16 logical shards even when only 2-3 physical servers are needed — each server hosts multiple logical shards, and scaling just means moving a logical shard to a new machine.

5.4 Unique Constraints Across Shards

Enforcing a global UNIQUE constraint (e.g. unique email across all shards) can't use a simple database constraint. Options:

Separate uniqueness table: a small centralized table mapping (email → shard_id)
Globally unique IDs: Snowflake IDs, UUIDs, or ULIDs that encode shard info into the ID
Application-level check: query before insert, accept the rare race condition

6. Comparing solutions: Vitess, Citus, CockroachDB, TiDB

As of 2026, four sharding solutions stand out — each serving a different use case:

Criterion	Vitess	Citus	CockroachDB	TiDB
Foundation	MySQL middleware	PostgreSQL extension	Custom (Postgres-compatible)	Custom (MySQL-compatible)
Sharding model	Explicit (manual key)	Distributed tables	Automatic range-based	Automatic (region-based)
Resharding	Manual, online	Rebalancer tool	Automatic	Automatic (split/merge)
Cross-shard queries	Limited	Good	Very good	Very good
Consistency	Per-shard (MySQL)	Per-node (Postgres)	Strong (Raft)	Strong (Raft)
HTAP	No	Columnar addon	No	Yes (TiFlash)
Deployment	Kubernetes-native	Managed + self-host	Managed + self-host	Managed + self-host
Origin	YouTube/Google	Microsoft (Azure)	Cockroach Labs	PingCAP
Best for	Controlled MySQL scale-out	Postgres + analytics	Multi-region, Postgres-first	MySQL-first, HTAP workloads

graph TB
    subgraph Decision["🤔 Which solution to pick?"]
        Start{Current database?}
        Start -->|MySQL| MySQL_Q{Need auto-sharding?}
        Start -->|PostgreSQL| PG_Q{Need multi-region?}
        Start -->|Greenfield| New_Q{Top priority?}

        MySQL_Q -->|Yes| TiDB_R[✅ TiDB]
        MySQL_Q -->|No, want control| Vitess_R[✅ Vitess]

        PG_Q -->|Yes| CRDB_R[✅ CockroachDB]
        PG_Q -->|No| Citus_R[✅ Citus]

        New_Q -->|HTAP| TiDB_R2[✅ TiDB]
        New_Q -->|Multi-region| CRDB_R2[✅ CockroachDB]
        New_Q -->|Postgres ecosystem| Citus_R2[✅ Citus]
    end

    style Start fill:#e94560,stroke:#fff,color:#fff
    style MySQL_Q fill:#2c3e50,stroke:#fff,color:#fff
    style PG_Q fill:#2c3e50,stroke:#fff,color:#fff
    style New_Q fill:#2c3e50,stroke:#fff,color:#fff
    style TiDB_R fill:#4CAF50,stroke:#fff,color:#fff
    style TiDB_R2 fill:#4CAF50,stroke:#fff,color:#fff
    style Vitess_R fill:#4CAF50,stroke:#fff,color:#fff
    style CRDB_R fill:#4CAF50,stroke:#fff,color:#fff
    style CRDB_R2 fill:#4CAF50,stroke:#fff,color:#fff
    style Citus_R fill:#4CAF50,stroke:#fff,color:#fff
    style Citus_R2 fill:#4CAF50,stroke:#fff,color:#fff

Decision tree for choosing the right sharding solution

Case study: Ninja Van picked TiDB over Vitess

Ninja Van — a Southeast Asian logistics platform handling millions of orders daily — migrated from MySQL to TiDB instead of choosing Vitess or CockroachDB. Key reasons:

TiDB is MySQL-protocol-compatible → low migration cost
Auto-sharding without having to define shard keys by hand
TiFlash lets them run analytics directly on operational data (HTAP)
Horizontal scaling on Kubernetes without manual resharding

7. When you should NOT shard

Sharding adds significant complexity. Before going there, make sure you've exhausted simpler options:

Step 1: Query optimization

Proper indexes, rewrite slow queries, use EXPLAIN ANALYZE. Most DB bottlenecks come from bad queries, not missing hardware.

Step 2: Read replicas

80% of apps are read-heavy. Adding a read replica can boost throughput 3-5× without sharding. Every managed database (RDS, Cloud SQL) supports it out of the box.

Step 3: Caching layer

Redis/Memcached in front of the DB cuts 70-90% of read traffic. Pair with a CDN for static content. Low cost, fast to ship.

Step 4: Vertical scaling

Upgrade to 64 vCPU, 256 GB RAM, NVMe SSD. Cloud providers let you scale up in minutes. PostgreSQL on good hardware handles millions of rows per second.

Step 5: Table partitioning

Partitioning WITHIN a single database (not sharding). PostgreSQL native partitioning or MySQL partition by range/hash — much simpler than sharding.

Step 6: Sharding

Only when ALL the above aren't enough: a single-node DB genuinely can't sustain the write throughput, storage exceeds the TB range, or you need multi-region data locality.

⚠️ Premature sharding is an anti-pattern

Many teams shard too early "just in case". In reality, a single-node PostgreSQL on modern (2026) hardware comfortably handles several TB of data and tens of thousands of transactions per second. Sharding before you need it creates operational overhead without adding value.

8. Conclusion

Database Sharding is a powerful tool — but comes with a major complexity tradeoff. Remember:

Hash-based is the best default strategy for most use cases
Consistent Hashing with Virtual Nodes solves the resharding problem
The Shard Key must have high cardinality, be stable, and align with the primary query pattern
Cross-shard queries are always expensive — design the schema to minimize them
Modern solutions like TiDB and CockroachDB provide auto-sharding, reducing operational burden
Always try query optimization → read replicas → caching → vertical scaling before reaching for sharding

In the distributed-systems era of 2026, understanding sharding isn't just about applying it — it's about knowing when not to. That's the mark of a mature systems architect.

References:

#Database Sharding #system design #Consistent Hashing #Distributed Database #Horizontal Scaling #Vitess #TiDB #CockroachDB

# Database Sharding — Data Partitioning Strategies When Your System Hits the Ceiling

## Contents

## 1. Why do we need Database Sharding?

As an app grows from thousands to millions of users, a single-node database hits hard limits in storage, throughput, and latency. **Vertical scaling** (bigger hardware) only takes you so far — you can't buy infinite CPU for a single machine.

**Database Sharding** (a.k.a. horizontal partitioning) splits one large database into multiple smaller pieces called **shards**, each running on its own server. Every shard holds a subset of the full dataset but uses the same schema.

100x Write throughput gain with the right shards

<10ms Query latency on a small shard

∞ Theoretical storage scalability

70% Of large systems use sharding (DB-Engines 2026)

```
graph TB
    Client[👤 Client Application] --> Router[🔀 Shard Router / Proxy]
    Router --> S1[📦 Shard 1  
User ID 1-1M]
    Router --> S2[📦 Shard 2  
User ID 1M-2M]
    Router --> S3[📦 Shard 3  
User ID 2M-3M]
    Router --> SN[📦 Shard N  
User ID ...]

S1 --> R1[🔄 Replica 1a]
    S2 --> R2[🔄 Replica 2a]
    S3 --> R3[🔄 Replica 3a]

style Client fill:#e94560,stroke:#fff,color:#fff
    style Router fill:#2c3e50,stroke:#fff,color:#fff
    style S1 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style S2 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style S3 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style SN fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style R1 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
    style R2 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
    style R3 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50

```
A basic Database Sharding architecture with a shard router and read replicas

## 2. Sharding strategies

### 2.1 Hash-based Sharding

Apply a hash function to the shard key to pick the destination shard. This is the most common strategy because it spreads data evenly.

```
shard_id = hash(shard_key) % number_of_shards

-- Example: user_id = 12345
-- hash(12345) = 7829431
-- shard_id = 7829431 % 4 = 3 → Shard 3
```

#### ✅ Strengths

Even data distribution, no hotspots. PlanetScale recommends hash-based as the default strategy for most use cases because it's simple and effective.

#### ⚠️ Weaknesses

Range queries are inefficient (they scan every shard). Adding/removing shards forces data to be rehashed — which is exactly why Consistent Hashing exists.

### 2.2 Range-based Sharding

Split data into contiguous ranges of the shard key. A natural fit for data with inherent ordering (timestamps, sequential IDs, geography).

Shard A

orders 2024-01 → 2024-06

Shard B

orders 2024-07 → 2024-12

Shard C

orders 2025-01 → 2025-06

Shard D

orders 2025-07 → 2026-04

**Pros:** very fast range queries (only 1-2 shards touched). Easy to reason about and operate.

**Cons:** prone to **hotspots** — the shard holding the newest data is hammered the most. For example, the "current month" order shard may absorb 90% of traffic.

### 2.3 Directory-based Sharding

Use a separate lookup table (directory) to map each shard key → shard location. The most flexible option, but the directory becomes a single point of failure.

style App fill:#e94560,stroke:#fff,color:#fff
    style Dir fill:#2c3e50,stroke:#fff,color:#fff
    style S1 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style S2 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style S3 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style S4 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50

```
Directory-based sharding enables flexible mapping — even special logic (a VIP shard)

Directory-based is often paired with a cache (Redis, Memcached) to keep lookup latency low. Moving data between shards simply updates the directory — no rehashing required.

### 2.4 Geographic Sharding

Distribute data by the user's geographic location — keep it near where it's accessed. Especially important for global apps and data-residency regulations (GDPR, PDPA).

| Strategy | Even distribution | Range query | Resharding | Typical use case |
| --- | --- | --- | --- | --- |
| **Hash-based** | ⭐⭐⭐ | ⭐ | ⭐⭐ | User data, session store |
| **Range-based** | ⭐ | ⭐⭐⭐ | ⭐⭐⭐ | Time-series, log data |
| **Directory-based** | ⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐ | Multi-tenant SaaS |
| **Geographic** | ⭐⭐ | ⭐⭐ | ⭐⭐ | Global apps, data residency |

## 3. Consistent Hashing & Virtual Nodes

The biggest issue with naïve hash-based sharding (`hash(key) % N`) is that adding or removing a shard forces most keys to move (rehash). Going from N = 4 to N = 5, about **80% of data** must migrate — unacceptable in production.

**Consistent Hashing** addresses this by arranging the hash space into a circle (the hash ring). Each shard occupies a position on the ring, and each key is mapped to the nearest shard clockwise.

```
graph TB
    subgraph Ring["🔵 Hash Ring (0 → 2³²)"]
        direction TB
        N1["🟢 Node A  
position: 0°"]
        N2["🔴 Node B  
position: 90°"]
        N3["🟡 Node C  
position: 180°"]
        N4["🟣 Node D  
position: 270°"]
    end

style N1 fill:#4CAF50,stroke:#fff,color:#fff
    style N2 fill:#e94560,stroke:#fff,color:#fff
    style N3 fill:#ff9800,stroke:#fff,color:#fff
    style N4 fill:#9c27b0,stroke:#fff,color:#fff
    style K1 fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style K2 fill:#f8f9fa,stroke:#9c27b0,color:#2c3e50
    style K3 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50

```
The Consistent Hashing Ring — adding/removing a node only affects keys on adjacent nodes

When Node E is added at position 135°, only keys in the (90°, 135°] range migrate from Node C to Node E — roughly **25% of a single node's data**, not 80% of the whole system.

### Virtual Nodes (VNodes)

In practice, only 4-5 physical nodes on the ring result in uneven distribution. **Virtual Nodes** fix this by having each physical node occupy multiple positions on the ring:

```
-- Node A holds 3 virtual nodes at:   0°, 120°, 240°
-- Node B holds 3 virtual nodes at:  40°, 160°, 280°
-- → distribution is far more uniform

-- When Node A is removed:
-- Data at   0° → moves to the nearest vnode (Node B at 40°)
-- Data at 120° → moves to Node C at 160°
-- Data at 240° → moves to Node B at 280°
-- → Load is evenly redistributed across the remaining nodes
```

#### 💡 In practice

Cassandra uses 256 virtual nodes per physical node by default. Amazon's DynamoDB also uses consistent hashing with virtual nodes for partition management. For new designs, start with **128-256 VNodes per node** to get an even distribution from day one.

## 4. The art of picking a Shard Key

### 4.1 High Cardinality

The shard key must have many distinct values. `user_id` (millions of values) is far better than `country_code` (~200 values). Low cardinality means data concentrates in a handful of shards.

### 4.2 Even Distribution

Data must spread evenly across shards. For instance, `created_date` is a poor shard key for range-based sharding because new data always piles onto the last shard.

### 4.3 Query Pattern Alignment

The shard key should match your most common WHERE clause. If 90% of queries filter by `tenant_id`, that's your shard key — each query only hits a single shard.

### 4.4 Stability (Immutable)

The shard key should never change. If a user updates their email and email is the shard key, that row must migrate to another shard — expensive and error-prone.

| Shard Key | Cardinality | Distribution | Stability | Assessment |
| --- | --- | --- | --- | --- |
| `user_id` | Very high | Even (hash) | Immutable | ⭐ Best for multi-user apps |
| `tenant_id` | Medium | Can be skewed | Immutable | ⭐ Good for SaaS, monitor size |
| `order_id` | Very high | Even | Immutable | ⭐ Good for e-commerce |
| `created_at` | High | Skewed (new = hot) | Immutable | ⚠️ Only good for archive data |
| `country` | Low (~200) | Very skewed | Can change | ❌ Avoid using alone |
| `email` | High | Even | Can change | ❌ Not stable |

#### ⚠️ Compound shard keys

When no single key is perfect, use a compound key. Example: `(tenant_id, user_id)` keeps a tenant's data on the same shard (good for isolation) while hashing on user_id distributes load evenly within a large tenant.

## 5. Challenges of sharding

### 5.1 Cross-Shard Queries

Queries that need data across multiple shards are the biggest challenge. E.g. `SELECT * FROM orders WHERE product_id = 42` when the shard key is `user_id` — you must scatter the query to EVERY shard and gather the results.

```
sequenceDiagram
    participant App as Application
    participant Router as Shard Router
    participant S1 as Shard 1
    participant S2 as Shard 2
    participant S3 as Shard 3

App->>Router: SELECT * FROM orders WHERE product_id = 42
    Router->>S1: Forward query
    Router->>S2: Forward query
    Router->>S3: Forward query
    S1-->>Router: 15 rows
    S2-->>Router: 8 rows
    S3-->>Router: 22 rows
    Router->>Router: Merge + Sort + Limit
    Router-->>App: Final result (45 rows)

```
The scatter-gather pattern — cross-shard queries fan out to every shard

**Solutions:** design the shard key around your primary query pattern. If you need to query along multiple dimensions, use **denormalization** (duplicate data across shards) or a separate **secondary-index shard**.

### 5.2 Distributed Transactions

Transactions spanning multiple shards require distributed protocols like **2-Phase Commit (2PC)** — slow and complex. Prefer designs where every transaction fits within a single shard.

```
-- ❌ Cross-shard transaction (slow, complex)
BEGIN DISTRIBUTED TRANSACTION
  UPDATE shard_1.accounts SET balance = balance - 100 WHERE user_id = 1;
  UPDATE shard_3.accounts SET balance = balance + 100 WHERE user_id = 999;
COMMIT

-- ✅ Better design: use a Saga pattern
-- Step 1: Debit user 1 (shard 1) → emit an event
-- Step 2: Credit user 999 (shard 3) → consume the event
-- Step 3: If step 2 fails → compensating transaction on shard 1
```

### 5.3 Resharding (Rebalancing)

As data grows and you need more shards, moving data between shards (resharding) is extremely expensive. The usual steps:

1. **Double-write:** write to both old and new shards in parallel
2. **Backfill:** copy historical data from the old shard to the new one
3. **Verify:** compare checksums between the two sides
4. **Cutover:** switch traffic to the new shard schema
5. **Cleanup:** delete the now-redundant data from the old shard

#### ✅ Pre-split

### 5.4 Unique Constraints Across Shards

Enforcing a global `UNIQUE` constraint (e.g. unique email across all shards) can't use a simple database constraint. Options:

- **Separate uniqueness table:** a small centralized table mapping (email → shard_id)
- **Globally unique IDs:** Snowflake IDs, UUIDs, or ULIDs that encode shard info into the ID
- **Application-level check:** query before insert, accept the rare race condition

## 6. Comparing solutions: Vitess, Citus, CockroachDB, TiDB

As of 2026, four sharding solutions stand out — each serving a different use case:

| Criterion | Vitess | Citus | CockroachDB | TiDB |
| --- | --- | --- | --- | --- |
| **Foundation** | MySQL middleware | PostgreSQL extension | Custom (Postgres-compatible) | Custom (MySQL-compatible) |
| **Sharding model** | Explicit (manual key) | Distributed tables | Automatic range-based | Automatic (region-based) |
| **Resharding** | Manual, online | Rebalancer tool | Automatic | Automatic (split/merge) |
| **Cross-shard queries** | Limited | Good | Very good | Very good |
| **Consistency** | Per-shard (MySQL) | Per-node (Postgres) | Strong (Raft) | Strong (Raft) |
| **HTAP** | No | Columnar addon | No | Yes (TiFlash) |
| **Deployment** | Kubernetes-native | Managed + self-host | Managed + self-host | Managed + self-host |
| **Origin** | YouTube/Google | Microsoft (Azure) | Cockroach Labs | PingCAP |
| **Best for** | Controlled MySQL scale-out | Postgres + analytics | Multi-region, Postgres-first | MySQL-first, HTAP workloads |

```
graph TB
    subgraph Decision["🤔 Which solution to pick?"]
        Start{Current database?}
        Start -->|MySQL| MySQL_Q{Need auto-sharding?}
        Start -->|PostgreSQL| PG_Q{Need multi-region?}
        Start -->|Greenfield| New_Q{Top priority?}

MySQL_Q -->|Yes| TiDB_R[✅ TiDB]
        MySQL_Q -->|No, want control| Vitess_R[✅ Vitess]

PG_Q -->|Yes| CRDB_R[✅ CockroachDB]
        PG_Q -->|No| Citus_R[✅ Citus]

style Start fill:#e94560,stroke:#fff,color:#fff
    style MySQL_Q fill:#2c3e50,stroke:#fff,color:#fff
    style PG_Q fill:#2c3e50,stroke:#fff,color:#fff
    style New_Q fill:#2c3e50,stroke:#fff,color:#fff
    style TiDB_R fill:#4CAF50,stroke:#fff,color:#fff
    style TiDB_R2 fill:#4CAF50,stroke:#fff,color:#fff
    style Vitess_R fill:#4CAF50,stroke:#fff,color:#fff
    style CRDB_R fill:#4CAF50,stroke:#fff,color:#fff
    style CRDB_R2 fill:#4CAF50,stroke:#fff,color:#fff
    style Citus_R fill:#4CAF50,stroke:#fff,color:#fff
    style Citus_R2 fill:#4CAF50,stroke:#fff,color:#fff

```
Decision tree for choosing the right sharding solution

### Case study: Ninja Van picked TiDB over Vitess

Ninja Van — a Southeast Asian logistics platform handling millions of orders daily — migrated from MySQL to TiDB instead of choosing Vitess or CockroachDB. Key reasons:

- TiDB is MySQL-protocol-compatible → low migration cost
- Auto-sharding without having to define shard keys by hand
- TiFlash lets them run analytics directly on operational data (HTAP)
- Horizontal scaling on Kubernetes without manual resharding

## 7. When you should NOT shard

Sharding adds significant complexity. Before going there, make sure you've exhausted simpler options:

Step 1: Query optimization

Proper indexes, rewrite slow queries, use EXPLAIN ANALYZE. Most DB bottlenecks come from bad queries, not missing hardware.

Step 2: Read replicas

80% of apps are read-heavy. Adding a read replica can boost throughput 3-5× without sharding. Every managed database (RDS, Cloud SQL) supports it out of the box.

Step 3: Caching layer

Redis/Memcached in front of the DB cuts 70-90% of read traffic. Pair with a CDN for static content. Low cost, fast to ship.

Step 4: Vertical scaling

Upgrade to 64 vCPU, 256 GB RAM, NVMe SSD. Cloud providers let you scale up in minutes. PostgreSQL on good hardware handles millions of rows per second.

Step 5: Table partitioning

Partitioning WITHIN a single database (not sharding). PostgreSQL native partitioning or MySQL partition by range/hash — much simpler than sharding.

Step 6: Sharding

Only when ALL the above aren't enough: a single-node DB genuinely can't sustain the write throughput, storage exceeds the TB range, or you need multi-region data locality.

#### ⚠️ Premature sharding is an anti-pattern

Many teams shard too early "just in case". In reality, a single-node PostgreSQL on modern (2026) hardware comfortably handles **several TB of data** and **tens of thousands of transactions per second**. Sharding before you need it creates operational overhead without adding value.

## 8. Conclusion

Database Sharding is a powerful tool — but comes with a major complexity tradeoff. Remember:

- **Hash-based** is the best default strategy for most use cases
- **Consistent Hashing** with Virtual Nodes solves the resharding problem
- **The Shard Key** must have high cardinality, be stable, and align with the primary query pattern
- **Cross-shard queries** are always expensive — design the schema to minimize them
- **Modern solutions** like TiDB and CockroachDB provide auto-sharding, reducing operational burden
- Always try **query optimization → read replicas → caching → vertical scaling** before reaching for sharding

In the distributed-systems era of 2026, understanding sharding isn't just about applying it — it's about knowing when *not* to. That's the mark of a mature systems architect.

**References:**

- [PlanetScale — Sharding Strategies: Directory-based, Range-based, and Hash-based](https://planetscale.com/blog/types-of-sharding)
- [PingCAP — Why TiDB Beats Vitess and CockroachDB at Ninja Van](https://www.pingcap.com/case-study/choose-a-mysql-alternative-over-vitess-and-crdb-to-scale-out-our-databases-on-k8s/)
- [Last9 — Database Sharding: How It Works and When You Actually Need It](https://last9.io/blog/database-sharding/)
- [Hello Interview — Sharding in System Design Interviews](https://www.hellointerview.com/learn/system-design/core-concepts/sharding)
- [Brandur — A Comparison of Advanced, Modern Cloud Databases](https://brandur.org/cloud-databases)

Zero-Downtime Deployment — Blue-Green, Canary, and Rolling Update

Cloudflare Agent Cloud 2026 — Building AI Agents at the Edge with Workers, Durable Objects, and Project Think

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.