Multi-Tier Caching Strategy: From Browser to Database for High-Performance Applications

Posted on: 4/25/2026 2:17:49 PM

Table of contents

1. Why caching is the backbone of every high-performance system
2. Multi-tier caching architecture — 4 layers of defense
3. Tier 1: Browser Cache — zero network, zero cost
4. Tier 2: CDN Edge Cache — reducing latency by geography
1. 4.1. Cloudflare — smart cache rules for free
  1. Cache Key — determining "same" vs "different"
2. 4.2. Cache Purge and Tag-based Invalidation
5. Tier 3: Application Cache — the heart of caching strategy
6. Tier 4: Database Cache — optimizing at the source
1. 6.1. Query Plan Cache
2. 6.2. Materialized View — pre-computed data
7. Cache Invalidation — "only two hard things in CS"
1. Quick pattern selection guide
8. Cache Stampede & Thundering Herd — proactive defense
1. 8.1. Prevention techniques
  1. HybridCache solves stampede automatically
9. Monitoring & Observability for Cache
10. Production Checklist
1. The golden rule of caching
11. Conclusion
12. References

1. Why caching is the backbone of every high-performance system

In any System Design interview, the answer to "how to make the system faster?" almost always starts with caching. But in practice, caching is not simply "add Redis." A truly high-performance system implements caching at multiple tiers, each addressing a different layer of latency. Misunderstanding which tier should cache what leads to wasted resources, stale data, or worse — cache stampedes that bring down the entire system.

This article analyzes multi-tier caching architecture from a practical perspective: 4 cache layers from browser to database, 5 invalidation strategies, and how .NET 10's new HybridCache solves the L1/L2 caching problem once and for all.

~5msMemory cache lookup (L1)

~50msDistributed cache lookup (L2)

~200msAverage database query

90%+Database load reduction with proper caching

2. Multi-tier caching architecture — 4 layers of defense

Each cache tier acts as a protective layer, intercepting requests before they fall down to a more expensive layer below. The closer to the user, the lower the latency — but the more limited the capacity and the harder the invalidation.

flowchart TB
    User["👤 User / Browser"]
    BCache["🖥️ Browser Cache
Cache-Control, ETag, Service Worker
~0ms latency"]
    CDN["🌐 CDN Edge Cache
Cloudflare, CloudFront
~10-30ms latency"]
    AppCache["⚡ Application Cache
IMemoryCache / HybridCache / Redis
~1-50ms latency"]
    DB["🗄️ Database
Query Cache, Materialized View
~100-500ms latency"]

    User -->|"Request"| BCache
    BCache -->|"Cache Miss"| CDN
    CDN -->|"Cache Miss"| AppCache
    AppCache -->|"Cache Miss"| DB
    DB -->|"Response"| AppCache
    AppCache -->|"Response + Cache"| CDN
    CDN -->|"Response + Cache"| BCache
    BCache -->|"Response"| User

    style User fill:#e94560,stroke:#fff,color:#fff
    style BCache fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style CDN fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
    style AppCache fill:#f8f9fa,stroke:#2196F3,color:#2c3e50
    style DB fill:#2c3e50,stroke:#fff,color:#fff

Figure 1: Multi-tier caching architecture — requests flow top-down, each tier intercepts before hitting the more expensive layer below

Tier	Latency	Capacity	Scope	Invalidation
Browser Cache	~0ms	~50-300MB	Single user, single device	Difficult (depends on TTL, user action)
CDN Edge	10-30ms	TB-level	All users in region	API purge, TTL, tag-based
Application	1-50ms	GB-level	All requests to cluster	Proactive (event, TTL, tag)
Database	100-500ms	Disk-level	Query layer	Automatic on data change

3. Tier 1: Browser Cache — zero network, zero cost

Browser cache is the fastest tier because it requires absolutely no network request. The browser stores responses on disk/memory based on HTTP headers returned by the server.

3.1. Cache-Control — the brain of browser caching

The Cache-Control header determines whether the browser can cache, for how long, and whether revalidation with the server is required:

// Static assets — long cache, immutable
Cache-Control: public, max-age=31536000, immutable

// API response — short cache, must revalidate
Cache-Control: private, max-age=0, must-revalidate

// HTML page — conditional caching
Cache-Control: public, max-age=300, stale-while-revalidate=60

stale-while-revalidate — "serve stale, update silently"

stale-while-revalidate=60 allows the browser to immediately return the stale cached version to the user, while simultaneously sending a background request to the server for the fresh version. The user sees an instant page load, and the fresh version will be ready for next time. This is the ideal pattern for blog pages, product catalogs — where data changes infrequently but still needs updating.

3.2. ETag — smart validation without full reload

When max-age expires, the browser sends If-None-Match with the ETag. The server compares: if content hasn't changed, it returns 304 Not Modified (a few bytes) instead of resending the entire response (potentially hundreds of KB). Significant bandwidth savings for APIs with large payloads.

// First response
HTTP/1.1 200 OK
ETag: "a1b2c3d4e5"
Cache-Control: max-age=300

// Next request after cache expires
GET /api/products HTTP/1.1
If-None-Match: "a1b2c3d4e5"

// Server: data unchanged
HTTP/1.1 304 Not Modified

3.3. Service Worker — programmable cache layer

Service Workers allow fully customizable cache logic in JavaScript. Common in PWAs, Service Workers intercept all fetch requests and decide: serve from cache, from network, or combine both (stale-while-revalidate pattern at the code level).

// sw.js — Cache-first for static, Network-first for API
self.addEventListener('fetch', (event) => {
  if (event.request.url.includes('/api/')) {
    // Network-first for API calls
    event.respondWith(
      fetch(event.request)
        .then(response => {
          const clone = response.clone();
          caches.open('api-v1').then(cache =>
            cache.put(event.request, clone));
          return response;
        })
        .catch(() => caches.match(event.request))
    );
  } else {
    // Cache-first for static assets
    event.respondWith(
      caches.match(event.request)
        .then(cached => cached || fetch(event.request))
    );
  }
});

4. Tier 2: CDN Edge Cache — reducing latency by geography

CDNs place content copies at hundreds of edge servers worldwide. When a user in Vietnam makes a request, the CDN serves from the Singapore node instead of the US origin server — reducing latency from 200ms to 20ms.

4.1. Cloudflare — smart cache rules for free

Cloudflare provides powerful cache rules even on the Free plan. You can cache by path, query string, header, and set different TTLs for each content type:

# Example Cloudflare Cache Rules (via Dashboard or API)

# Rule 1: Cache static assets for 1 year
URI Path matches "/assets/*"
→ Cache eligible, Edge TTL: 365 days, Browser TTL: 365 days

# Rule 2: Cache API responses for 5 minutes
URI Path matches "/api/products*"
→ Cache eligible, Edge TTL: 300s, Browser TTL: 0
→ Cache Key: include query string

# Rule 3: Bypass cache for authenticated requests
Cookie contains "auth_token"
→ Bypass cache

Cache Key — determining "same" vs "different"

The cache key determines whether two requests receive the same cached response. By default, CDNs use URL + query string as the cache key. But if you serve different content based on headers (e.g., Accept-Language for multilingual support), you must include that header in the cache key — otherwise, a Vietnamese user might receive an English cached version.

4.2. Cache Purge and Tag-based Invalidation

CDNs allow cache purging by specific URL, prefix (purge all /api/products/*), or by cache tag. Cloudflare Enterprise and AWS CloudFront both support tag-based purge — when a product changes, purging tag product-123 clears all related responses across all edge servers within seconds.

sequenceDiagram
    participant App as Application
    participant CDN as CDN Edge
    participant User as Users

    App->>CDN: Response + Cache-Tag: product-123
    CDN->>User: Serve cached response
    Note over App: Product 123 updated
    App->>CDN: Purge cache tag "product-123"
    CDN->>CDN: Invalidate all entries with tag
    User->>CDN: Next request
    CDN->>App: Cache MISS → fetch fresh
    App->>CDN: New response + Cache-Tag: product-123
    CDN->>User: Serve fresh response

Figure 2: Tag-based cache invalidation flow — purge by tag instead of individual URLs

5. Tier 3: Application Cache — the heart of caching strategy

This is the cache tier you have full control over, residing either in-process or in a distributed store. On .NET, the caching ecosystem has evolved through three generations:

.NET Core 1.0+

IMemoryCache — in-process, fast but not shared between instances. Cache lost on app restart.

.NET Core 2.0+

IDistributedCache — interface for Redis/SQL Server/NCache. Shared between instances but raw API (byte array), no stampede protection.

.NET 9 → 10 (GA)

HybridCache — combines L1 (memory) + L2 (distributed), built-in stampede protection, tag-based invalidation, automatic serialization. This is the future.

5.1. IMemoryCache — simple but with traps

IMemoryCache stores objects directly in process memory, with nanosecond-level retrieval. Suitable for single-instance deployments or data that only needs local caching (config, small lookup tables).

// Register in Program.cs
builder.Services.AddMemoryCache();

// Usage
public class ProductService
{
    private readonly IMemoryCache _cache;
    private readonly IProductRepository _repo;

    public async Task<Product?> GetByIdAsync(int id)
    {
        var key = $"product:{id}";

        if (_cache.TryGetValue(key, out Product? cached))
            return cached;

        var product = await _repo.GetByIdAsync(id);
        if (product is not null)
        {
            _cache.Set(key, product, new MemoryCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(30),
                SlidingExpiration = TimeSpan.FromMinutes(10),
                Size = 1 // if using SizeLimit
            });
        }
        return product;
    }
}

The Cache Stampede trap with IMemoryCache

When a cache entry expires and 100 concurrent requests arrive, all 100 see a cache miss and simultaneously hit the database. With IMemoryCache, you must handle this yourself using SemaphoreSlim or Lazy<Task>. This is exactly why HybridCache was created — stampede protection is built in.

5.2. IDistributedCache — shared but raw

When running multiple instances (Kubernetes, load balancer), IMemoryCache on each pod is independent. A user sets cache on Pod 1, but the next request hits Pod 2 → cache miss. IDistributedCache with Redis solves this:

// Program.cs
builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = "localhost:6379";
    options.InstanceName = "myapp:";
});

// Usage — raw API, manual serialization
public async Task<Product?> GetByIdAsync(int id)
{
    var key = $"product:{id}";
    var bytes = await _cache.GetAsync(key);

    if (bytes is not null)
        return JsonSerializer.Deserialize<Product>(bytes);

    var product = await _repo.GetByIdAsync(id);
    if (product is not null)
    {
        var json = JsonSerializer.SerializeToUtf8Bytes(product);
        await _cache.SetAsync(key, json, new DistributedCacheEntryOptions
        {
            AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(30)
        });
    }
    return product;
}

The downsides are clear: the API works with byte[], requiring manual serialization/deserialization, no stampede protection, and every cache read goes through the network (even on hit).

5.3. HybridCache — the future of caching on .NET

HybridCache (GA since .NET 9, stable in .NET 10) combines the best of both: L1 in-memory for speed, L2 distributed for consistency, plus stampede protection and tag-based invalidation:

flowchart LR
    Request["Request"] --> HC["HybridCache
GetOrCreateAsync"]
    HC --> L1{"L1 Memory
Cache Hit?"}
    L1 -->|"Hit"| Return["Return
~nanoseconds"]
    L1 -->|"Miss"| L2{"L2 Redis
Cache Hit?"}
    L2 -->|"Hit"| PopL1["Populate L1
Return ~ms"]
    L2 -->|"Miss"| Factory["Factory Method
(DB Query)"]
    Factory --> PopBoth["Populate L1 + L2
Return"]

    style Request fill:#e94560,stroke:#fff,color:#fff
    style HC fill:#2c3e50,stroke:#fff,color:#fff
    style L1 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
    style L2 fill:#f8f9fa,stroke:#2196F3,color:#2c3e50
    style Factory fill:#f8f9fa,stroke:#ff9800,color:#2c3e50
    style Return fill:#4CAF50,stroke:#fff,color:#fff
    style PopL1 fill:#4CAF50,stroke:#fff,color:#fff
    style PopBoth fill:#4CAF50,stroke:#fff,color:#fff

Figure 3: HybridCache flow — L1 memory → L2 Redis → Factory, with stampede protection at each tier

// Program.cs — setup HybridCache
builder.Services.AddHybridCache(options =>
{
    options.DefaultEntryOptions = new HybridCacheEntryOptions
    {
        Expiration = TimeSpan.FromMinutes(30),
        LocalCacheExpiration = TimeSpan.FromMinutes(5)
    };
    options.MaximumPayloadBytes = 1024 * 1024; // 1MB limit
});

// Add Redis as L2 backend
builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = "localhost:6379";
});

// Usage — clean API, auto serialize, stampede protection
public class ProductService(HybridCache cache, IProductRepository repo)
{
    public async Task<Product?> GetByIdAsync(int id,
        CancellationToken ct = default)
    {
        return await cache.GetOrCreateAsync(
            $"product:{id}",
            async cancel => await repo.GetByIdAsync(id, cancel),
            new HybridCacheEntryOptions
            {
                Expiration = TimeSpan.FromMinutes(30),
                LocalCacheExpiration = TimeSpan.FromMinutes(5)
            },
            tags: ["products", $"product:{id}"],
            cancellationToken: ct
        );
    }

    public async Task UpdateAsync(Product product,
        CancellationToken ct = default)
    {
        await repo.UpdateAsync(product, ct);
        // Invalidate by tag — removes all related entries
        await cache.RemoveByTagAsync($"product:{product.Id}", ct);
    }
}

Feature	IMemoryCache	IDistributedCache	HybridCache
L1 (in-process)	✅	❌	✅
L2 (distributed)	❌	✅	✅
Stampede protection	❌	❌	✅
Tag-based invalidation	❌	❌	✅
Auto serialization	N/A (object ref)	❌ (byte[])	✅
Multi-instance safe	❌	✅	✅
Latency (hit)	~ns	~ms	~ns (L1) / ~ms (L2)

6. Tier 4: Database Cache — optimizing at the source

Even with application caching intercepting most requests, some always reach the database. Optimizing this tier reduces latency on cache misses and eases pressure during mass cache invalidations.

6.1. Query Plan Cache

Both SQL Server and PostgreSQL cache execution plans for prepared statements. Using parameterized queries instead of string concatenation not only prevents SQL injection but also allows the database to reuse compiled plans:

// ❌ Each id value creates a new plan
var sql = $"SELECT * FROM Products WHERE Id = {id}";

// ✅ Plan is cached and reused
var sql = "SELECT * FROM Products WHERE Id = @Id";
cmd.Parameters.AddWithValue("@Id", id);

6.2. Materialized View — pre-computed data

Instead of running complex aggregate queries each time, Materialized Views store pre-computed results. PostgreSQL supports REFRESH MATERIALIZED VIEW CONCURRENTLY for non-blocking refreshes. SQL Server uses Indexed Views for similar purposes.

-- PostgreSQL: Create materialized view for dashboard stats
CREATE MATERIALIZED VIEW product_stats AS
SELECT
    category_id,
    COUNT(*) as total_products,
    AVG(price) as avg_price,
    MAX(updated_at) as last_updated
FROM products
GROUP BY category_id;

-- Concurrent refresh, non-blocking reads
REFRESH MATERIALIZED VIEW CONCURRENTLY product_stats;

7. Cache Invalidation — "only two hard things in CS"

Phil Karlton once said: "There are only two hard things in Computer Science: cache invalidation and naming things." Understanding these 5 invalidation strategies helps you choose the right pattern for each use case:

flowchart TB
    subgraph CacheAside["Cache-Aside (Lazy Loading)"]
        CA1["App checks cache"] --> CA2{"Hit?"}
        CA2 -->|"Yes"| CA3["Return"]
        CA2 -->|"No"| CA4["Query DB"]
        CA4 --> CA5["Write to cache"]
        CA5 --> CA3
    end

    subgraph WriteThrough["Write-Through"]
        WT1["App writes data"] --> WT2["Write to cache"]
        WT2 --> WT3["Cache writes to DB"]
        WT3 --> WT4["Confirm"]
    end

    subgraph WriteBehind["Write-Behind (Write-Back)"]
        WB1["App writes data"] --> WB2["Write to cache"]
        WB2 --> WB3["Confirm immediately"]
        WB2 -.->|"Async"| WB4["Cache writes DB later"]
    end

    style CacheAside fill:#f8f9fa,stroke:#e94560
    style WriteThrough fill:#f8f9fa,stroke:#4CAF50
    style WriteBehind fill:#f8f9fa,stroke:#2196F3

Figure 4: Three main cache write strategy patterns

Pattern	Read perf	Write perf	Consistency	Best for
Cache-Aside	Slow first time (cold start)	Fast (DB only)	Eventual	95% of use cases — the default choice
Read-Through	Like Cache-Aside but cache self-fetches	N/A	Eventual	When you want cache layer to self-manage
Write-Through	Always fast (data available)	Slower (sync write to both)	Strong	Critical data requiring high consistency
Write-Behind	Always fast	Very fast (async)	Eventual (data loss risk)	High-write throughput, accepting risk
Write-Around	Slow first time	Fast (bypass cache)	Eventual	Data rarely read after write

Quick pattern selection guide

Default: Cache-Aside + TTL safety net. 95% of web applications work well with this.
Need strong consistency: Write-Through (e.g., inventory count, account balance).
Write throughput is priority: Write-Behind (e.g., analytics events, view counts).
Data rarely read back: Write-Around (e.g., audit logs, notification history).

8. Cache Stampede & Thundering Herd — proactive defense

Cache stampede occurs when a popular cache entry expires and hundreds of concurrent requests discover the miss simultaneously, all querying the database at once. This is the #1 cause of caching-related system crashes.

8.1. Prevention techniques

// Technique 1: Mutex/Lock — only 1 request fetches, others wait
private static readonly SemaphoreSlim _lock = new(1, 1);

public async Task<Product?> GetWithLockAsync(int id)
{
    var key = $"product:{id}";
    if (_cache.TryGetValue(key, out Product? cached))
        return cached;

    await _lock.WaitAsync();
    try
    {
        // Double-check after acquiring lock
        if (_cache.TryGetValue(key, out cached))
            return cached;

        var product = await _repo.GetByIdAsync(id);
        _cache.Set(key, product, TimeSpan.FromMinutes(30));
        return product;
    }
    finally { _lock.Release(); }
}

// Technique 2: Probabilistic Early Expiration
// Cache entry "voluntarily" refreshes before actual expiry
public async Task<Product?> GetWithEarlyRefreshAsync(int id)
{
    var key = $"product:{id}";
    var entry = _cache.Get<CacheEntry<Product>>(key);

    if (entry is not null)
    {
        var timeToExpiry = entry.ExpiresAt - DateTime.UtcNow;
        var totalTtl = entry.Ttl;
        // Refresh probability increases as expiry approaches
        var probability = Math.Exp(-timeToExpiry / totalTtl * 10);
        if (Random.Shared.NextDouble() > probability)
            return entry.Value;
    }

    // Fetch and re-cache
    var product = await _repo.GetByIdAsync(id);
    // ... set cache
    return product;
}

HybridCache solves stampede automatically

With HybridCache.GetOrCreateAsync, when 100 concurrent requests call the same key, exactly 1 request runs the factory method (DB query), and the other 99 wait for the result. No SemaphoreSlim, no double-check pattern needed. This is the primary reason to migrate from IMemoryCache to HybridCache.

9. Monitoring & Observability for Cache

Cache without monitoring is like driving blindfolded — you don't know whether you're saving or wasting. The three most important metrics:

Hit RateCache hits / total requests. Target: > 90%

Eviction RateEntries evicted/sec. High = need more capacity

Latency P9999th percentile read time. L1 < 1ms, L2 < 10ms

Memory Usage% memory used for cache. Prevent OOM with SizeLimit

// Expose cache metrics via OpenTelemetry
builder.Services.AddOpenTelemetry()
    .WithMetrics(metrics =>
    {
        metrics.AddMeter("MyApp.Cache");
    });

// Custom meter for cache
public class CacheMetrics
{
    private static readonly Meter _meter = new("MyApp.Cache");
    private static readonly Counter<long> _hits =
        _meter.CreateCounter<long>("cache.hits");
    private static readonly Counter<long> _misses =
        _meter.CreateCounter<long>("cache.misses");
    private static readonly Histogram<double> _duration =
        _meter.CreateHistogram<double>("cache.duration.ms");

    public void RecordHit(string cacheLayer) =>
        _hits.Add(1, new("layer", cacheLayer));
    public void RecordMiss(string cacheLayer) =>
        _misses.Add(1, new("layer", cacheLayer));
    public void RecordDuration(string cacheLayer, double ms) =>
        _duration.Record(ms, new("layer", cacheLayer));
}

10. Production Checklist

Before deploying a system with caching, check the following:

#	Item	Notes
1	TTL always set	Never cache forever — even config. Set both absolute and sliding expiration.
2	Cache key has namespace	Prefix with service name + version: `myapp:v2:product:123`. Prevents collision when multiple services share Redis.
3	Stampede protection	Use `HybridCache` or implement mutex. Don't let cache misses fan-out to database.
4	Consistent serialization format	Choose JSON or MessagePack and stick with it. Changing format = invalidating entire cache.
5	Circuit breaker for cache layer	When Redis is down, fall back to direct database — don't let cache timeout block requests.
6	Full monitoring	Hit rate, eviction rate, memory usage, latency. Alert when hit rate < 80%.
7	Don't cache PII/sensitive data	Or if you must, encrypt + short TTL. Audit trail for cache access.
8	Warm-up strategy	After fresh deploy, cache is cold. Consider pre-warming hot keys or canary deploy.

The golden rule of caching

Cache data that is read more than written. If write/read ratio > 50%, caching may cause more overhead than benefit — you'll spend more time on invalidation than you save from cache hits. In this case, optimize the database directly (indexing, query optimization, connection pooling) instead of adding a cache layer.

11. Conclusion

Multi-tier caching is not simply "add Redis between app and DB." It's designing a multi-layered defense system where each layer has its own role, trade-offs, and invalidation strategy. Browser cache eliminates network requests, CDN reduces geographic latency, application cache reduces database load, and database cache optimizes at the data source.

With HybridCache in .NET 10, Microsoft has solved most of the application caching complexity: automatic L1/L2, built-in stampede protection, tag-based invalidation, and a clean API. If you're using IMemoryCache or IDistributedCache separately, now is the time to migrate.

Remember: the best cache is the one you don't need. Before adding cache, ask: is this data actually read-heavy? Can the query be optimized directly? Caching should only be the solution after you've optimized at the data layer.

12. References

#system design #redis #.NET 10 #HybridCache #Web Performance #CDN #Caching

# Multi-Tier Caching Strategy: From Browser to Database for High-Performance Applications

## 1. Why caching is the backbone of every high-performance system

In any System Design interview, the answer to "how to make the system faster?" almost always starts with caching. But in practice, caching is not simply "add Redis." A truly high-performance system implements caching at **multiple tiers**, each addressing a different layer of latency. Misunderstanding which tier should cache what leads to wasted resources, stale data, or worse — cache stampedes that bring down the entire system.

This article analyzes multi-tier caching architecture from a practical perspective: 4 cache layers from browser to database, 5 invalidation strategies, and how **.NET 10**'s new `HybridCache` solves the L1/L2 caching problem once and for all.

~5msMemory cache lookup (L1)

~50msDistributed cache lookup (L2)

~200msAverage database query

90%+Database load reduction with proper caching

## 2. Multi-tier caching architecture — 4 layers of defense

```
flowchart TB
    User["👤 User / Browser"]
    BCache["🖥️ Browser Cache  
Cache-Control, ETag, Service Worker  
~0ms latency"]
    CDN["🌐 CDN Edge Cache  
Cloudflare, CloudFront  
~10-30ms latency"]
    AppCache["⚡ Application Cache  
IMemoryCache / HybridCache / Redis  
~1-50ms latency"]
    DB["🗄️ Database  
Query Cache, Materialized View  
~100-500ms latency"]

style User fill:#e94560,stroke:#fff,color:#fff
    style BCache fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style CDN fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
    style AppCache fill:#f8f9fa,stroke:#2196F3,color:#2c3e50
    style DB fill:#2c3e50,stroke:#fff,color:#fff

```

Figure 1: Multi-tier caching architecture — requests flow top-down, each tier intercepts before hitting the more expensive layer below

| Tier | Latency | Capacity | Scope | Invalidation |
| --- | --- | --- | --- | --- |
| **Browser Cache** | ~0ms | ~50-300MB | Single user, single device | Difficult (depends on TTL, user action) |
| **CDN Edge** | 10-30ms | TB-level | All users in region | API purge, TTL, tag-based |
| **Application** | 1-50ms | GB-level | All requests to cluster | Proactive (event, TTL, tag) |
| **Database** | 100-500ms | Disk-level | Query layer | Automatic on data change |

## 3. Tier 1: Browser Cache — zero network, zero cost

Browser cache is the fastest tier because it requires absolutely no network request. The browser stores responses on disk/memory based on HTTP headers returned by the server.

### 3.1. Cache-Control — the brain of browser caching

The `Cache-Control` header determines whether the browser can cache, for how long, and whether revalidation with the server is required:

```
// Static assets — long cache, immutable
Cache-Control: public, max-age=31536000, immutable

// API response — short cache, must revalidate
Cache-Control: private, max-age=0, must-revalidate

// HTML page — conditional caching
Cache-Control: public, max-age=300, stale-while-revalidate=60
```

#### stale-while-revalidate — "serve stale, update silently"

`stale-while-revalidate=60` allows the browser to immediately return the stale cached version to the user, while simultaneously sending a background request to the server for the fresh version. The user sees an instant page load, and the fresh version will be ready for next time. This is the ideal pattern for blog pages, product catalogs — where data changes infrequently but still needs updating.

### 3.2. ETag — smart validation without full reload

When `max-age` expires, the browser sends `If-None-Match` with the ETag. The server compares: if content hasn't changed, it returns `304 Not Modified` (a few bytes) instead of resending the entire response (potentially hundreds of KB). Significant bandwidth savings for APIs with large payloads.

```
// First response
HTTP/1.1 200 OK
ETag: "a1b2c3d4e5"
Cache-Control: max-age=300

// Next request after cache expires
GET /api/products HTTP/1.1
If-None-Match: "a1b2c3d4e5"

// Server: data unchanged
HTTP/1.1 304 Not Modified
```

### 3.3. Service Worker — programmable cache layer

```
// sw.js — Cache-first for static, Network-first for API
self.addEventListener('fetch', (event) => {
  if (event.request.url.includes('/api/')) {
    // Network-first for API calls
    event.respondWith(
      fetch(event.request)
        .then(response => {
          const clone = response.clone();
          caches.open('api-v1').then(cache =>
            cache.put(event.request, clone));
          return response;
        })
        .catch(() => caches.match(event.request))
    );
  } else {
    // Cache-first for static assets
    event.respondWith(
      caches.match(event.request)
        .then(cached => cached || fetch(event.request))
    );
  }
});
```

## 4. Tier 2: CDN Edge Cache — reducing latency by geography

### 4.1. Cloudflare — smart cache rules for free

Cloudflare provides powerful cache rules even on the Free plan. You can cache by path, query string, header, and set different TTLs for each content type:

```
# Example Cloudflare Cache Rules (via Dashboard or API)

# Rule 1: Cache static assets for 1 year
URI Path matches "/assets/*"
→ Cache eligible, Edge TTL: 365 days, Browser TTL: 365 days

# Rule 2: Cache API responses for 5 minutes
URI Path matches "/api/products*"
→ Cache eligible, Edge TTL: 300s, Browser TTL: 0
→ Cache Key: include query string

# Rule 3: Bypass cache for authenticated requests
Cookie contains "auth_token"
→ Bypass cache
```

#### Cache Key — determining "same" vs "different"

The cache key determines whether two requests receive the same cached response. By default, CDNs use URL + query string as the cache key. But if you serve different content based on headers (e.g., `Accept-Language` for multilingual support), you must include that header in the cache key — otherwise, a Vietnamese user might receive an English cached version.

### 4.2. Cache Purge and Tag-based Invalidation

CDNs allow cache purging by specific URL, prefix (purge all `/api/products/*`), or by cache tag. Cloudflare Enterprise and AWS CloudFront both support tag-based purge — when a product changes, purging tag `product-123` clears all related responses across all edge servers within seconds.

```
sequenceDiagram
    participant App as Application
    participant CDN as CDN Edge
    participant User as Users

App->>CDN: Response + Cache-Tag: product-123
    CDN->>User: Serve cached response
    Note over App: Product 123 updated
    App->>CDN: Purge cache tag "product-123"
    CDN->>CDN: Invalidate all entries with tag
    User->>CDN: Next request
    CDN->>App: Cache MISS → fetch fresh
    App->>CDN: New response + Cache-Tag: product-123
    CDN->>User: Serve fresh response

```

Figure 2: Tag-based cache invalidation flow — purge by tag instead of individual URLs

## 5. Tier 3: Application Cache — the heart of caching strategy

This is the cache tier you have full control over, residing either in-process or in a distributed store. On .NET, the caching ecosystem has evolved through three generations:

.NET Core 1.0+

**IMemoryCache** — in-process, fast but not shared between instances. Cache lost on app restart.

.NET Core 2.0+

**IDistributedCache** — interface for Redis/SQL Server/NCache. Shared between instances but raw API (byte array), no stampede protection.

.NET 9 → 10 (GA)

**HybridCache** — combines L1 (memory) + L2 (distributed), built-in stampede protection, tag-based invalidation, automatic serialization. This is the future.

### 5.1. IMemoryCache — simple but with traps

`IMemoryCache` stores objects directly in process memory, with nanosecond-level retrieval. Suitable for single-instance deployments or data that only needs local caching (config, small lookup tables).

```
// Register in Program.cs
builder.Services.AddMemoryCache();

// Usage
public class ProductService
{
    private readonly IMemoryCache _cache;
    private readonly IProductRepository _repo;

public async Task<Product?> GetByIdAsync(int id)
    {
        var key = $"product:{id}";

if (_cache.TryGetValue(key, out Product? cached))
            return cached;

var product = await _repo.GetByIdAsync(id);
        if (product is not null)
        {
            _cache.Set(key, product, new MemoryCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(30),
                SlidingExpiration = TimeSpan.FromMinutes(10),
                Size = 1 // if using SizeLimit
            });
        }
        return product;
    }
}
```

#### The Cache Stampede trap with IMemoryCache

When a cache entry expires and 100 concurrent requests arrive, **all 100 see a cache miss** and simultaneously hit the database. With IMemoryCache, you must handle this yourself using `SemaphoreSlim` or `Lazy<Task>`. This is exactly why `HybridCache` was created — stampede protection is built in.

### 5.2. IDistributedCache — shared but raw

When running multiple instances (Kubernetes, load balancer), `IMemoryCache` on each pod is independent. A user sets cache on Pod 1, but the next request hits Pod 2 → cache miss. `IDistributedCache` with Redis solves this:

```
// Program.cs
builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = "localhost:6379";
    options.InstanceName = "myapp:";
});

// Usage — raw API, manual serialization
public async Task<Product?> GetByIdAsync(int id)
{
    var key = $"product:{id}";
    var bytes = await _cache.GetAsync(key);

if (bytes is not null)
        return JsonSerializer.Deserialize<Product>(bytes);

var product = await _repo.GetByIdAsync(id);
    if (product is not null)
    {
        var json = JsonSerializer.SerializeToUtf8Bytes(product);
        await _cache.SetAsync(key, json, new DistributedCacheEntryOptions
        {
            AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(30)
        });
    }
    return product;
}
```
The downsides are clear: the API works with `byte[]`, requiring manual serialization/deserialization, no stampede protection, and every cache read goes through the network (even on hit).

### 5.3. HybridCache — the future of caching on .NET

`HybridCache` (GA since .NET 9, stable in .NET 10) combines the best of both: L1 in-memory for speed, L2 distributed for consistency, plus stampede protection and tag-based invalidation:

```
flowchart LR
    Request["Request"] --> HC["HybridCache  
GetOrCreateAsync"]
    HC --> L1{"L1 Memory  
Cache Hit?"}
    L1 -->|"Hit"| Return["Return  
~nanoseconds"]
    L1 -->|"Miss"| L2{"L2 Redis  
Cache Hit?"}
    L2 -->|"Hit"| PopL1["Populate L1  
Return ~ms"]
    L2 -->|"Miss"| Factory["Factory Method  
(DB Query)"]
    Factory --> PopBoth["Populate L1 + L2  
Return"]

style Request fill:#e94560,stroke:#fff,color:#fff
    style HC fill:#2c3e50,stroke:#fff,color:#fff
    style L1 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
    style L2 fill:#f8f9fa,stroke:#2196F3,color:#2c3e50
    style Factory fill:#f8f9fa,stroke:#ff9800,color:#2c3e50
    style Return fill:#4CAF50,stroke:#fff,color:#fff
    style PopL1 fill:#4CAF50,stroke:#fff,color:#fff
    style PopBoth fill:#4CAF50,stroke:#fff,color:#fff

```

Figure 3: HybridCache flow — L1 memory → L2 Redis → Factory, with stampede protection at each tier

```
// Program.cs — setup HybridCache
builder.Services.AddHybridCache(options =>
{
    options.DefaultEntryOptions = new HybridCacheEntryOptions
    {
        Expiration = TimeSpan.FromMinutes(30),
        LocalCacheExpiration = TimeSpan.FromMinutes(5)
    };
    options.MaximumPayloadBytes = 1024 * 1024; // 1MB limit
});

// Add Redis as L2 backend
builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = "localhost:6379";
});

// Usage — clean API, auto serialize, stampede protection
public class ProductService(HybridCache cache, IProductRepository repo)
{
    public async Task<Product?> GetByIdAsync(int id,
        CancellationToken ct = default)
    {
        return await cache.GetOrCreateAsync(
            $"product:{id}",
            async cancel => await repo.GetByIdAsync(id, cancel),
            new HybridCacheEntryOptions
            {
                Expiration = TimeSpan.FromMinutes(30),
                LocalCacheExpiration = TimeSpan.FromMinutes(5)
            },
            tags: ["products", $"product:{id}"],
            cancellationToken: ct
        );
    }

public async Task UpdateAsync(Product product,
        CancellationToken ct = default)
    {
        await repo.UpdateAsync(product, ct);
        // Invalidate by tag — removes all related entries
        await cache.RemoveByTagAsync($"product:{product.Id}", ct);
    }
}
```

| Feature | IMemoryCache | IDistributedCache | HybridCache |
| --- | --- | --- | --- |
| L1 (in-process) | ✅ | ❌ | ✅ |
| L2 (distributed) | ❌ | ✅ | ✅ |
| Stampede protection | ❌ | ❌ | ✅ |
| Tag-based invalidation | ❌ | ❌ | ✅ |
| Auto serialization | N/A (object ref) | ❌ (byte[]) | ✅ |
| Multi-instance safe | ❌ | ✅ | ✅ |
| Latency (hit) | ~ns | ~ms | ~ns (L1) / ~ms (L2) |

## 6. Tier 4: Database Cache — optimizing at the source

Even with application caching intercepting most requests, some always reach the database. Optimizing this tier reduces latency on cache misses and eases pressure during mass cache invalidations.

### 6.1. Query Plan Cache

```
// ❌ Each id value creates a new plan
var sql = $"SELECT * FROM Products WHERE Id = {id}";

// ✅ Plan is cached and reused
var sql = "SELECT * FROM Products WHERE Id = @Id";
cmd.Parameters.AddWithValue("@Id", id);
```

### 6.2. Materialized View — pre-computed data

Instead of running complex aggregate queries each time, Materialized Views store pre-computed results. PostgreSQL supports `REFRESH MATERIALIZED VIEW CONCURRENTLY` for non-blocking refreshes. SQL Server uses Indexed Views for similar purposes.

```
-- PostgreSQL: Create materialized view for dashboard stats
CREATE MATERIALIZED VIEW product_stats AS
SELECT
    category_id,
    COUNT(*) as total_products,
    AVG(price) as avg_price,
    MAX(updated_at) as last_updated
FROM products
GROUP BY category_id;

-- Concurrent refresh, non-blocking reads
REFRESH MATERIALIZED VIEW CONCURRENTLY product_stats;
```

## 7. Cache Invalidation — "only two hard things in CS"

Phil Karlton once said: *"There are only two hard things in Computer Science: cache invalidation and naming things."* Understanding these 5 invalidation strategies helps you choose the right pattern for each use case:

```
flowchart TB
    subgraph CacheAside["Cache-Aside (Lazy Loading)"]
        CA1["App checks cache"] --> CA2{"Hit?"}
        CA2 -->|"Yes"| CA3["Return"]
        CA2 -->|"No"| CA4["Query DB"]
        CA4 --> CA5["Write to cache"]
        CA5 --> CA3
    end

subgraph WriteThrough["Write-Through"]
        WT1["App writes data"] --> WT2["Write to cache"]
        WT2 --> WT3["Cache writes to DB"]
        WT3 --> WT4["Confirm"]
    end

subgraph WriteBehind["Write-Behind (Write-Back)"]
        WB1["App writes data"] --> WB2["Write to cache"]
        WB2 --> WB3["Confirm immediately"]
        WB2 -.->|"Async"| WB4["Cache writes DB later"]
    end

style CacheAside fill:#f8f9fa,stroke:#e94560
    style WriteThrough fill:#f8f9fa,stroke:#4CAF50
    style WriteBehind fill:#f8f9fa,stroke:#2196F3

```

Figure 4: Three main cache write strategy patterns

| Pattern | Read perf | Write perf | Consistency | Best for |
| --- | --- | --- | --- | --- |
| **Cache-Aside** | Slow first time (cold start) | Fast (DB only) | Eventual | 95% of use cases — the default choice |
| **Read-Through** | Like Cache-Aside but cache self-fetches | N/A | Eventual | When you want cache layer to self-manage |
| **Write-Through** | Always fast (data available) | Slower (sync write to both) | Strong | Critical data requiring high consistency |
| **Write-Behind** | Always fast | Very fast (async) | Eventual (data loss risk) | High-write throughput, accepting risk |
| **Write-Around** | Slow first time | Fast (bypass cache) | Eventual | Data rarely read after write |

#### Quick pattern selection guide

**Default:** Cache-Aside + TTL safety net. 95% of web applications work well with this.  
**Need strong consistency:** Write-Through (e.g., inventory count, account balance).  
**Write throughput is priority:** Write-Behind (e.g., analytics events, view counts).  
**Data rarely read back:** Write-Around (e.g., audit logs, notification history).

## 8. Cache Stampede & Thundering Herd — proactive defense

### 8.1. Prevention techniques

```
// Technique 1: Mutex/Lock — only 1 request fetches, others wait
private static readonly SemaphoreSlim _lock = new(1, 1);

public async Task<Product?> GetWithLockAsync(int id)
{
    var key = $"product:{id}";
    if (_cache.TryGetValue(key, out Product? cached))
        return cached;

await _lock.WaitAsync();
    try
    {
        // Double-check after acquiring lock
        if (_cache.TryGetValue(key, out cached))
            return cached;

var product = await _repo.GetByIdAsync(id);
        _cache.Set(key, product, TimeSpan.FromMinutes(30));
        return product;
    }
    finally { _lock.Release(); }
}

// Technique 2: Probabilistic Early Expiration
// Cache entry "voluntarily" refreshes before actual expiry
public async Task<Product?> GetWithEarlyRefreshAsync(int id)
{
    var key = $"product:{id}";
    var entry = _cache.Get<CacheEntry<Product>>(key);

if (entry is not null)
    {
        var timeToExpiry = entry.ExpiresAt - DateTime.UtcNow;
        var totalTtl = entry.Ttl;
        // Refresh probability increases as expiry approaches
        var probability = Math.Exp(-timeToExpiry / totalTtl * 10);
        if (Random.Shared.NextDouble() > probability)
            return entry.Value;
    }

// Fetch and re-cache
    var product = await _repo.GetByIdAsync(id);
    // ... set cache
    return product;
}
```

#### HybridCache solves stampede automatically

With `HybridCache.GetOrCreateAsync`, when 100 concurrent requests call the same key, **exactly 1 request** runs the factory method (DB query), and the other 99 wait for the result. No `SemaphoreSlim`, no double-check pattern needed. This is the primary reason to migrate from `IMemoryCache` to `HybridCache`.

## 9. Monitoring & Observability for Cache

Cache without monitoring is like driving blindfolded — you don't know whether you're saving or wasting. The three most important metrics:

Hit RateCache hits / total requests. Target: > 90%

Eviction RateEntries evicted/sec. High = need more capacity

Latency P9999th percentile read time. L1 < 1ms, L2 < 10ms

Memory Usage% memory used for cache. Prevent OOM with SizeLimit

```
// Expose cache metrics via OpenTelemetry
builder.Services.AddOpenTelemetry()
    .WithMetrics(metrics =>
    {
        metrics.AddMeter("MyApp.Cache");
    });

// Custom meter for cache
public class CacheMetrics
{
    private static readonly Meter _meter = new("MyApp.Cache");
    private static readonly Counter<long> _hits =
        _meter.CreateCounter<long>("cache.hits");
    private static readonly Counter<long> _misses =
        _meter.CreateCounter<long>("cache.misses");
    private static readonly Histogram<double> _duration =
        _meter.CreateHistogram<double>("cache.duration.ms");

public void RecordHit(string cacheLayer) =>
        _hits.Add(1, new("layer", cacheLayer));
    public void RecordMiss(string cacheLayer) =>
        _misses.Add(1, new("layer", cacheLayer));
    public void RecordDuration(string cacheLayer, double ms) =>
        _duration.Record(ms, new("layer", cacheLayer));
}
```

## 10. Production Checklist

Before deploying a system with caching, check the following:

| # | Item | Notes |
| --- | --- | --- |
| 1 | **TTL always set** | Never cache forever — even config. Set both absolute and sliding expiration. |
| 2 | **Cache key has namespace** | Prefix with service name + version: `myapp:v2:product:123`. Prevents collision when multiple services share Redis. |
| 3 | **Stampede protection** | Use `HybridCache` or implement mutex. Don't let cache misses fan-out to database. |
| 4 | **Consistent serialization format** | Choose JSON or MessagePack and stick with it. Changing format = invalidating entire cache. |
| 5 | **Circuit breaker for cache layer** | When Redis is down, fall back to direct database — don't let cache timeout block requests. |
| 6 | **Full monitoring** | Hit rate, eviction rate, memory usage, latency. Alert when hit rate < 80%. |
| 7 | **Don't cache PII/sensitive data** | Or if you must, encrypt + short TTL. Audit trail for cache access. |
| 8 | **Warm-up strategy** | After fresh deploy, cache is cold. Consider pre-warming hot keys or canary deploy. |

#### The golden rule of caching

**Cache data that is read more than written.** If write/read ratio > 50%, caching may cause more overhead than benefit — you'll spend more time on invalidation than you save from cache hits. In this case, optimize the database directly (indexing, query optimization, connection pooling) instead of adding a cache layer.

## 11. Conclusion

With `HybridCache` in .NET 10, Microsoft has solved most of the application caching complexity: automatic L1/L2, built-in stampede protection, tag-based invalidation, and a clean API. If you're using `IMemoryCache` or `IDistributedCache` separately, now is the time to migrate.

Remember: **the best cache is the one you don't need**. Before adding cache, ask: is this data actually read-heavy? Can the query be optimized directly? Caching should only be the solution after you've optimized at the data layer.

## 12. References

- [HybridCache library in ASP.NET Core — Microsoft Learn](https://learn.microsoft.com/en-us/aspnet/core/performance/caching/hybrid?view=aspnetcore-10.0)
- [Hello HybridCache! Streamlining Cache Management — .NET Blog](https://devblogs.microsoft.com/dotnet/hybrid-cache-is-now-ga/)
- [Caching in ASP.NET Core: Improving Application Performance — Milan Jovanović](https://www.milanjovanovic.tech/blog/caching-in-aspnetcore-improving-application-performance)
- [Distributed Caching in ASP.NET Core with Redis — CodeWithMukesh](https://codewithmukesh.com/blog/distributed-caching-in-aspnet-core-with-redis/)
- [Caching Layers Explained: Browser, CDN, and App Caching — Technori](https://technori.com/2026/03/24675-caching-layers-explained-browser-cdn-and-app-caching/marcus/)
- [Ultimate Guide to Caching 2026: Strategies and Best Practices — DragonflyDB](https://www.dragonflydb.io/guides/ultimate-guide-to-caching)
- [Caching Strategies and How to Choose the Right One — CodeAhoy](https://codeahoy.com/2017/08/11/caching-strategies-and-how-to-choose-the-right-one/)
- [What is Caching and How it Works — AWS](https://aws.amazon.com/caching/)

Load Testing for Distributed Systems — k6, NBomber and Performance Testing Strategies

n8n — Open-Source AI Workflow Automation Platform for Developers

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.