Multi-Tier Caching Strategy: From Browser to Database for High-Performance Applications

Posted on: 4/25/2026 2:17:49 PM

1. Why caching is the backbone of every high-performance system

In any System Design interview, the answer to "how to make the system faster?" almost always starts with caching. But in practice, caching is not simply "add Redis." A truly high-performance system implements caching at multiple tiers, each addressing a different layer of latency. Misunderstanding which tier should cache what leads to wasted resources, stale data, or worse — cache stampedes that bring down the entire system.

This article analyzes multi-tier caching architecture from a practical perspective: 4 cache layers from browser to database, 5 invalidation strategies, and how .NET 10's new HybridCache solves the L1/L2 caching problem once and for all.

~5msMemory cache lookup (L1)
~50msDistributed cache lookup (L2)
~200msAverage database query
90%+Database load reduction with proper caching

2. Multi-tier caching architecture — 4 layers of defense

Each cache tier acts as a protective layer, intercepting requests before they fall down to a more expensive layer below. The closer to the user, the lower the latency — but the more limited the capacity and the harder the invalidation.

flowchart TB
    User["👤 User / Browser"]
    BCache["🖥️ Browser Cache
Cache-Control, ETag, Service Worker
~0ms latency"] CDN["🌐 CDN Edge Cache
Cloudflare, CloudFront
~10-30ms latency"] AppCache["⚡ Application Cache
IMemoryCache / HybridCache / Redis
~1-50ms latency"] DB["🗄️ Database
Query Cache, Materialized View
~100-500ms latency"] User -->|"Request"| BCache BCache -->|"Cache Miss"| CDN CDN -->|"Cache Miss"| AppCache AppCache -->|"Cache Miss"| DB DB -->|"Response"| AppCache AppCache -->|"Response + Cache"| CDN CDN -->|"Response + Cache"| BCache BCache -->|"Response"| User style User fill:#e94560,stroke:#fff,color:#fff style BCache fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style CDN fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50 style AppCache fill:#f8f9fa,stroke:#2196F3,color:#2c3e50 style DB fill:#2c3e50,stroke:#fff,color:#fff

Figure 1: Multi-tier caching architecture — requests flow top-down, each tier intercepts before hitting the more expensive layer below

TierLatencyCapacityScopeInvalidation
Browser Cache~0ms~50-300MBSingle user, single deviceDifficult (depends on TTL, user action)
CDN Edge10-30msTB-levelAll users in regionAPI purge, TTL, tag-based
Application1-50msGB-levelAll requests to clusterProactive (event, TTL, tag)
Database100-500msDisk-levelQuery layerAutomatic on data change

3. Tier 1: Browser Cache — zero network, zero cost

Browser cache is the fastest tier because it requires absolutely no network request. The browser stores responses on disk/memory based on HTTP headers returned by the server.

3.1. Cache-Control — the brain of browser caching

The Cache-Control header determines whether the browser can cache, for how long, and whether revalidation with the server is required:

// Static assets — long cache, immutable
Cache-Control: public, max-age=31536000, immutable

// API response — short cache, must revalidate
Cache-Control: private, max-age=0, must-revalidate

// HTML page — conditional caching
Cache-Control: public, max-age=300, stale-while-revalidate=60

stale-while-revalidate — "serve stale, update silently"

stale-while-revalidate=60 allows the browser to immediately return the stale cached version to the user, while simultaneously sending a background request to the server for the fresh version. The user sees an instant page load, and the fresh version will be ready for next time. This is the ideal pattern for blog pages, product catalogs — where data changes infrequently but still needs updating.

3.2. ETag — smart validation without full reload

When max-age expires, the browser sends If-None-Match with the ETag. The server compares: if content hasn't changed, it returns 304 Not Modified (a few bytes) instead of resending the entire response (potentially hundreds of KB). Significant bandwidth savings for APIs with large payloads.

// First response
HTTP/1.1 200 OK
ETag: "a1b2c3d4e5"
Cache-Control: max-age=300

// Next request after cache expires
GET /api/products HTTP/1.1
If-None-Match: "a1b2c3d4e5"

// Server: data unchanged
HTTP/1.1 304 Not Modified

3.3. Service Worker — programmable cache layer

Service Workers allow fully customizable cache logic in JavaScript. Common in PWAs, Service Workers intercept all fetch requests and decide: serve from cache, from network, or combine both (stale-while-revalidate pattern at the code level).

// sw.js — Cache-first for static, Network-first for API
self.addEventListener('fetch', (event) => {
  if (event.request.url.includes('/api/')) {
    // Network-first for API calls
    event.respondWith(
      fetch(event.request)
        .then(response => {
          const clone = response.clone();
          caches.open('api-v1').then(cache =>
            cache.put(event.request, clone));
          return response;
        })
        .catch(() => caches.match(event.request))
    );
  } else {
    // Cache-first for static assets
    event.respondWith(
      caches.match(event.request)
        .then(cached => cached || fetch(event.request))
    );
  }
});

4. Tier 2: CDN Edge Cache — reducing latency by geography

CDNs place content copies at hundreds of edge servers worldwide. When a user in Vietnam makes a request, the CDN serves from the Singapore node instead of the US origin server — reducing latency from 200ms to 20ms.

4.1. Cloudflare — smart cache rules for free

Cloudflare provides powerful cache rules even on the Free plan. You can cache by path, query string, header, and set different TTLs for each content type:

# Example Cloudflare Cache Rules (via Dashboard or API)

# Rule 1: Cache static assets for 1 year
URI Path matches "/assets/*"
→ Cache eligible, Edge TTL: 365 days, Browser TTL: 365 days

# Rule 2: Cache API responses for 5 minutes
URI Path matches "/api/products*"
→ Cache eligible, Edge TTL: 300s, Browser TTL: 0
→ Cache Key: include query string

# Rule 3: Bypass cache for authenticated requests
Cookie contains "auth_token"
→ Bypass cache

Cache Key — determining "same" vs "different"

The cache key determines whether two requests receive the same cached response. By default, CDNs use URL + query string as the cache key. But if you serve different content based on headers (e.g., Accept-Language for multilingual support), you must include that header in the cache key — otherwise, a Vietnamese user might receive an English cached version.

4.2. Cache Purge and Tag-based Invalidation

CDNs allow cache purging by specific URL, prefix (purge all /api/products/*), or by cache tag. Cloudflare Enterprise and AWS CloudFront both support tag-based purge — when a product changes, purging tag product-123 clears all related responses across all edge servers within seconds.

sequenceDiagram
    participant App as Application
    participant CDN as CDN Edge
    participant User as Users

    App->>CDN: Response + Cache-Tag: product-123
    CDN->>User: Serve cached response
    Note over App: Product 123 updated
    App->>CDN: Purge cache tag "product-123"
    CDN->>CDN: Invalidate all entries with tag
    User->>CDN: Next request
    CDN->>App: Cache MISS → fetch fresh
    App->>CDN: New response + Cache-Tag: product-123
    CDN->>User: Serve fresh response

Figure 2: Tag-based cache invalidation flow — purge by tag instead of individual URLs

5. Tier 3: Application Cache — the heart of caching strategy

This is the cache tier you have full control over, residing either in-process or in a distributed store. On .NET, the caching ecosystem has evolved through three generations:

.NET Core 1.0+
IMemoryCache — in-process, fast but not shared between instances. Cache lost on app restart.
.NET Core 2.0+
IDistributedCache — interface for Redis/SQL Server/NCache. Shared between instances but raw API (byte array), no stampede protection.
.NET 9 → 10 (GA)
HybridCache — combines L1 (memory) + L2 (distributed), built-in stampede protection, tag-based invalidation, automatic serialization. This is the future.

5.1. IMemoryCache — simple but with traps

IMemoryCache stores objects directly in process memory, with nanosecond-level retrieval. Suitable for single-instance deployments or data that only needs local caching (config, small lookup tables).

// Register in Program.cs
builder.Services.AddMemoryCache();

// Usage
public class ProductService
{
    private readonly IMemoryCache _cache;
    private readonly IProductRepository _repo;

    public async Task<Product?> GetByIdAsync(int id)
    {
        var key = $"product:{id}";

        if (_cache.TryGetValue(key, out Product? cached))
            return cached;

        var product = await _repo.GetByIdAsync(id);
        if (product is not null)
        {
            _cache.Set(key, product, new MemoryCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(30),
                SlidingExpiration = TimeSpan.FromMinutes(10),
                Size = 1 // if using SizeLimit
            });
        }
        return product;
    }
}

The Cache Stampede trap with IMemoryCache

When a cache entry expires and 100 concurrent requests arrive, all 100 see a cache miss and simultaneously hit the database. With IMemoryCache, you must handle this yourself using SemaphoreSlim or Lazy<Task>. This is exactly why HybridCache was created — stampede protection is built in.

5.2. IDistributedCache — shared but raw

When running multiple instances (Kubernetes, load balancer), IMemoryCache on each pod is independent. A user sets cache on Pod 1, but the next request hits Pod 2 → cache miss. IDistributedCache with Redis solves this:

// Program.cs
builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = "localhost:6379";
    options.InstanceName = "myapp:";
});

// Usage — raw API, manual serialization
public async Task<Product?> GetByIdAsync(int id)
{
    var key = $"product:{id}";
    var bytes = await _cache.GetAsync(key);

    if (bytes is not null)
        return JsonSerializer.Deserialize<Product>(bytes);

    var product = await _repo.GetByIdAsync(id);
    if (product is not null)
    {
        var json = JsonSerializer.SerializeToUtf8Bytes(product);
        await _cache.SetAsync(key, json, new DistributedCacheEntryOptions
        {
            AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(30)
        });
    }
    return product;
}

The downsides are clear: the API works with byte[], requiring manual serialization/deserialization, no stampede protection, and every cache read goes through the network (even on hit).

5.3. HybridCache — the future of caching on .NET

HybridCache (GA since .NET 9, stable in .NET 10) combines the best of both: L1 in-memory for speed, L2 distributed for consistency, plus stampede protection and tag-based invalidation:

flowchart LR
    Request["Request"] --> HC["HybridCache
GetOrCreateAsync"] HC --> L1{"L1 Memory
Cache Hit?"} L1 -->|"Hit"| Return["Return
~nanoseconds"] L1 -->|"Miss"| L2{"L2 Redis
Cache Hit?"} L2 -->|"Hit"| PopL1["Populate L1
Return ~ms"] L2 -->|"Miss"| Factory["Factory Method
(DB Query)"] Factory --> PopBoth["Populate L1 + L2
Return"] style Request fill:#e94560,stroke:#fff,color:#fff style HC fill:#2c3e50,stroke:#fff,color:#fff style L1 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50 style L2 fill:#f8f9fa,stroke:#2196F3,color:#2c3e50 style Factory fill:#f8f9fa,stroke:#ff9800,color:#2c3e50 style Return fill:#4CAF50,stroke:#fff,color:#fff style PopL1 fill:#4CAF50,stroke:#fff,color:#fff style PopBoth fill:#4CAF50,stroke:#fff,color:#fff

Figure 3: HybridCache flow — L1 memory → L2 Redis → Factory, with stampede protection at each tier

// Program.cs — setup HybridCache
builder.Services.AddHybridCache(options =>
{
    options.DefaultEntryOptions = new HybridCacheEntryOptions
    {
        Expiration = TimeSpan.FromMinutes(30),
        LocalCacheExpiration = TimeSpan.FromMinutes(5)
    };
    options.MaximumPayloadBytes = 1024 * 1024; // 1MB limit
});

// Add Redis as L2 backend
builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = "localhost:6379";
});

// Usage — clean API, auto serialize, stampede protection
public class ProductService(HybridCache cache, IProductRepository repo)
{
    public async Task<Product?> GetByIdAsync(int id,
        CancellationToken ct = default)
    {
        return await cache.GetOrCreateAsync(
            $"product:{id}",
            async cancel => await repo.GetByIdAsync(id, cancel),
            new HybridCacheEntryOptions
            {
                Expiration = TimeSpan.FromMinutes(30),
                LocalCacheExpiration = TimeSpan.FromMinutes(5)
            },
            tags: ["products", $"product:{id}"],
            cancellationToken: ct
        );
    }

    public async Task UpdateAsync(Product product,
        CancellationToken ct = default)
    {
        await repo.UpdateAsync(product, ct);
        // Invalidate by tag — removes all related entries
        await cache.RemoveByTagAsync($"product:{product.Id}", ct);
    }
}
FeatureIMemoryCacheIDistributedCacheHybridCache
L1 (in-process)
L2 (distributed)
Stampede protection
Tag-based invalidation
Auto serializationN/A (object ref)❌ (byte[])
Multi-instance safe
Latency (hit)~ns~ms~ns (L1) / ~ms (L2)

6. Tier 4: Database Cache — optimizing at the source

Even with application caching intercepting most requests, some always reach the database. Optimizing this tier reduces latency on cache misses and eases pressure during mass cache invalidations.

6.1. Query Plan Cache

Both SQL Server and PostgreSQL cache execution plans for prepared statements. Using parameterized queries instead of string concatenation not only prevents SQL injection but also allows the database to reuse compiled plans:

// ❌ Each id value creates a new plan
var sql = $"SELECT * FROM Products WHERE Id = {id}";

// ✅ Plan is cached and reused
var sql = "SELECT * FROM Products WHERE Id = @Id";
cmd.Parameters.AddWithValue("@Id", id);

6.2. Materialized View — pre-computed data

Instead of running complex aggregate queries each time, Materialized Views store pre-computed results. PostgreSQL supports REFRESH MATERIALIZED VIEW CONCURRENTLY for non-blocking refreshes. SQL Server uses Indexed Views for similar purposes.

-- PostgreSQL: Create materialized view for dashboard stats
CREATE MATERIALIZED VIEW product_stats AS
SELECT
    category_id,
    COUNT(*) as total_products,
    AVG(price) as avg_price,
    MAX(updated_at) as last_updated
FROM products
GROUP BY category_id;

-- Concurrent refresh, non-blocking reads
REFRESH MATERIALIZED VIEW CONCURRENTLY product_stats;

7. Cache Invalidation — "only two hard things in CS"

Phil Karlton once said: "There are only two hard things in Computer Science: cache invalidation and naming things." Understanding these 5 invalidation strategies helps you choose the right pattern for each use case:

flowchart TB
    subgraph CacheAside["Cache-Aside (Lazy Loading)"]
        CA1["App checks cache"] --> CA2{"Hit?"}
        CA2 -->|"Yes"| CA3["Return"]
        CA2 -->|"No"| CA4["Query DB"]
        CA4 --> CA5["Write to cache"]
        CA5 --> CA3
    end

    subgraph WriteThrough["Write-Through"]
        WT1["App writes data"] --> WT2["Write to cache"]
        WT2 --> WT3["Cache writes to DB"]
        WT3 --> WT4["Confirm"]
    end

    subgraph WriteBehind["Write-Behind (Write-Back)"]
        WB1["App writes data"] --> WB2["Write to cache"]
        WB2 --> WB3["Confirm immediately"]
        WB2 -.->|"Async"| WB4["Cache writes DB later"]
    end

    style CacheAside fill:#f8f9fa,stroke:#e94560
    style WriteThrough fill:#f8f9fa,stroke:#4CAF50
    style WriteBehind fill:#f8f9fa,stroke:#2196F3

Figure 4: Three main cache write strategy patterns

PatternRead perfWrite perfConsistencyBest for
Cache-AsideSlow first time (cold start)Fast (DB only)Eventual95% of use cases — the default choice
Read-ThroughLike Cache-Aside but cache self-fetchesN/AEventualWhen you want cache layer to self-manage
Write-ThroughAlways fast (data available)Slower (sync write to both)StrongCritical data requiring high consistency
Write-BehindAlways fastVery fast (async)Eventual (data loss risk)High-write throughput, accepting risk
Write-AroundSlow first timeFast (bypass cache)EventualData rarely read after write

Quick pattern selection guide

Default: Cache-Aside + TTL safety net. 95% of web applications work well with this.
Need strong consistency: Write-Through (e.g., inventory count, account balance).
Write throughput is priority: Write-Behind (e.g., analytics events, view counts).
Data rarely read back: Write-Around (e.g., audit logs, notification history).

8. Cache Stampede & Thundering Herd — proactive defense

Cache stampede occurs when a popular cache entry expires and hundreds of concurrent requests discover the miss simultaneously, all querying the database at once. This is the #1 cause of caching-related system crashes.

8.1. Prevention techniques

// Technique 1: Mutex/Lock — only 1 request fetches, others wait
private static readonly SemaphoreSlim _lock = new(1, 1);

public async Task<Product?> GetWithLockAsync(int id)
{
    var key = $"product:{id}";
    if (_cache.TryGetValue(key, out Product? cached))
        return cached;

    await _lock.WaitAsync();
    try
    {
        // Double-check after acquiring lock
        if (_cache.TryGetValue(key, out cached))
            return cached;

        var product = await _repo.GetByIdAsync(id);
        _cache.Set(key, product, TimeSpan.FromMinutes(30));
        return product;
    }
    finally { _lock.Release(); }
}

// Technique 2: Probabilistic Early Expiration
// Cache entry "voluntarily" refreshes before actual expiry
public async Task<Product?> GetWithEarlyRefreshAsync(int id)
{
    var key = $"product:{id}";
    var entry = _cache.Get<CacheEntry<Product>>(key);

    if (entry is not null)
    {
        var timeToExpiry = entry.ExpiresAt - DateTime.UtcNow;
        var totalTtl = entry.Ttl;
        // Refresh probability increases as expiry approaches
        var probability = Math.Exp(-timeToExpiry / totalTtl * 10);
        if (Random.Shared.NextDouble() > probability)
            return entry.Value;
    }

    // Fetch and re-cache
    var product = await _repo.GetByIdAsync(id);
    // ... set cache
    return product;
}

HybridCache solves stampede automatically

With HybridCache.GetOrCreateAsync, when 100 concurrent requests call the same key, exactly 1 request runs the factory method (DB query), and the other 99 wait for the result. No SemaphoreSlim, no double-check pattern needed. This is the primary reason to migrate from IMemoryCache to HybridCache.

9. Monitoring & Observability for Cache

Cache without monitoring is like driving blindfolded — you don't know whether you're saving or wasting. The three most important metrics:

Hit RateCache hits / total requests. Target: > 90%
Eviction RateEntries evicted/sec. High = need more capacity
Latency P9999th percentile read time. L1 < 1ms, L2 < 10ms
Memory Usage% memory used for cache. Prevent OOM with SizeLimit
// Expose cache metrics via OpenTelemetry
builder.Services.AddOpenTelemetry()
    .WithMetrics(metrics =>
    {
        metrics.AddMeter("MyApp.Cache");
    });

// Custom meter for cache
public class CacheMetrics
{
    private static readonly Meter _meter = new("MyApp.Cache");
    private static readonly Counter<long> _hits =
        _meter.CreateCounter<long>("cache.hits");
    private static readonly Counter<long> _misses =
        _meter.CreateCounter<long>("cache.misses");
    private static readonly Histogram<double> _duration =
        _meter.CreateHistogram<double>("cache.duration.ms");

    public void RecordHit(string cacheLayer) =>
        _hits.Add(1, new("layer", cacheLayer));
    public void RecordMiss(string cacheLayer) =>
        _misses.Add(1, new("layer", cacheLayer));
    public void RecordDuration(string cacheLayer, double ms) =>
        _duration.Record(ms, new("layer", cacheLayer));
}

10. Production Checklist

Before deploying a system with caching, check the following:

#ItemNotes
1TTL always setNever cache forever — even config. Set both absolute and sliding expiration.
2Cache key has namespacePrefix with service name + version: myapp:v2:product:123. Prevents collision when multiple services share Redis.
3Stampede protectionUse HybridCache or implement mutex. Don't let cache misses fan-out to database.
4Consistent serialization formatChoose JSON or MessagePack and stick with it. Changing format = invalidating entire cache.
5Circuit breaker for cache layerWhen Redis is down, fall back to direct database — don't let cache timeout block requests.
6Full monitoringHit rate, eviction rate, memory usage, latency. Alert when hit rate < 80%.
7Don't cache PII/sensitive dataOr if you must, encrypt + short TTL. Audit trail for cache access.
8Warm-up strategyAfter fresh deploy, cache is cold. Consider pre-warming hot keys or canary deploy.

The golden rule of caching

Cache data that is read more than written. If write/read ratio > 50%, caching may cause more overhead than benefit — you'll spend more time on invalidation than you save from cache hits. In this case, optimize the database directly (indexing, query optimization, connection pooling) instead of adding a cache layer.

11. Conclusion

Multi-tier caching is not simply "add Redis between app and DB." It's designing a multi-layered defense system where each layer has its own role, trade-offs, and invalidation strategy. Browser cache eliminates network requests, CDN reduces geographic latency, application cache reduces database load, and database cache optimizes at the data source.

With HybridCache in .NET 10, Microsoft has solved most of the application caching complexity: automatic L1/L2, built-in stampede protection, tag-based invalidation, and a clean API. If you're using IMemoryCache or IDistributedCache separately, now is the time to migrate.

Remember: the best cache is the one you don't need. Before adding cache, ask: is this data actually read-heavy? Can the query be optimized directly? Caching should only be the solution after you've optimized at the data layer.

12. References