Multi-Tier Caching Strategy: From Browser to Database for High-Performance Applications
Posted on: 4/25/2026 2:17:49 PM
Table of contents
- 1. Why caching is the backbone of every high-performance system
- 2. Multi-tier caching architecture — 4 layers of defense
- 3. Tier 1: Browser Cache — zero network, zero cost
- 4. Tier 2: CDN Edge Cache — reducing latency by geography
- 5. Tier 3: Application Cache — the heart of caching strategy
- 6. Tier 4: Database Cache — optimizing at the source
- 7. Cache Invalidation — "only two hard things in CS"
- 8. Cache Stampede & Thundering Herd — proactive defense
- 9. Monitoring & Observability for Cache
- 10. Production Checklist
- 11. Conclusion
- 12. References
1. Why caching is the backbone of every high-performance system
In any System Design interview, the answer to "how to make the system faster?" almost always starts with caching. But in practice, caching is not simply "add Redis." A truly high-performance system implements caching at multiple tiers, each addressing a different layer of latency. Misunderstanding which tier should cache what leads to wasted resources, stale data, or worse — cache stampedes that bring down the entire system.
This article analyzes multi-tier caching architecture from a practical perspective: 4 cache layers from browser to database, 5 invalidation strategies, and how .NET 10's new HybridCache solves the L1/L2 caching problem once and for all.
2. Multi-tier caching architecture — 4 layers of defense
Each cache tier acts as a protective layer, intercepting requests before they fall down to a more expensive layer below. The closer to the user, the lower the latency — but the more limited the capacity and the harder the invalidation.
flowchart TB
User["👤 User / Browser"]
BCache["🖥️ Browser Cache
Cache-Control, ETag, Service Worker
~0ms latency"]
CDN["🌐 CDN Edge Cache
Cloudflare, CloudFront
~10-30ms latency"]
AppCache["⚡ Application Cache
IMemoryCache / HybridCache / Redis
~1-50ms latency"]
DB["🗄️ Database
Query Cache, Materialized View
~100-500ms latency"]
User -->|"Request"| BCache
BCache -->|"Cache Miss"| CDN
CDN -->|"Cache Miss"| AppCache
AppCache -->|"Cache Miss"| DB
DB -->|"Response"| AppCache
AppCache -->|"Response + Cache"| CDN
CDN -->|"Response + Cache"| BCache
BCache -->|"Response"| User
style User fill:#e94560,stroke:#fff,color:#fff
style BCache fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style CDN fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
style AppCache fill:#f8f9fa,stroke:#2196F3,color:#2c3e50
style DB fill:#2c3e50,stroke:#fff,color:#fff
Figure 1: Multi-tier caching architecture — requests flow top-down, each tier intercepts before hitting the more expensive layer below
| Tier | Latency | Capacity | Scope | Invalidation |
|---|---|---|---|---|
| Browser Cache | ~0ms | ~50-300MB | Single user, single device | Difficult (depends on TTL, user action) |
| CDN Edge | 10-30ms | TB-level | All users in region | API purge, TTL, tag-based |
| Application | 1-50ms | GB-level | All requests to cluster | Proactive (event, TTL, tag) |
| Database | 100-500ms | Disk-level | Query layer | Automatic on data change |
3. Tier 1: Browser Cache — zero network, zero cost
Browser cache is the fastest tier because it requires absolutely no network request. The browser stores responses on disk/memory based on HTTP headers returned by the server.
3.1. Cache-Control — the brain of browser caching
The Cache-Control header determines whether the browser can cache, for how long, and whether revalidation with the server is required:
// Static assets — long cache, immutable
Cache-Control: public, max-age=31536000, immutable
// API response — short cache, must revalidate
Cache-Control: private, max-age=0, must-revalidate
// HTML page — conditional caching
Cache-Control: public, max-age=300, stale-while-revalidate=60
stale-while-revalidate — "serve stale, update silently"
stale-while-revalidate=60 allows the browser to immediately return the stale cached version to the user, while simultaneously sending a background request to the server for the fresh version. The user sees an instant page load, and the fresh version will be ready for next time. This is the ideal pattern for blog pages, product catalogs — where data changes infrequently but still needs updating.
3.2. ETag — smart validation without full reload
When max-age expires, the browser sends If-None-Match with the ETag. The server compares: if content hasn't changed, it returns 304 Not Modified (a few bytes) instead of resending the entire response (potentially hundreds of KB). Significant bandwidth savings for APIs with large payloads.
// First response
HTTP/1.1 200 OK
ETag: "a1b2c3d4e5"
Cache-Control: max-age=300
// Next request after cache expires
GET /api/products HTTP/1.1
If-None-Match: "a1b2c3d4e5"
// Server: data unchanged
HTTP/1.1 304 Not Modified
3.3. Service Worker — programmable cache layer
Service Workers allow fully customizable cache logic in JavaScript. Common in PWAs, Service Workers intercept all fetch requests and decide: serve from cache, from network, or combine both (stale-while-revalidate pattern at the code level).
// sw.js — Cache-first for static, Network-first for API
self.addEventListener('fetch', (event) => {
if (event.request.url.includes('/api/')) {
// Network-first for API calls
event.respondWith(
fetch(event.request)
.then(response => {
const clone = response.clone();
caches.open('api-v1').then(cache =>
cache.put(event.request, clone));
return response;
})
.catch(() => caches.match(event.request))
);
} else {
// Cache-first for static assets
event.respondWith(
caches.match(event.request)
.then(cached => cached || fetch(event.request))
);
}
});
4. Tier 2: CDN Edge Cache — reducing latency by geography
CDNs place content copies at hundreds of edge servers worldwide. When a user in Vietnam makes a request, the CDN serves from the Singapore node instead of the US origin server — reducing latency from 200ms to 20ms.
4.1. Cloudflare — smart cache rules for free
Cloudflare provides powerful cache rules even on the Free plan. You can cache by path, query string, header, and set different TTLs for each content type:
# Example Cloudflare Cache Rules (via Dashboard or API)
# Rule 1: Cache static assets for 1 year
URI Path matches "/assets/*"
→ Cache eligible, Edge TTL: 365 days, Browser TTL: 365 days
# Rule 2: Cache API responses for 5 minutes
URI Path matches "/api/products*"
→ Cache eligible, Edge TTL: 300s, Browser TTL: 0
→ Cache Key: include query string
# Rule 3: Bypass cache for authenticated requests
Cookie contains "auth_token"
→ Bypass cache
Cache Key — determining "same" vs "different"
The cache key determines whether two requests receive the same cached response. By default, CDNs use URL + query string as the cache key. But if you serve different content based on headers (e.g., Accept-Language for multilingual support), you must include that header in the cache key — otherwise, a Vietnamese user might receive an English cached version.
4.2. Cache Purge and Tag-based Invalidation
CDNs allow cache purging by specific URL, prefix (purge all /api/products/*), or by cache tag. Cloudflare Enterprise and AWS CloudFront both support tag-based purge — when a product changes, purging tag product-123 clears all related responses across all edge servers within seconds.
sequenceDiagram
participant App as Application
participant CDN as CDN Edge
participant User as Users
App->>CDN: Response + Cache-Tag: product-123
CDN->>User: Serve cached response
Note over App: Product 123 updated
App->>CDN: Purge cache tag "product-123"
CDN->>CDN: Invalidate all entries with tag
User->>CDN: Next request
CDN->>App: Cache MISS → fetch fresh
App->>CDN: New response + Cache-Tag: product-123
CDN->>User: Serve fresh response
Figure 2: Tag-based cache invalidation flow — purge by tag instead of individual URLs
5. Tier 3: Application Cache — the heart of caching strategy
This is the cache tier you have full control over, residing either in-process or in a distributed store. On .NET, the caching ecosystem has evolved through three generations:
5.1. IMemoryCache — simple but with traps
IMemoryCache stores objects directly in process memory, with nanosecond-level retrieval. Suitable for single-instance deployments or data that only needs local caching (config, small lookup tables).
// Register in Program.cs
builder.Services.AddMemoryCache();
// Usage
public class ProductService
{
private readonly IMemoryCache _cache;
private readonly IProductRepository _repo;
public async Task<Product?> GetByIdAsync(int id)
{
var key = $"product:{id}";
if (_cache.TryGetValue(key, out Product? cached))
return cached;
var product = await _repo.GetByIdAsync(id);
if (product is not null)
{
_cache.Set(key, product, new MemoryCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(30),
SlidingExpiration = TimeSpan.FromMinutes(10),
Size = 1 // if using SizeLimit
});
}
return product;
}
}
The Cache Stampede trap with IMemoryCache
When a cache entry expires and 100 concurrent requests arrive, all 100 see a cache miss and simultaneously hit the database. With IMemoryCache, you must handle this yourself using SemaphoreSlim or Lazy<Task>. This is exactly why HybridCache was created — stampede protection is built in.
5.2. IDistributedCache — shared but raw
When running multiple instances (Kubernetes, load balancer), IMemoryCache on each pod is independent. A user sets cache on Pod 1, but the next request hits Pod 2 → cache miss. IDistributedCache with Redis solves this:
// Program.cs
builder.Services.AddStackExchangeRedisCache(options =>
{
options.Configuration = "localhost:6379";
options.InstanceName = "myapp:";
});
// Usage — raw API, manual serialization
public async Task<Product?> GetByIdAsync(int id)
{
var key = $"product:{id}";
var bytes = await _cache.GetAsync(key);
if (bytes is not null)
return JsonSerializer.Deserialize<Product>(bytes);
var product = await _repo.GetByIdAsync(id);
if (product is not null)
{
var json = JsonSerializer.SerializeToUtf8Bytes(product);
await _cache.SetAsync(key, json, new DistributedCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(30)
});
}
return product;
}
The downsides are clear: the API works with byte[], requiring manual serialization/deserialization, no stampede protection, and every cache read goes through the network (even on hit).
5.3. HybridCache — the future of caching on .NET
HybridCache (GA since .NET 9, stable in .NET 10) combines the best of both: L1 in-memory for speed, L2 distributed for consistency, plus stampede protection and tag-based invalidation:
flowchart LR
Request["Request"] --> HC["HybridCache
GetOrCreateAsync"]
HC --> L1{"L1 Memory
Cache Hit?"}
L1 -->|"Hit"| Return["Return
~nanoseconds"]
L1 -->|"Miss"| L2{"L2 Redis
Cache Hit?"}
L2 -->|"Hit"| PopL1["Populate L1
Return ~ms"]
L2 -->|"Miss"| Factory["Factory Method
(DB Query)"]
Factory --> PopBoth["Populate L1 + L2
Return"]
style Request fill:#e94560,stroke:#fff,color:#fff
style HC fill:#2c3e50,stroke:#fff,color:#fff
style L1 fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
style L2 fill:#f8f9fa,stroke:#2196F3,color:#2c3e50
style Factory fill:#f8f9fa,stroke:#ff9800,color:#2c3e50
style Return fill:#4CAF50,stroke:#fff,color:#fff
style PopL1 fill:#4CAF50,stroke:#fff,color:#fff
style PopBoth fill:#4CAF50,stroke:#fff,color:#fff
Figure 3: HybridCache flow — L1 memory → L2 Redis → Factory, with stampede protection at each tier
// Program.cs — setup HybridCache
builder.Services.AddHybridCache(options =>
{
options.DefaultEntryOptions = new HybridCacheEntryOptions
{
Expiration = TimeSpan.FromMinutes(30),
LocalCacheExpiration = TimeSpan.FromMinutes(5)
};
options.MaximumPayloadBytes = 1024 * 1024; // 1MB limit
});
// Add Redis as L2 backend
builder.Services.AddStackExchangeRedisCache(options =>
{
options.Configuration = "localhost:6379";
});
// Usage — clean API, auto serialize, stampede protection
public class ProductService(HybridCache cache, IProductRepository repo)
{
public async Task<Product?> GetByIdAsync(int id,
CancellationToken ct = default)
{
return await cache.GetOrCreateAsync(
$"product:{id}",
async cancel => await repo.GetByIdAsync(id, cancel),
new HybridCacheEntryOptions
{
Expiration = TimeSpan.FromMinutes(30),
LocalCacheExpiration = TimeSpan.FromMinutes(5)
},
tags: ["products", $"product:{id}"],
cancellationToken: ct
);
}
public async Task UpdateAsync(Product product,
CancellationToken ct = default)
{
await repo.UpdateAsync(product, ct);
// Invalidate by tag — removes all related entries
await cache.RemoveByTagAsync($"product:{product.Id}", ct);
}
}
| Feature | IMemoryCache | IDistributedCache | HybridCache |
|---|---|---|---|
| L1 (in-process) | ✅ | ❌ | ✅ |
| L2 (distributed) | ❌ | ✅ | ✅ |
| Stampede protection | ❌ | ❌ | ✅ |
| Tag-based invalidation | ❌ | ❌ | ✅ |
| Auto serialization | N/A (object ref) | ❌ (byte[]) | ✅ |
| Multi-instance safe | ❌ | ✅ | ✅ |
| Latency (hit) | ~ns | ~ms | ~ns (L1) / ~ms (L2) |
6. Tier 4: Database Cache — optimizing at the source
Even with application caching intercepting most requests, some always reach the database. Optimizing this tier reduces latency on cache misses and eases pressure during mass cache invalidations.
6.1. Query Plan Cache
Both SQL Server and PostgreSQL cache execution plans for prepared statements. Using parameterized queries instead of string concatenation not only prevents SQL injection but also allows the database to reuse compiled plans:
// ❌ Each id value creates a new plan
var sql = $"SELECT * FROM Products WHERE Id = {id}";
// ✅ Plan is cached and reused
var sql = "SELECT * FROM Products WHERE Id = @Id";
cmd.Parameters.AddWithValue("@Id", id);
6.2. Materialized View — pre-computed data
Instead of running complex aggregate queries each time, Materialized Views store pre-computed results. PostgreSQL supports REFRESH MATERIALIZED VIEW CONCURRENTLY for non-blocking refreshes. SQL Server uses Indexed Views for similar purposes.
-- PostgreSQL: Create materialized view for dashboard stats
CREATE MATERIALIZED VIEW product_stats AS
SELECT
category_id,
COUNT(*) as total_products,
AVG(price) as avg_price,
MAX(updated_at) as last_updated
FROM products
GROUP BY category_id;
-- Concurrent refresh, non-blocking reads
REFRESH MATERIALIZED VIEW CONCURRENTLY product_stats;
7. Cache Invalidation — "only two hard things in CS"
Phil Karlton once said: "There are only two hard things in Computer Science: cache invalidation and naming things." Understanding these 5 invalidation strategies helps you choose the right pattern for each use case:
flowchart TB
subgraph CacheAside["Cache-Aside (Lazy Loading)"]
CA1["App checks cache"] --> CA2{"Hit?"}
CA2 -->|"Yes"| CA3["Return"]
CA2 -->|"No"| CA4["Query DB"]
CA4 --> CA5["Write to cache"]
CA5 --> CA3
end
subgraph WriteThrough["Write-Through"]
WT1["App writes data"] --> WT2["Write to cache"]
WT2 --> WT3["Cache writes to DB"]
WT3 --> WT4["Confirm"]
end
subgraph WriteBehind["Write-Behind (Write-Back)"]
WB1["App writes data"] --> WB2["Write to cache"]
WB2 --> WB3["Confirm immediately"]
WB2 -.->|"Async"| WB4["Cache writes DB later"]
end
style CacheAside fill:#f8f9fa,stroke:#e94560
style WriteThrough fill:#f8f9fa,stroke:#4CAF50
style WriteBehind fill:#f8f9fa,stroke:#2196F3
Figure 4: Three main cache write strategy patterns
| Pattern | Read perf | Write perf | Consistency | Best for |
|---|---|---|---|---|
| Cache-Aside | Slow first time (cold start) | Fast (DB only) | Eventual | 95% of use cases — the default choice |
| Read-Through | Like Cache-Aside but cache self-fetches | N/A | Eventual | When you want cache layer to self-manage |
| Write-Through | Always fast (data available) | Slower (sync write to both) | Strong | Critical data requiring high consistency |
| Write-Behind | Always fast | Very fast (async) | Eventual (data loss risk) | High-write throughput, accepting risk |
| Write-Around | Slow first time | Fast (bypass cache) | Eventual | Data rarely read after write |
Quick pattern selection guide
Default: Cache-Aside + TTL safety net. 95% of web applications work well with this.
Need strong consistency: Write-Through (e.g., inventory count, account balance).
Write throughput is priority: Write-Behind (e.g., analytics events, view counts).
Data rarely read back: Write-Around (e.g., audit logs, notification history).
8. Cache Stampede & Thundering Herd — proactive defense
Cache stampede occurs when a popular cache entry expires and hundreds of concurrent requests discover the miss simultaneously, all querying the database at once. This is the #1 cause of caching-related system crashes.
8.1. Prevention techniques
// Technique 1: Mutex/Lock — only 1 request fetches, others wait
private static readonly SemaphoreSlim _lock = new(1, 1);
public async Task<Product?> GetWithLockAsync(int id)
{
var key = $"product:{id}";
if (_cache.TryGetValue(key, out Product? cached))
return cached;
await _lock.WaitAsync();
try
{
// Double-check after acquiring lock
if (_cache.TryGetValue(key, out cached))
return cached;
var product = await _repo.GetByIdAsync(id);
_cache.Set(key, product, TimeSpan.FromMinutes(30));
return product;
}
finally { _lock.Release(); }
}
// Technique 2: Probabilistic Early Expiration
// Cache entry "voluntarily" refreshes before actual expiry
public async Task<Product?> GetWithEarlyRefreshAsync(int id)
{
var key = $"product:{id}";
var entry = _cache.Get<CacheEntry<Product>>(key);
if (entry is not null)
{
var timeToExpiry = entry.ExpiresAt - DateTime.UtcNow;
var totalTtl = entry.Ttl;
// Refresh probability increases as expiry approaches
var probability = Math.Exp(-timeToExpiry / totalTtl * 10);
if (Random.Shared.NextDouble() > probability)
return entry.Value;
}
// Fetch and re-cache
var product = await _repo.GetByIdAsync(id);
// ... set cache
return product;
}
HybridCache solves stampede automatically
With HybridCache.GetOrCreateAsync, when 100 concurrent requests call the same key, exactly 1 request runs the factory method (DB query), and the other 99 wait for the result. No SemaphoreSlim, no double-check pattern needed. This is the primary reason to migrate from IMemoryCache to HybridCache.
9. Monitoring & Observability for Cache
Cache without monitoring is like driving blindfolded — you don't know whether you're saving or wasting. The three most important metrics:
// Expose cache metrics via OpenTelemetry
builder.Services.AddOpenTelemetry()
.WithMetrics(metrics =>
{
metrics.AddMeter("MyApp.Cache");
});
// Custom meter for cache
public class CacheMetrics
{
private static readonly Meter _meter = new("MyApp.Cache");
private static readonly Counter<long> _hits =
_meter.CreateCounter<long>("cache.hits");
private static readonly Counter<long> _misses =
_meter.CreateCounter<long>("cache.misses");
private static readonly Histogram<double> _duration =
_meter.CreateHistogram<double>("cache.duration.ms");
public void RecordHit(string cacheLayer) =>
_hits.Add(1, new("layer", cacheLayer));
public void RecordMiss(string cacheLayer) =>
_misses.Add(1, new("layer", cacheLayer));
public void RecordDuration(string cacheLayer, double ms) =>
_duration.Record(ms, new("layer", cacheLayer));
}
10. Production Checklist
Before deploying a system with caching, check the following:
| # | Item | Notes |
|---|---|---|
| 1 | TTL always set | Never cache forever — even config. Set both absolute and sliding expiration. |
| 2 | Cache key has namespace | Prefix with service name + version: myapp:v2:product:123. Prevents collision when multiple services share Redis. |
| 3 | Stampede protection | Use HybridCache or implement mutex. Don't let cache misses fan-out to database. |
| 4 | Consistent serialization format | Choose JSON or MessagePack and stick with it. Changing format = invalidating entire cache. |
| 5 | Circuit breaker for cache layer | When Redis is down, fall back to direct database — don't let cache timeout block requests. |
| 6 | Full monitoring | Hit rate, eviction rate, memory usage, latency. Alert when hit rate < 80%. |
| 7 | Don't cache PII/sensitive data | Or if you must, encrypt + short TTL. Audit trail for cache access. |
| 8 | Warm-up strategy | After fresh deploy, cache is cold. Consider pre-warming hot keys or canary deploy. |
The golden rule of caching
Cache data that is read more than written. If write/read ratio > 50%, caching may cause more overhead than benefit — you'll spend more time on invalidation than you save from cache hits. In this case, optimize the database directly (indexing, query optimization, connection pooling) instead of adding a cache layer.
11. Conclusion
Multi-tier caching is not simply "add Redis between app and DB." It's designing a multi-layered defense system where each layer has its own role, trade-offs, and invalidation strategy. Browser cache eliminates network requests, CDN reduces geographic latency, application cache reduces database load, and database cache optimizes at the data source.
With HybridCache in .NET 10, Microsoft has solved most of the application caching complexity: automatic L1/L2, built-in stampede protection, tag-based invalidation, and a clean API. If you're using IMemoryCache or IDistributedCache separately, now is the time to migrate.
Remember: the best cache is the one you don't need. Before adding cache, ask: is this data actually read-heavy? Can the query be optimized directly? Caching should only be the solution after you've optimized at the data layer.
12. References
- HybridCache library in ASP.NET Core — Microsoft Learn
- Hello HybridCache! Streamlining Cache Management — .NET Blog
- Caching in ASP.NET Core: Improving Application Performance — Milan Jovanović
- Distributed Caching in ASP.NET Core with Redis — CodeWithMukesh
- Caching Layers Explained: Browser, CDN, and App Caching — Technori
- Ultimate Guide to Caching 2026: Strategies and Best Practices — DragonflyDB
- Caching Strategies and How to Choose the Right One — CodeAhoy
- What is Caching and How it Works — AWS
Load Testing for Distributed Systems — k6, NBomber and Performance Testing Strategies
n8n — Open-Source AI Workflow Automation Platform for Developers
Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.