Redis 8 and Caching Patterns 2026 — I/O Threading, Vector Set, and High-Performance Cache Strategies
Posted on: 4/17/2026 9:14:01 AM
Table of contents
- 1. Redis 8 — The biggest update in Redis history
- 2. Redis 8.6 (March 2026) — Deeper optimizations
- 3. Caching Patterns — Pick the right strategy
- 4. Comparing the caching patterns
- 5. Redis vs Valkey 2026 — The license-fork landscape
- 6. Designing a production cache layer
- 7. Cache anti-patterns to avoid
- 8. Conclusion
In the backend world, Redis is no longer just a simple key-value store. With Redis 8 (GA 2025, continuously updated to 8.6 in March 2026), Redis has become a general-purpose data platform — integrating JSON, Search, TimeSeries, Vector Set, and Bloom Filter directly into the core, together with a new I/O Threading architecture that delivers 5× higher throughput than Redis 7.2. This article dives into the Redis 8 architecture, modern caching patterns, and how to pick the right cache strategy for production.
1. Redis 8 — The biggest update in Redis history
Redis 8 isn't a routine upgrade — it's a full ecosystem consolidation. Previously, modules like RediSearch, RedisJSON, RedisTimeSeries, and RedisBloom had to be installed separately via MODULE LOAD. Since Redis 8, they're all integrated into a single binary called Redis Open Source.
1.1. Eight new built-in data types
Redis 8 adds 8 data types directly into the core — no separate module install required:
| Data type | Predecessor (module) | Primary use |
|---|---|---|
| JSON | RedisJSON | Store and query JSON documents with JSONPath |
| Time Series | RedisTimeSeries | Metrics, IoT sensor data, monitoring |
| Vector Set (preview) | Brand new | Vector similarity search for AI/ML |
| Bloom Filter | RedisBloom | Probabilistic set membership checks |
| Cuckoo Filter | RedisBloom | Similar to Bloom but supports delete |
| Top-K | RedisBloom | Tracking the top-K most frequent elements |
| Count-Min Sketch | RedisBloom | Frequency estimation |
| T-Digest | RedisBloom | Percentile estimation (p99, p95) |
# Redis 8 - JSON is a first-class data type
127.0.0.1:6379> JSON.SET user:1001 $ '{"name":"Anh Tu","role":"engineer","skills":["redis","dotnet"]}'
OK
# JSONPath queries directly
127.0.0.1:6379> JSON.GET user:1001 $.skills[0]
"[\"redis\"]"
# Vector Set for AI/Semantic Search (preview in 8.x)
127.0.0.1:6379> VADD products REDUCE 2 VALUES 3 0.12 0.87 0.34 ELE "laptop-gaming"
(integer) 1
# Bloom Filter checking if an email is already registered
127.0.0.1:6379> BF.ADD emails:registered "user@example.com"
(integer) 1
127.0.0.1:6379> BF.EXISTS emails:registered "user@example.com"
(integer) 1
1.2. I/O Threading — A performance leap
One of Redis 8's most important architectural changes is the new I/O Threading. Redis is famous for its single-threaded model — all processing on a single CPU core. Since Redis 8, network I/O is distributed across multiple cores while data manipulation remains single-threaded to preserve consistency.
graph LR
C1["Client 1"] --> IO1["I/O Thread 1"]
C2["Client 2"] --> IO2["I/O Thread 2"]
C3["Client 3"] --> IO3["I/O Thread 3"]
C4["Client N"] --> IO4["I/O Thread N"]
IO1 --> MAIN["Main Thread
(Data Processing)"]
IO2 --> MAIN
IO3 --> MAIN
IO4 --> MAIN
MAIN --> IO1
MAIN --> IO2
MAIN --> IO3
MAIN --> IO4
style MAIN fill:#e94560,stroke:#fff,color:#fff
style IO1 fill:#16213e,stroke:#e94560,color:#fff
style IO2 fill:#16213e,stroke:#e94560,color:#fff
style IO3 fill:#16213e,stroke:#e94560,color:#fff
style IO4 fill:#16213e,stroke:#e94560,color:#fff
style C1 fill:#0f3460,stroke:#e94560,color:#fff
style C2 fill:#0f3460,stroke:#e94560,color:#fff
style C3 fill:#0f3460,stroke:#e94560,color:#fff
style C4 fill:#0f3460,stroke:#e94560,color:#fff
Figure 1: Redis 8's I/O Threading — multi-threaded network I/O with single-threaded data processing
# redis.conf — enable I/O Threading
io-threads 8 # I/O thread count (recommended = CPU cores)
io-threads-do-reads yes # Allow I/O threads to handle reads too
# Benchmark results on an 8-core CPU:
# Redis 7.2: ~650K ops/sec
# Redis 8.0: ~1.38M ops/sec (+112%)
# Redis 8.6: ~3.5M ops/sec (with pipelining)
When should you enable I/O Threading?
I/O Threading shines when the system has many concurrent connections (thousands of clients) and the workload is mostly simple commands (GET/SET). For complex workloads (long Lua scripts, SORT on huge datasets), the bottleneck is on the main thread so I/O threading matters less.
1.3. Dual-Stream Replication
Redis 8 improves replication with dual-stream — two replication streams running concurrently: one for the snapshot and one for changes happening during the transfer. Result: replication time drops 18%, peak buffer size drops 35%.
sequenceDiagram
participant P as Primary
participant R as Replica
P->>R: Stream 1: RDB Snapshot
P->>R: Stream 2: Write Buffer (in parallel)
Note over P,R: Both streams run concurrently,
buffer doesn't wait for snapshot to finish
R->>R: Apply snapshot
R->>R: Apply buffered writes
R-->>P: Replication complete
Figure 2: Dual-Stream Replication — parallelizing primary-replica sync
2. Redis 8.6 (March 2026) — Deeper optimizations
The latest Redis 8.6 release focuses on raw performance and introduces a new eviction policy for semantic caching:
2.1. LRM Eviction — A new policy for AI caching
Redis 8.6 introduces two new eviction policies: volatile-lrm and allkeys-lrm (Least Recently Modified). Unlike LRU (Least Recently Used) where every read refreshes the timestamp, LRM only considers the last write time.
Why does LRM matter for AI/Semantic Caching?
In AI systems, embedding vectors and cached inference results are usually read extremely often but rarely updated. Under LRU, such keys would never be evicted (they keep getting read). LRM solves this by evicting based on write time — keys that haven't been updated for a while get evicted first, regardless of read frequency.
# redis.conf — use LRM for a mixed workload
maxmemory 8gb
maxmemory-policy allkeys-lrm
# Works well when you have:
# - Short-lived cache keys (sessions, API responses)
# → frequent writes → LRM keeps them
# - Long-lived semantic cache (embeddings, ML outputs)
# → few writes, many reads → LRM will evict when needed
2.2. Redis 8.6 benchmarks
| Metric | Redis 8.4 | Redis 8.6 | Improvement |
|---|---|---|---|
| Vector Set insertion | baseline | +43% | Vectorized quantized distance |
| Vector Set querying | baseline | +58% | Binary quantization (Intel/AMD AVX) |
| Sorted Set latency | baseline | -35% | Prefetching optimization |
| GET (short strings) | baseline | -15% latency | ACL verification optimization |
| Hash memory usage | baseline | -17% | Encoding optimization |
| Sorted Set memory | baseline | -31% | Compact representation |
3. Caching Patterns — Pick the right strategy
Using Redis effectively isn't just about server configuration — it's about picking the right caching pattern. Each one has its own trade-offs and best-fit workloads.
3.1. Cache-Aside (Lazy Loading)
The most common pattern. The application proactively checks the cache first; on miss, it reads from the database and populates the cache.
graph LR
APP["Application"] -->|"1. GET key"| REDIS["Redis Cache"]
REDIS -->|"2a. Cache HIT"| APP
REDIS -->|"2b. Cache MISS"| APP
APP -->|"3. Query DB"| DB["Database"]
DB -->|"4. Return data"| APP
APP -->|"5. SET key (TTL)"| REDIS
style REDIS fill:#e94560,stroke:#fff,color:#fff
style DB fill:#0f3460,stroke:#e94560,color:#fff
style APP fill:#16213e,stroke:#e94560,color:#fff
Figure 3: Cache-Aside pattern — the application manages both the cache and the database
// C# (.NET) — Cache-Aside with StackExchange.Redis
public async Task<Product?> GetProductAsync(int productId)
{
var cacheKey = $"product:{productId}";
var db = _redis.GetDatabase();
// 1. Check the cache
var cached = await db.StringGetAsync(cacheKey);
if (cached.HasValue)
return JsonSerializer.Deserialize<Product>(cached!);
// 2. Cache miss → read from the database
var product = await _dbContext.Products.FindAsync(productId);
if (product is null) return null;
// 3. Populate the cache with a TTL
await db.StringSetAsync(
cacheKey,
JsonSerializer.Serialize(product),
expiry: TimeSpan.FromMinutes(15)
);
return product;
}
Cache-Aside strengths
Only caches data that's actually requested (demand-driven). If Redis is down, the app still works by reading directly from the DB. This pattern fits read-heavy workloads where the read/write ratio is > 10:1.
3.2. Write-Through
On every DB write, also write to the cache. Keeps cache in sync with the database but pays for it with higher write latency.
// Write-Through Pattern
public async Task UpdateProductAsync(Product product)
{
// Write to the database AND the cache at the same time
await _dbContext.Products.UpdateAsync(product);
await _dbContext.SaveChangesAsync();
// Update the cache immediately
var cacheKey = $"product:{product.Id}";
var db = _redis.GetDatabase();
await db.StringSetAsync(
cacheKey,
JsonSerializer.Serialize(product),
expiry: TimeSpan.FromHours(1)
);
}
3.3. Write-Behind (Write-Back)
The application writes only to Redis; Redis asynchronously flushes to the database. Extremely high write throughput but with a risk of data loss if Redis crashes before flushing.
graph LR
APP["Application"] -->|"1. Write"| REDIS["Redis Cache"]
REDIS -->|"2. Immediate ACK"| APP
REDIS -->|"3. Async flush"| DB["Database"]
WORKER["Background Worker"] -->|"4. Batch write"| DB
REDIS -->|"Queue"| WORKER
style REDIS fill:#e94560,stroke:#fff,color:#fff
style DB fill:#0f3460,stroke:#e94560,color:#fff
style APP fill:#16213e,stroke:#e94560,color:#fff
style WORKER fill:#533483,stroke:#e94560,color:#fff
Figure 4: Write-Behind pattern — fast writes to the cache, async to the database
// Write-Behind using Redis Stream as a buffer
public async Task RecordPageViewAsync(string pageId, string userId)
{
var db = _redis.GetDatabase();
// Write to the Redis Stream (very fast, ~0.1 ms)
await db.StreamAddAsync(
"pageviews:buffer",
new NameValueEntry[]
{
new("pageId", pageId),
new("userId", userId),
new("timestamp", DateTimeOffset.UtcNow.ToUnixTimeMilliseconds().ToString())
},
maxLength: 100_000 // Cap buffer size
);
}
// Background worker flushes to the database in batches
public async Task FlushPageViewsAsync()
{
var db = _redis.GetDatabase();
var entries = await db.StreamReadAsync("pageviews:buffer", "0-0", count: 500);
if (entries.Length == 0) return;
// Batch-insert into the database
var records = entries.Select(e => new PageView
{
PageId = e["pageId"],
UserId = e["userId"],
Timestamp = DateTimeOffset.FromUnixTimeMilliseconds(long.Parse(e["timestamp"]))
});
await _dbContext.PageViews.AddRangeAsync(records);
await _dbContext.SaveChangesAsync();
// Remove processed entries
foreach (var entry in entries)
await db.StreamDeleteAsync("pageviews:buffer", new[] { entry.Id });
}
3.4. Read-Through
The application only talks to the cache. The cache automatically reads from the database on miss. Simplifies application code but requires a "smarter" cache layer.
3.5. Proactive Cache Refresh — The 2026 trend
Instead of waiting for a user request to populate the cache (reactive), the system proactively refreshes the cache before data expires. The user never experiences a cache miss or stale data.
// Proactive Refresh via a background timer
public class CacheRefreshService : BackgroundService
{
protected override async Task ExecuteAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
var db = _redis.GetDatabase();
// Scan for keys about to expire (TTL < 2 minutes)
var server = _redis.GetServer(_redis.GetEndPoints()[0]);
foreach (var key in server.Keys(pattern: "product:*"))
{
var ttl = await db.KeyTimeToLiveAsync(key);
if (ttl.HasValue && ttl.Value < TimeSpan.FromMinutes(2))
{
// Proactively refresh before expiration
var productId = int.Parse(key.ToString().Split(':')[1]);
var product = await _dbContext.Products.FindAsync(productId);
if (product != null)
{
await db.StringSetAsync(
key,
JsonSerializer.Serialize(product),
expiry: TimeSpan.FromMinutes(15)
);
}
}
}
await Task.Delay(TimeSpan.FromSeconds(30), ct);
}
}
}
4. Comparing the caching patterns
| Pattern | Consistency | Read latency | Write latency | Best for |
|---|---|---|---|---|
| Cache-Aside | Eventual | Cache hit: very low Cache miss: high | Normal | Read-heavy, tolerant of stale data |
| Write-Through | Strong | Always low | High (writes DB + cache) | Consistency-critical workloads |
| Write-Behind | Eventual | Always low | Very low | Write-heavy (analytics, logging) |
| Read-Through | Eventual | Similar to Cache-Aside | Normal | Simplifying application code |
| Proactive Refresh | Near real-time | Always low (no miss) | Normal + background | Hot data, UX-critical paths |
5. Redis vs Valkey 2026 — The license-fork landscape
In March 2024, Redis Ltd. changed the license from BSD-3-Clause to SSPL/RSALv2. Major cloud providers immediately forked Redis 7.2 into Valkey under the Linux Foundation, keeping the BSD-3-Clause license. By 2026 the two projects have diverged noticeably:
| Criterion | Redis 8.x | Valkey 8.x |
|---|---|---|
| License | SSPL / RSALv2 (source-available) | BSD-3-Clause (fully open) |
| Module integration | JSON, Search, TimeSeries, Bloom, Vector Set in core | None — focused on core data types |
| I/O Threading | Yes (io-threads config) | Yes (enhanced by AWS engineers) |
| Performance (SET) | ~729K RPS (8.0) | ~1M RPS (8.1) |
| Direction | All-in-one data platform (AI, search, analytics) | Lean core, clustering, Rust integration |
| Cloud support | AWS ElastiCache, GCP Memorystore, Azure Cache | AWS MemoryDB, GCP Memorystore |
| Pick when | You need integrated JSON/Search/TimeSeries/Vector | You need a BSD license (SaaS, managed service) |
License notes
If you're building a SaaS or managed service that offers Redis-as-a-service, Redis 8's SSPL/RSALv2 license may pose legal concerns. In that case, Valkey (BSD-3-Clause) is a safer choice. If Redis is only an internal component, the license doesn't affect you.
6. Designing a production cache layer
Below is a multi-tier cache architecture combining several patterns — fit for production systems handling millions of requests:
graph TB
CLIENT["Client Request"] --> LB["Load Balancer"]
LB --> APP["Application Server"]
APP --> L1["L1: In-Memory Cache
(IMemoryCache, 30s TTL)"]
L1 -->|"Miss"| L2["L2: Redis Cache
(Distributed, 15 min TTL)"]
L2 -->|"Miss"| DB["Database
(SQL Server / PostgreSQL)"]
DB -->|"Populate"| L2
L2 -->|"Populate"| L1
WORKER["Background Worker"] -->|"Proactive Refresh"| L2
WORKER -->|"Invalidate"| L1
CDC["CDC / Event Bus"] -->|"Data Changed"| WORKER
DB -->|"Change Event"| CDC
style L1 fill:#533483,stroke:#e94560,color:#fff
style L2 fill:#e94560,stroke:#fff,color:#fff
style DB fill:#0f3460,stroke:#e94560,color:#fff
style WORKER fill:#16213e,stroke:#e94560,color:#fff
style CDC fill:#16213e,stroke:#e94560,color:#fff
style APP fill:#16213e,stroke:#e94560,color:#fff
Figure 5: Multi-tier cache architecture with L1 in-memory, L2 Redis, and proactive refresh via CDC
// Multi-Level Cache Service in .NET
public class MultiLevelCacheService
{
private readonly IMemoryCache _l1;
private readonly IDatabase _l2; // Redis
public async Task<T?> GetAsync<T>(string key, Func<Task<T?>> factory)
{
// L1: In-Memory (ultra-fast, ~0.001 ms)
if (_l1.TryGetValue(key, out T? l1Value))
return l1Value;
// L2: Redis (fast, ~0.5 ms)
var l2Value = await _l2.StringGetAsync(key);
if (l2Value.HasValue)
{
var result = JsonSerializer.Deserialize<T>(l2Value!);
_l1.Set(key, result, TimeSpan.FromSeconds(30));
return result;
}
// L3: Database (slow, ~5-50 ms)
var dbValue = await factory();
if (dbValue is not null)
{
var json = JsonSerializer.Serialize(dbValue);
await _l2.StringSetAsync(key, json, TimeSpan.FromMinutes(15));
_l1.Set(key, dbValue, TimeSpan.FromSeconds(30));
}
return dbValue;
}
}
7. Cache anti-patterns to avoid
7.1. Thundering Herd (Cache Stampede)
When a hot cache key expires, thousands of concurrent requests miss simultaneously and pile on the database. Fix with a distributed lock or probabilistic early expiration.
// Distributed lock to prevent Thundering Herd
public async Task<T?> GetWithLockAsync<T>(string key, Func<Task<T?>> factory)
{
var db = _redis.GetDatabase();
var cached = await db.StringGetAsync(key);
if (cached.HasValue)
return JsonSerializer.Deserialize<T>(cached!);
var lockKey = $"lock:{key}";
var lockAcquired = await db.StringSetAsync(
lockKey, "1", TimeSpan.FromSeconds(10), When.NotExists);
if (lockAcquired)
{
try
{
var value = await factory();
if (value is not null)
{
await db.StringSetAsync(key,
JsonSerializer.Serialize(value),
TimeSpan.FromMinutes(15));
}
return value;
}
finally
{
await db.KeyDeleteAsync(lockKey);
}
}
// Other requests wait and retry
await Task.Delay(100);
return await GetWithLockAsync<T>(key, factory);
}
7.2. Cache Penetration
Requests repeatedly query a key that exists in neither cache nor database (often from attackers). Fix by caching null values or using a Bloom Filter (now built into Redis 8).
// Use a Bloom Filter (Redis 8 native) to stop Cache Penetration
public async Task<Product?> GetProductSafeAsync(int productId)
{
var db = _redis.GetDatabase();
// Check the Bloom Filter first — O(1), extremely fast
var exists = (bool)await db.ExecuteAsync("BF.EXISTS", "products:bf", productId.ToString());
if (!exists)
return null; // Definitely doesn't exist, skip the DB query
// Bloom Filter says "probably exists" → check cache + DB
return await GetProductAsync(productId);
}
7.3. The Hot Key problem
One key is accessed so often it bottlenecks a single Redis node. Fix by replicating the hot key across nodes or using an L1 in-memory cache on the application side.
8. Conclusion
Redis 8 marks the evolution from a key-value store into a unified data platform with JSON, Search, TimeSeries, Vector Set, and Bloom Filter baked in. The new I/O Threading architecture delivers a throughput leap, while dual-stream replication and the LRM eviction policy open new possibilities for AI/semantic caching.
On caching patterns, there's no "silver bullet" — each pattern fits a specific workload. Cache-Aside remains the safe default for read-heavy systems, Write-Behind fits write-heavy analytics, and Proactive Refresh is ideal for hot paths that need smooth UX. Combining multi-tier caching (L1 in-memory + L2 Redis) with stampede-prevention techniques is the proven recipe for million-request systems.
Finally, the Redis/Valkey split forces teams to make explicit licensing decisions. Evaluate your need for module integration (JSON, Search, Vector) and licensing requirements before picking a direction.
References
- Redis 8 GA Announcement — redis.io
- Redis 8.6 Performance Improvements — redis.io
- What's New in Redis 8.0 — redis.io
- Database Caching Strategies Using Redis — AWS
- Redis Caching Solutions — redis.io
- Redis vs Valkey in 2026 — DEV Community
- Choosing the Right Key-Value Store: Redis vs Valkey — Percona
- Redis 8.6 Throughput Improvements — Linuxiac
Passkeys & WebAuthn 2026 — Replacing Passwords with FIDO2, Platform Authenticators, and Phishing-resistant Auth on .NET 10 and Vue
AWS Lambda Serverless 2026: Architecture, SnapStart, Event-Driven Patterns, and the Production Free Tier
Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.