Redis 8 and Caching Patterns 2026 — I/O Threading, Vector Set, and High-Performance Cache Strategies

Posted on: 4/17/2026 9:14:01 AM

Table of contents

1. Redis 8 — The biggest update in Redis history
2. Redis 8.6 (March 2026) — Deeper optimizations
1. 2.1. LRM Eviction — A new policy for AI caching
  1. Why does LRM matter for AI/Semantic Caching?
2. 2.2. Redis 8.6 benchmarks
3. Caching Patterns — Pick the right strategy
4. Comparing the caching patterns
5. Redis vs Valkey 2026 — The license-fork landscape
1. License notes
6. Designing a production cache layer
7. Cache anti-patterns to avoid
8. Conclusion
1. References

In the backend world, Redis is no longer just a simple key-value store. With Redis 8 (GA 2025, continuously updated to 8.6 in March 2026), Redis has become a general-purpose data platform — integrating JSON, Search, TimeSeries, Vector Set, and Bloom Filter directly into the core, together with a new I/O Threading architecture that delivers 5× higher throughput than Redis 7.2. This article dives into the Redis 8 architecture, modern caching patterns, and how to pick the right cache strategy for production.

1. Redis 8 — The biggest update in Redis history

Redis 8 isn't a routine upgrade — it's a full ecosystem consolidation. Previously, modules like RediSearch, RedisJSON, RedisTimeSeries, and RedisBloom had to be installed separately via MODULE LOAD. Since Redis 8, they're all integrated into a single binary called Redis Open Source.

5x Higher throughput than Redis 7.2

3.5M ops/sec with pipelining

87% Of commands speed-optimized

8 New core data types

1.1. Eight new built-in data types

Redis 8 adds 8 data types directly into the core — no separate module install required:

Data type	Predecessor (module)	Primary use
JSON	RedisJSON	Store and query JSON documents with JSONPath
Time Series	RedisTimeSeries	Metrics, IoT sensor data, monitoring
Vector Set (preview)	Brand new	Vector similarity search for AI/ML
Bloom Filter	RedisBloom	Probabilistic set membership checks
Cuckoo Filter	RedisBloom	Similar to Bloom but supports delete
Top-K	RedisBloom	Tracking the top-K most frequent elements
Count-Min Sketch	RedisBloom	Frequency estimation
T-Digest	RedisBloom	Percentile estimation (p99, p95)

# Redis 8 - JSON is a first-class data type
127.0.0.1:6379> JSON.SET user:1001 $ '{"name":"Anh Tu","role":"engineer","skills":["redis","dotnet"]}'
OK

# JSONPath queries directly
127.0.0.1:6379> JSON.GET user:1001 $.skills[0]
"[\"redis\"]"

# Vector Set for AI/Semantic Search (preview in 8.x)
127.0.0.1:6379> VADD products REDUCE 2 VALUES 3 0.12 0.87 0.34 ELE "laptop-gaming"
(integer) 1

# Bloom Filter checking if an email is already registered
127.0.0.1:6379> BF.ADD emails:registered "user@example.com"
(integer) 1
127.0.0.1:6379> BF.EXISTS emails:registered "user@example.com"
(integer) 1

1.2. I/O Threading — A performance leap

One of Redis 8's most important architectural changes is the new I/O Threading. Redis is famous for its single-threaded model — all processing on a single CPU core. Since Redis 8, network I/O is distributed across multiple cores while data manipulation remains single-threaded to preserve consistency.

graph LR
    C1["Client 1"] --> IO1["I/O Thread 1"]
    C2["Client 2"] --> IO2["I/O Thread 2"]
    C3["Client 3"] --> IO3["I/O Thread 3"]
    C4["Client N"] --> IO4["I/O Thread N"]
    IO1 --> MAIN["Main Thread
(Data Processing)"]
    IO2 --> MAIN
    IO3 --> MAIN
    IO4 --> MAIN
    MAIN --> IO1
    MAIN --> IO2
    MAIN --> IO3
    MAIN --> IO4
    style MAIN fill:#e94560,stroke:#fff,color:#fff
    style IO1 fill:#16213e,stroke:#e94560,color:#fff
    style IO2 fill:#16213e,stroke:#e94560,color:#fff
    style IO3 fill:#16213e,stroke:#e94560,color:#fff
    style IO4 fill:#16213e,stroke:#e94560,color:#fff
    style C1 fill:#0f3460,stroke:#e94560,color:#fff
    style C2 fill:#0f3460,stroke:#e94560,color:#fff
    style C3 fill:#0f3460,stroke:#e94560,color:#fff
    style C4 fill:#0f3460,stroke:#e94560,color:#fff

Figure 1: Redis 8's I/O Threading — multi-threaded network I/O with single-threaded data processing

# redis.conf — enable I/O Threading
io-threads 8          # I/O thread count (recommended = CPU cores)
io-threads-do-reads yes  # Allow I/O threads to handle reads too

# Benchmark results on an 8-core CPU:
# Redis 7.2:  ~650K ops/sec
# Redis 8.0:  ~1.38M ops/sec (+112%)
# Redis 8.6:  ~3.5M ops/sec (with pipelining)

When should you enable I/O Threading?

I/O Threading shines when the system has many concurrent connections (thousands of clients) and the workload is mostly simple commands (GET/SET). For complex workloads (long Lua scripts, SORT on huge datasets), the bottleneck is on the main thread so I/O threading matters less.

1.3. Dual-Stream Replication

Redis 8 improves replication with dual-stream — two replication streams running concurrently: one for the snapshot and one for changes happening during the transfer. Result: replication time drops 18%, peak buffer size drops 35%.

sequenceDiagram
    participant P as Primary
    participant R as Replica
    P->>R: Stream 1: RDB Snapshot
    P->>R: Stream 2: Write Buffer (in parallel)
    Note over P,R: Both streams run concurrently,
buffer doesn't wait for snapshot to finish
    R->>R: Apply snapshot
    R->>R: Apply buffered writes
    R-->>P: Replication complete

Figure 2: Dual-Stream Replication — parallelizing primary-replica sync

2. Redis 8.6 (March 2026) — Deeper optimizations

The latest Redis 8.6 release focuses on raw performance and introduces a new eviction policy for semantic caching:

2.1. LRM Eviction — A new policy for AI caching

Redis 8.6 introduces two new eviction policies: volatile-lrm and allkeys-lrm (Least Recently Modified). Unlike LRU (Least Recently Used) where every read refreshes the timestamp, LRM only considers the last write time.

Why does LRM matter for AI/Semantic Caching?

In AI systems, embedding vectors and cached inference results are usually read extremely often but rarely updated. Under LRU, such keys would never be evicted (they keep getting read). LRM solves this by evicting based on write time — keys that haven't been updated for a while get evicted first, regardless of read frequency.

# redis.conf — use LRM for a mixed workload
maxmemory 8gb
maxmemory-policy allkeys-lrm

# Works well when you have:
# - Short-lived cache keys (sessions, API responses)
#   → frequent writes → LRM keeps them
# - Long-lived semantic cache (embeddings, ML outputs)
#   → few writes, many reads → LRM will evict when needed

2.2. Redis 8.6 benchmarks

Metric	Redis 8.4	Redis 8.6	Improvement
Vector Set insertion	baseline	+43%	Vectorized quantized distance
Vector Set querying	baseline	+58%	Binary quantization (Intel/AMD AVX)
Sorted Set latency	baseline	-35%	Prefetching optimization
GET (short strings)	baseline	-15% latency	ACL verification optimization
Hash memory usage	baseline	-17%	Encoding optimization
Sorted Set memory	baseline	-31%	Compact representation

3. Caching Patterns — Pick the right strategy

Using Redis effectively isn't just about server configuration — it's about picking the right caching pattern. Each one has its own trade-offs and best-fit workloads.

3.1. Cache-Aside (Lazy Loading)

The most common pattern. The application proactively checks the cache first; on miss, it reads from the database and populates the cache.

graph LR
    APP["Application"] -->|"1. GET key"| REDIS["Redis Cache"]
    REDIS -->|"2a. Cache HIT"| APP
    REDIS -->|"2b. Cache MISS"| APP
    APP -->|"3. Query DB"| DB["Database"]
    DB -->|"4. Return data"| APP
    APP -->|"5. SET key (TTL)"| REDIS
    style REDIS fill:#e94560,stroke:#fff,color:#fff
    style DB fill:#0f3460,stroke:#e94560,color:#fff
    style APP fill:#16213e,stroke:#e94560,color:#fff

Figure 3: Cache-Aside pattern — the application manages both the cache and the database

// C# (.NET) — Cache-Aside with StackExchange.Redis
public async Task<Product?> GetProductAsync(int productId)
{
    var cacheKey = $"product:{productId}";
    var db = _redis.GetDatabase();

    // 1. Check the cache
    var cached = await db.StringGetAsync(cacheKey);
    if (cached.HasValue)
        return JsonSerializer.Deserialize<Product>(cached!);

    // 2. Cache miss → read from the database
    var product = await _dbContext.Products.FindAsync(productId);
    if (product is null) return null;

    // 3. Populate the cache with a TTL
    await db.StringSetAsync(
        cacheKey,
        JsonSerializer.Serialize(product),
        expiry: TimeSpan.FromMinutes(15)
    );

    return product;
}

Cache-Aside strengths

Only caches data that's actually requested (demand-driven). If Redis is down, the app still works by reading directly from the DB. This pattern fits read-heavy workloads where the read/write ratio is > 10:1.

3.2. Write-Through

On every DB write, also write to the cache. Keeps cache in sync with the database but pays for it with higher write latency.

// Write-Through Pattern
public async Task UpdateProductAsync(Product product)
{
    // Write to the database AND the cache at the same time
    await _dbContext.Products.UpdateAsync(product);
    await _dbContext.SaveChangesAsync();

    // Update the cache immediately
    var cacheKey = $"product:{product.Id}";
    var db = _redis.GetDatabase();
    await db.StringSetAsync(
        cacheKey,
        JsonSerializer.Serialize(product),
        expiry: TimeSpan.FromHours(1)
    );
}

3.3. Write-Behind (Write-Back)

The application writes only to Redis; Redis asynchronously flushes to the database. Extremely high write throughput but with a risk of data loss if Redis crashes before flushing.

graph LR
    APP["Application"] -->|"1. Write"| REDIS["Redis Cache"]
    REDIS -->|"2. Immediate ACK"| APP
    REDIS -->|"3. Async flush"| DB["Database"]
    WORKER["Background Worker"] -->|"4. Batch write"| DB
    REDIS -->|"Queue"| WORKER
    style REDIS fill:#e94560,stroke:#fff,color:#fff
    style DB fill:#0f3460,stroke:#e94560,color:#fff
    style APP fill:#16213e,stroke:#e94560,color:#fff
    style WORKER fill:#533483,stroke:#e94560,color:#fff

Figure 4: Write-Behind pattern — fast writes to the cache, async to the database

// Write-Behind using Redis Stream as a buffer
public async Task RecordPageViewAsync(string pageId, string userId)
{
    var db = _redis.GetDatabase();

    // Write to the Redis Stream (very fast, ~0.1 ms)
    await db.StreamAddAsync(
        "pageviews:buffer",
        new NameValueEntry[]
        {
            new("pageId", pageId),
            new("userId", userId),
            new("timestamp", DateTimeOffset.UtcNow.ToUnixTimeMilliseconds().ToString())
        },
        maxLength: 100_000  // Cap buffer size
    );
}

// Background worker flushes to the database in batches
public async Task FlushPageViewsAsync()
{
    var db = _redis.GetDatabase();
    var entries = await db.StreamReadAsync("pageviews:buffer", "0-0", count: 500);

    if (entries.Length == 0) return;

    // Batch-insert into the database
    var records = entries.Select(e => new PageView
    {
        PageId = e["pageId"],
        UserId = e["userId"],
        Timestamp = DateTimeOffset.FromUnixTimeMilliseconds(long.Parse(e["timestamp"]))
    });

    await _dbContext.PageViews.AddRangeAsync(records);
    await _dbContext.SaveChangesAsync();

    // Remove processed entries
    foreach (var entry in entries)
        await db.StreamDeleteAsync("pageviews:buffer", new[] { entry.Id });
}

3.4. Read-Through

The application only talks to the cache. The cache automatically reads from the database on miss. Simplifies application code but requires a "smarter" cache layer.

3.5. Proactive Cache Refresh — The 2026 trend

Instead of waiting for a user request to populate the cache (reactive), the system proactively refreshes the cache before data expires. The user never experiences a cache miss or stale data.

// Proactive Refresh via a background timer
public class CacheRefreshService : BackgroundService
{
    protected override async Task ExecuteAsync(CancellationToken ct)
    {
        while (!ct.IsCancellationRequested)
        {
            var db = _redis.GetDatabase();

            // Scan for keys about to expire (TTL < 2 minutes)
            var server = _redis.GetServer(_redis.GetEndPoints()[0]);
            foreach (var key in server.Keys(pattern: "product:*"))
            {
                var ttl = await db.KeyTimeToLiveAsync(key);
                if (ttl.HasValue && ttl.Value < TimeSpan.FromMinutes(2))
                {
                    // Proactively refresh before expiration
                    var productId = int.Parse(key.ToString().Split(':')[1]);
                    var product = await _dbContext.Products.FindAsync(productId);
                    if (product != null)
                    {
                        await db.StringSetAsync(
                            key,
                            JsonSerializer.Serialize(product),
                            expiry: TimeSpan.FromMinutes(15)
                        );
                    }
                }
            }

            await Task.Delay(TimeSpan.FromSeconds(30), ct);
        }
    }
}

4. Comparing the caching patterns

Pattern	Consistency	Read latency	Write latency	Best for
Cache-Aside	Eventual	Cache hit: very low Cache miss: high	Normal	Read-heavy, tolerant of stale data
Write-Through	Strong	Always low	High (writes DB + cache)	Consistency-critical workloads
Write-Behind	Eventual	Always low	Very low	Write-heavy (analytics, logging)
Read-Through	Eventual	Similar to Cache-Aside	Normal	Simplifying application code
Proactive Refresh	Near real-time	Always low (no miss)	Normal + background	Hot data, UX-critical paths

5. Redis vs Valkey 2026 — The license-fork landscape

In March 2024, Redis Ltd. changed the license from BSD-3-Clause to SSPL/RSALv2. Major cloud providers immediately forked Redis 7.2 into Valkey under the Linux Foundation, keeping the BSD-3-Clause license. By 2026 the two projects have diverged noticeably:

Criterion	Redis 8.x	Valkey 8.x
License	SSPL / RSALv2 (source-available)	BSD-3-Clause (fully open)
Module integration	JSON, Search, TimeSeries, Bloom, Vector Set in core	None — focused on core data types
I/O Threading	Yes (io-threads config)	Yes (enhanced by AWS engineers)
Performance (SET)	~729K RPS (8.0)	~1M RPS (8.1)
Direction	All-in-one data platform (AI, search, analytics)	Lean core, clustering, Rust integration
Cloud support	AWS ElastiCache, GCP Memorystore, Azure Cache	AWS MemoryDB, GCP Memorystore
Pick when	You need integrated JSON/Search/TimeSeries/Vector	You need a BSD license (SaaS, managed service)

License notes

If you're building a SaaS or managed service that offers Redis-as-a-service, Redis 8's SSPL/RSALv2 license may pose legal concerns. In that case, Valkey (BSD-3-Clause) is a safer choice. If Redis is only an internal component, the license doesn't affect you.

6. Designing a production cache layer

Below is a multi-tier cache architecture combining several patterns — fit for production systems handling millions of requests:

graph TB
    CLIENT["Client Request"] --> LB["Load Balancer"]
    LB --> APP["Application Server"]
    APP --> L1["L1: In-Memory Cache
(IMemoryCache, 30s TTL)"]
    L1 -->|"Miss"| L2["L2: Redis Cache
(Distributed, 15 min TTL)"]
    L2 -->|"Miss"| DB["Database
(SQL Server / PostgreSQL)"]
    DB -->|"Populate"| L2
    L2 -->|"Populate"| L1
    WORKER["Background Worker"] -->|"Proactive Refresh"| L2
    WORKER -->|"Invalidate"| L1
    CDC["CDC / Event Bus"] -->|"Data Changed"| WORKER
    DB -->|"Change Event"| CDC
    style L1 fill:#533483,stroke:#e94560,color:#fff
    style L2 fill:#e94560,stroke:#fff,color:#fff
    style DB fill:#0f3460,stroke:#e94560,color:#fff
    style WORKER fill:#16213e,stroke:#e94560,color:#fff
    style CDC fill:#16213e,stroke:#e94560,color:#fff
    style APP fill:#16213e,stroke:#e94560,color:#fff

Figure 5: Multi-tier cache architecture with L1 in-memory, L2 Redis, and proactive refresh via CDC

// Multi-Level Cache Service in .NET
public class MultiLevelCacheService
{
    private readonly IMemoryCache _l1;
    private readonly IDatabase _l2;  // Redis

    public async Task<T?> GetAsync<T>(string key, Func<Task<T?>> factory)
    {
        // L1: In-Memory (ultra-fast, ~0.001 ms)
        if (_l1.TryGetValue(key, out T? l1Value))
            return l1Value;

        // L2: Redis (fast, ~0.5 ms)
        var l2Value = await _l2.StringGetAsync(key);
        if (l2Value.HasValue)
        {
            var result = JsonSerializer.Deserialize<T>(l2Value!);
            _l1.Set(key, result, TimeSpan.FromSeconds(30));
            return result;
        }

        // L3: Database (slow, ~5-50 ms)
        var dbValue = await factory();
        if (dbValue is not null)
        {
            var json = JsonSerializer.Serialize(dbValue);
            await _l2.StringSetAsync(key, json, TimeSpan.FromMinutes(15));
            _l1.Set(key, dbValue, TimeSpan.FromSeconds(30));
        }

        return dbValue;
    }
}

7. Cache anti-patterns to avoid

7.1. Thundering Herd (Cache Stampede)

When a hot cache key expires, thousands of concurrent requests miss simultaneously and pile on the database. Fix with a distributed lock or probabilistic early expiration.

// Distributed lock to prevent Thundering Herd
public async Task<T?> GetWithLockAsync<T>(string key, Func<Task<T?>> factory)
{
    var db = _redis.GetDatabase();
    var cached = await db.StringGetAsync(key);
    if (cached.HasValue)
        return JsonSerializer.Deserialize<T>(cached!);

    var lockKey = $"lock:{key}";
    var lockAcquired = await db.StringSetAsync(
        lockKey, "1", TimeSpan.FromSeconds(10), When.NotExists);

    if (lockAcquired)
    {
        try
        {
            var value = await factory();
            if (value is not null)
            {
                await db.StringSetAsync(key,
                    JsonSerializer.Serialize(value),
                    TimeSpan.FromMinutes(15));
            }
            return value;
        }
        finally
        {
            await db.KeyDeleteAsync(lockKey);
        }
    }

    // Other requests wait and retry
    await Task.Delay(100);
    return await GetWithLockAsync<T>(key, factory);
}

7.2. Cache Penetration

Requests repeatedly query a key that exists in neither cache nor database (often from attackers). Fix by caching null values or using a Bloom Filter (now built into Redis 8).

// Use a Bloom Filter (Redis 8 native) to stop Cache Penetration
public async Task<Product?> GetProductSafeAsync(int productId)
{
    var db = _redis.GetDatabase();

    // Check the Bloom Filter first — O(1), extremely fast
    var exists = (bool)await db.ExecuteAsync("BF.EXISTS", "products:bf", productId.ToString());
    if (!exists)
        return null;  // Definitely doesn't exist, skip the DB query

    // Bloom Filter says "probably exists" → check cache + DB
    return await GetProductAsync(productId);
}

7.3. The Hot Key problem

One key is accessed so often it bottlenecks a single Redis node. Fix by replicating the hot key across nodes or using an L1 in-memory cache on the application side.

8. Conclusion

Redis 8 marks the evolution from a key-value store into a unified data platform with JSON, Search, TimeSeries, Vector Set, and Bloom Filter baked in. The new I/O Threading architecture delivers a throughput leap, while dual-stream replication and the LRM eviction policy open new possibilities for AI/semantic caching.

On caching patterns, there's no "silver bullet" — each pattern fits a specific workload. Cache-Aside remains the safe default for read-heavy systems, Write-Behind fits write-heavy analytics, and Proactive Refresh is ideal for hot paths that need smooth UX. Combining multi-tier caching (L1 in-memory + L2 Redis) with stampede-prevention techniques is the proven recipe for million-request systems.

Finally, the Redis/Valkey split forces teams to make explicit licensing decisions. Evaluate your need for module integration (JSON, Search, Vector) and licensing requirements before picking a direction.

References

#redis #Caching #system design #Performance #Database

# Redis 8 and Caching Patterns 2026 — I/O Threading, Vector Set, and High-Performance Cache Strategies

In the backend world, **Redis** is no longer just a simple key-value store. With **Redis 8** (GA 2025, continuously updated to 8.6 in March 2026), Redis has become a **general-purpose data platform** — integrating JSON, Search, TimeSeries, Vector Set, and Bloom Filter directly into the core, together with a new **I/O Threading** architecture that delivers 5× higher throughput than Redis 7.2. This article dives into the Redis 8 architecture, modern caching patterns, and how to pick the right cache strategy for production.

## 1. Redis 8 — The biggest update in Redis history

Redis 8 isn't a routine upgrade — it's a **full ecosystem consolidation**. Previously, modules like RediSearch, RedisJSON, RedisTimeSeries, and RedisBloom had to be installed separately via `MODULE LOAD`. Since Redis 8, they're all integrated into a single binary called **Redis Open Source**.

5x Higher throughput than Redis 7.2

3.5M ops/sec with pipelining

87% Of commands speed-optimized

8 New core data types

### 1.1. Eight new built-in data types

Redis 8 adds 8 data types directly into the core — no separate module install required:

| Data type | Predecessor (module) | Primary use |
| --- | --- | --- |
| **JSON** | RedisJSON | Store and query JSON documents with JSONPath |
| **Time Series** | RedisTimeSeries | Metrics, IoT sensor data, monitoring |
| **Vector Set** (preview) | Brand new | Vector similarity search for AI/ML |
| **Bloom Filter** | RedisBloom | Probabilistic set membership checks |
| **Cuckoo Filter** | RedisBloom | Similar to Bloom but supports delete |
| **Top-K** | RedisBloom | Tracking the top-K most frequent elements |
| **Count-Min Sketch** | RedisBloom | Frequency estimation |
| **T-Digest** | RedisBloom | Percentile estimation (p99, p95) |

```bash
# Redis 8 - JSON is a first-class data type
127.0.0.1:6379> JSON.SET user:1001 $ '{"name":"Anh Tu","role":"engineer","skills":["redis","dotnet"]}'
OK

# JSONPath queries directly
127.0.0.1:6379> JSON.GET user:1001 $.skills[0]
"[\"redis\"]"

# Vector Set for AI/Semantic Search (preview in 8.x)
127.0.0.1:6379> VADD products REDUCE 2 VALUES 3 0.12 0.87 0.34 ELE "laptop-gaming"
(integer) 1

# Bloom Filter checking if an email is already registered
127.0.0.1:6379> BF.ADD emails:registered "user@example.com"
(integer) 1
127.0.0.1:6379> BF.EXISTS emails:registered "user@example.com"
(integer) 1

```

### 1.2. I/O Threading — A performance leap

One of Redis 8's most important architectural changes is the **new I/O Threading**. Redis is famous for its single-threaded model — all processing on a single CPU core. Since Redis 8, network I/O is distributed across multiple cores while **data manipulation remains single-threaded** to preserve consistency.

```
graph LR
    C1["Client 1"] --> IO1["I/O Thread 1"]
    C2["Client 2"] --> IO2["I/O Thread 2"]
    C3["Client 3"] --> IO3["I/O Thread 3"]
    C4["Client N"] --> IO4["I/O Thread N"]
    IO1 --> MAIN["Main Thread  
(Data Processing)"]
    IO2 --> MAIN
    IO3 --> MAIN
    IO4 --> MAIN
    MAIN --> IO1
    MAIN --> IO2
    MAIN --> IO3
    MAIN --> IO4
    style MAIN fill:#e94560,stroke:#fff,color:#fff
    style IO1 fill:#16213e,stroke:#e94560,color:#fff
    style IO2 fill:#16213e,stroke:#e94560,color:#fff
    style IO3 fill:#16213e,stroke:#e94560,color:#fff
    style IO4 fill:#16213e,stroke:#e94560,color:#fff
    style C1 fill:#0f3460,stroke:#e94560,color:#fff
    style C2 fill:#0f3460,stroke:#e94560,color:#fff
    style C3 fill:#0f3460,stroke:#e94560,color:#fff
    style C4 fill:#0f3460,stroke:#e94560,color:#fff

```
Figure 1: Redis 8's I/O Threading — multi-threaded network I/O with single-threaded data processing

```bash
# redis.conf — enable I/O Threading
io-threads 8          # I/O thread count (recommended = CPU cores)
io-threads-do-reads yes  # Allow I/O threads to handle reads too

# Benchmark results on an 8-core CPU:
# Redis 7.2:  ~650K ops/sec
# Redis 8.0:  ~1.38M ops/sec (+112%)
# Redis 8.6:  ~3.5M ops/sec (with pipelining)

```

#### When should you enable I/O Threading?

I/O Threading shines when the system has **many concurrent connections** (thousands of clients) and the workload is mostly simple commands (GET/SET). For complex workloads (long Lua scripts, SORT on huge datasets), the bottleneck is on the main thread so I/O threading matters less.

### 1.3. Dual-Stream Replication

Redis 8 improves replication with **dual-stream** — two replication streams running concurrently: one for the snapshot and one for changes happening during the transfer. Result: replication time drops 18%, peak buffer size drops 35%.

```
sequenceDiagram
    participant P as Primary
    participant R as Replica
    P->>R: Stream 1: RDB Snapshot
    P->>R: Stream 2: Write Buffer (in parallel)
    Note over P,R: Both streams run concurrently,  
buffer doesn't wait for snapshot to finish
    R->>R: Apply snapshot
    R->>R: Apply buffered writes
    R-->>P: Replication complete

```
Figure 2: Dual-Stream Replication — parallelizing primary-replica sync

## 2. Redis 8.6 (March 2026) — Deeper optimizations

The latest Redis 8.6 release focuses on **raw performance** and introduces a new eviction policy for semantic caching:

### 2.1. LRM Eviction — A new policy for AI caching

Redis 8.6 introduces two new eviction policies: `volatile-lrm` and `allkeys-lrm` (Least Recently **Modified**). Unlike LRU (Least Recently Used) where every read refreshes the timestamp, LRM only considers the **last write time**.

#### Why does LRM matter for AI/Semantic Caching?

In AI systems, **embedding vectors** and **cached inference results** are usually read extremely often but rarely updated. Under LRU, such keys would never be evicted (they keep getting read). LRM solves this by evicting based on write time — keys that haven't been updated for a while get evicted first, regardless of read frequency.

```bash
# redis.conf — use LRM for a mixed workload
maxmemory 8gb
maxmemory-policy allkeys-lrm

# Works well when you have:
# - Short-lived cache keys (sessions, API responses)
#   → frequent writes → LRM keeps them
# - Long-lived semantic cache (embeddings, ML outputs)
#   → few writes, many reads → LRM will evict when needed

```

### 2.2. Redis 8.6 benchmarks

| Metric | Redis 8.4 | Redis 8.6 | Improvement |
| --- | --- | --- | --- |
| Vector Set insertion | baseline | +43% | Vectorized quantized distance |
| Vector Set querying | baseline | +58% | Binary quantization (Intel/AMD AVX) |
| Sorted Set latency | baseline | -35% | Prefetching optimization |
| GET (short strings) | baseline | -15% latency | ACL verification optimization |
| Hash memory usage | baseline | -17% | Encoding optimization |
| Sorted Set memory | baseline | -31% | Compact representation |

## 3. Caching Patterns — Pick the right strategy

Using Redis effectively isn't just about server configuration — it's about picking the right **caching pattern**. Each one has its own trade-offs and best-fit workloads.

### 3.1. Cache-Aside (Lazy Loading)

The most common pattern. The application proactively checks the cache first; on miss, it reads from the database and populates the cache.

```
Figure 3: Cache-Aside pattern — the application manages both the cache and the database

```csharp
// C# (.NET) — Cache-Aside with StackExchange.Redis
public async Task<Product?> GetProductAsync(int productId)
{
    var cacheKey = $"product:{productId}";
    var db = _redis.GetDatabase();

// 1. Check the cache
    var cached = await db.StringGetAsync(cacheKey);
    if (cached.HasValue)
        return JsonSerializer.Deserialize<Product>(cached!);

// 2. Cache miss → read from the database
    var product = await _dbContext.Products.FindAsync(productId);
    if (product is null) return null;

// 3. Populate the cache with a TTL
    await db.StringSetAsync(
        cacheKey,
        JsonSerializer.Serialize(product),
        expiry: TimeSpan.FromMinutes(15)
    );

return product;
}

```

#### Cache-Aside strengths

Only caches data that's actually requested (demand-driven). If Redis is down, the app still works by reading directly from the DB. This pattern fits **read-heavy workloads** where the read/write ratio is > 10:1.

### 3.2. Write-Through

On every DB write, also write to the cache. Keeps cache in sync with the database but pays for it with higher write latency.

```csharp
// Write-Through Pattern
public async Task UpdateProductAsync(Product product)
{
    // Write to the database AND the cache at the same time
    await _dbContext.Products.UpdateAsync(product);
    await _dbContext.SaveChangesAsync();

// Update the cache immediately
    var cacheKey = $"product:{product.Id}";
    var db = _redis.GetDatabase();
    await db.StringSetAsync(
        cacheKey,
        JsonSerializer.Serialize(product),
        expiry: TimeSpan.FromHours(1)
    );
}

```

### 3.3. Write-Behind (Write-Back)

The application writes only to Redis; Redis **asynchronously** flushes to the database. Extremely high write throughput but with a risk of data loss if Redis crashes before flushing.

```
Figure 4: Write-Behind pattern — fast writes to the cache, async to the database

```csharp
// Write-Behind using Redis Stream as a buffer
public async Task RecordPageViewAsync(string pageId, string userId)
{
    var db = _redis.GetDatabase();

// Write to the Redis Stream (very fast, ~0.1 ms)
    await db.StreamAddAsync(
        "pageviews:buffer",
        new NameValueEntry[]
        {
            new("pageId", pageId),
            new("userId", userId),
            new("timestamp", DateTimeOffset.UtcNow.ToUnixTimeMilliseconds().ToString())
        },
        maxLength: 100_000  // Cap buffer size
    );
}

// Background worker flushes to the database in batches
public async Task FlushPageViewsAsync()
{
    var db = _redis.GetDatabase();
    var entries = await db.StreamReadAsync("pageviews:buffer", "0-0", count: 500);

if (entries.Length == 0) return;

// Batch-insert into the database
    var records = entries.Select(e => new PageView
    {
        PageId = e["pageId"],
        UserId = e["userId"],
        Timestamp = DateTimeOffset.FromUnixTimeMilliseconds(long.Parse(e["timestamp"]))
    });

await _dbContext.PageViews.AddRangeAsync(records);
    await _dbContext.SaveChangesAsync();

// Remove processed entries
    foreach (var entry in entries)
        await db.StreamDeleteAsync("pageviews:buffer", new[] { entry.Id });
}

```

### 3.4. Read-Through

The application only talks to the cache. The cache **automatically** reads from the database on miss. Simplifies application code but requires a "smarter" cache layer.

### 3.5. Proactive Cache Refresh — The 2026 trend

Instead of waiting for a user request to populate the cache (reactive), the system **proactively refreshes** the cache before data expires. The user never experiences a cache miss or stale data.

```csharp
// Proactive Refresh via a background timer
public class CacheRefreshService : BackgroundService
{
    protected override async Task ExecuteAsync(CancellationToken ct)
    {
        while (!ct.IsCancellationRequested)
        {
            var db = _redis.GetDatabase();

// Scan for keys about to expire (TTL < 2 minutes)
            var server = _redis.GetServer(_redis.GetEndPoints()[0]);
            foreach (var key in server.Keys(pattern: "product:*"))
            {
                var ttl = await db.KeyTimeToLiveAsync(key);
                if (ttl.HasValue && ttl.Value &lt; TimeSpan.FromMinutes(2))
                {
                    // Proactively refresh before expiration
                    var productId = int.Parse(key.ToString().Split(':')[1]);
                    var product = await _dbContext.Products.FindAsync(productId);
                    if (product != null)
                    {
                        await db.StringSetAsync(
                            key,
                            JsonSerializer.Serialize(product),
                            expiry: TimeSpan.FromMinutes(15)
                        );
                    }
                }
            }

await Task.Delay(TimeSpan.FromSeconds(30), ct);
        }
    }
}

```

## 4. Comparing the caching patterns

| Pattern | Consistency | Read latency | Write latency | Best for |
| --- | --- | --- | --- | --- |
| **Cache-Aside** | Eventual | Cache hit: very low   Cache miss: high | Normal | Read-heavy, tolerant of stale data |
| **Write-Through** | Strong | Always low | High (writes DB + cache) | Consistency-critical workloads |
| **Write-Behind** | Eventual | Always low | Very low | Write-heavy (analytics, logging) |
| **Read-Through** | Eventual | Similar to Cache-Aside | Normal | Simplifying application code |
| **Proactive Refresh** | Near real-time | Always low (no miss) | Normal + background | Hot data, UX-critical paths |

## 5. Redis vs Valkey 2026 — The license-fork landscape

In March 2024, Redis Ltd. changed the license from BSD-3-Clause to SSPL/RSALv2. Major cloud providers immediately forked Redis 7.2 into **Valkey** under the Linux Foundation, keeping the BSD-3-Clause license. By 2026 the two projects have diverged noticeably:

| Criterion | Redis 8.x | Valkey 8.x |
| --- | --- | --- |
| **License** | SSPL / RSALv2 (source-available) | BSD-3-Clause (fully open) |
| **Module integration** | JSON, Search, TimeSeries, Bloom, Vector Set in core | None — focused on core data types |
| **I/O Threading** | Yes (io-threads config) | Yes (enhanced by AWS engineers) |
| **Performance (SET)** | ~729K RPS (8.0) | ~1M RPS (8.1) |
| **Direction** | All-in-one data platform (AI, search, analytics) | Lean core, clustering, Rust integration |
| **Cloud support** | AWS ElastiCache, GCP Memorystore, Azure Cache | AWS MemoryDB, GCP Memorystore |
| **Pick when** | You need integrated JSON/Search/TimeSeries/Vector | You need a BSD license (SaaS, managed service) |

#### License notes

## 6. Designing a production cache layer

Below is a multi-tier cache architecture combining several patterns — fit for production systems handling millions of requests:

```
graph TB
    CLIENT["Client Request"] --> LB["Load Balancer"]
    LB --> APP["Application Server"]
    APP --> L1["L1: In-Memory Cache  
(IMemoryCache, 30s TTL)"]
    L1 -->|"Miss"| L2["L2: Redis Cache  
(Distributed, 15 min TTL)"]
    L2 -->|"Miss"| DB["Database  
(SQL Server / PostgreSQL)"]
    DB -->|"Populate"| L2
    L2 -->|"Populate"| L1
    WORKER["Background Worker"] -->|"Proactive Refresh"| L2
    WORKER -->|"Invalidate"| L1
    CDC["CDC / Event Bus"] -->|"Data Changed"| WORKER
    DB -->|"Change Event"| CDC
    style L1 fill:#533483,stroke:#e94560,color:#fff
    style L2 fill:#e94560,stroke:#fff,color:#fff
    style DB fill:#0f3460,stroke:#e94560,color:#fff
    style WORKER fill:#16213e,stroke:#e94560,color:#fff
    style CDC fill:#16213e,stroke:#e94560,color:#fff
    style APP fill:#16213e,stroke:#e94560,color:#fff

```
Figure 5: Multi-tier cache architecture with L1 in-memory, L2 Redis, and proactive refresh via CDC

```csharp
// Multi-Level Cache Service in .NET
public class MultiLevelCacheService
{
    private readonly IMemoryCache _l1;
    private readonly IDatabase _l2;  // Redis

public async Task<T?> GetAsync<T>(string key, Func<Task<T?>> factory)
    {
        // L1: In-Memory (ultra-fast, ~0.001 ms)
        if (_l1.TryGetValue(key, out T? l1Value))
            return l1Value;

// L2: Redis (fast, ~0.5 ms)
        var l2Value = await _l2.StringGetAsync(key);
        if (l2Value.HasValue)
        {
            var result = JsonSerializer.Deserialize<T>(l2Value!);
            _l1.Set(key, result, TimeSpan.FromSeconds(30));
            return result;
        }

// L3: Database (slow, ~5-50 ms)
        var dbValue = await factory();
        if (dbValue is not null)
        {
            var json = JsonSerializer.Serialize(dbValue);
            await _l2.StringSetAsync(key, json, TimeSpan.FromMinutes(15));
            _l1.Set(key, dbValue, TimeSpan.FromSeconds(30));
        }

return dbValue;
    }
}

```

## 7. Cache anti-patterns to avoid

### 7.1. Thundering Herd (Cache Stampede)

When a hot cache key expires, thousands of concurrent requests miss simultaneously and pile on the database. Fix with a **distributed lock** or **probabilistic early expiration**.

```csharp
// Distributed lock to prevent Thundering Herd
public async Task<T?> GetWithLockAsync<T>(string key, Func<Task<T?>> factory)
{
    var db = _redis.GetDatabase();
    var cached = await db.StringGetAsync(key);
    if (cached.HasValue)
        return JsonSerializer.Deserialize<T>(cached!);

var lockKey = $"lock:{key}";
    var lockAcquired = await db.StringSetAsync(
        lockKey, "1", TimeSpan.FromSeconds(10), When.NotExists);

if (lockAcquired)
    {
        try
        {
            var value = await factory();
            if (value is not null)
            {
                await db.StringSetAsync(key,
                    JsonSerializer.Serialize(value),
                    TimeSpan.FromMinutes(15));
            }
            return value;
        }
        finally
        {
            await db.KeyDeleteAsync(lockKey);
        }
    }

// Other requests wait and retry
    await Task.Delay(100);
    return await GetWithLockAsync<T>(key, factory);
}

```

### 7.2. Cache Penetration

Requests repeatedly query a key that exists in neither cache nor database (often from attackers). Fix by caching null values or using a **Bloom Filter** (now built into Redis 8).

```csharp
// Use a Bloom Filter (Redis 8 native) to stop Cache Penetration
public async Task<Product?> GetProductSafeAsync(int productId)
{
    var db = _redis.GetDatabase();

// Check the Bloom Filter first — O(1), extremely fast
    var exists = (bool)await db.ExecuteAsync("BF.EXISTS", "products:bf", productId.ToString());
    if (!exists)
        return null;  // Definitely doesn't exist, skip the DB query

// Bloom Filter says "probably exists" → check cache + DB
    return await GetProductAsync(productId);
}

```

### 7.3. The Hot Key problem

One key is accessed so often it bottlenecks a single Redis node. Fix by replicating the hot key across nodes or using an L1 in-memory cache on the application side.

## 8. Conclusion

Redis 8 marks the evolution from a key-value store into a **unified data platform** with JSON, Search, TimeSeries, Vector Set, and Bloom Filter baked in. The new I/O Threading architecture delivers a throughput leap, while dual-stream replication and the LRM eviction policy open new possibilities for AI/semantic caching.

On caching patterns, there's no "silver bullet" — each pattern fits a specific workload. **Cache-Aside** remains the safe default for read-heavy systems, **Write-Behind** fits write-heavy analytics, and **Proactive Refresh** is ideal for hot paths that need smooth UX. Combining multi-tier caching (L1 in-memory + L2 Redis) with stampede-prevention techniques is the proven recipe for million-request systems.

### References

- [Redis 8 GA Announcement — redis.io](https://redis.io/blog/redis-8-ga/)
- [Redis 8.6 Performance Improvements — redis.io](https://redis.io/blog/announcing-redis-86-performance-improvements-streams/)
- [What's New in Redis 8.0 — redis.io](https://redis.io/docs/latest/develop/whats-new/8-0/)
- [Database Caching Strategies Using Redis — AWS](https://docs.aws.amazon.com/whitepapers/latest/database-caching-strategies-using-redis/caching-patterns.html)
- [Redis Caching Solutions — redis.io](https://redis.io/solutions/caching/)
- [Redis vs Valkey in 2026 — DEV Community](https://dev.to/synsun/redis-vs-valkey-in-2026-what-the-license-fork-actually-changed-1kni)
- [Choosing the Right Key-Value Store: Redis vs Valkey — Percona](https://www.percona.com/blog/choosing-the-right-key-value-store-redis-vs-valkey/)
- [Redis 8.6 Throughput Improvements — Linuxiac](https://linuxiac.com/redis-8-6-improves-throughput-by-more-than-five-times-over-redis-7-2/)

Passkeys & WebAuthn 2026 — Replacing Passwords with FIDO2, Platform Authenticators, and Phishing-resistant Auth on .NET 10 and Vue

AWS Lambda Serverless 2026: Architecture, SnapStart, Event-Driven Patterns, and the Production Free Tier

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.