Design a News Feed (Twitter Timeline) in .NET
Fan-out on write vs fan-out on read, the celebrity problem, and how to build a Twitter-style timeline service in .NET with Postgres + Redis + Kafka.
Table of contents
- When does the news feed problem show up?
- What back-of-envelope numbers shape the design?
- What does the architecture look like?
- What is the .NET 10 wiring for fan-out-on-write?
- What scale-out path does this support?
- What failure modes does this introduce?
- When is a simpler model enough?
- Where should you go from here?
The news feed is the canonical hard system design problem because it forces the trade-off between write cost and read cost into the open. This chapter designs a Twitter-style timeline in .NET, with the hybrid fan-out model that production systems converge on, and the .NET 10 wiring that makes it concrete.
When does the news feed problem show up?
Three contexts. Twitter / Mastodon clones, where the timeline is the product. Activity feeds inside apps - "people you follow posted X", "your team's recent updates". Notification feeds - the same fan-out shape, different rendering.
In all three, the question is "given a graph of follow relationships, how do I show user X the most recent N posts from people they follow, fast?". The answer is one of three models, plus the celebrity escape hatch.
What back-of-envelope numbers shape the design?
DAU 100M
Avg follows 200 per user
Posts / day 50M (50% of DAU posts once)
Timeline reads / day 5B (50 timeline opens per DAU)
Peak timeline reads 5B / 100K * 5 = 250K req/s
Avg timeline size 1000 IDs * 8 bytes = 8 KB
Total timeline mem 100M * 8 KB = 800 GB across cluster
The 800 GB total Redis memory is feasible for a sharded Redis cluster. The 250K req/s read rate must be served from cache - the DB cannot keep up. Writes are 50M/day = ~600 writes/s on average, 3K peak; trivial for Postgres. The bottleneck is fan-out amplification.
What does the architecture look like?
Hybrid fan-out:
flowchart LR
User[User posts] -->|write| App[ASP.NET Core]
App --> PG[(Postgres<br/>posts)]
App --> OB[(Outbox)]
OB --> Worker[Fan-out worker]
Worker --> FollowSvc[(Follow graph)]
Worker -->|fan-out small| Redis[(Redis ZSET<br/>per timeline)]
Worker -.skip celebrity.-> Redis
Reader[Reader fetch] --> App2[ASP.NET Core]
App2 -->|merge personal ZSET| Redis
App2 -->|merge celebrity posts| PG
Two writes for a normal user post: the post row plus an outbox entry. Worker reads outbox, looks up followers, pushes to each follower's ZSET (skipping celebrity authors; their posts get merged at read time). On read, fetch personal ZSET, merge with recent celebrity posts the user follows.
What is the .NET 10 wiring for fan-out-on-write?
// Post handler
public async Task<Guid> PostAsync(string text, CancellationToken ct)
{
var post = new Post { Id = Guid.NewGuid(), AuthorId = userId, Text = text, CreatedAt = DateTime.UtcNow };
db.Posts.Add(post);
db.OutboxMessages.Add(new OutboxMessage
{
MessageType = nameof(PostCreated),
Payload = JsonSerializer.Serialize(new PostCreated(post.Id, post.AuthorId, post.CreatedAt))
});
await db.SaveChangesAsync(ct);
return post.Id;
}
// Fan-out consumer
public class PostCreatedConsumer(IConnectionMultiplexer redis, IFollowService follows, IUserService users)
: IConsumer<PostCreated>
{
private const int CelebrityThreshold = 100_000;
public async Task Consume(ConsumeContext<PostCreated> ctx)
{
var msg = ctx.Message;
var author = await users.GetAsync(msg.AuthorId);
if (author.FollowerCount > CelebrityThreshold) return; // celebrity, no push
var followers = await follows.GetFollowersAsync(msg.AuthorId);
var db = redis.GetDatabase();
var batch = db.CreateBatch();
foreach (var f in followers)
{
batch.SortedSetAddAsync($"tl:{f}", msg.PostId.ToString(), msg.CreatedAt.Ticks);
batch.SortedSetRemoveRangeByRankAsync($"tl:{f}", 0, -1001); // keep top 1000
}
batch.Execute();
}
}
// Read endpoint - merge personal feed with celebrity posts
public async Task<List<Post>> GetTimelineAsync(Guid userId, int take = 50, CancellationToken ct = default)
{
var personalIds = (await redis.GetDatabase().SortedSetRangeByRankAsync(
$"tl:{userId}", 0, take - 1, Order.Descending))
.Select(v => Guid.Parse(v!)).ToArray();
var celebrityFollows = await follows.GetCelebrityFolloweesAsync(userId);
var celebrityPosts = await db.Posts
.Where(p => celebrityFollows.Contains(p.AuthorId) && p.CreatedAt > DateTime.UtcNow.AddHours(-24))
.OrderByDescending(p => p.CreatedAt)
.Take(take)
.ToListAsync(ct);
var personal = await db.Posts.Where(p => personalIds.Contains(p.Id)).ToListAsync(ct);
return personal.Concat(celebrityPosts)
.OrderByDescending(p => p.CreatedAt)
.Take(take).ToList();
}
Three details. The outbox guarantees the fan-out runs even if the worker crashes mid-write. Redis pipelined batch makes the fan-out to thousands of followers fit in a few network round-trips. The celebrity merge at read time keeps the system from collapsing under viral users.
What scale-out path does this support?
- Posts table: partition by month, archive old months to cold storage.
- Timeline cache: Redis cluster, shard by user ID; one ZSET per user is partitioned naturally.
- Follow graph: dedicated service with its own database, often graph-shaped (Neo4j, Postgres recursive CTEs at small scale).
- Fan-out workers: scale horizontally with
SKIP LOCKEDoutbox draining; partition by author ID hash.
The architecture handles 100M DAU comfortably; Twitter's actual peak is in the same order of magnitude.
What failure modes does this introduce?
- Fan-out lag - new posts take seconds to appear in followers'
timelines. Mitigation: alert on
outbox_age_p99; scale workers. - Hot author detection lag - a user becomes celebrity overnight but is still being fanned out. Mitigation: real-time follower count refresh in the consumer; cap fan-out batch size.
- Timeline drift - cache and DB disagree on what a user's feed
contains. Mitigation: periodic full rebuild from
poststable on demand or by scheduled job. - Cold cache on new user - first feed open has empty ZSET. Mitigation: lazy population - on miss, run fan-out-on-read once and populate the ZSET.
When is a simpler model enough?
Three cases.
Pull-only (fan-out-on-read) is fine up to ~1M users with low
posting rates - the read query "posts from followees in last 7
days" is one indexed SQL with a JOIN to follows.
Push-only (fan-out-on-write) is fine up to ~10K followers-per-user - the write amplification is bounded.
Hybrid is the answer when neither of the simpler models holds. For a Twitter scale, only hybrid works.
Where should you go from here?
Next case study: realtime chat with SignalR
- WebSockets, presence, and message ordering. Different shape but the fan-out vocabulary transfers directly to broadcast messages.