News Feed System Design — Fan-out, Caching & Ranking at Scale for Millions of Users

Posted on: 4/26/2026 8:17:18 PM

Table of contents

Table of Contents
1. System Requirements Analysis
2. Data Model & API Design
1. Core Entities
2. API Endpoints
  1. Why Cursor-based Pagination?
3. Fan-out Strategy: Push vs Pull vs Hybrid
4. Overall Architecture
1. Core Components
5. Multi-layer Caching Strategy
6. Ranking Algorithm
1. Scoring Formula (Simplified)
2. Facebook EdgeRank (Simplified Version)
  1. Ranking Service in Production
7. Media Handling & CDN
1. Upload Flow — Presigned URL
2. Media Processing Pipeline
8. Scaling & Partitioning
9. Real-time Updates & Notifications
1. WebSocket for Feed Updates
2. Notification Flow
10. Monitoring & Observability
Conclusion
1. References

500M+Feed queries/day (100M DAU)

~6,000Average QPS for Feed API

<200msTarget latency for feed load

10K+Followers → switch to Pull model

You open the app, swipe your thumb — and hundreds of posts from friends, followees, and groups appear seamlessly. That experience seems simple, but behind the scenes lies a complex distributed system that must solve countless problems: who posted what, who receives it, how to rank it, where to cache it, and how to ensure millions of concurrent users don't feel any latency.

This article dives deep into the architecture of a News Feed system — from data model, fan-out strategy, caching layers to ranking algorithms — following the patterns that companies like Facebook, Twitter (X), and Instagram have deployed at production scale.

1. System Requirements Analysis

Functional Requirements

Publish: Users can create posts (text, images, videos)
Feed Generation: Display a personalized feed from followed users
Interaction: Like, comment, share on each post
Pagination: Feed supports infinite scroll with cursor-based pagination

Non-Functional Requirements

Latency: Feed load < 200ms (P99)
Availability: 99.99% uptime — feed is the core experience
Scalability: Support 100M+ DAU, each user follows 200–500 people on average
Consistency: Eventual consistency is acceptable — users see new posts within seconds, no need for absolute real-time

Scale Estimation

Metric	Estimate
DAU (Daily Active Users)	100 million
New posts / day	~50 million (0.5 posts/user)
Feed loads / day / user	~5 times
Total feed queries / day	500 million (~6,000 QPS)
Average followees / user	~300
Feed cache size / user	~100 posts × 1KB = 100KB

2. Data Model & API Design

Core Entities

-- Users
CREATE TABLE Users (
    user_id     BIGINT PRIMARY KEY,
    username    VARCHAR(50) UNIQUE,
    display_name NVARCHAR(100),
    avatar_url  VARCHAR(500),
    is_celebrity BOOLEAN DEFAULT FALSE,
    follower_count INT DEFAULT 0,
    created_at  TIMESTAMP DEFAULT NOW()
);

-- Posts
CREATE TABLE Posts (
    post_id     BIGINT PRIMARY KEY,  -- Snowflake ID
    author_id   BIGINT REFERENCES Users(user_id),
    content     TEXT,
    media_url   VARCHAR(500),
    media_type  VARCHAR(20),  -- 'image', 'video', 'none'
    like_count  INT DEFAULT 0,
    comment_count INT DEFAULT 0,
    share_count INT DEFAULT 0,
    created_at  TIMESTAMP DEFAULT NOW(),
    INDEX idx_author_time (author_id, created_at DESC)
);

-- Follow Graph
CREATE TABLE Follows (
    follower_id BIGINT,
    followee_id BIGINT,
    created_at  TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (follower_id, followee_id),
    INDEX idx_followee (followee_id)
);

-- Precomputed Feed (Fan-out on Write)
CREATE TABLE FeedItems (
    user_id     BIGINT,
    post_id     BIGINT,
    author_id   BIGINT,
    score       FLOAT,
    created_at  TIMESTAMP,
    PRIMARY KEY (user_id, post_id),
    INDEX idx_user_score (user_id, score DESC)
);

API Endpoints

POST   /v1/posts                    -- Create new post
GET    /v1/feed?cursor=xxx&limit=20  -- Get feed (cursor-based pagination)
POST   /v1/posts/{id}/like           -- Like / unlike
POST   /v1/posts/{id}/comment        -- Add comment
GET    /v1/posts/{id}/comments?cursor=xxx  -- Get comments
POST   /v1/users/{id}/follow         -- Follow user

Why Cursor-based Pagination?

Offset-based pagination (OFFSET 100 LIMIT 20) breaks when new items are inserted: users will see duplicate posts or miss items entirely. Cursor-based pagination uses created_at or score of the last item as an anchor, ensuring stable pagination even when data changes continuously.

3. Fan-out Strategy: Push vs Pull vs Hybrid

This is the most critical architectural decision for a News Feed system. There are 3 main strategies:

3.1. Fan-out on Write (Push Model)

graph LR
    A["User A creates post"] --> B["Post Service"]
    B --> C["Message Queue"]
    C --> D["Fan-out Workers"]
    D --> E["Feed Cache User X"]
    D --> F["Feed Cache User Y"]
    D --> G["Feed Cache User Z"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f39c12,stroke:#fff,color:#fff
    style D fill:#16213e,stroke:#fff,color:#fff
    style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style F fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style G fill:#f8f9fa,stroke:#e94560,color:#2c3e50

Fan-out on Write: posts are pushed into every follower's feed cache at publish time

How it works: When User A creates a post, the system retrieves all of A's followers, then inserts the post into each follower's feed cache (Redis Sorted Set).

Pros	Cons
Extremely fast reads — feed is precomputed	Slow writes if user has many followers
Simple read logic	Celebrity problem: 10M followers = 10M writes
Works well for majority of regular users	Wastes cache memory for inactive users

3.2. Fan-out on Read (Pull Model)

graph LR
    A["User X opens feed"] --> B["Feed Service"]
    B --> C["Get followees list"]
    C --> D["Query Posts from each followee"]
    D --> E["Merge + Rank + Return"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#16213e,stroke:#fff,color:#fff
    style D fill:#f39c12,stroke:#fff,color:#fff
    style E fill:#4CAF50,stroke:#fff,color:#fff

Fan-out on Read: feed is assembled at the time of user request

How it works: When User X opens their feed, the system queries recent posts from everyone X follows, merges them, ranks, and returns the result.

Pros	Cons
Fast writes — only one copy of the post stored	Slow reads — must query multiple sources
No wasted cache for inactive users	Latency scales linearly with number of followees
No celebrity problem on write path	Not suitable for < 200ms feed experience

3.3. Hybrid Approach (Production Standard)

This is how Facebook, Twitter, and Instagram do it

Combine both Push and Pull: Push for regular users (under 10K followers) and Pull for celebrities (over 10K followers). When a user loads their feed, the system merges the precomputed feed with real-time posts from celebrities the user follows.

graph TD
    A["User creates post"] --> B{"Followers > 10K?"}
    B -- No --> C["Fan-out on Write
Push to all followers'
feed cache"]
    B -- Yes --> D["Only save to
Posts table"]
    E["User X loads feed"] --> F["Read precomputed
feed from cache"]
    E --> G["Query celebrity posts
that X follows"]
    F --> H["Merge + Rank"]
    G --> H
    H --> I["Return feed
to client"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#f39c12,stroke:#fff,color:#2c3e50
    style C fill:#4CAF50,stroke:#fff,color:#fff
    style D fill:#2c3e50,stroke:#fff,color:#fff
    style E fill:#e94560,stroke:#fff,color:#fff
    style H fill:#16213e,stroke:#fff,color:#fff
    style I fill:#4CAF50,stroke:#fff,color:#fff

Hybrid: Push for regular users, Pull for celebrities, merge at read time

# Pseudo-code: Hybrid Feed Generation
async def get_feed(user_id, cursor, limit=20):
    # 1. Read precomputed feed (fan-out on write results)
    precomputed = await redis.zrevrangebyscore(
        f"feed:{user_id}",
        max=cursor or "+inf",
        min="-inf",
        start=0, num=limit
    )

    # 2. Get celebrity list that user follows
    celebrities = await db.query("""
        SELECT followee_id FROM Follows f
        JOIN Users u ON u.user_id = f.followee_id
        WHERE f.follower_id = ? AND u.is_celebrity = TRUE
    """, user_id)

    # 3. Fetch recent posts from celebrities (fan-out on read)
    celebrity_posts = await db.query("""
        SELECT * FROM Posts
        WHERE author_id IN (?) AND created_at > NOW() - INTERVAL '24 HOURS'
        ORDER BY created_at DESC LIMIT ?
    """, celebrity_ids, limit)

    # 4. Merge and rank
    all_posts = merge(precomputed, celebrity_posts)
    ranked = ranking_service.rank(all_posts, user_id)

    return ranked[:limit]

4. Overall Architecture

graph TD
    Client["Mobile / Web Client"] --> LB["Load Balancer"]
    LB --> API["API Gateway"]
    API --> PS["Post Service"]
    API --> FS["Feed Service"]
    API --> US["User Service"]

    PS --> DB_Post["Posts DB
(Sharded)"]
    PS --> MQ["Message Queue
(Kafka)"]
    PS --> S3["Object Storage
(S3 / R2)"]

    MQ --> FW["Fan-out Workers"]
    FW --> RC["Feed Cache
(Redis Cluster)"]
    FW --> NS["Notification Service"]

    FS --> RC
    FS --> DB_Post
    FS --> RS["Ranking Service"]

    US --> DB_User["Users DB"]
    US --> GC["Graph Cache
(Follow relations)"]

    S3 --> CDN["CDN
(CloudFront / Cloudflare)"]

    style Client fill:#e94560,stroke:#fff,color:#fff
    style LB fill:#2c3e50,stroke:#fff,color:#fff
    style API fill:#2c3e50,stroke:#fff,color:#fff
    style MQ fill:#f39c12,stroke:#fff,color:#fff
    style RC fill:#e94560,stroke:#e94560,color:#fff
    style FW fill:#16213e,stroke:#fff,color:#fff
    style CDN fill:#4CAF50,stroke:#fff,color:#fff

Overall architecture of a large-scale News Feed system

Core Components

Service	Role	Common Tech Stack
Post Service	CRUD posts, publish events to queue	.NET / Go + PostgreSQL (sharded)
Feed Service	Generate feed, merge precomputed + celebrity posts	.NET / Go + Redis
Fan-out Workers	Consume from queue, push posts into feed cache	Background workers, horizontally scaled
Ranking Service	Score and sort feed items	Python / .NET + ML model
User Service	Manage profiles, follow graph	.NET / Go + PostgreSQL + Redis (graph cache)
Notification Service	Push notifications on interactions	Firebase / APNs + Message Queue

5. Multi-layer Caching Strategy

Cache is the backbone of a News Feed. Without cache, each feed request would need to query hundreds of data sources — impossible at scale.

graph LR
    A["Client Cache
(In-memory)"] --> B["CDN Cache
(Static assets)"]
    B --> C["Feed Cache
(Redis Sorted Set)"]
    C --> D["Post Cache
(Redis Hash)"]
    D --> E["User Cache
(Redis Hash)"]
    E --> F["Database
(Source of Truth)"]
    style A fill:#4CAF50,stroke:#fff,color:#fff
    style B fill:#4CAF50,stroke:#fff,color:#fff
    style C fill:#e94560,stroke:#fff,color:#fff
    style D fill:#e94560,stroke:#fff,color:#fff
    style E fill:#e94560,stroke:#fff,color:#fff
    style F fill:#2c3e50,stroke:#fff,color:#fff

Multi-layer caching: from client to database

5.1. Feed Cache — Redis Sorted Set

Each user has a Sorted Set in Redis, scored by timestamp (or ranking score). Loading a feed only requires ZREVRANGEBYSCORE — O(log N + M) where M is the number of items to fetch.

# Structure: feed:{user_id} → Sorted Set
# Member: post_id, Score: timestamp (or ranking score)

# Push new post into feed
ZADD feed:12345 1714200000 "post:98765"

# Keep only the 500 most recent items (trim)
ZREMRANGEBYRANK feed:12345 0 -501

# Read feed: get 20 newest items
ZREVRANGEBYSCORE feed:12345 +inf -inf LIMIT 0 20

5.2. Post Cache — Redis Hash

# Cache post content to avoid DB queries
# Key: post:{post_id}
HSET post:98765 author_id 111 content "Hello world"
     like_count 42 comment_count 7 media_url "..."
     created_at 1714200000

# TTL: 24-48 hours for regular posts, longer for viral posts
EXPIRE post:98765 172800

# Cache followees list for fast fan-out
# Key: followers:{user_id} → Set of follower IDs
SMEMBERS followers:111
# → {12345, 67890, 11111, ...}

# Cache celebrity list that user follows
SMEMBERS celeb_followees:12345
# → {celebrity_1, celebrity_2}

Cache Invalidation Strategy

Partial invalidation is the best approach for feed cache: when a new post arrives, simply ZADD to the sorted set and ZREMRANGEBYRANK to trim — no need to flush the entire cache. For post cache, use write-through: update cache immediately when like_count/comment_count changes, with TTL as a safety net.

6. Ranking Algorithm

A feed is not just a chronological list of recent posts — it must be personalized based on user behavior.

Scoring Formula (Simplified)

Score = w1 × affinity_score
      + w2 × interaction_score
      + w3 × recency_score
      + w4 × content_type_boost

Where:
- affinity_score: Level of interaction between user and author
  (likes, comments, shares between them in the last 30 days)
- interaction_score: Total engagement of the post
  (likes + 2×comments + 3×shares) / time_since_post
- recency_score: Time decay function
  = 1 / (1 + hours_since_post)^1.5
- content_type_boost: Video > Image > Text (platform-dependent)

Facebook EdgeRank (Simplified Version)

graph LR
    A["Affinity
(User-Author relationship)"] --> D["EdgeRank Score"]
    B["Weight
(Interaction type)"] --> D
    C["Time Decay
(Freshness)"] --> D
    D --> E["Sorted Feed"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#f39c12,stroke:#fff,color:#fff
    style C fill:#4CAF50,stroke:#fff,color:#fff
    style D fill:#16213e,stroke:#fff,color:#fff
    style E fill:#2c3e50,stroke:#fff,color:#fff

Three key factors in EdgeRank: Affinity × Weight × Decay

Signal	Description	Weight
Direct message	User has messaged the author before	Very High
Comment on post	User has commented on author's posts	High
Like	User has liked author's posts	Medium
Profile visit	User has visited author's profile	Medium
Shared group/page	In same group or following same page	Low
Video > Image > Link > Text	Content type	Descending

Ranking Service in Production

At large scale, ranking services run ML models (typically gradient-boosted trees or deep learning) trained on billions of interactions. The model predicts P(user will interact with this post) and uses that probability as the score. However, the heuristic-based scoring above remains a solid baseline for small to medium-scale systems.

7. Media Handling & CDN

Posts with media (images, videos) consume the majority of bandwidth. Poor handling will kill performance.

Upload Flow — Presigned URL

sequenceDiagram
    participant C as Client
    participant API as API Gateway
    participant S3 as Object Storage
    participant CDN as CDN

    C->>API: POST /v1/media/upload-url
    API->>S3: Generate presigned URL
    S3-->>API: Presigned URL (5 min expiry)
    API-->>C: { uploadUrl, mediaKey }
    C->>S3: PUT file directly
    S3-->>C: 200 OK
    C->>API: POST /v1/posts { content, mediaKey }
    Note over API: Save post with media_url
    C->>CDN: GET media (via CDN edge)
    CDN->>S3: Cache miss → fetch from origin
    CDN-->>C: Serve from edge (cache hit next time)

Upload flow: client uploads directly to Object Storage, served via CDN

Benefits: The API server doesn't handle file uploads — reducing CPU and bandwidth usage. Clients upload directly to S3/R2, media is served from the nearest CDN edge.

Media Processing Pipeline

Image: Resize to multiple dimensions (thumbnail 150px, medium 600px, full 1200px), convert to WebP/AVIF, strip metadata
Video: Transcode to adaptive bitrate (HLS/DASH), generate thumbnails, extract duration
Processing: Runs async via message queue — never blocks post creation

8. Scaling & Partitioning

Database Sharding

Table	Shard Key	Reason
Posts	author_id	All posts by one user on the same shard → fast queries
FeedItems	user_id	One user's feed on the same shard → fast reads
Follows	follower_id	"Who does user X follow?" query hits single shard
Users	user_id	Consistent hashing, easy to add more shards

Redis Cluster Partitioning

Feed cache: Hash slot = CRC16("feed:{user_id}") mod 16384
→ Each Redis node handles a range of slots
→ Scale by adding nodes + rebalancing slots

Memory estimation:
- 100M users × 100KB feed cache = ~10TB
- Redis Cluster 20 nodes × 512GB RAM = 10TB ✓
- Replication factor 2 → 40 nodes total

Fan-out Worker Scaling

Handling Celebrity Posts

When a celebrity (10M followers) creates a post, fan-out on write would require 10M writes — taking minutes if done sequentially. Solutions: (1) Don't fan-out for celebrities (hybrid model), (2) If fan-out is required, partition followers into batches of 10K, send to queue, multiple workers process in parallel → complete within seconds.

9. Real-time Updates & Notifications

WebSocket for Feed Updates

// Client-side: WebSocket connection for real-time feed updates
const ws = new WebSocket('wss://feed.example.com/ws');

ws.onmessage = (event) => {
  const update = JSON.parse(event.data);

  switch (update.type) {
    case 'new_post':
      // Prepend to feed UI
      feedStore.prepend(update.post);
      break;
    case 'engagement_update':
      // Update like/comment counts
      feedStore.updateCounts(update.post_id, update.counts);
      break;
    case 'new_notification':
      notificationStore.add(update.notification);
      break;
  }
};

Notification Flow

graph LR
    A["User likes post"] --> B["Interaction Service"]
    B --> C["Update Counts
(Post Cache)"]
    B --> D["Notification Queue"]
    D --> E["Notification Worker"]
    E --> F{"Push enabled?"}
    F -- Yes --> G["FCM / APNs
Push Notification"]
    F -- No --> H["In-app badge
update"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style D fill:#f39c12,stroke:#fff,color:#fff
    style G fill:#4CAF50,stroke:#fff,color:#fff

Notification flow: from interaction to push notification

10. Monitoring & Observability

Metric	Alert Threshold	Monitoring Tool
Feed API P99 latency	> 500ms	Prometheus + Grafana
Fan-out queue lag	> 100K messages	Kafka Consumer Lag monitor
Redis cache hit rate	< 95%	Redis Exporter + Grafana
Post creation error rate	> 0.1%	Application metrics
CDN cache hit ratio	< 85%	CloudFront / Cloudflare Analytics
Database query P95	> 100ms	Slow query log + APM

Conclusion

Designing a News Feed system is a composite challenge spanning multiple techniques: from fan-out strategy determining write/read trade-offs, multi-layer caching to reduce latency, ranking algorithms to personalize the experience, to sharding and partitioning for horizontal scaling. The key takeaway isn't choosing Push or Pull, but knowing when to use each — and the Hybrid approach is the answer that all large-scale systems converge on.

Whether you're preparing for a system design interview or actually building a product, understanding the flow from user creates post → fan-out → cache → rank → serve will give you the confidence to design any feed/timeline system.

References

#system design #News Feed #Fan-out #redis #Caching #Ranking Algorithm #Database Sharding #Microservices

# News Feed System Design — Fan-out, Caching & Ranking at Scale for Millions of Users

500M+Feed queries/day (100M DAU)

~6,000Average QPS for Feed API

<200msTarget latency for feed load

10K+Followers → switch to Pull model

## 1. System Requirements Analysis

### Functional Requirements

- **Publish**: Users can create posts (text, images, videos)
- **Feed Generation**: Display a personalized feed from followed users
- **Interaction**: Like, comment, share on each post
- **Pagination**: Feed supports infinite scroll with cursor-based pagination

### Non-Functional Requirements

- **Latency**: Feed load < 200ms (P99)
- **Availability**: 99.99% uptime — feed is the core experience
- **Scalability**: Support 100M+ DAU, each user follows 200–500 people on average
- **Consistency**: Eventual consistency is acceptable — users see new posts within seconds, no need for absolute real-time

### Scale Estimation

| Metric | Estimate |
| --- | --- |
| DAU (Daily Active Users) | 100 million |
| New posts / day | ~50 million (0.5 posts/user) |
| Feed loads / day / user | ~5 times |
| Total feed queries / day | 500 million (~6,000 QPS) |
| Average followees / user | ~300 |
| Feed cache size / user | ~100 posts × 1KB = 100KB |

## 2. Data Model & API Design

### Core Entities

```sql
-- Users
CREATE TABLE Users (
    user_id     BIGINT PRIMARY KEY,
    username    VARCHAR(50) UNIQUE,
    display_name NVARCHAR(100),
    avatar_url  VARCHAR(500),
    is_celebrity BOOLEAN DEFAULT FALSE,
    follower_count INT DEFAULT 0,
    created_at  TIMESTAMP DEFAULT NOW()
);

-- Posts
CREATE TABLE Posts (
    post_id     BIGINT PRIMARY KEY,  -- Snowflake ID
    author_id   BIGINT REFERENCES Users(user_id),
    content     TEXT,
    media_url   VARCHAR(500),
    media_type  VARCHAR(20),  -- 'image', 'video', 'none'
    like_count  INT DEFAULT 0,
    comment_count INT DEFAULT 0,
    share_count INT DEFAULT 0,
    created_at  TIMESTAMP DEFAULT NOW(),
    INDEX idx_author_time (author_id, created_at DESC)
);

-- Follow Graph
CREATE TABLE Follows (
    follower_id BIGINT,
    followee_id BIGINT,
    created_at  TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (follower_id, followee_id),
    INDEX idx_followee (followee_id)
);

-- Precomputed Feed (Fan-out on Write)
CREATE TABLE FeedItems (
    user_id     BIGINT,
    post_id     BIGINT,
    author_id   BIGINT,
    score       FLOAT,
    created_at  TIMESTAMP,
    PRIMARY KEY (user_id, post_id),
    INDEX idx_user_score (user_id, score DESC)
);
```

### API Endpoints

```text
POST   /v1/posts                    -- Create new post
GET    /v1/feed?cursor=xxx&limit=20  -- Get feed (cursor-based pagination)
POST   /v1/posts/{id}/like           -- Like / unlike
POST   /v1/posts/{id}/comment        -- Add comment
GET    /v1/posts/{id}/comments?cursor=xxx  -- Get comments
POST   /v1/users/{id}/follow         -- Follow user
```

#### Why Cursor-based Pagination?

Offset-based pagination (`OFFSET 100 LIMIT 20`) breaks when new items are inserted: users will see duplicate posts or miss items entirely. Cursor-based pagination uses `created_at` or `score` of the last item as an anchor, ensuring stable pagination even when data changes continuously.

## 3. Fan-out Strategy: Push vs Pull vs Hybrid

This is the most critical architectural decision for a News Feed system. There are 3 main strategies:

### 3.1. Fan-out on Write (Push Model)

```
graph LR
    A["User A creates post"] --> B["Post Service"]
    B --> C["Message Queue"]
    C --> D["Fan-out Workers"]
    D --> E["Feed Cache User X"]
    D --> F["Feed Cache User Y"]
    D --> G["Feed Cache User Z"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f39c12,stroke:#fff,color:#fff
    style D fill:#16213e,stroke:#fff,color:#fff
    style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style F fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style G fill:#f8f9fa,stroke:#e94560,color:#2c3e50

```
Fan-out on Write: posts are pushed into every follower's feed cache at publish time

**How it works:** When User A creates a post, the system retrieves all of A's followers, then inserts the post into each follower's feed cache (Redis Sorted Set).

| Pros | Cons |
| --- | --- |
| Extremely fast reads — feed is precomputed | Slow writes if user has many followers |
| Simple read logic | Celebrity problem: 10M followers = 10M writes |
| Works well for majority of regular users | Wastes cache memory for inactive users |

### 3.2. Fan-out on Read (Pull Model)

```
graph LR
    A["User X opens feed"] --> B["Feed Service"]
    B --> C["Get followees list"]
    C --> D["Query Posts from each followee"]
    D --> E["Merge + Rank + Return"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#16213e,stroke:#fff,color:#fff
    style D fill:#f39c12,stroke:#fff,color:#fff
    style E fill:#4CAF50,stroke:#fff,color:#fff

```
Fan-out on Read: feed is assembled at the time of user request

**How it works:** When User X opens their feed, the system queries recent posts from everyone X follows, merges them, ranks, and returns the result.

| Pros | Cons |
| --- | --- |
| Fast writes — only one copy of the post stored | Slow reads — must query multiple sources |
| No wasted cache for inactive users | Latency scales linearly with number of followees |
| No celebrity problem on write path | Not suitable for < 200ms feed experience |

### 3.3. Hybrid Approach (Production Standard)

#### This is how Facebook, Twitter, and Instagram do it

Combine both Push and Pull: **Push for regular users** (under 10K followers) and **Pull for celebrities** (over 10K followers). When a user loads their feed, the system merges the precomputed feed with real-time posts from celebrities the user follows.

```
graph TD
    A["User creates post"] --> B{"Followers > 10K?"}
    B -- No --> C["Fan-out on Write  
Push to all followers'  
feed cache"]
    B -- Yes --> D["Only save to  
Posts table"]
    E["User X loads feed"] --> F["Read precomputed  
feed from cache"]
    E --> G["Query celebrity posts  
that X follows"]
    F --> H["Merge + Rank"]
    G --> H
    H --> I["Return feed  
to client"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#f39c12,stroke:#fff,color:#2c3e50
    style C fill:#4CAF50,stroke:#fff,color:#fff
    style D fill:#2c3e50,stroke:#fff,color:#fff
    style E fill:#e94560,stroke:#fff,color:#fff
    style H fill:#16213e,stroke:#fff,color:#fff
    style I fill:#4CAF50,stroke:#fff,color:#fff

```
Hybrid: Push for regular users, Pull for celebrities, merge at read time

```python
# Pseudo-code: Hybrid Feed Generation
async def get_feed(user_id, cursor, limit=20):
    # 1. Read precomputed feed (fan-out on write results)
    precomputed = await redis.zrevrangebyscore(
        f"feed:{user_id}",
        max=cursor or "+inf",
        min="-inf",
        start=0, num=limit
    )

# 2. Get celebrity list that user follows
    celebrities = await db.query("""
        SELECT followee_id FROM Follows f
        JOIN Users u ON u.user_id = f.followee_id
        WHERE f.follower_id = ? AND u.is_celebrity = TRUE
    """, user_id)

# 3. Fetch recent posts from celebrities (fan-out on read)
    celebrity_posts = await db.query("""
        SELECT * FROM Posts
        WHERE author_id IN (?) AND created_at > NOW() - INTERVAL '24 HOURS'
        ORDER BY created_at DESC LIMIT ?
    """, celebrity_ids, limit)

# 4. Merge and rank
    all_posts = merge(precomputed, celebrity_posts)
    ranked = ranking_service.rank(all_posts, user_id)

return ranked[:limit]
```

## 4. Overall Architecture

```
graph TD
    Client["Mobile / Web Client"] --> LB["Load Balancer"]
    LB --> API["API Gateway"]
    API --> PS["Post Service"]
    API --> FS["Feed Service"]
    API --> US["User Service"]

PS --> DB_Post["Posts DB  
(Sharded)"]
    PS --> MQ["Message Queue  
(Kafka)"]
    PS --> S3["Object Storage  
(S3 / R2)"]

MQ --> FW["Fan-out Workers"]
    FW --> RC["Feed Cache  
(Redis Cluster)"]
    FW --> NS["Notification Service"]

FS --> RC
    FS --> DB_Post
    FS --> RS["Ranking Service"]

US --> DB_User["Users DB"]
    US --> GC["Graph Cache  
(Follow relations)"]

S3 --> CDN["CDN  
(CloudFront / Cloudflare)"]

style Client fill:#e94560,stroke:#fff,color:#fff
    style LB fill:#2c3e50,stroke:#fff,color:#fff
    style API fill:#2c3e50,stroke:#fff,color:#fff
    style MQ fill:#f39c12,stroke:#fff,color:#fff
    style RC fill:#e94560,stroke:#e94560,color:#fff
    style FW fill:#16213e,stroke:#fff,color:#fff
    style CDN fill:#4CAF50,stroke:#fff,color:#fff

```
Overall architecture of a large-scale News Feed system

### Core Components

| Service | Role | Common Tech Stack |
| --- | --- | --- |
| **Post Service** | CRUD posts, publish events to queue | .NET / Go + PostgreSQL (sharded) |
| **Feed Service** | Generate feed, merge precomputed + celebrity posts | .NET / Go + Redis |
| **Fan-out Workers** | Consume from queue, push posts into feed cache | Background workers, horizontally scaled |
| **Ranking Service** | Score and sort feed items | Python / .NET + ML model |
| **User Service** | Manage profiles, follow graph | .NET / Go + PostgreSQL + Redis (graph cache) |
| **Notification Service** | Push notifications on interactions | Firebase / APNs + Message Queue |

## 5. Multi-layer Caching Strategy

Cache is the backbone of a News Feed. Without cache, each feed request would need to query hundreds of data sources — impossible at scale.

```
graph LR
    A["Client Cache  
(In-memory)"] --> B["CDN Cache  
(Static assets)"]
    B --> C["Feed Cache  
(Redis Sorted Set)"]
    C --> D["Post Cache  
(Redis Hash)"]
    D --> E["User Cache  
(Redis Hash)"]
    E --> F["Database  
(Source of Truth)"]
    style A fill:#4CAF50,stroke:#fff,color:#fff
    style B fill:#4CAF50,stroke:#fff,color:#fff
    style C fill:#e94560,stroke:#fff,color:#fff
    style D fill:#e94560,stroke:#fff,color:#fff
    style E fill:#e94560,stroke:#fff,color:#fff
    style F fill:#2c3e50,stroke:#fff,color:#fff

```
Multi-layer caching: from client to database

### 5.1. Feed Cache — Redis Sorted Set

Each user has a Sorted Set in Redis, scored by timestamp (or ranking score). Loading a feed only requires `ZREVRANGEBYSCORE` — O(log N + M) where M is the number of items to fetch.

```bash
# Structure: feed:{user_id} → Sorted Set
# Member: post_id, Score: timestamp (or ranking score)

# Push new post into feed
ZADD feed:12345 1714200000 "post:98765"

# Keep only the 500 most recent items (trim)
ZREMRANGEBYRANK feed:12345 0 -501

# Read feed: get 20 newest items
ZREVRANGEBYSCORE feed:12345 +inf -inf LIMIT 0 20
```

### 5.2. Post Cache — Redis Hash

```bash
# Cache post content to avoid DB queries
# Key: post:{post_id}
HSET post:98765 author_id 111 content "Hello world"
     like_count 42 comment_count 7 media_url "..."
     created_at 1714200000

# TTL: 24-48 hours for regular posts, longer for viral posts
EXPIRE post:98765 172800
```

### 5.3. Social Graph Cache

```bash
# Cache followees list for fast fan-out
# Key: followers:{user_id} → Set of follower IDs
SMEMBERS followers:111
# → {12345, 67890, 11111, ...}

# Cache celebrity list that user follows
SMEMBERS celeb_followees:12345
# → {celebrity_1, celebrity_2}
```

#### Cache Invalidation Strategy

**Partial invalidation** is the best approach for feed cache: when a new post arrives, simply ZADD to the sorted set and ZREMRANGEBYRANK to trim — no need to flush the entire cache. For post cache, use **write-through**: update cache immediately when like_count/comment_count changes, with TTL as a safety net.

## 6. Ranking Algorithm

A feed is not just a chronological list of recent posts — it must be personalized based on user behavior.

### Scoring Formula (Simplified)

```text
Score = w1 × affinity_score
      + w2 × interaction_score
      + w3 × recency_score
      + w4 × content_type_boost

Where:
- affinity_score: Level of interaction between user and author
  (likes, comments, shares between them in the last 30 days)
- interaction_score: Total engagement of the post
  (likes + 2×comments + 3×shares) / time_since_post
- recency_score: Time decay function
  = 1 / (1 + hours_since_post)^1.5
- content_type_boost: Video > Image > Text (platform-dependent)
```

### Facebook EdgeRank (Simplified Version)

```
graph LR
    A["Affinity  
(User-Author relationship)"] --> D["EdgeRank Score"]
    B["Weight  
(Interaction type)"] --> D
    C["Time Decay  
(Freshness)"] --> D
    D --> E["Sorted Feed"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#f39c12,stroke:#fff,color:#fff
    style C fill:#4CAF50,stroke:#fff,color:#fff
    style D fill:#16213e,stroke:#fff,color:#fff
    style E fill:#2c3e50,stroke:#fff,color:#fff

```
Three key factors in EdgeRank: Affinity × Weight × Decay

| Signal | Description | Weight |
| --- | --- | --- |
| Direct message | User has messaged the author before | Very High |
| Comment on post | User has commented on author's posts | High |
| Like | User has liked author's posts | Medium |
| Profile visit | User has visited author's profile | Medium |
| Shared group/page | In same group or following same page | Low |
| Video > Image > Link > Text | Content type | Descending |

#### Ranking Service in Production

## 7. Media Handling & CDN

Posts with media (images, videos) consume the majority of bandwidth. Poor handling will kill performance.

### Upload Flow — Presigned URL

```
sequenceDiagram
    participant C as Client
    participant API as API Gateway
    participant S3 as Object Storage
    participant CDN as CDN

C->>API: POST /v1/media/upload-url
    API->>S3: Generate presigned URL
    S3-->>API: Presigned URL (5 min expiry)
    API-->>C: { uploadUrl, mediaKey }
    C->>S3: PUT file directly
    S3-->>C: 200 OK
    C->>API: POST /v1/posts { content, mediaKey }
    Note over API: Save post with media_url
    C->>CDN: GET media (via CDN edge)
    CDN->>S3: Cache miss → fetch from origin
    CDN-->>C: Serve from edge (cache hit next time)

```
Upload flow: client uploads directly to Object Storage, served via CDN

**Benefits:** The API server doesn't handle file uploads — reducing CPU and bandwidth usage. Clients upload directly to S3/R2, media is served from the nearest CDN edge.

### Media Processing Pipeline

- **Image**: Resize to multiple dimensions (thumbnail 150px, medium 600px, full 1200px), convert to WebP/AVIF, strip metadata
- **Video**: Transcode to adaptive bitrate (HLS/DASH), generate thumbnails, extract duration
- **Processing**: Runs async via message queue — never blocks post creation

## 8. Scaling & Partitioning

### Database Sharding

| Table | Shard Key | Reason |
| --- | --- | --- |
| Posts | author_id | All posts by one user on the same shard → fast queries |
| FeedItems | user_id | One user's feed on the same shard → fast reads |
| Follows | follower_id | "Who does user X follow?" query hits single shard |
| Users | user_id | Consistent hashing, easy to add more shards |

### Redis Cluster Partitioning

```text
Feed cache: Hash slot = CRC16("feed:{user_id}") mod 16384
→ Each Redis node handles a range of slots
→ Scale by adding nodes + rebalancing slots

Memory estimation:
- 100M users × 100KB feed cache = ~10TB
- Redis Cluster 20 nodes × 512GB RAM = 10TB ✓
- Replication factor 2 → 40 nodes total
```

### Fan-out Worker Scaling

#### Handling Celebrity Posts

When a celebrity (10M followers) creates a post, fan-out on write would require 10M writes — taking minutes if done sequentially. Solutions: **(1)** Don't fan-out for celebrities (hybrid model), **(2)** If fan-out is required, partition followers into batches of 10K, send to queue, multiple workers process in parallel → complete within seconds.

## 9. Real-time Updates & Notifications

### WebSocket for Feed Updates

```javascript
// Client-side: WebSocket connection for real-time feed updates
const ws = new WebSocket('wss://feed.example.com/ws');

ws.onmessage = (event) => {
  const update = JSON.parse(event.data);

switch (update.type) {
    case 'new_post':
      // Prepend to feed UI
      feedStore.prepend(update.post);
      break;
    case 'engagement_update':
      // Update like/comment counts
      feedStore.updateCounts(update.post_id, update.counts);
      break;
    case 'new_notification':
      notificationStore.add(update.notification);
      break;
  }
};
```

### Notification Flow

```
graph LR
    A["User likes post"] --> B["Interaction Service"]
    B --> C["Update Counts  
(Post Cache)"]
    B --> D["Notification Queue"]
    D --> E["Notification Worker"]
    E --> F{"Push enabled?"}
    F -- Yes --> G["FCM / APNs  
Push Notification"]
    F -- No --> H["In-app badge  
update"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style D fill:#f39c12,stroke:#fff,color:#fff
    style G fill:#4CAF50,stroke:#fff,color:#fff

```
Notification flow: from interaction to push notification

## 10. Monitoring & Observability

| Metric | Alert Threshold | Monitoring Tool |
| --- | --- | --- |
| Feed API P99 latency | > 500ms | Prometheus + Grafana |
| Fan-out queue lag | > 100K messages | Kafka Consumer Lag monitor |
| Redis cache hit rate | < 95% | Redis Exporter + Grafana |
| Post creation error rate | > 0.1% | Application metrics |
| CDN cache hit ratio | < 85% | CloudFront / Cloudflare Analytics |
| Database query P95 | > 100ms | Slow query log + APM |

## Conclusion

Designing a News Feed system is a composite challenge spanning multiple techniques: from fan-out strategy determining write/read trade-offs, multi-layer caching to reduce latency, ranking algorithms to personalize the experience, to sharding and partitioning for horizontal scaling. The key takeaway isn't choosing Push or Pull, but knowing when to use each — and the **Hybrid approach** is the answer that all large-scale systems converge on.

Whether you're preparing for a system design interview or actually building a product, understanding the flow from *user creates post* → *fan-out* → *cache* → *rank* → *serve* will give you the confidence to design any feed/timeline system.

### References

- [Designing a Scalable News Feed System — AlgoMaster](https://blog.algomaster.io/p/designing-a-scalable-news-feed-system)
- [Design Facebook's News Feed — Hello Interview](https://www.hellointerview.com/learn/system-design/problem-breakdowns/fb-news-feed)
- [How to Design a Social Media News Feed — Design Gurus](https://www.designgurus.io/blog/design-social-media-news-feed)
- [Designing a News Feed System: Facebook and Twitter Architecture — DEV Community](https://dev.to/sgchris/designing-a-news-feed-system-facebook-and-twitter-architecture-5292)

Azure Container Apps Dynamic Sessions — Hyper-V Sandbox for Secure AI Agent Code Execution

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.