News Feed System Design — Fan-out, Caching & Ranking at Scale for Millions of Users

Posted on: 4/26/2026 8:17:18 PM

500M+Feed queries/day (100M DAU)
~6,000Average QPS for Feed API
<200msTarget latency for feed load
10K+Followers → switch to Pull model

You open the app, swipe your thumb — and hundreds of posts from friends, followees, and groups appear seamlessly. That experience seems simple, but behind the scenes lies a complex distributed system that must solve countless problems: who posted what, who receives it, how to rank it, where to cache it, and how to ensure millions of concurrent users don't feel any latency.

This article dives deep into the architecture of a News Feed system — from data model, fan-out strategy, caching layers to ranking algorithms — following the patterns that companies like Facebook, Twitter (X), and Instagram have deployed at production scale.

1. System Requirements Analysis

Functional Requirements

  • Publish: Users can create posts (text, images, videos)
  • Feed Generation: Display a personalized feed from followed users
  • Interaction: Like, comment, share on each post
  • Pagination: Feed supports infinite scroll with cursor-based pagination

Non-Functional Requirements

  • Latency: Feed load < 200ms (P99)
  • Availability: 99.99% uptime — feed is the core experience
  • Scalability: Support 100M+ DAU, each user follows 200–500 people on average
  • Consistency: Eventual consistency is acceptable — users see new posts within seconds, no need for absolute real-time

Scale Estimation

MetricEstimate
DAU (Daily Active Users)100 million
New posts / day~50 million (0.5 posts/user)
Feed loads / day / user~5 times
Total feed queries / day500 million (~6,000 QPS)
Average followees / user~300
Feed cache size / user~100 posts × 1KB = 100KB

2. Data Model & API Design

Core Entities

-- Users
CREATE TABLE Users (
    user_id     BIGINT PRIMARY KEY,
    username    VARCHAR(50) UNIQUE,
    display_name NVARCHAR(100),
    avatar_url  VARCHAR(500),
    is_celebrity BOOLEAN DEFAULT FALSE,
    follower_count INT DEFAULT 0,
    created_at  TIMESTAMP DEFAULT NOW()
);

-- Posts
CREATE TABLE Posts (
    post_id     BIGINT PRIMARY KEY,  -- Snowflake ID
    author_id   BIGINT REFERENCES Users(user_id),
    content     TEXT,
    media_url   VARCHAR(500),
    media_type  VARCHAR(20),  -- 'image', 'video', 'none'
    like_count  INT DEFAULT 0,
    comment_count INT DEFAULT 0,
    share_count INT DEFAULT 0,
    created_at  TIMESTAMP DEFAULT NOW(),
    INDEX idx_author_time (author_id, created_at DESC)
);

-- Follow Graph
CREATE TABLE Follows (
    follower_id BIGINT,
    followee_id BIGINT,
    created_at  TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (follower_id, followee_id),
    INDEX idx_followee (followee_id)
);

-- Precomputed Feed (Fan-out on Write)
CREATE TABLE FeedItems (
    user_id     BIGINT,
    post_id     BIGINT,
    author_id   BIGINT,
    score       FLOAT,
    created_at  TIMESTAMP,
    PRIMARY KEY (user_id, post_id),
    INDEX idx_user_score (user_id, score DESC)
);

API Endpoints

POST   /v1/posts                    -- Create new post
GET    /v1/feed?cursor=xxx&limit=20  -- Get feed (cursor-based pagination)
POST   /v1/posts/{id}/like           -- Like / unlike
POST   /v1/posts/{id}/comment        -- Add comment
GET    /v1/posts/{id}/comments?cursor=xxx  -- Get comments
POST   /v1/users/{id}/follow         -- Follow user

Why Cursor-based Pagination?

Offset-based pagination (OFFSET 100 LIMIT 20) breaks when new items are inserted: users will see duplicate posts or miss items entirely. Cursor-based pagination uses created_at or score of the last item as an anchor, ensuring stable pagination even when data changes continuously.

3. Fan-out Strategy: Push vs Pull vs Hybrid

This is the most critical architectural decision for a News Feed system. There are 3 main strategies:

3.1. Fan-out on Write (Push Model)

graph LR
    A["User A creates post"] --> B["Post Service"]
    B --> C["Message Queue"]
    C --> D["Fan-out Workers"]
    D --> E["Feed Cache User X"]
    D --> F["Feed Cache User Y"]
    D --> G["Feed Cache User Z"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f39c12,stroke:#fff,color:#fff
    style D fill:#16213e,stroke:#fff,color:#fff
    style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style F fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style G fill:#f8f9fa,stroke:#e94560,color:#2c3e50

Fan-out on Write: posts are pushed into every follower's feed cache at publish time

How it works: When User A creates a post, the system retrieves all of A's followers, then inserts the post into each follower's feed cache (Redis Sorted Set).

ProsCons
Extremely fast reads — feed is precomputedSlow writes if user has many followers
Simple read logicCelebrity problem: 10M followers = 10M writes
Works well for majority of regular usersWastes cache memory for inactive users

3.2. Fan-out on Read (Pull Model)

graph LR
    A["User X opens feed"] --> B["Feed Service"]
    B --> C["Get followees list"]
    C --> D["Query Posts from each followee"]
    D --> E["Merge + Rank + Return"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#16213e,stroke:#fff,color:#fff
    style D fill:#f39c12,stroke:#fff,color:#fff
    style E fill:#4CAF50,stroke:#fff,color:#fff

Fan-out on Read: feed is assembled at the time of user request

How it works: When User X opens their feed, the system queries recent posts from everyone X follows, merges them, ranks, and returns the result.

ProsCons
Fast writes — only one copy of the post storedSlow reads — must query multiple sources
No wasted cache for inactive usersLatency scales linearly with number of followees
No celebrity problem on write pathNot suitable for < 200ms feed experience

3.3. Hybrid Approach (Production Standard)

This is how Facebook, Twitter, and Instagram do it

Combine both Push and Pull: Push for regular users (under 10K followers) and Pull for celebrities (over 10K followers). When a user loads their feed, the system merges the precomputed feed with real-time posts from celebrities the user follows.

graph TD
    A["User creates post"] --> B{"Followers > 10K?"}
    B -- No --> C["Fan-out on Write
Push to all followers'
feed cache"] B -- Yes --> D["Only save to
Posts table"] E["User X loads feed"] --> F["Read precomputed
feed from cache"] E --> G["Query celebrity posts
that X follows"] F --> H["Merge + Rank"] G --> H H --> I["Return feed
to client"] style A fill:#e94560,stroke:#fff,color:#fff style B fill:#f39c12,stroke:#fff,color:#2c3e50 style C fill:#4CAF50,stroke:#fff,color:#fff style D fill:#2c3e50,stroke:#fff,color:#fff style E fill:#e94560,stroke:#fff,color:#fff style H fill:#16213e,stroke:#fff,color:#fff style I fill:#4CAF50,stroke:#fff,color:#fff

Hybrid: Push for regular users, Pull for celebrities, merge at read time

# Pseudo-code: Hybrid Feed Generation
async def get_feed(user_id, cursor, limit=20):
    # 1. Read precomputed feed (fan-out on write results)
    precomputed = await redis.zrevrangebyscore(
        f"feed:{user_id}",
        max=cursor or "+inf",
        min="-inf",
        start=0, num=limit
    )

    # 2. Get celebrity list that user follows
    celebrities = await db.query("""
        SELECT followee_id FROM Follows f
        JOIN Users u ON u.user_id = f.followee_id
        WHERE f.follower_id = ? AND u.is_celebrity = TRUE
    """, user_id)

    # 3. Fetch recent posts from celebrities (fan-out on read)
    celebrity_posts = await db.query("""
        SELECT * FROM Posts
        WHERE author_id IN (?) AND created_at > NOW() - INTERVAL '24 HOURS'
        ORDER BY created_at DESC LIMIT ?
    """, celebrity_ids, limit)

    # 4. Merge and rank
    all_posts = merge(precomputed, celebrity_posts)
    ranked = ranking_service.rank(all_posts, user_id)

    return ranked[:limit]

4. Overall Architecture

graph TD
    Client["Mobile / Web Client"] --> LB["Load Balancer"]
    LB --> API["API Gateway"]
    API --> PS["Post Service"]
    API --> FS["Feed Service"]
    API --> US["User Service"]

    PS --> DB_Post["Posts DB
(Sharded)"] PS --> MQ["Message Queue
(Kafka)"] PS --> S3["Object Storage
(S3 / R2)"] MQ --> FW["Fan-out Workers"] FW --> RC["Feed Cache
(Redis Cluster)"] FW --> NS["Notification Service"] FS --> RC FS --> DB_Post FS --> RS["Ranking Service"] US --> DB_User["Users DB"] US --> GC["Graph Cache
(Follow relations)"] S3 --> CDN["CDN
(CloudFront / Cloudflare)"] style Client fill:#e94560,stroke:#fff,color:#fff style LB fill:#2c3e50,stroke:#fff,color:#fff style API fill:#2c3e50,stroke:#fff,color:#fff style MQ fill:#f39c12,stroke:#fff,color:#fff style RC fill:#e94560,stroke:#e94560,color:#fff style FW fill:#16213e,stroke:#fff,color:#fff style CDN fill:#4CAF50,stroke:#fff,color:#fff

Overall architecture of a large-scale News Feed system

Core Components

ServiceRoleCommon Tech Stack
Post ServiceCRUD posts, publish events to queue.NET / Go + PostgreSQL (sharded)
Feed ServiceGenerate feed, merge precomputed + celebrity posts.NET / Go + Redis
Fan-out WorkersConsume from queue, push posts into feed cacheBackground workers, horizontally scaled
Ranking ServiceScore and sort feed itemsPython / .NET + ML model
User ServiceManage profiles, follow graph.NET / Go + PostgreSQL + Redis (graph cache)
Notification ServicePush notifications on interactionsFirebase / APNs + Message Queue

5. Multi-layer Caching Strategy

Cache is the backbone of a News Feed. Without cache, each feed request would need to query hundreds of data sources — impossible at scale.

graph LR
    A["Client Cache
(In-memory)"] --> B["CDN Cache
(Static assets)"] B --> C["Feed Cache
(Redis Sorted Set)"] C --> D["Post Cache
(Redis Hash)"] D --> E["User Cache
(Redis Hash)"] E --> F["Database
(Source of Truth)"] style A fill:#4CAF50,stroke:#fff,color:#fff style B fill:#4CAF50,stroke:#fff,color:#fff style C fill:#e94560,stroke:#fff,color:#fff style D fill:#e94560,stroke:#fff,color:#fff style E fill:#e94560,stroke:#fff,color:#fff style F fill:#2c3e50,stroke:#fff,color:#fff

Multi-layer caching: from client to database

5.1. Feed Cache — Redis Sorted Set

Each user has a Sorted Set in Redis, scored by timestamp (or ranking score). Loading a feed only requires ZREVRANGEBYSCORE — O(log N + M) where M is the number of items to fetch.

# Structure: feed:{user_id} → Sorted Set
# Member: post_id, Score: timestamp (or ranking score)

# Push new post into feed
ZADD feed:12345 1714200000 "post:98765"

# Keep only the 500 most recent items (trim)
ZREMRANGEBYRANK feed:12345 0 -501

# Read feed: get 20 newest items
ZREVRANGEBYSCORE feed:12345 +inf -inf LIMIT 0 20

5.2. Post Cache — Redis Hash

# Cache post content to avoid DB queries
# Key: post:{post_id}
HSET post:98765 author_id 111 content "Hello world"
     like_count 42 comment_count 7 media_url "..."
     created_at 1714200000

# TTL: 24-48 hours for regular posts, longer for viral posts
EXPIRE post:98765 172800

5.3. Social Graph Cache

# Cache followees list for fast fan-out
# Key: followers:{user_id} → Set of follower IDs
SMEMBERS followers:111
# → {12345, 67890, 11111, ...}

# Cache celebrity list that user follows
SMEMBERS celeb_followees:12345
# → {celebrity_1, celebrity_2}

Cache Invalidation Strategy

Partial invalidation is the best approach for feed cache: when a new post arrives, simply ZADD to the sorted set and ZREMRANGEBYRANK to trim — no need to flush the entire cache. For post cache, use write-through: update cache immediately when like_count/comment_count changes, with TTL as a safety net.

6. Ranking Algorithm

A feed is not just a chronological list of recent posts — it must be personalized based on user behavior.

Scoring Formula (Simplified)

Score = w1 × affinity_score
      + w2 × interaction_score
      + w3 × recency_score
      + w4 × content_type_boost

Where:
- affinity_score: Level of interaction between user and author
  (likes, comments, shares between them in the last 30 days)
- interaction_score: Total engagement of the post
  (likes + 2×comments + 3×shares) / time_since_post
- recency_score: Time decay function
  = 1 / (1 + hours_since_post)^1.5
- content_type_boost: Video > Image > Text (platform-dependent)

Facebook EdgeRank (Simplified Version)

graph LR
    A["Affinity
(User-Author relationship)"] --> D["EdgeRank Score"] B["Weight
(Interaction type)"] --> D C["Time Decay
(Freshness)"] --> D D --> E["Sorted Feed"] style A fill:#e94560,stroke:#fff,color:#fff style B fill:#f39c12,stroke:#fff,color:#fff style C fill:#4CAF50,stroke:#fff,color:#fff style D fill:#16213e,stroke:#fff,color:#fff style E fill:#2c3e50,stroke:#fff,color:#fff

Three key factors in EdgeRank: Affinity × Weight × Decay

SignalDescriptionWeight
Direct messageUser has messaged the author beforeVery High
Comment on postUser has commented on author's postsHigh
LikeUser has liked author's postsMedium
Profile visitUser has visited author's profileMedium
Shared group/pageIn same group or following same pageLow
Video > Image > Link > TextContent typeDescending

Ranking Service in Production

At large scale, ranking services run ML models (typically gradient-boosted trees or deep learning) trained on billions of interactions. The model predicts P(user will interact with this post) and uses that probability as the score. However, the heuristic-based scoring above remains a solid baseline for small to medium-scale systems.

7. Media Handling & CDN

Posts with media (images, videos) consume the majority of bandwidth. Poor handling will kill performance.

Upload Flow — Presigned URL

sequenceDiagram
    participant C as Client
    participant API as API Gateway
    participant S3 as Object Storage
    participant CDN as CDN

    C->>API: POST /v1/media/upload-url
    API->>S3: Generate presigned URL
    S3-->>API: Presigned URL (5 min expiry)
    API-->>C: { uploadUrl, mediaKey }
    C->>S3: PUT file directly
    S3-->>C: 200 OK
    C->>API: POST /v1/posts { content, mediaKey }
    Note over API: Save post with media_url
    C->>CDN: GET media (via CDN edge)
    CDN->>S3: Cache miss → fetch from origin
    CDN-->>C: Serve from edge (cache hit next time)

Upload flow: client uploads directly to Object Storage, served via CDN

Benefits: The API server doesn't handle file uploads — reducing CPU and bandwidth usage. Clients upload directly to S3/R2, media is served from the nearest CDN edge.

Media Processing Pipeline

  • Image: Resize to multiple dimensions (thumbnail 150px, medium 600px, full 1200px), convert to WebP/AVIF, strip metadata
  • Video: Transcode to adaptive bitrate (HLS/DASH), generate thumbnails, extract duration
  • Processing: Runs async via message queue — never blocks post creation

8. Scaling & Partitioning

Database Sharding

TableShard KeyReason
Postsauthor_idAll posts by one user on the same shard → fast queries
FeedItemsuser_idOne user's feed on the same shard → fast reads
Followsfollower_id"Who does user X follow?" query hits single shard
Usersuser_idConsistent hashing, easy to add more shards

Redis Cluster Partitioning

Feed cache: Hash slot = CRC16("feed:{user_id}") mod 16384
→ Each Redis node handles a range of slots
→ Scale by adding nodes + rebalancing slots

Memory estimation:
- 100M users × 100KB feed cache = ~10TB
- Redis Cluster 20 nodes × 512GB RAM = 10TB ✓
- Replication factor 2 → 40 nodes total

Fan-out Worker Scaling

Handling Celebrity Posts

When a celebrity (10M followers) creates a post, fan-out on write would require 10M writes — taking minutes if done sequentially. Solutions: (1) Don't fan-out for celebrities (hybrid model), (2) If fan-out is required, partition followers into batches of 10K, send to queue, multiple workers process in parallel → complete within seconds.

9. Real-time Updates & Notifications

WebSocket for Feed Updates

// Client-side: WebSocket connection for real-time feed updates
const ws = new WebSocket('wss://feed.example.com/ws');

ws.onmessage = (event) => {
  const update = JSON.parse(event.data);

  switch (update.type) {
    case 'new_post':
      // Prepend to feed UI
      feedStore.prepend(update.post);
      break;
    case 'engagement_update':
      // Update like/comment counts
      feedStore.updateCounts(update.post_id, update.counts);
      break;
    case 'new_notification':
      notificationStore.add(update.notification);
      break;
  }
};

Notification Flow

graph LR
    A["User likes post"] --> B["Interaction Service"]
    B --> C["Update Counts
(Post Cache)"] B --> D["Notification Queue"] D --> E["Notification Worker"] E --> F{"Push enabled?"} F -- Yes --> G["FCM / APNs
Push Notification"] F -- No --> H["In-app badge
update"] style A fill:#e94560,stroke:#fff,color:#fff style D fill:#f39c12,stroke:#fff,color:#fff style G fill:#4CAF50,stroke:#fff,color:#fff

Notification flow: from interaction to push notification

10. Monitoring & Observability

MetricAlert ThresholdMonitoring Tool
Feed API P99 latency> 500msPrometheus + Grafana
Fan-out queue lag> 100K messagesKafka Consumer Lag monitor
Redis cache hit rate< 95%Redis Exporter + Grafana
Post creation error rate> 0.1%Application metrics
CDN cache hit ratio< 85%CloudFront / Cloudflare Analytics
Database query P95> 100msSlow query log + APM

Conclusion

Designing a News Feed system is a composite challenge spanning multiple techniques: from fan-out strategy determining write/read trade-offs, multi-layer caching to reduce latency, ranking algorithms to personalize the experience, to sharding and partitioning for horizontal scaling. The key takeaway isn't choosing Push or Pull, but knowing when to use each — and the Hybrid approach is the answer that all large-scale systems converge on.

Whether you're preparing for a system design interview or actually building a product, understanding the flow from user creates postfan-outcacherankserve will give you the confidence to design any feed/timeline system.

References