News Feed System Design — Fan-out, Caching & Ranking at Scale for Millions of Users
Posted on: 4/26/2026 8:17:18 PM
Table of contents
- Table of Contents
- 1. System Requirements Analysis
- 2. Data Model & API Design
- 3. Fan-out Strategy: Push vs Pull vs Hybrid
- 4. Overall Architecture
- 5. Multi-layer Caching Strategy
- 6. Ranking Algorithm
- 7. Media Handling & CDN
- 8. Scaling & Partitioning
- 9. Real-time Updates & Notifications
- 10. Monitoring & Observability
- Conclusion
You open the app, swipe your thumb — and hundreds of posts from friends, followees, and groups appear seamlessly. That experience seems simple, but behind the scenes lies a complex distributed system that must solve countless problems: who posted what, who receives it, how to rank it, where to cache it, and how to ensure millions of concurrent users don't feel any latency.
This article dives deep into the architecture of a News Feed system — from data model, fan-out strategy, caching layers to ranking algorithms — following the patterns that companies like Facebook, Twitter (X), and Instagram have deployed at production scale.
1. System Requirements Analysis
Functional Requirements
- Publish: Users can create posts (text, images, videos)
- Feed Generation: Display a personalized feed from followed users
- Interaction: Like, comment, share on each post
- Pagination: Feed supports infinite scroll with cursor-based pagination
Non-Functional Requirements
- Latency: Feed load < 200ms (P99)
- Availability: 99.99% uptime — feed is the core experience
- Scalability: Support 100M+ DAU, each user follows 200–500 people on average
- Consistency: Eventual consistency is acceptable — users see new posts within seconds, no need for absolute real-time
Scale Estimation
| Metric | Estimate |
|---|---|
| DAU (Daily Active Users) | 100 million |
| New posts / day | ~50 million (0.5 posts/user) |
| Feed loads / day / user | ~5 times |
| Total feed queries / day | 500 million (~6,000 QPS) |
| Average followees / user | ~300 |
| Feed cache size / user | ~100 posts × 1KB = 100KB |
2. Data Model & API Design
Core Entities
-- Users
CREATE TABLE Users (
user_id BIGINT PRIMARY KEY,
username VARCHAR(50) UNIQUE,
display_name NVARCHAR(100),
avatar_url VARCHAR(500),
is_celebrity BOOLEAN DEFAULT FALSE,
follower_count INT DEFAULT 0,
created_at TIMESTAMP DEFAULT NOW()
);
-- Posts
CREATE TABLE Posts (
post_id BIGINT PRIMARY KEY, -- Snowflake ID
author_id BIGINT REFERENCES Users(user_id),
content TEXT,
media_url VARCHAR(500),
media_type VARCHAR(20), -- 'image', 'video', 'none'
like_count INT DEFAULT 0,
comment_count INT DEFAULT 0,
share_count INT DEFAULT 0,
created_at TIMESTAMP DEFAULT NOW(),
INDEX idx_author_time (author_id, created_at DESC)
);
-- Follow Graph
CREATE TABLE Follows (
follower_id BIGINT,
followee_id BIGINT,
created_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (follower_id, followee_id),
INDEX idx_followee (followee_id)
);
-- Precomputed Feed (Fan-out on Write)
CREATE TABLE FeedItems (
user_id BIGINT,
post_id BIGINT,
author_id BIGINT,
score FLOAT,
created_at TIMESTAMP,
PRIMARY KEY (user_id, post_id),
INDEX idx_user_score (user_id, score DESC)
);
API Endpoints
POST /v1/posts -- Create new post
GET /v1/feed?cursor=xxx&limit=20 -- Get feed (cursor-based pagination)
POST /v1/posts/{id}/like -- Like / unlike
POST /v1/posts/{id}/comment -- Add comment
GET /v1/posts/{id}/comments?cursor=xxx -- Get comments
POST /v1/users/{id}/follow -- Follow user
Why Cursor-based Pagination?
Offset-based pagination (OFFSET 100 LIMIT 20) breaks when new items are inserted: users will see duplicate posts or miss items entirely. Cursor-based pagination uses created_at or score of the last item as an anchor, ensuring stable pagination even when data changes continuously.
3. Fan-out Strategy: Push vs Pull vs Hybrid
This is the most critical architectural decision for a News Feed system. There are 3 main strategies:
3.1. Fan-out on Write (Push Model)
graph LR
A["User A creates post"] --> B["Post Service"]
B --> C["Message Queue"]
C --> D["Fan-out Workers"]
D --> E["Feed Cache User X"]
D --> F["Feed Cache User Y"]
D --> G["Feed Cache User Z"]
style A fill:#e94560,stroke:#fff,color:#fff
style B fill:#2c3e50,stroke:#fff,color:#fff
style C fill:#f39c12,stroke:#fff,color:#fff
style D fill:#16213e,stroke:#fff,color:#fff
style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style F fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style G fill:#f8f9fa,stroke:#e94560,color:#2c3e50
Fan-out on Write: posts are pushed into every follower's feed cache at publish time
How it works: When User A creates a post, the system retrieves all of A's followers, then inserts the post into each follower's feed cache (Redis Sorted Set).
| Pros | Cons |
|---|---|
| Extremely fast reads — feed is precomputed | Slow writes if user has many followers |
| Simple read logic | Celebrity problem: 10M followers = 10M writes |
| Works well for majority of regular users | Wastes cache memory for inactive users |
3.2. Fan-out on Read (Pull Model)
graph LR
A["User X opens feed"] --> B["Feed Service"]
B --> C["Get followees list"]
C --> D["Query Posts from each followee"]
D --> E["Merge + Rank + Return"]
style A fill:#e94560,stroke:#fff,color:#fff
style B fill:#2c3e50,stroke:#fff,color:#fff
style C fill:#16213e,stroke:#fff,color:#fff
style D fill:#f39c12,stroke:#fff,color:#fff
style E fill:#4CAF50,stroke:#fff,color:#fff
Fan-out on Read: feed is assembled at the time of user request
How it works: When User X opens their feed, the system queries recent posts from everyone X follows, merges them, ranks, and returns the result.
| Pros | Cons |
|---|---|
| Fast writes — only one copy of the post stored | Slow reads — must query multiple sources |
| No wasted cache for inactive users | Latency scales linearly with number of followees |
| No celebrity problem on write path | Not suitable for < 200ms feed experience |
3.3. Hybrid Approach (Production Standard)
This is how Facebook, Twitter, and Instagram do it
Combine both Push and Pull: Push for regular users (under 10K followers) and Pull for celebrities (over 10K followers). When a user loads their feed, the system merges the precomputed feed with real-time posts from celebrities the user follows.
graph TD
A["User creates post"] --> B{"Followers > 10K?"}
B -- No --> C["Fan-out on Write
Push to all followers'
feed cache"]
B -- Yes --> D["Only save to
Posts table"]
E["User X loads feed"] --> F["Read precomputed
feed from cache"]
E --> G["Query celebrity posts
that X follows"]
F --> H["Merge + Rank"]
G --> H
H --> I["Return feed
to client"]
style A fill:#e94560,stroke:#fff,color:#fff
style B fill:#f39c12,stroke:#fff,color:#2c3e50
style C fill:#4CAF50,stroke:#fff,color:#fff
style D fill:#2c3e50,stroke:#fff,color:#fff
style E fill:#e94560,stroke:#fff,color:#fff
style H fill:#16213e,stroke:#fff,color:#fff
style I fill:#4CAF50,stroke:#fff,color:#fff
Hybrid: Push for regular users, Pull for celebrities, merge at read time
# Pseudo-code: Hybrid Feed Generation
async def get_feed(user_id, cursor, limit=20):
# 1. Read precomputed feed (fan-out on write results)
precomputed = await redis.zrevrangebyscore(
f"feed:{user_id}",
max=cursor or "+inf",
min="-inf",
start=0, num=limit
)
# 2. Get celebrity list that user follows
celebrities = await db.query("""
SELECT followee_id FROM Follows f
JOIN Users u ON u.user_id = f.followee_id
WHERE f.follower_id = ? AND u.is_celebrity = TRUE
""", user_id)
# 3. Fetch recent posts from celebrities (fan-out on read)
celebrity_posts = await db.query("""
SELECT * FROM Posts
WHERE author_id IN (?) AND created_at > NOW() - INTERVAL '24 HOURS'
ORDER BY created_at DESC LIMIT ?
""", celebrity_ids, limit)
# 4. Merge and rank
all_posts = merge(precomputed, celebrity_posts)
ranked = ranking_service.rank(all_posts, user_id)
return ranked[:limit]
4. Overall Architecture
graph TD
Client["Mobile / Web Client"] --> LB["Load Balancer"]
LB --> API["API Gateway"]
API --> PS["Post Service"]
API --> FS["Feed Service"]
API --> US["User Service"]
PS --> DB_Post["Posts DB
(Sharded)"]
PS --> MQ["Message Queue
(Kafka)"]
PS --> S3["Object Storage
(S3 / R2)"]
MQ --> FW["Fan-out Workers"]
FW --> RC["Feed Cache
(Redis Cluster)"]
FW --> NS["Notification Service"]
FS --> RC
FS --> DB_Post
FS --> RS["Ranking Service"]
US --> DB_User["Users DB"]
US --> GC["Graph Cache
(Follow relations)"]
S3 --> CDN["CDN
(CloudFront / Cloudflare)"]
style Client fill:#e94560,stroke:#fff,color:#fff
style LB fill:#2c3e50,stroke:#fff,color:#fff
style API fill:#2c3e50,stroke:#fff,color:#fff
style MQ fill:#f39c12,stroke:#fff,color:#fff
style RC fill:#e94560,stroke:#e94560,color:#fff
style FW fill:#16213e,stroke:#fff,color:#fff
style CDN fill:#4CAF50,stroke:#fff,color:#fff
Overall architecture of a large-scale News Feed system
Core Components
| Service | Role | Common Tech Stack |
|---|---|---|
| Post Service | CRUD posts, publish events to queue | .NET / Go + PostgreSQL (sharded) |
| Feed Service | Generate feed, merge precomputed + celebrity posts | .NET / Go + Redis |
| Fan-out Workers | Consume from queue, push posts into feed cache | Background workers, horizontally scaled |
| Ranking Service | Score and sort feed items | Python / .NET + ML model |
| User Service | Manage profiles, follow graph | .NET / Go + PostgreSQL + Redis (graph cache) |
| Notification Service | Push notifications on interactions | Firebase / APNs + Message Queue |
5. Multi-layer Caching Strategy
Cache is the backbone of a News Feed. Without cache, each feed request would need to query hundreds of data sources — impossible at scale.
graph LR
A["Client Cache
(In-memory)"] --> B["CDN Cache
(Static assets)"]
B --> C["Feed Cache
(Redis Sorted Set)"]
C --> D["Post Cache
(Redis Hash)"]
D --> E["User Cache
(Redis Hash)"]
E --> F["Database
(Source of Truth)"]
style A fill:#4CAF50,stroke:#fff,color:#fff
style B fill:#4CAF50,stroke:#fff,color:#fff
style C fill:#e94560,stroke:#fff,color:#fff
style D fill:#e94560,stroke:#fff,color:#fff
style E fill:#e94560,stroke:#fff,color:#fff
style F fill:#2c3e50,stroke:#fff,color:#fff
Multi-layer caching: from client to database
5.1. Feed Cache — Redis Sorted Set
Each user has a Sorted Set in Redis, scored by timestamp (or ranking score). Loading a feed only requires ZREVRANGEBYSCORE — O(log N + M) where M is the number of items to fetch.
# Structure: feed:{user_id} → Sorted Set
# Member: post_id, Score: timestamp (or ranking score)
# Push new post into feed
ZADD feed:12345 1714200000 "post:98765"
# Keep only the 500 most recent items (trim)
ZREMRANGEBYRANK feed:12345 0 -501
# Read feed: get 20 newest items
ZREVRANGEBYSCORE feed:12345 +inf -inf LIMIT 0 20
5.2. Post Cache — Redis Hash
# Cache post content to avoid DB queries
# Key: post:{post_id}
HSET post:98765 author_id 111 content "Hello world"
like_count 42 comment_count 7 media_url "..."
created_at 1714200000
# TTL: 24-48 hours for regular posts, longer for viral posts
EXPIRE post:98765 172800
5.3. Social Graph Cache
# Cache followees list for fast fan-out
# Key: followers:{user_id} → Set of follower IDs
SMEMBERS followers:111
# → {12345, 67890, 11111, ...}
# Cache celebrity list that user follows
SMEMBERS celeb_followees:12345
# → {celebrity_1, celebrity_2}
Cache Invalidation Strategy
Partial invalidation is the best approach for feed cache: when a new post arrives, simply ZADD to the sorted set and ZREMRANGEBYRANK to trim — no need to flush the entire cache. For post cache, use write-through: update cache immediately when like_count/comment_count changes, with TTL as a safety net.
6. Ranking Algorithm
A feed is not just a chronological list of recent posts — it must be personalized based on user behavior.
Scoring Formula (Simplified)
Score = w1 × affinity_score
+ w2 × interaction_score
+ w3 × recency_score
+ w4 × content_type_boost
Where:
- affinity_score: Level of interaction between user and author
(likes, comments, shares between them in the last 30 days)
- interaction_score: Total engagement of the post
(likes + 2×comments + 3×shares) / time_since_post
- recency_score: Time decay function
= 1 / (1 + hours_since_post)^1.5
- content_type_boost: Video > Image > Text (platform-dependent)
Facebook EdgeRank (Simplified Version)
graph LR
A["Affinity
(User-Author relationship)"] --> D["EdgeRank Score"]
B["Weight
(Interaction type)"] --> D
C["Time Decay
(Freshness)"] --> D
D --> E["Sorted Feed"]
style A fill:#e94560,stroke:#fff,color:#fff
style B fill:#f39c12,stroke:#fff,color:#fff
style C fill:#4CAF50,stroke:#fff,color:#fff
style D fill:#16213e,stroke:#fff,color:#fff
style E fill:#2c3e50,stroke:#fff,color:#fff
Three key factors in EdgeRank: Affinity × Weight × Decay
| Signal | Description | Weight |
|---|---|---|
| Direct message | User has messaged the author before | Very High |
| Comment on post | User has commented on author's posts | High |
| Like | User has liked author's posts | Medium |
| Profile visit | User has visited author's profile | Medium |
| Shared group/page | In same group or following same page | Low |
| Video > Image > Link > Text | Content type | Descending |
Ranking Service in Production
At large scale, ranking services run ML models (typically gradient-boosted trees or deep learning) trained on billions of interactions. The model predicts P(user will interact with this post) and uses that probability as the score. However, the heuristic-based scoring above remains a solid baseline for small to medium-scale systems.
7. Media Handling & CDN
Posts with media (images, videos) consume the majority of bandwidth. Poor handling will kill performance.
Upload Flow — Presigned URL
sequenceDiagram
participant C as Client
participant API as API Gateway
participant S3 as Object Storage
participant CDN as CDN
C->>API: POST /v1/media/upload-url
API->>S3: Generate presigned URL
S3-->>API: Presigned URL (5 min expiry)
API-->>C: { uploadUrl, mediaKey }
C->>S3: PUT file directly
S3-->>C: 200 OK
C->>API: POST /v1/posts { content, mediaKey }
Note over API: Save post with media_url
C->>CDN: GET media (via CDN edge)
CDN->>S3: Cache miss → fetch from origin
CDN-->>C: Serve from edge (cache hit next time)
Upload flow: client uploads directly to Object Storage, served via CDN
Benefits: The API server doesn't handle file uploads — reducing CPU and bandwidth usage. Clients upload directly to S3/R2, media is served from the nearest CDN edge.
Media Processing Pipeline
- Image: Resize to multiple dimensions (thumbnail 150px, medium 600px, full 1200px), convert to WebP/AVIF, strip metadata
- Video: Transcode to adaptive bitrate (HLS/DASH), generate thumbnails, extract duration
- Processing: Runs async via message queue — never blocks post creation
8. Scaling & Partitioning
Database Sharding
| Table | Shard Key | Reason |
|---|---|---|
| Posts | author_id | All posts by one user on the same shard → fast queries |
| FeedItems | user_id | One user's feed on the same shard → fast reads |
| Follows | follower_id | "Who does user X follow?" query hits single shard |
| Users | user_id | Consistent hashing, easy to add more shards |
Redis Cluster Partitioning
Feed cache: Hash slot = CRC16("feed:{user_id}") mod 16384
→ Each Redis node handles a range of slots
→ Scale by adding nodes + rebalancing slots
Memory estimation:
- 100M users × 100KB feed cache = ~10TB
- Redis Cluster 20 nodes × 512GB RAM = 10TB ✓
- Replication factor 2 → 40 nodes total
Fan-out Worker Scaling
Handling Celebrity Posts
When a celebrity (10M followers) creates a post, fan-out on write would require 10M writes — taking minutes if done sequentially. Solutions: (1) Don't fan-out for celebrities (hybrid model), (2) If fan-out is required, partition followers into batches of 10K, send to queue, multiple workers process in parallel → complete within seconds.
9. Real-time Updates & Notifications
WebSocket for Feed Updates
// Client-side: WebSocket connection for real-time feed updates
const ws = new WebSocket('wss://feed.example.com/ws');
ws.onmessage = (event) => {
const update = JSON.parse(event.data);
switch (update.type) {
case 'new_post':
// Prepend to feed UI
feedStore.prepend(update.post);
break;
case 'engagement_update':
// Update like/comment counts
feedStore.updateCounts(update.post_id, update.counts);
break;
case 'new_notification':
notificationStore.add(update.notification);
break;
}
};
Notification Flow
graph LR
A["User likes post"] --> B["Interaction Service"]
B --> C["Update Counts
(Post Cache)"]
B --> D["Notification Queue"]
D --> E["Notification Worker"]
E --> F{"Push enabled?"}
F -- Yes --> G["FCM / APNs
Push Notification"]
F -- No --> H["In-app badge
update"]
style A fill:#e94560,stroke:#fff,color:#fff
style D fill:#f39c12,stroke:#fff,color:#fff
style G fill:#4CAF50,stroke:#fff,color:#fff
Notification flow: from interaction to push notification
10. Monitoring & Observability
| Metric | Alert Threshold | Monitoring Tool |
|---|---|---|
| Feed API P99 latency | > 500ms | Prometheus + Grafana |
| Fan-out queue lag | > 100K messages | Kafka Consumer Lag monitor |
| Redis cache hit rate | < 95% | Redis Exporter + Grafana |
| Post creation error rate | > 0.1% | Application metrics |
| CDN cache hit ratio | < 85% | CloudFront / Cloudflare Analytics |
| Database query P95 | > 100ms | Slow query log + APM |
Conclusion
Designing a News Feed system is a composite challenge spanning multiple techniques: from fan-out strategy determining write/read trade-offs, multi-layer caching to reduce latency, ranking algorithms to personalize the experience, to sharding and partitioning for horizontal scaling. The key takeaway isn't choosing Push or Pull, but knowing when to use each — and the Hybrid approach is the answer that all large-scale systems converge on.
Whether you're preparing for a system design interview or actually building a product, understanding the flow from user creates post → fan-out → cache → rank → serve will give you the confidence to design any feed/timeline system.
References
Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.