Rate Limiting — Kiểm Soát Lưu Lượng API Trong Hệ Thống Phân Tán

Posted on: 4/23/2026 5:11:11 AM

Table of contents

Mục lục
Tại sao cần Rate Limiting?
1. Rate Limiting giải quyết gì?
4 thuật toán Rate Limiting cốt lõi
Triển khai trên ASP.NET Core 10
Distributed Rate Limiting
Rate Limiting ở tầng API Gateway
1. Cloudflare Rate Limiting (Free Tier)
2. YARP Reverse Proxy với Rate Limiting
Best Practices & Anti-Patterns
Kết luận
1. Tham khảo thêm

~90% API attacks là brute-force hoặc DDoS

4 Thuật toán Rate Limiting phổ biến nhất

<1ms Overhead của Rate Limiter tối ưu

.NET 10 Built-in Rate Limiting Middleware

Tại sao cần Rate Limiting?

Hãy tưởng tượng bạn vận hành một API public phục vụ hàng triệu request mỗi ngày. Một buổi sáng, traffic tăng đột biến 50x — không phải vì sản phẩm viral, mà vì một bot quét dữ liệu hoặc đối thủ thực hiện DDoS. Không có cơ chế kiểm soát lưu lượng, toàn bộ hệ thống sập, ảnh hưởng đến mọi người dùng hợp lệ.

Rate Limiting (giới hạn tốc độ) là kỹ thuật kiểm soát số lượng request mà client được phép gửi đến server trong một khoảng thời gian nhất định. Đây là một trong những thành phần không thể thiếu trong kiến trúc hệ thống phân tán hiện đại.

Rate Limiting giải quyết gì?

Bảo vệ tài nguyên: Ngăn chặn quá tải CPU, memory, database connections
Công bằng: Đảm bảo mọi client đều có cơ hội sử dụng API
Bảo mật: Giảm thiểu brute-force attacks, credential stuffing, scraping
Kiểm soát chi phí: Ngăn chặn cloud bill shock khi traffic bất thường
SLA compliance: Đảm bảo service quality cho khách hàng trả phí

flowchart LR
    A[Client Request] --> B{Rate Limiter}
    B -->|Allowed| C[API Server]
    B -->|Rejected| D[429 Too Many Requests]
    C --> E[Database / Service]
    D --> F[Retry-After Header]

    style A fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style B fill:#e94560,stroke:#fff,color:#fff
    style C fill:#4CAF50,stroke:#fff,color:#fff
    style D fill:#ff9800,stroke:#fff,color:#fff
    style E fill:#2c3e50,stroke:#fff,color:#fff
    style F fill:#f8f9fa,stroke:#ff9800,color:#2c3e50

Luồng xử lý cơ bản của Rate Limiter

4 thuật toán Rate Limiting cốt lõi

1. Fixed Window Counter

Thuật toán đơn giản nhất: chia thời gian thành các cửa sổ cố định (ví dụ mỗi phút), đếm số request trong mỗi cửa sổ. Khi counter vượt ngưỡng, từ chối request cho đến khi cửa sổ mới bắt đầu.

gantt
    title Fixed Window Counter — 100 req/phút
    dateFormat X
    axisFormat %s

    section Window 1 (00-60s)
    80 requests (OK)        :done, 0, 60
    section Window 2 (60-120s)
    120 requests (20 bị reject) :crit, 60, 120
    section Window 3 (120-180s)
    45 requests (OK)        :done, 120, 180

Fixed Window chia thời gian thành các khoảng cố định

// Fixed Window Counter — minh hoạ đơn giản
public class FixedWindowLimiter
{
    private int _counter;
    private DateTime _windowStart;
    private readonly int _maxRequests;
    private readonly TimeSpan _windowSize;
    private readonly object _lock = new();

    public FixedWindowLimiter(int maxRequests, TimeSpan windowSize)
    {
        _maxRequests = maxRequests;
        _windowSize = windowSize;
        _windowStart = DateTime.UtcNow;
    }

    public bool TryAcquire()
    {
        lock (_lock)
        {
            var now = DateTime.UtcNow;
            if (now - _windowStart >= _windowSize)
            {
                _counter = 0;
                _windowStart = now;
            }
            if (_counter < _maxRequests)
            {
                _counter++;
                return true;
            }
            return false;
        }
    }
}

Boundary burst problem

Điểm yếu lớn nhất của Fixed Window: nếu client gửi 100 request ở cuối window 1 và 100 request ở đầu window 2, hệ thống chịu 200 request trong khoảng thời gian rất ngắn — gấp đôi limit. Đây gọi là boundary burst problem.

2. Sliding Window Log

Khắc phục boundary burst bằng cách lưu timestamp của mỗi request. Khi có request mới, loại bỏ tất cả timestamp cũ hơn window size, đếm số timestamp còn lại.

// Sliding Window Log
public class SlidingWindowLog
{
    private readonly Queue<DateTime> _timestamps = new();
    private readonly int _maxRequests;
    private readonly TimeSpan _windowSize;
    private readonly object _lock = new();

    public SlidingWindowLog(int maxRequests, TimeSpan windowSize)
    {
        _maxRequests = maxRequests;
        _windowSize = windowSize;
    }

    public bool TryAcquire()
    {
        lock (_lock)
        {
            var now = DateTime.UtcNow;
            var windowStart = now - _windowSize;

            while (_timestamps.Count > 0 && _timestamps.Peek() < windowStart)
                _timestamps.Dequeue();

            if (_timestamps.Count < _maxRequests)
            {
                _timestamps.Enqueue(now);
                return true;
            }
            return false;
        }
    }
}

Ưu điểm: Chính xác tuyệt đối, không bị boundary burst. Nhược điểm: Tốn memory vì phải lưu mọi timestamp — với 10,000 req/s, mỗi giây cần lưu 10,000 entries.

3. Sliding Window Counter

Kết hợp ưu điểm của cả hai: sử dụng counter (tiết kiệm memory) nhưng trượt theo thời gian thực (tránh boundary burst). Tính số request cho phép bằng weighted average giữa window trước và window hiện tại.

// Sliding Window Counter — hybrid approach
public class SlidingWindowCounter
{
    private int _prevCount;
    private int _currCount;
    private DateTime _windowStart;
    private readonly int _maxRequests;
    private readonly TimeSpan _windowSize;
    private readonly object _lock = new();

    public bool TryAcquire()
    {
        lock (_lock)
        {
            var now = DateTime.UtcNow;
            var elapsed = now - _windowStart;

            if (elapsed >= _windowSize * 2)
            {
                _prevCount = 0;
                _currCount = 0;
                _windowStart = now;
            }
            else if (elapsed >= _windowSize)
            {
                _prevCount = _currCount;
                _currCount = 0;
                _windowStart += _windowSize;
                elapsed = now - _windowStart;
            }

            // Weighted count: phần còn lại của window trước + window hiện tại
            double weight = 1.0 - (elapsed.TotalMilliseconds / _windowSize.TotalMilliseconds);
            double estimatedCount = (_prevCount * weight) + _currCount;

            if (estimatedCount < _maxRequests)
            {
                _currCount++;
                return true;
            }
            return false;
        }
    }
}

4. Token Bucket

Mô hình trực quan nhất: một "thùng" chứa token được nạp đều đặn theo tốc độ cố định. Mỗi request tiêu tốn một token. Khi thùng hết token, request bị từ chối. Thùng có capacity tối đa, cho phép burst trong giới hạn kiểm soát.

flowchart TB
    subgraph TB["Token Bucket"]
        direction TB
        R[Refill: 10 tokens/giây] --> B[Bucket
capacity: 100 tokens]
        B --> C{Request đến}
        C -->|Có token| D[Trừ token, cho phép]
        C -->|Hết token| E[Reject 429]
    end

    style R fill:#4CAF50,stroke:#fff,color:#fff
    style B fill:#e94560,stroke:#fff,color:#fff
    style D fill:#4CAF50,stroke:#fff,color:#fff
    style E fill:#ff9800,stroke:#fff,color:#fff
    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50

Token Bucket cho phép burst nhưng kiểm soát tốc độ trung bình

// Token Bucket
public class TokenBucket
{
    private double _tokens;
    private DateTime _lastRefill;
    private readonly double _maxTokens;
    private readonly double _refillRate; // tokens per second
    private readonly object _lock = new();

    public TokenBucket(double maxTokens, double refillRate)
    {
        _maxTokens = maxTokens;
        _refillRate = refillRate;
        _tokens = maxTokens;
        _lastRefill = DateTime.UtcNow;
    }

    public bool TryAcquire(int tokens = 1)
    {
        lock (_lock)
        {
            Refill();
            if (_tokens >= tokens)
            {
                _tokens -= tokens;
                return true;
            }
            return false;
        }
    }

    private void Refill()
    {
        var now = DateTime.UtcNow;
        var elapsed = (now - _lastRefill).TotalSeconds;
        _tokens = Math.Min(_maxTokens, _tokens + elapsed * _refillRate);
        _lastRefill = now;
    }
}

So sánh 4 thuật toán

Thuật toán	Memory	Chính xác	Burst Control	Độ phức tạp	Use case
Fixed Window	O(1)	Trung bình	Kém (boundary burst)	Thấp	Internal APIs, MVP
Sliding Window Log	O(n)	Cao nhất	Tốt	Trung bình	Billing, audit-critical
Sliding Window Counter	O(1)	Cao	Tốt	Trung bình	Public APIs (khuyến nghị)
Token Bucket	O(1)	Cao	Tốt nhất (configurable)	Thấp	API Gateway, microservices

Triển khai trên ASP.NET Core 10

Từ .NET 7, Microsoft đã tích hợp sẵn Rate Limiting middleware trong namespace System.Threading.RateLimiting. Đến .NET 10, middleware này đã trưởng thành với hỗ trợ multi-tenant partitioning, chained limiters, và tích hợp sâu hơn với minimal APIs.

Cấu hình cơ bản với Fixed Window

// Program.cs — ASP.NET Core 10
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;

    options.AddFixedWindowLimiter("fixed", opt =>
    {
        opt.PermitLimit = 100;
        opt.Window = TimeSpan.FromMinutes(1);
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 10;
    });
});

var app = builder.Build();
app.UseRateLimiter();

app.MapGet("/api/products", () => Results.Ok(new { products = new[] { "A", "B" } }))
   .RequireRateLimiting("fixed");

app.Run();

Token Bucket cho API công khai

builder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = 429;

    options.AddTokenBucketLimiter("api-public", opt =>
    {
        opt.TokenLimit = 100;           // Burst capacity
        opt.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
        opt.TokensPerPeriod = 20;       // 20 tokens mỗi 10 giây = 2 req/s sustained
        opt.AutoReplenishment = true;
        opt.QueueLimit = 5;
    });
});

Sliding Window với Partition theo User

builder.Services.AddRateLimiter(options =>
{
    options.AddPolicy("per-user", httpContext =>
    {
        var userId = httpContext.User.FindFirst("sub")?.Value
                     ?? httpContext.Connection.RemoteIpAddress?.ToString()
                     ?? "anonymous";

        return RateLimitPartition.GetSlidingWindowLimiter(userId, _ => new SlidingWindowRateLimiterOptions
        {
            PermitLimit = 60,
            Window = TimeSpan.FromMinutes(1),
            SegmentsPerWindow = 6,      // 6 segments × 10 giây mỗi segment
            QueueLimit = 0
        });
    });

    options.OnRejected = async (context, ct) =>
    {
        context.HttpContext.Response.Headers["Retry-After"] = "60";
        await context.HttpContext.Response.WriteAsJsonAsync(new
        {
            error = "rate_limit_exceeded",
            message = "Bạn đã gửi quá nhiều request. Vui lòng thử lại sau.",
            retryAfter = 60
        }, ct);
    };
});

Multi-tenant Rate Limiting (.NET 10)

.NET 10 cải tiến RateLimitPartition cho phép kết hợp nhiều key partition cùng lúc — ví dụ giới hạn theo user + endpoint + plan tier. Điều này đặc biệt hữu ích cho SaaS APIs khi mỗi pricing plan có rate limit khác nhau.

Chained Rate Limiters — nhiều tầng bảo vệ

builder.Services.AddRateLimiter(options =>
{
    // Tầng 1: Global — bảo vệ toàn bộ server
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
    {
        return RateLimitPartition.GetFixedWindowLimiter("global", _ =>
            new FixedWindowRateLimiterOptions
            {
                PermitLimit = 10_000,
                Window = TimeSpan.FromMinutes(1)
            });
    });

    // Tầng 2: Per-IP — chống abuse từ single source
    options.AddPolicy("per-ip", context =>
    {
        var ip = context.Connection.RemoteIpAddress?.ToString() ?? "unknown";
        return RateLimitPartition.GetTokenBucketLimiter(ip, _ =>
            new TokenBucketRateLimiterOptions
            {
                TokenLimit = 50,
                ReplenishmentPeriod = TimeSpan.FromSeconds(10),
                TokensPerPeriod = 10,
                AutoReplenishment = true
            });
    });

    // Tầng 3: Per-User-Per-Endpoint — fine-grained control
    options.AddPolicy("user-endpoint", context =>
    {
        var user = context.User.FindFirst("sub")?.Value ?? "anon";
        var endpoint = context.GetEndpoint()?.DisplayName ?? "default";
        var key = $"{user}:{endpoint}";

        return RateLimitPartition.GetSlidingWindowLimiter(key, _ =>
            new SlidingWindowRateLimiterOptions
            {
                PermitLimit = 30,
                Window = TimeSpan.FromMinutes(1),
                SegmentsPerWindow = 6
            });
    });
});

Distributed Rate Limiting

Các thuật toán ở trên hoạt động tốt trên single instance. Nhưng trong kiến trúc microservices với nhiều replicas đằng sau load balancer, mỗi instance có counter riêng — client có thể bypass limit bằng cách gửi request đến nhiều instance khác nhau.

flowchart TB
    C[Client
Limit: 100 req/min] --> LB[Load Balancer]
    LB --> S1[Server 1
Counter: 40]
    LB --> S2[Server 2
Counter: 35]
    LB --> S3[Server 3
Counter: 38]

    S1 -.->|Tổng thực tế: 113 req| NOTE[Vượt limit!
Nhưng không instance nào biết]
    S2 -.-> NOTE
    S3 -.-> NOTE

    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style LB fill:#2c3e50,stroke:#fff,color:#fff
    style S1 fill:#4CAF50,stroke:#fff,color:#fff
    style S2 fill:#4CAF50,stroke:#fff,color:#fff
    style S3 fill:#4CAF50,stroke:#fff,color:#fff
    style NOTE fill:#ff9800,stroke:#fff,color:#fff

Vấn đề rate limiting với nhiều server instances

Giải pháp: Centralized Counter với Redis

Sử dụng Redis làm shared store cho counter — tất cả instances đọc/ghi cùng một counter. Redis Lua scripting đảm bảo atomicity cho các thao tác increment + check.

// Distributed Sliding Window Counter với Redis
public class RedisRateLimiter
{
    private readonly IConnectionMultiplexer _redis;
    private readonly string _luaScript = @"
        local key = KEYS[1]
        local now = tonumber(ARGV[1])
        local window = tonumber(ARGV[2])
        local limit = tonumber(ARGV[3])

        -- Xóa entries cũ hơn window
        redis.call('ZREMRANGEBYSCORE', key, 0, now - window)

        -- Đếm entries trong window
        local count = redis.call('ZCARD', key)

        if count < limit then
            -- Thêm request mới với score = timestamp
            redis.call('ZADD', key, now, now .. '-' .. math.random(1000000))
            redis.call('EXPIRE', key, math.ceil(window / 1000))
            return 1
        end
        return 0
    ";

    public RedisRateLimiter(IConnectionMultiplexer redis)
    {
        _redis = redis;
    }

    public async Task<bool> TryAcquireAsync(string clientId, int limit, TimeSpan window)
    {
        var db = _redis.GetDatabase();
        var now = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds();

        var result = await db.ScriptEvaluateAsync(
            _luaScript,
            new RedisKey[] { $"rate:{clientId}" },
            new RedisValue[] { now, (long)window.TotalMilliseconds, limit });

        return (int)result == 1;
    }
}

Tại sao dùng Lua script?

Redis thực thi Lua script một cách atomic — không có race condition giữa ZREMRANGEBYSCORE, ZCARD và ZADD. Nếu tách thành nhiều lệnh Redis riêng lẻ, hai request đồng thời có thể đều đọc count = 99 (dưới limit 100) và cả hai đều được cho phép, dẫn đến vượt limit.

Token Bucket phân tán với Redis

// Distributed Token Bucket — Redis Lua
public class RedisTokenBucket
{
    private readonly IConnectionMultiplexer _redis;
    private readonly string _luaScript = @"
        local key = KEYS[1]
        local max_tokens = tonumber(ARGV[1])
        local refill_rate = tonumber(ARGV[2])
        local now = tonumber(ARGV[3])
        local requested = tonumber(ARGV[4])

        local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
        local tokens = tonumber(bucket[1]) or max_tokens
        local last_refill = tonumber(bucket[2]) or now

        -- Tính token được nạp thêm
        local elapsed = (now - last_refill) / 1000.0
        tokens = math.min(max_tokens, tokens + elapsed * refill_rate)

        if tokens >= requested then
            tokens = tokens - requested
            redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
            redis.call('EXPIRE', key, math.ceil(max_tokens / refill_rate) + 10)
            return 1
        end

        redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
        redis.call('EXPIRE', key, math.ceil(max_tokens / refill_rate) + 10)
        return 0
    ";

    public async Task<bool> TryAcquireAsync(
        string clientId, int maxTokens, double refillRate, int tokens = 1)
    {
        var db = _redis.GetDatabase();
        var now = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds();

        var result = await db.ScriptEvaluateAsync(
            _luaScript,
            new RedisKey[] { $"bucket:{clientId}" },
            new RedisValue[] { maxTokens, refillRate, now, tokens });

        return (int)result == 1;
    }
}

Tích hợp Redis Rate Limiter vào ASP.NET Core

// Đăng ký service
builder.Services.AddSingleton<IConnectionMultiplexer>(
    ConnectionMultiplexer.Connect("localhost:6379"));
builder.Services.AddSingleton<RedisRateLimiter>();

// Custom Rate Limiting Policy
builder.Services.AddRateLimiter(options =>
{
    options.AddPolicy("redis-distributed", context =>
    {
        var ip = context.Connection.RemoteIpAddress?.ToString() ?? "unknown";

        return RateLimitPartition.Get(ip, key =>
        {
            var redis = context.RequestServices.GetRequiredService<RedisRateLimiter>();
            // Trả về custom RateLimiter wrapper
            return new RedisPartitionedLimiter(redis, key, limit: 100,
                window: TimeSpan.FromMinutes(1));
        });
    });
});

Rate Limiting ở tầng API Gateway

Trong thực tế, rate limiting thường được triển khai ở nhiều tầng (defense in depth). Tầng ngoài cùng — API Gateway hoặc Edge — chặn traffic sớm nhất có thể, tiết kiệm tài nguyên cho tầng application.

flowchart LR
    C[Client] --> CF[Cloudflare
Edge Rate Limit]
    CF --> GW[API Gateway
YARP / Kong]
    GW --> SVC[Service
App-level Limit]
    SVC --> DB[(Database)]

    CF -.->|"Layer 1: IP-based
1000 req/min"| N1[ ]
    GW -.->|"Layer 2: API Key
200 req/min"| N2[ ]
    SVC -.->|"Layer 3: User+Endpoint
30 req/min"| N3[ ]

    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style CF fill:#e94560,stroke:#fff,color:#fff
    style GW fill:#2c3e50,stroke:#fff,color:#fff
    style SVC fill:#4CAF50,stroke:#fff,color:#fff
    style DB fill:#16213e,stroke:#fff,color:#fff
    style N1 fill:transparent,stroke:transparent
    style N2 fill:transparent,stroke:transparent
    style N3 fill:transparent,stroke:transparent

Defense in depth: Rate limiting nhiều tầng

Cloudflare Rate Limiting (Free Tier)

Cloudflare cung cấp Rate Limiting rules ngay cả ở Free plan — cho phép bạn chặn traffic bất thường trước khi nó chạm origin server. Với Enterprise plan, bạn có thêm Advanced Rate Limiting dựa trên request characteristics (headers, cookies, body fields).

// Cloudflare Rate Limiting Rule — ví dụ cấu hình
{
  "description": "API rate limit — 100 req/min per IP",
  "expression": "(http.request.uri.path matches \"^/api/\")",
  "action": "block",
  "ratelimit": {
    "characteristics": ["ip.src"],
    "period": 60,
    "requests_per_period": 100,
    "mitigation_timeout": 120,
    "counting_expression": ""
  }
}

YARP Reverse Proxy với Rate Limiting

// YARP + Rate Limiting — ASP.NET Core 10
builder.Services.AddReverseProxy()
    .LoadFromConfig(builder.Configuration.GetSection("ReverseProxy"));

builder.Services.AddRateLimiter(options =>
{
    options.AddPolicy("gateway-limit", context =>
    {
        // Rate limit theo API Key từ header
        var apiKey = context.Request.Headers["X-API-Key"].FirstOrDefault() ?? "no-key";

        return RateLimitPartition.GetTokenBucketLimiter(apiKey, _ =>
            new TokenBucketRateLimiterOptions
            {
                TokenLimit = 200,
                ReplenishmentPeriod = TimeSpan.FromSeconds(10),
                TokensPerPeriod = 40,
                AutoReplenishment = true
            });
    });
});

app.UseRateLimiter();
app.MapReverseProxy();

Best Practices & Anti-Patterns

Response Headers chuẩn

Luôn trả về rate limit headers để client biết trạng thái quota của mình. Đây là chuẩn đề xuất trong RFC 6585 và RFC 7231:

// Middleware thêm Rate Limit Headers
app.Use(async (context, next) =>
{
    await next();

    var limiterFeature = context.Features.Get<IRateLimiterStatisticsFeature>();
    if (limiterFeature is not null)
    {
        var stats = limiterFeature.GetStatistics();
        context.Response.Headers["X-RateLimit-Limit"] = "100";
        context.Response.Headers["X-RateLimit-Remaining"] =
            stats?.CurrentAvailablePermits.ToString() ?? "0";
        context.Response.Headers["X-RateLimit-Reset"] =
            DateTimeOffset.UtcNow.AddMinutes(1).ToUnixTimeSeconds().ToString();
    }
});

Anti-patterns cần tránh

Anti-Pattern	Vấn đề	Giải pháp
Rate limit chỉ theo IP	Shared IP (NAT, corporate proxy) ảnh hưởng nhiều users	Kết hợp IP + API Key + User ID
Không có Retry-After	Client retry ngay lập tức → thundering herd	Luôn trả Retry-After header + exponential backoff
Hard limit không bypass	Health check, admin, webhook bị block	Whitelist trusted IPs/service accounts
Limit quá thấp khi deploy	False positives → user frustration	Bắt đầu permissive, monitor, rồi tighten
Không log rejections	Không biết ai bị reject và tại sao	Log clientId, endpoint, reject reason → dashboard

Graceful degradation thay vì hard reject

Thay vì trả 429 ngay, hãy cân nhắc các chiến lược mềm hơn: trả response cached, giảm quality (ảnh nhỏ hơn, ít data hơn), hoặc đưa vào queue xử lý sau. Điều này giữ trải nghiệm người dùng tốt hơn trong peak traffic.

Monitoring & Alerting

// Kết hợp với OpenTelemetry Metrics
builder.Services.AddRateLimiter(options =>
{
    options.OnRejected = async (context, ct) =>
    {
        // Emit metric
        var meter = context.HttpContext.RequestServices
            .GetRequiredService<IMeterFactory>()
            .Create("RateLimiting");

        var counter = meter.CreateCounter<long>("rate_limit.rejections");
        counter.Add(1, new KeyValuePair<string, object?>("endpoint",
            context.HttpContext.GetEndpoint()?.DisplayName));

        context.HttpContext.Response.StatusCode = 429;
        context.HttpContext.Response.Headers["Retry-After"] = "60";

        await context.HttpContext.Response.WriteAsJsonAsync(new
        {
            error = "rate_limit_exceeded",
            retryAfter = 60
        }, ct);
    };
});

Kết luận

Rate Limiting không chỉ là "chặn bot" — đây là thành phần kiến trúc quan trọng quyết định khả năng chịu tải, bảo mật, và trải nghiệm người dùng của hệ thống. Lựa chọn đúng thuật toán và vị trí triển khai phụ thuộc vào yêu cầu cụ thể:

MVP / Internal API: Fixed Window trên ASP.NET Core là đủ
Public API: Sliding Window Counter hoặc Token Bucket, partition theo user
Microservices: Distributed Token Bucket với Redis + Edge rate limiting (Cloudflare)
SaaS multi-tenant: Chained limiters — Global → Per-Tenant → Per-User → Per-Endpoint

Điểm mấu chốt: bắt đầu đơn giản với built-in middleware của .NET 10, monitor kỹ, rồi mở rộng sang distributed solution khi quy mô đòi hỏi. Đừng over-engineer từ đầu, nhưng cũng đừng quên rằng mỗi API public không có rate limiting là một quả bom nổ chậm.

Tham khảo thêm

#Rate Limiting #system design #ASP.NET Core #API Gateway #Distributed Systems #.NET 10

# Rate Limiting — Kiểm Soát Lưu Lượng API Trong Hệ Thống Phân Tán

~90% API attacks là brute-force hoặc DDoS

4 Thuật toán Rate Limiting phổ biến nhất

<1ms Overhead của Rate Limiter tối ưu

.NET 10 Built-in Rate Limiting Middleware

## Mục lục

- [Tại sao cần Rate Limiting?](#tai-sao-rate-limiting)
- [4 thuật toán Rate Limiting cốt lõi](#4-thuat-toan)
- [Triển khai trên ASP.NET Core 10](#aspnet-core)
- [Distributed Rate Limiting](#distributed)
- [Rate Limiting ở tầng API Gateway](#api-gateway)
- [Best Practices & Anti-Patterns](#best-practices)
- [Kết luận](#ket-luan)

## Tại sao cần Rate Limiting?

**Rate Limiting** (giới hạn tốc độ) là kỹ thuật kiểm soát số lượng request mà client được phép gửi đến server trong một khoảng thời gian nhất định. Đây là một trong những thành phần *không thể thiếu* trong kiến trúc hệ thống phân tán hiện đại.

#### Rate Limiting giải quyết gì?

- **Bảo vệ tài nguyên:** Ngăn chặn quá tải CPU, memory, database connections
- **Công bằng:** Đảm bảo mọi client đều có cơ hội sử dụng API
- **Bảo mật:** Giảm thiểu brute-force attacks, credential stuffing, scraping
- **Kiểm soát chi phí:** Ngăn chặn cloud bill shock khi traffic bất thường
- **SLA compliance:** Đảm bảo service quality cho khách hàng trả phí

```
flowchart LR
    A[Client Request] --> B{Rate Limiter}
    B -->|Allowed| C[API Server]
    B -->|Rejected| D[429 Too Many Requests]
    C --> E[Database / Service]
    D --> F[Retry-After Header]

style A fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style B fill:#e94560,stroke:#fff,color:#fff
    style C fill:#4CAF50,stroke:#fff,color:#fff
    style D fill:#ff9800,stroke:#fff,color:#fff
    style E fill:#2c3e50,stroke:#fff,color:#fff
    style F fill:#f8f9fa,stroke:#ff9800,color:#2c3e50
  
```

Luồng xử lý cơ bản của Rate Limiter

## 4 thuật toán Rate Limiting cốt lõi

### 1. Fixed Window Counter

```
gantt
    title Fixed Window Counter — 100 req/phút
    dateFormat X
    axisFormat %s

section Window 1 (00-60s)
    80 requests (OK)        :done, 0, 60
    section Window 2 (60-120s)
    120 requests (20 bị reject) :crit, 60, 120
    section Window 3 (120-180s)
    45 requests (OK)        :done, 120, 180
  
```

Fixed Window chia thời gian thành các khoảng cố định

```csharp
// Fixed Window Counter — minh hoạ đơn giản
public class FixedWindowLimiter
{
    private int _counter;
    private DateTime _windowStart;
    private readonly int _maxRequests;
    private readonly TimeSpan _windowSize;
    private readonly object _lock = new();

public FixedWindowLimiter(int maxRequests, TimeSpan windowSize)
    {
        _maxRequests = maxRequests;
        _windowSize = windowSize;
        _windowStart = DateTime.UtcNow;
    }

public bool TryAcquire()
    {
        lock (_lock)
        {
            var now = DateTime.UtcNow;
            if (now - _windowStart >= _windowSize)
            {
                _counter = 0;
                _windowStart = now;
            }
            if (_counter < _maxRequests)
            {
                _counter++;
                return true;
            }
            return false;
        }
    }
}
```

#### Boundary burst problem

### 2. Sliding Window Log

Khắc phục boundary burst bằng cách lưu timestamp của mỗi request. Khi có request mới, loại bỏ tất cả timestamp cũ hơn window size, đếm số timestamp còn lại.

```csharp
// Sliding Window Log
public class SlidingWindowLog
{
    private readonly Queue<DateTime> _timestamps = new();
    private readonly int _maxRequests;
    private readonly TimeSpan _windowSize;
    private readonly object _lock = new();

public SlidingWindowLog(int maxRequests, TimeSpan windowSize)
    {
        _maxRequests = maxRequests;
        _windowSize = windowSize;
    }

public bool TryAcquire()
    {
        lock (_lock)
        {
            var now = DateTime.UtcNow;
            var windowStart = now - _windowSize;

while (_timestamps.Count > 0 && _timestamps.Peek() &lt; windowStart)
                _timestamps.Dequeue();

if (_timestamps.Count < _maxRequests)
            {
                _timestamps.Enqueue(now);
                return true;
            }
            return false;
        }
    }
}
```
**Ưu điểm:** Chính xác tuyệt đối, không bị boundary burst. **Nhược điểm:** Tốn memory vì phải lưu mọi timestamp — với 10,000 req/s, mỗi giây cần lưu 10,000 entries.

### 3. Sliding Window Counter

```csharp
// Sliding Window Counter — hybrid approach
public class SlidingWindowCounter
{
    private int _prevCount;
    private int _currCount;
    private DateTime _windowStart;
    private readonly int _maxRequests;
    private readonly TimeSpan _windowSize;
    private readonly object _lock = new();

public bool TryAcquire()
    {
        lock (_lock)
        {
            var now = DateTime.UtcNow;
            var elapsed = now - _windowStart;

if (elapsed >= _windowSize * 2)
            {
                _prevCount = 0;
                _currCount = 0;
                _windowStart = now;
            }
            else if (elapsed >= _windowSize)
            {
                _prevCount = _currCount;
                _currCount = 0;
                _windowStart += _windowSize;
                elapsed = now - _windowStart;
            }

// Weighted count: phần còn lại của window trước + window hiện tại
            double weight = 1.0 - (elapsed.TotalMilliseconds / _windowSize.TotalMilliseconds);
            double estimatedCount = (_prevCount * weight) + _currCount;

if (estimatedCount < _maxRequests)
            {
                _currCount++;
                return true;
            }
            return false;
        }
    }
}
```

### 4. Token Bucket

```
flowchart TB
    subgraph TB["Token Bucket"]
        direction TB
        R[Refill: 10 tokens/giây] --> B[Bucket  
capacity: 100 tokens]
        B --> C{Request đến}
        C -->|Có token| D[Trừ token, cho phép]
        C -->|Hết token| E[Reject 429]
    end

style R fill:#4CAF50,stroke:#fff,color:#fff
    style B fill:#e94560,stroke:#fff,color:#fff
    style D fill:#4CAF50,stroke:#fff,color:#fff
    style E fill:#ff9800,stroke:#fff,color:#fff
    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
  
```

Token Bucket cho phép burst nhưng kiểm soát tốc độ trung bình

```csharp
// Token Bucket
public class TokenBucket
{
    private double _tokens;
    private DateTime _lastRefill;
    private readonly double _maxTokens;
    private readonly double _refillRate; // tokens per second
    private readonly object _lock = new();

public TokenBucket(double maxTokens, double refillRate)
    {
        _maxTokens = maxTokens;
        _refillRate = refillRate;
        _tokens = maxTokens;
        _lastRefill = DateTime.UtcNow;
    }

public bool TryAcquire(int tokens = 1)
    {
        lock (_lock)
        {
            Refill();
            if (_tokens >= tokens)
            {
                _tokens -= tokens;
                return true;
            }
            return false;
        }
    }

private void Refill()
    {
        var now = DateTime.UtcNow;
        var elapsed = (now - _lastRefill).TotalSeconds;
        _tokens = Math.Min(_maxTokens, _tokens + elapsed * _refillRate);
        _lastRefill = now;
    }
}
```

### So sánh 4 thuật toán

| Thuật toán | Memory | Chính xác | Burst Control | Độ phức tạp | Use case |
| --- | --- | --- | --- | --- | --- |
| **Fixed Window** | O(1) | Trung bình | Kém (boundary burst) | Thấp | Internal APIs, MVP |
| **Sliding Window Log** | O(n) | Cao nhất | Tốt | Trung bình | Billing, audit-critical |
| **Sliding Window Counter** | O(1) | Cao | Tốt | Trung bình | Public APIs (khuyến nghị) |
| **Token Bucket** | O(1) | Cao | Tốt nhất (configurable) | Thấp | API Gateway, microservices |

## Triển khai trên ASP.NET Core 10

Từ .NET 7, Microsoft đã tích hợp sẵn Rate Limiting middleware trong namespace `System.Threading.RateLimiting`. Đến .NET 10, middleware này đã trưởng thành với hỗ trợ multi-tenant partitioning, chained limiters, và tích hợp sâu hơn với minimal APIs.

### Cấu hình cơ bản với Fixed Window

```csharp
// Program.cs — ASP.NET Core 10
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;

options.AddFixedWindowLimiter("fixed", opt =>
    {
        opt.PermitLimit = 100;
        opt.Window = TimeSpan.FromMinutes(1);
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 10;
    });
});

var app = builder.Build();
app.UseRateLimiter();

app.MapGet("/api/products", () => Results.Ok(new { products = new[] { "A", "B" } }))
   .RequireRateLimiting("fixed");

app.Run();
```

### Token Bucket cho API công khai

```csharp
builder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = 429;

options.AddTokenBucketLimiter("api-public", opt =>
    {
        opt.TokenLimit = 100;           // Burst capacity
        opt.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
        opt.TokensPerPeriod = 20;       // 20 tokens mỗi 10 giây = 2 req/s sustained
        opt.AutoReplenishment = true;
        opt.QueueLimit = 5;
    });
});
```

### Sliding Window với Partition theo User

```csharp
builder.Services.AddRateLimiter(options =>
{
    options.AddPolicy("per-user", httpContext =>
    {
        var userId = httpContext.User.FindFirst("sub")?.Value
                     ?? httpContext.Connection.RemoteIpAddress?.ToString()
                     ?? "anonymous";

return RateLimitPartition.GetSlidingWindowLimiter(userId, _ => new SlidingWindowRateLimiterOptions
        {
            PermitLimit = 60,
            Window = TimeSpan.FromMinutes(1),
            SegmentsPerWindow = 6,      // 6 segments × 10 giây mỗi segment
            QueueLimit = 0
        });
    });

options.OnRejected = async (context, ct) =>
    {
        context.HttpContext.Response.Headers["Retry-After"] = "60";
        await context.HttpContext.Response.WriteAsJsonAsync(new
        {
            error = "rate_limit_exceeded",
            message = "Bạn đã gửi quá nhiều request. Vui lòng thử lại sau.",
            retryAfter = 60
        }, ct);
    };
});
```

#### Multi-tenant Rate Limiting (.NET 10)

.NET 10 cải tiến `RateLimitPartition` cho phép kết hợp nhiều key partition cùng lúc — ví dụ giới hạn theo **user + endpoint + plan tier**. Điều này đặc biệt hữu ích cho SaaS APIs khi mỗi pricing plan có rate limit khác nhau.

### Chained Rate Limiters — nhiều tầng bảo vệ

```csharp
builder.Services.AddRateLimiter(options =>
{
    // Tầng 1: Global — bảo vệ toàn bộ server
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
    {
        return RateLimitPartition.GetFixedWindowLimiter("global", _ =>
            new FixedWindowRateLimiterOptions
            {
                PermitLimit = 10_000,
                Window = TimeSpan.FromMinutes(1)
            });
    });

// Tầng 2: Per-IP — chống abuse từ single source
    options.AddPolicy("per-ip", context =>
    {
        var ip = context.Connection.RemoteIpAddress?.ToString() ?? "unknown";
        return RateLimitPartition.GetTokenBucketLimiter(ip, _ =>
            new TokenBucketRateLimiterOptions
            {
                TokenLimit = 50,
                ReplenishmentPeriod = TimeSpan.FromSeconds(10),
                TokensPerPeriod = 10,
                AutoReplenishment = true
            });
    });

// Tầng 3: Per-User-Per-Endpoint — fine-grained control
    options.AddPolicy("user-endpoint", context =>
    {
        var user = context.User.FindFirst("sub")?.Value ?? "anon";
        var endpoint = context.GetEndpoint()?.DisplayName ?? "default";
        var key = $"{user}:{endpoint}";

return RateLimitPartition.GetSlidingWindowLimiter(key, _ =>
            new SlidingWindowRateLimiterOptions
            {
                PermitLimit = 30,
                Window = TimeSpan.FromMinutes(1),
                SegmentsPerWindow = 6
            });
    });
});
```

## Distributed Rate Limiting

Các thuật toán ở trên hoạt động tốt trên **single instance**. Nhưng trong kiến trúc microservices với nhiều replicas đằng sau load balancer, mỗi instance có counter riêng — client có thể bypass limit bằng cách gửi request đến nhiều instance khác nhau.

```
flowchart TB
    C[Client  
Limit: 100 req/min] --> LB[Load Balancer]
    LB --> S1[Server 1  
Counter: 40]
    LB --> S2[Server 2  
Counter: 35]
    LB --> S3[Server 3  
Counter: 38]

S1 -.->|Tổng thực tế: 113 req| NOTE[Vượt limit!  
Nhưng không instance nào biết]
    S2 -.-> NOTE
    S3 -.-> NOTE

style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style LB fill:#2c3e50,stroke:#fff,color:#fff
    style S1 fill:#4CAF50,stroke:#fff,color:#fff
    style S2 fill:#4CAF50,stroke:#fff,color:#fff
    style S3 fill:#4CAF50,stroke:#fff,color:#fff
    style NOTE fill:#ff9800,stroke:#fff,color:#fff
  
```

Vấn đề rate limiting với nhiều server instances

### Giải pháp: Centralized Counter với Redis

Sử dụng Redis làm shared store cho counter — tất cả instances đọc/ghi cùng một counter. Redis Lua scripting đảm bảo atomicity cho các thao tác increment + check.

```csharp
// Distributed Sliding Window Counter với Redis
public class RedisRateLimiter
{
    private readonly IConnectionMultiplexer _redis;
    private readonly string _luaScript = @"
        local key = KEYS[1]
        local now = tonumber(ARGV[1])
        local window = tonumber(ARGV[2])
        local limit = tonumber(ARGV[3])

-- Xóa entries cũ hơn window
        redis.call('ZREMRANGEBYSCORE', key, 0, now - window)

-- Đếm entries trong window
        local count = redis.call('ZCARD', key)

if count < limit then
            -- Thêm request mới với score = timestamp
            redis.call('ZADD', key, now, now .. '-' .. math.random(1000000))
            redis.call('EXPIRE', key, math.ceil(window / 1000))
            return 1
        end
        return 0
    ";

public RedisRateLimiter(IConnectionMultiplexer redis)
    {
        _redis = redis;
    }

public async Task<bool> TryAcquireAsync(string clientId, int limit, TimeSpan window)
    {
        var db = _redis.GetDatabase();
        var now = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds();

var result = await db.ScriptEvaluateAsync(
            _luaScript,
            new RedisKey[] { $"rate:{clientId}" },
            new RedisValue[] { now, (long)window.TotalMilliseconds, limit });

return (int)result == 1;
    }
}
```

#### Tại sao dùng Lua script?

Redis thực thi Lua script một cách **atomic** — không có race condition giữa ZREMRANGEBYSCORE, ZCARD và ZADD. Nếu tách thành nhiều lệnh Redis riêng lẻ, hai request đồng thời có thể đều đọc count = 99 (dưới limit 100) và cả hai đều được cho phép, dẫn đến vượt limit.

### Token Bucket phân tán với Redis

```csharp
// Distributed Token Bucket — Redis Lua
public class RedisTokenBucket
{
    private readonly IConnectionMultiplexer _redis;
    private readonly string _luaScript = @"
        local key = KEYS[1]
        local max_tokens = tonumber(ARGV[1])
        local refill_rate = tonumber(ARGV[2])
        local now = tonumber(ARGV[3])
        local requested = tonumber(ARGV[4])

local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
        local tokens = tonumber(bucket[1]) or max_tokens
        local last_refill = tonumber(bucket[2]) or now

-- Tính token được nạp thêm
        local elapsed = (now - last_refill) / 1000.0
        tokens = math.min(max_tokens, tokens + elapsed * refill_rate)

if tokens >= requested then
            tokens = tokens - requested
            redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
            redis.call('EXPIRE', key, math.ceil(max_tokens / refill_rate) + 10)
            return 1
        end

redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
        redis.call('EXPIRE', key, math.ceil(max_tokens / refill_rate) + 10)
        return 0
    ";

public async Task<bool> TryAcquireAsync(
        string clientId, int maxTokens, double refillRate, int tokens = 1)
    {
        var db = _redis.GetDatabase();
        var now = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds();

var result = await db.ScriptEvaluateAsync(
            _luaScript,
            new RedisKey[] { $"bucket:{clientId}" },
            new RedisValue[] { maxTokens, refillRate, now, tokens });

return (int)result == 1;
    }
}
```

### Tích hợp Redis Rate Limiter vào ASP.NET Core

```csharp
// Đăng ký service
builder.Services.AddSingleton<IConnectionMultiplexer>(
    ConnectionMultiplexer.Connect("localhost:6379"));
builder.Services.AddSingleton<RedisRateLimiter>();

// Custom Rate Limiting Policy
builder.Services.AddRateLimiter(options =>
{
    options.AddPolicy("redis-distributed", context =>
    {
        var ip = context.Connection.RemoteIpAddress?.ToString() ?? "unknown";

return RateLimitPartition.Get(ip, key =>
        {
            var redis = context.RequestServices.GetRequiredService<RedisRateLimiter>();
            // Trả về custom RateLimiter wrapper
            return new RedisPartitionedLimiter(redis, key, limit: 100,
                window: TimeSpan.FromMinutes(1));
        });
    });
});
```

## Rate Limiting ở tầng API Gateway

Trong thực tế, rate limiting thường được triển khai ở **nhiều tầng** (defense in depth). Tầng ngoài cùng — API Gateway hoặc Edge — chặn traffic sớm nhất có thể, tiết kiệm tài nguyên cho tầng application.

```
flowchart LR
    C[Client] --> CF[Cloudflare  
Edge Rate Limit]
    CF --> GW[API Gateway  
YARP / Kong]
    GW --> SVC[Service  
App-level Limit]
    SVC --> DB[(Database)]

CF -.->|"Layer 1: IP-based  
1000 req/min"| N1[ ]
    GW -.->|"Layer 2: API Key  
200 req/min"| N2[ ]
    SVC -.->|"Layer 3: User+Endpoint  
30 req/min"| N3[ ]

style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style CF fill:#e94560,stroke:#fff,color:#fff
    style GW fill:#2c3e50,stroke:#fff,color:#fff
    style SVC fill:#4CAF50,stroke:#fff,color:#fff
    style DB fill:#16213e,stroke:#fff,color:#fff
    style N1 fill:transparent,stroke:transparent
    style N2 fill:transparent,stroke:transparent
    style N3 fill:transparent,stroke:transparent
  
```

Defense in depth: Rate limiting nhiều tầng

### Cloudflare Rate Limiting (Free Tier)

```json
// Cloudflare Rate Limiting Rule — ví dụ cấu hình
{
  "description": "API rate limit — 100 req/min per IP",
  "expression": "(http.request.uri.path matches \"^/api/\")",
  "action": "block",
  "ratelimit": {
    "characteristics": ["ip.src"],
    "period": 60,
    "requests_per_period": 100,
    "mitigation_timeout": 120,
    "counting_expression": ""
  }
}
```

### YARP Reverse Proxy với Rate Limiting

```csharp
// YARP + Rate Limiting — ASP.NET Core 10
builder.Services.AddReverseProxy()
    .LoadFromConfig(builder.Configuration.GetSection("ReverseProxy"));

builder.Services.AddRateLimiter(options =>
{
    options.AddPolicy("gateway-limit", context =>
    {
        // Rate limit theo API Key từ header
        var apiKey = context.Request.Headers["X-API-Key"].FirstOrDefault() ?? "no-key";

return RateLimitPartition.GetTokenBucketLimiter(apiKey, _ =>
            new TokenBucketRateLimiterOptions
            {
                TokenLimit = 200,
                ReplenishmentPeriod = TimeSpan.FromSeconds(10),
                TokensPerPeriod = 40,
                AutoReplenishment = true
            });
    });
});

app.UseRateLimiter();
app.MapReverseProxy();
```

## Best Practices & Anti-Patterns

### Response Headers chuẩn

Luôn trả về rate limit headers để client biết trạng thái quota của mình. Đây là chuẩn đề xuất trong **RFC 6585** và **RFC 7231**:

```csharp
// Middleware thêm Rate Limit Headers
app.Use(async (context, next) =>
{
    await next();

var limiterFeature = context.Features.Get<IRateLimiterStatisticsFeature>();
    if (limiterFeature is not null)
    {
        var stats = limiterFeature.GetStatistics();
        context.Response.Headers["X-RateLimit-Limit"] = "100";
        context.Response.Headers["X-RateLimit-Remaining"] =
            stats?.CurrentAvailablePermits.ToString() ?? "0";
        context.Response.Headers["X-RateLimit-Reset"] =
            DateTimeOffset.UtcNow.AddMinutes(1).ToUnixTimeSeconds().ToString();
    }
});
```

### Anti-patterns cần tránh

| Anti-Pattern | Vấn đề | Giải pháp |
| --- | --- | --- |
| **Rate limit chỉ theo IP** | Shared IP (NAT, corporate proxy) ảnh hưởng nhiều users | Kết hợp IP + API Key + User ID |
| **Không có Retry-After** | Client retry ngay lập tức → thundering herd | Luôn trả Retry-After header + exponential backoff |
| **Hard limit không bypass** | Health check, admin, webhook bị block | Whitelist trusted IPs/service accounts |
| **Limit quá thấp khi deploy** | False positives → user frustration | Bắt đầu permissive, monitor, rồi tighten |
| **Không log rejections** | Không biết ai bị reject và tại sao | Log clientId, endpoint, reject reason → dashboard |

#### Graceful degradation thay vì hard reject

### Monitoring & Alerting

```csharp
// Kết hợp với OpenTelemetry Metrics
builder.Services.AddRateLimiter(options =>
{
    options.OnRejected = async (context, ct) =>
    {
        // Emit metric
        var meter = context.HttpContext.RequestServices
            .GetRequiredService<IMeterFactory>()
            .Create("RateLimiting");

var counter = meter.CreateCounter<long>("rate_limit.rejections");
        counter.Add(1, new KeyValuePair<string, object?>("endpoint",
            context.HttpContext.GetEndpoint()?.DisplayName));

context.HttpContext.Response.StatusCode = 429;
        context.HttpContext.Response.Headers["Retry-After"] = "60";

await context.HttpContext.Response.WriteAsJsonAsync(new
        {
            error = "rate_limit_exceeded",
            retryAfter = 60
        }, ct);
    };
});
```

## Kết luận

- **MVP / Internal API:** Fixed Window trên ASP.NET Core là đủ
- **Public API:** Sliding Window Counter hoặc Token Bucket, partition theo user
- **Microservices:** Distributed Token Bucket với Redis + Edge rate limiting (Cloudflare)
- **SaaS multi-tenant:** Chained limiters — Global → Per-Tenant → Per-User → Per-Endpoint

#### Tham khảo thêm

- [Rate limiting middleware in ASP.NET Core — Microsoft Docs](https://learn.microsoft.com/en-us/aspnet/core/performance/rate-limit?view=aspnetcore-10.0)
- [Rate Limiting Algorithms Explained with Code — AlgoMaster](https://blog.algomaster.io/p/rate-limiting-algorithms-explained-with-code)
- [Design A Rate Limiter — ByteByteGo](https://bytebytego.com/courses/system-design-interview/design-a-rate-limiter)
- [Rate Limiting Algorithms: Token Bucket vs Sliding Window — Arcjet](https://blog.arcjet.com/rate-limiting-algorithms-token-bucket-vs-sliding-window-vs-fixed-window/)
- [Multi-Tenant Rate Limiting in .NET 10 — elmah.io](https://blog.elmah.io/new-in-net-10-and-c-14-multi-tenant-rate-limiting/)

C# 14 — Những Tính Năng Đột Phá Trong .NET 10

Vertical Slice Architecture trên .NET 10 — Tổ chức code theo tính năng

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.