DynamoDB Single-Table Design: Nghệ thuật thiết kế NoSQL cho hệ thống quy mô lớn

Posted on: 4/18/2026 11:10:58 AM

Table of contents

1. DynamoDB và bài toán NoSQL quy mô lớn
1. 💡 Nguyên tắc vàng
2. Kiến trúc nền tảng của DynamoDB
2. Single-Table Design là gì?
1. Ví dụ: Hệ thống E-Commerce
  1. ✅ Sức mạnh của thiết kế này
3. Access Pattern-First: Tư duy thiết kế ngược
4. Các design pattern nâng cao
5. Khi nào dùng Single-Table, khi nào Multi-Table?
1. 💡 Quy tắc thực tế
6. Tích hợp với Serverless trên .NET
7. Tối ưu chi phí với DynamoDB
1. Chiến lược tối ưu chi phí
8. Các anti-patterns cần tránh
9. DynamoDB Free Tier và bắt đầu thực hành
1. ✅ Bắt đầu thực hành
10. Kết luận
1. Nguồn tham khảo

< 10msLatency ổn định ở mọi quy mô

10 GBDung lượng tối đa mỗi partition

25 GBFree Tier vĩnh viễn

∞Horizontal scaling không giới hạn

1. DynamoDB và bài toán NoSQL quy mô lớn

Amazon DynamoDB là dịch vụ NoSQL fully-managed của AWS, nổi tiếng với khả năng mở rộng gần như vô hạn và độ trễ ổn định dưới 10 mili giây bất kể lượng data hay traffic. Tuy nhiên, điểm mạnh lớn nhất của DynamoDB cũng chính là thách thức lớn nhất: bạn phải thiết kế data model dựa trên access patterns — hoàn toàn ngược lại với tư duy relational truyền thống.

Với relational database (SQL Server, PostgreSQL...), bạn normalize dữ liệu trước, rồi dùng JOIN để query linh hoạt sau. Với DynamoDB, bạn phải biết chính xác ứng dụng sẽ đọc/ghi dữ liệu thế nào trước khi thiết kế schema. Đây là lý do khiến nhiều team thất bại — họ mang tư duy relational sang NoSQL.

💡 Nguyên tắc vàng

"Items that are accessed together should be stored together" — Dữ liệu được truy cập cùng nhau phải được lưu cùng nhau. Đây là triết lý nền tảng của mọi quyết định thiết kế trong DynamoDB.

Kiến trúc nền tảng của DynamoDB

Để hiểu Single-Table Design, trước hết cần nắm hai cơ chế cốt lõi:

graph TD
    A["Client Request"] --> B["DynamoDB Router"]
    B --> C["Partition 1
PK hash → slot"]
    B --> D["Partition 2
PK hash → slot"]
    B --> E["Partition N
PK hash → slot"]
    C --> F["B-Tree
Sort Key ordering"]
    D --> G["B-Tree
Sort Key ordering"]
    E --> H["B-Tree
Sort Key ordering"]

    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style F fill:#f8f9fa,stroke:#e0e0e0,color:#2c3e50
    style G fill:#f8f9fa,stroke:#e0e0e0,color:#2c3e50
    style H fill:#f8f9fa,stroke:#e0e0e0,color:#2c3e50

DynamoDB phân tán dữ liệu qua Partition Key, sắp xếp trong mỗi partition qua Sort Key

Partitioning: Dữ liệu được sharding thành các partition tối đa 10 GB. Mỗi item được route đến partition dựa trên hash của Partition Key (PK). Cơ chế này cho phép DynamoDB scale ngang lên hàng petabyte.

B-Tree trong mỗi Partition: Các items trong cùng partition được sắp xếp theo Sort Key (SK) trong cấu trúc B-Tree. Điều này cho phép range query cực nhanh với độ phức tạp O(log n).

2. Single-Table Design là gì?

Single-Table Design (STD) là kỹ thuật lưu trữ nhiều entity types khác nhau trong cùng một bảng DynamoDB. Thay vì tạo bảng riêng cho Users, Orders, Products... bạn thiết kế Partition Key và Sort Key sao cho tất cả dữ liệu liên quan nằm cùng một partition — giúp lấy ra bằng một Query duy nhất.

Ví dụ: Hệ thống E-Commerce

Thay vì 3 bảng riêng biệt (Customers, Orders, OrderItems), Single-Table Design gộp tất cả vào một bảng:

PK	SK	Attributes
CUSTOMER#C001	PROFILE	Name: "Nguyễn Văn A", Email: "a@mail.com"
CUSTOMER#C001	ORDER#2026-04-18#O100	Total: 2500000, Status: "processing"
CUSTOMER#C001	ORDER#2026-04-18#O100#ITEM#1	Product: "Laptop", Qty: 1, Price: 2500000
CUSTOMER#C001	ORDER#2026-04-15#O099	Total: 350000, Status: "delivered"
CUSTOMER#C001	ORDER#2026-04-15#O099#ITEM#1	Product: "Sách", Qty: 2, Price: 175000
CUSTOMER#C002	PROFILE	Name: "Trần Thị B", Email: "b@mail.com"

✅ Sức mạnh của thiết kế này

Chỉ với một Query PK = "CUSTOMER#C001", bạn lấy được profile, tất cả orders VÀ order items của khách hàng. Không cần JOIN, không cần nhiều round-trip. Với begins_with(SK, "ORDER#2026-04"), bạn lọc đúng orders tháng 4/2026.

3. Access Pattern-First: Tư duy thiết kế ngược

Đây là bước quan trọng nhất và cũng là nơi hầu hết developers mắc sai lầm. Bạn phải liệt kê toàn bộ access patterns trước khi vẽ bất kỳ schema nào.

graph LR
    A["Bước 1
Liệt kê Access Patterns"] --> B["Bước 2
Nhóm entity
theo quan hệ truy vấn"]
    B --> C["Bước 3
Thiết kế PK/SK"]
    C --> D["Bước 4
Thêm GSI
cho patterns phụ"]
    D --> E["Bước 5
Validate
với sample data"]

    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#e94560,stroke:#fff,color:#fff
    style D fill:#2c3e50,stroke:#fff,color:#fff
    style E fill:#e94560,stroke:#fff,color:#fff

Quy trình thiết kế DynamoDB Single-Table: luôn bắt đầu từ access patterns

Ví dụ, với hệ thống E-Commerce, các access patterns thường gặp:

#	Access Pattern	Operation	Key Design
1	Lấy profile khách hàng	GetItem	PK=CUSTOMER#id, SK=PROFILE
2	Lấy tất cả orders của khách	Query	PK=CUSTOMER#id, SK begins_with ORDER#
3	Lấy orders trong khoảng thời gian	Query	PK=CUSTOMER#id, SK BETWEEN ORDER#date1 AND ORDER#date2
4	Lấy chi tiết một order + items	Query	PK=CUSTOMER#id, SK begins_with ORDER#date#orderId
5	Tìm orders theo status	Query GSI	GSI1PK=STATUS#processing, GSI1SK=date
6	Lấy order theo orderId	Query GSI	GSI2PK=ORDER#orderId

4. Các design pattern nâng cao

4.1 GSI Overloading — Một GSI, nhiều access patterns

DynamoDB cho phép tối đa 20 GSI mỗi bảng. Thay vì tạo GSI riêng cho từng pattern, GSI Overloading tái sử dụng cùng GSI cho nhiều entity types khác nhau bằng cách dùng các attribute tên generic (GSI1PK, GSI1SK).

// Entity: Customer
{
  PK: "CUSTOMER#C001",
  SK: "PROFILE",
  GSI1PK: "EMAIL#a@mail.com",     // Tìm customer theo email
  GSI1SK: "CUSTOMER#C001",
  Name: "Nguyễn Văn A"
}

// Entity: Order
{
  PK: "CUSTOMER#C001",
  SK: "ORDER#2026-04-18#O100",
  GSI1PK: "STATUS#processing",    // Tìm orders theo status
  GSI1SK: "2026-04-18",
  Total: 2500000
}

// Entity: Product
{
  PK: "PRODUCT#P001",
  SK: "METADATA",
  GSI1PK: "CATEGORY#electronics", // Tìm products theo category
  GSI1SK: "PRODUCT#P001",
  Name: "Laptop Pro 2026"
}

Cùng một GSI1 nhưng phục vụ 3 access patterns hoàn toàn khác nhau: tìm customer theo email, lọc orders theo status, và browse products theo category.

4.2 Hierarchical Data — Sort Key phân cấp

Dữ liệu phân cấp (Organization → Department → Team → Member) được mô hình hóa bằng Sort Key có delimiter:

PK: "ORG#FPT"
SK: "DEPT#engineering"                          → Department
SK: "DEPT#engineering#TEAM#platform"            → Team
SK: "DEPT#engineering#TEAM#platform#MEM#tu001"  → Member
SK: "DEPT#engineering#TEAM#backend"             → Team
SK: "DEPT#engineering#TEAM#backend#MEM#an002"   → Member

// Query tất cả trong department Engineering:
// PK = "ORG#FPT" AND begins_with(SK, "DEPT#engineering")

// Query chỉ team Platform:
// PK = "ORG#FPT" AND begins_with(SK, "DEPT#engineering#TEAM#platform")

💡 Mẹo thiết kế Sort Key phân cấp

Sắp xếp các level từ general → specific trong Sort Key. Điều này cho phép begins_with() lọc ở bất kỳ level nào trong hierarchy. Dùng ký tự # làm delimiter vì nó hiếm khi xuất hiện trong data thực tế.

4.3 Adjacency List — Quan hệ many-to-many

Quan hệ many-to-many (Students ↔ Courses, Users ↔ Groups) là thách thức lớn nhất với NoSQL. Adjacency List pattern giải quyết bằng cách lưu cả hai chiều quan hệ:

graph LR
    subgraph "DynamoDB Table"
        A["PK: STUDENT#S01
SK: COURSE#C01
Grade: A"]
        B["PK: STUDENT#S01
SK: COURSE#C02
Grade: B+"]
        C["PK: COURSE#C01
SK: STUDENT#S01
Enrolled: 2026-01"]
        D["PK: COURSE#C01
SK: STUDENT#S02
Enrolled: 2026-02"]
    end

    style A fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style B fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style C fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style D fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

Adjacency List: lưu quan hệ theo cả hai chiều để query hiệu quả từ entity nào cũng được

Access Pattern	Query
Lấy tất cả courses của Student S01	PK = "STUDENT#S01", SK begins_with "COURSE#"
Lấy tất cả students trong Course C01	PK = "COURSE#C01", SK begins_with "STUDENT#"

4.4 Sparse Index — Lọc hiệu quả với GSI

DynamoDB chỉ index những items có attribute được dùng làm key của GSI. Tính chất này tạo ra "sparse index" — một GSI chỉ chứa subset nhỏ của bảng chính, cực kỳ hữu ích cho việc lọc.

// Chỉ orders chưa thanh toán mới có attribute "UnpaidGSIPK"
{
  PK: "CUSTOMER#C001",
  SK: "ORDER#2026-04-18#O100",
  UnpaidGSIPK: "UNPAID",        // ← Chỉ set khi chưa thanh toán
  UnpaidGSISK: "2026-04-18",
  Total: 2500000,
  Status: "pending_payment"
}

// Order đã thanh toán → KHÔNG có attribute UnpaidGSIPK
{
  PK: "CUSTOMER#C001",
  SK: "ORDER#2026-04-15#O099",
  // Không có UnpaidGSIPK → không xuất hiện trong GSI
  Total: 350000,
  Status: "delivered"
}

// Query GSI: lấy TẤT CẢ orders chưa thanh toán trong hệ thống
// GSI: UnpaidGSIPK = "UNPAID"
// → Chỉ scan qua vài items thay vì triệu orders

4.5 Write Sharding — Phân tán hot partition

Khi một Partition Key nhận quá nhiều writes (ví dụ: counter toàn cục, leaderboard), partition đó trở thành bottleneck. Write Sharding giải quyết bằng cách thêm suffix ngẫu nhiên:

// Thay vì: PK = "GLOBAL_COUNTER" (hot partition!)
// Dùng: PK = "GLOBAL_COUNTER#" + random(0, 9)

PK: "GLOBAL_COUNTER#0", SK: "COUNT" → Value: 1523
PK: "GLOBAL_COUNTER#1", SK: "COUNT" → Value: 1487
...
PK: "GLOBAL_COUNTER#9", SK: "COUNT" → Value: 1501

// Tổng = Sum tất cả shards = 1523 + 1487 + ... + 1501
// 10 partitions chia đều write load → throughput tăng 10x

⚠️ Trade-off của Write Sharding

Đọc giá trị tổng cần query tất cả shards rồi aggregate ở application. Phù hợp cho use case write-heavy như counters, votes, real-time analytics. Không nên dùng cho data cần strong consistency đọc ngay.

5. Khi nào dùng Single-Table, khi nào Multi-Table?

Single-Table Design không phải lúc nào cũng tối ưu. AWS chính thức khuyến nghị cân nhắc cả hai approach dựa trên ngữ cảnh cụ thể:

Tiêu chí	Single-Table ✅	Multi-Table ✅
Truy vấn liên entity	Cần fetch nhiều entity types cùng lúc (materialized joins)	Mỗi entity được query độc lập
DynamoDB Streams	Tối đa 2 consumers — đủ dùng	Cần >2 stream consumers cho các entity khác nhau
Analytics/OLAP	Workload thuần OLTP	Cần export riêng từng entity ra Redshift/S3
Team size	Một team quản lý toàn bộ service	Nhiều team, mỗi team own entity riêng
Monitoring	Một bảng dễ monitor hơn	Cần metrics riêng per entity type
Cost	Tối ưu RCU/WCU vì ít round-trips	Có thể tốn hơn do nhiều requests
Độ phức tạp	Schema phức tạp, cần team hiểu DynamoDB sâu	Đơn giản hơn, dễ onboard developer mới

💡 Quy tắc thực tế

Microservices: Mỗi service nên có bảng DynamoDB riêng (giống như mỗi service có database riêng). Single-Table Design áp dụng trong phạm vi một service, không phải gộp tất cả services vào một bảng.

6. Tích hợp với Serverless trên .NET

DynamoDB kết hợp với AWS Lambda tạo thành kiến trúc serverless mạnh mẽ. Với .NET, AWS cung cấp SDK chính thức và Object Persistence Model giúp làm việc với Single-Table Design dễ dàng hơn:

graph LR
    A["API Gateway"] --> B["Lambda .NET"]
    B --> C["DynamoDB
Single Table"]
    C --> D["DynamoDB Streams"]
    D --> E["Lambda
Event Handler"]
    E --> F["SQS / SNS
EventBridge"]

    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#e94560,stroke:#fff,color:#fff
    style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style E fill:#2c3e50,stroke:#fff,color:#fff
    style F fill:#f8f9fa,stroke:#e94560,color:#2c3e50

Kiến trúc Serverless điển hình: API Gateway → Lambda → DynamoDB Single Table → Streams → Event Processing

// .NET — DynamoDB Low-Level API với Single-Table Design
using Amazon.DynamoDBv2;
using Amazon.DynamoDBv2.Model;

public class OrderRepository
{
    private readonly IAmazonDynamoDB _client;
    private const string TableName = "ECommerceTable";

    // Lấy customer profile + tất cả orders trong 1 Query
    public async Task<(CustomerProfile, List<Order>)> GetCustomerWithOrders(string customerId)
    {
        var response = await _client.QueryAsync(new QueryRequest
        {
            TableName = TableName,
            KeyConditionExpression = "PK = :pk",
            ExpressionAttributeValues = new Dictionary<string, AttributeValue>
            {
                { ":pk", new AttributeValue($"CUSTOMER#{customerId}") }
            }
        });

        CustomerProfile profile = null;
        var orders = new List<Order>();

        foreach (var item in response.Items)
        {
            var sk = item["SK"].S;
            if (sk == "PROFILE")
                profile = MapToCustomer(item);
            else if (sk.StartsWith("ORDER#"))
                orders.Add(MapToOrder(item));
        }

        return (profile, orders);
    }

    // Lấy orders trong khoảng thời gian
    public async Task<List<Order>> GetOrdersByDateRange(
        string customerId, DateTime from, DateTime to)
    {
        var response = await _client.QueryAsync(new QueryRequest
        {
            TableName = TableName,
            KeyConditionExpression = "PK = :pk AND SK BETWEEN :start AND :end",
            ExpressionAttributeValues = new Dictionary<string, AttributeValue>
            {
                { ":pk", new AttributeValue($"CUSTOMER#{customerId}") },
                { ":start", new AttributeValue($"ORDER#{from:yyyy-MM-dd}") },
                { ":end", new AttributeValue($"ORDER#{to:yyyy-MM-dd}~") }
            }
        });

        return response.Items
            .Where(i => i["SK"].S.StartsWith("ORDER#"))
            .Select(MapToOrder)
            .ToList();
    }
}

7. Tối ưu chi phí với DynamoDB

DynamoDB tính phí theo Read Capacity Units (RCU) và Write Capacity Units (WCU). Single-Table Design giúp tiết kiệm đáng kể vì giảm số lượng requests:

1 RCU= 1 strongly consistent read ≤ 4 KB

1 WCU= 1 write ≤ 1 KB

$1.25/ triệu WCU on-demand

$0.25/ triệu RCU on-demand

Chiến lược tối ưu chi phí

1. Tách attribute nóng/lạnh: Attribute thay đổi thường xuyên (view count, last_login) nên lưu trong item riêng biệt, tránh ghi lại toàn bộ item lớn mỗi lần update.

2. Projection cho GSI: Chỉ project những attributes cần thiết vào GSI thay vì ALL. GSI tính phí storage và write riêng.

3. On-Demand vs Provisioned: On-Demand phù hợp workload không dự đoán được. Provisioned + Auto Scaling rẻ hơn 5-7x cho workload ổn định.

4. TTL tự động: Dùng Time-To-Live để tự xóa data hết hạn (sessions, logs, temp data). DynamoDB không tính phí cho TTL deletes.

8. Các anti-patterns cần tránh

Anti-pattern #1: Scan toàn bảng

Dùng Scan thay vì Query sẽ đọc toàn bộ bảng — cực kỳ tốn kém và chậm. Nếu cần Scan, đó là dấu hiệu thiết kế key sai. Mỗi access pattern phải được phục vụ bởi Query hoặc GetItem.

Anti-pattern #2: Hot partition

Partition Key phân bố không đều (ví dụ: dùng ngày làm PK → traffic dồn vào partition hôm nay). Giải pháp: thêm entityId vào PK, dùng Write Sharding cho global counters.

Anti-pattern #3: Large items

DynamoDB giới hạn 400 KB/item. Lưu blob data (images, documents) trong S3, chỉ lưu S3 URL trong DynamoDB. Tương tự, tránh lưu array lớn trong một attribute — tách thành items riêng.

Anti-pattern #4: Relational thinking

Normalize data rồi "JOIN" ở application code bằng nhiều GetItem requests. Đây là cách tệ nhất — đánh mất lợi thế core của DynamoDB. Hãy denormalize và lưu data redundantly.

Anti-pattern #5: Quá nhiều GSI

Mỗi GSI là một bản copy dữ liệu, tốn storage và write capacity. Ưu tiên GSI Overloading và Sparse Index trước khi tạo GSI mới. Tối đa 5-7 GSI là con số hợp lý.

9. DynamoDB Free Tier và bắt đầu thực hành

AWS cung cấp DynamoDB Free Tier vĩnh viễn (không giới hạn 12 tháng đầu):

Tài nguyên	Free Tier / tháng	Đủ cho
Read Capacity	25 RCU provisioned	~200 triệu reads/tháng (eventually consistent)
Write Capacity	25 WCU provisioned	~66 triệu writes/tháng
Storage	25 GB	Dư dả cho side project và MVP
DynamoDB Streams	2.5 triệu read requests	Event-driven architecture nhỏ
Global Tables	Không free	Cần trả phí cho multi-region replication

✅ Bắt đầu thực hành

NoSQL Workbench: AWS cung cấp tool miễn phí để thiết kế và visualize DynamoDB data model offline. Bạn có thể tạo table, define access patterns, và test queries trước khi deploy lên AWS. Tải tại aws.amazon.com/dynamodb/nosql-workbench.

DynamoDB Local: Chạy DynamoDB trên máy local cho development, không cần AWS account. Tích hợp tốt với Docker: docker run -p 8000:8000 amazon/dynamodb-local.

10. Kết luận

DynamoDB Single-Table Design không chỉ là một kỹ thuật — nó là cách tư duy hoàn toàn khác về data modeling. Thay vì "dữ liệu trông như thế nào" (relational), bạn phải trả lời "dữ liệu được dùng như thế nào" (access-pattern-first). Khi nắm vững các pattern: GSI Overloading, Hierarchical Sort Keys, Adjacency List, Sparse Index và Write Sharding — bạn có thể thiết kế hệ thống NoSQL scale tới hàng triệu requests/giây với chi phí tối ưu và latency dưới 10ms.

Kết hợp với kiến trúc serverless (Lambda + API Gateway), DynamoDB Free Tier đủ mạnh để chạy production cho startup giai đoạn đầu hoàn toàn miễn phí — một lợi thế cạnh tranh khó có dịch vụ nào sánh được.

Nguồn tham khảo

#DynamoDB #AWS #NoSQL #system design #Serverless #.NET #Single-Table Design #Database Architecture

# DynamoDB Single-Table Design: Nghệ thuật thiết kế NoSQL cho hệ thống quy mô lớn

< 10msLatency ổn định ở mọi quy mô

10 GBDung lượng tối đa mỗi partition

25 GBFree Tier vĩnh viễn

∞Horizontal scaling không giới hạn

## 1. DynamoDB và bài toán NoSQL quy mô lớn

Amazon DynamoDB là dịch vụ NoSQL fully-managed của AWS, nổi tiếng với khả năng mở rộng gần như vô hạn và độ trễ ổn định dưới 10 mili giây bất kể lượng data hay traffic. Tuy nhiên, điểm mạnh lớn nhất của DynamoDB cũng chính là thách thức lớn nhất: bạn phải **thiết kế data model dựa trên access patterns** — hoàn toàn ngược lại với tư duy relational truyền thống.

Với relational database (SQL Server, PostgreSQL...), bạn normalize dữ liệu trước, rồi dùng JOIN để query linh hoạt sau. Với DynamoDB, bạn phải biết chính xác ứng dụng sẽ đọc/ghi dữ liệu thế nào **trước khi thiết kế schema**. Đây là lý do khiến nhiều team thất bại — họ mang tư duy relational sang NoSQL.

#### 💡 Nguyên tắc vàng

**"Items that are accessed together should be stored together"** — Dữ liệu được truy cập cùng nhau phải được lưu cùng nhau. Đây là triết lý nền tảng của mọi quyết định thiết kế trong DynamoDB.

### Kiến trúc nền tảng của DynamoDB

Để hiểu Single-Table Design, trước hết cần nắm hai cơ chế cốt lõi:

```
graph TD
    A["Client Request"] --> B["DynamoDB Router"]
    B --> C["Partition 1  
PK hash → slot"]
    B --> D["Partition 2  
PK hash → slot"]
    B --> E["Partition N  
PK hash → slot"]
    C --> F["B-Tree  
Sort Key ordering"]
    D --> G["B-Tree  
Sort Key ordering"]
    E --> H["B-Tree  
Sort Key ordering"]

style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style F fill:#f8f9fa,stroke:#e0e0e0,color:#2c3e50
    style G fill:#f8f9fa,stroke:#e0e0e0,color:#2c3e50
    style H fill:#f8f9fa,stroke:#e0e0e0,color:#2c3e50

```
DynamoDB phân tán dữ liệu qua Partition Key, sắp xếp trong mỗi partition qua Sort Key

**Partitioning:** Dữ liệu được sharding thành các partition tối đa 10 GB. Mỗi item được route đến partition dựa trên hash của Partition Key (PK). Cơ chế này cho phép DynamoDB scale ngang lên hàng petabyte.

**B-Tree trong mỗi Partition:** Các items trong cùng partition được sắp xếp theo Sort Key (SK) trong cấu trúc B-Tree. Điều này cho phép range query cực nhanh với độ phức tạp O(log n).

## 2. Single-Table Design là gì?

Single-Table Design (STD) là kỹ thuật lưu trữ **nhiều entity types khác nhau trong cùng một bảng DynamoDB**. Thay vì tạo bảng riêng cho Users, Orders, Products... bạn thiết kế Partition Key và Sort Key sao cho tất cả dữ liệu liên quan nằm cùng một partition — giúp lấy ra bằng **một Query duy nhất**.

### Ví dụ: Hệ thống E-Commerce

Thay vì 3 bảng riêng biệt (Customers, Orders, OrderItems), Single-Table Design gộp tất cả vào một bảng:

| PK | SK | Attributes |
| --- | --- | --- |
| CUSTOMER#C001 | PROFILE | Name: "Nguyễn Văn A", Email: "a@mail.com" |
| CUSTOMER#C001 | ORDER#2026-04-18#O100 | Total: 2500000, Status: "processing" |
| CUSTOMER#C001 | ORDER#2026-04-18#O100#ITEM#1 | Product: "Laptop", Qty: 1, Price: 2500000 |
| CUSTOMER#C001 | ORDER#2026-04-15#O099 | Total: 350000, Status: "delivered" |
| CUSTOMER#C001 | ORDER#2026-04-15#O099#ITEM#1 | Product: "Sách", Qty: 2, Price: 175000 |
| CUSTOMER#C002 | PROFILE | Name: "Trần Thị B", Email: "b@mail.com" |

#### ✅ Sức mạnh của thiết kế này

Chỉ với **một Query** `PK = "CUSTOMER#C001"`, bạn lấy được profile, tất cả orders VÀ order items của khách hàng. Không cần JOIN, không cần nhiều round-trip. Với `begins_with(SK, "ORDER#2026-04")`, bạn lọc đúng orders tháng 4/2026.

## 3. Access Pattern-First: Tư duy thiết kế ngược

Đây là bước quan trọng nhất và cũng là nơi hầu hết developers mắc sai lầm. Bạn phải liệt kê **toàn bộ access patterns** trước khi vẽ bất kỳ schema nào.

```
graph LR
    A["Bước 1  
Liệt kê Access Patterns"] --> B["Bước 2  
Nhóm entity  
theo quan hệ truy vấn"]
    B --> C["Bước 3  
Thiết kế PK/SK"]
    C --> D["Bước 4  
Thêm GSI  
cho patterns phụ"]
    D --> E["Bước 5  
Validate  
với sample data"]

style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#e94560,stroke:#fff,color:#fff
    style D fill:#2c3e50,stroke:#fff,color:#fff
    style E fill:#e94560,stroke:#fff,color:#fff

```
Quy trình thiết kế DynamoDB Single-Table: luôn bắt đầu từ access patterns

Ví dụ, với hệ thống E-Commerce, các access patterns thường gặp:

| # | Access Pattern | Operation | Key Design |
| --- | --- | --- | --- |
| 1 | Lấy profile khách hàng | GetItem | PK=CUSTOMER#id, SK=PROFILE |
| 2 | Lấy tất cả orders của khách | Query | PK=CUSTOMER#id, SK begins_with ORDER# |
| 3 | Lấy orders trong khoảng thời gian | Query | PK=CUSTOMER#id, SK BETWEEN ORDER#date1 AND ORDER#date2 |
| 4 | Lấy chi tiết một order + items | Query | PK=CUSTOMER#id, SK begins_with ORDER#date#orderId |
| 5 | Tìm orders theo status | Query GSI | GSI1PK=STATUS#processing, GSI1SK=date |
| 6 | Lấy order theo orderId | Query GSI | GSI2PK=ORDER#orderId |

## 4. Các design pattern nâng cao

### 4.1 GSI Overloading — Một GSI, nhiều access patterns

DynamoDB cho phép tối đa 20 GSI mỗi bảng. Thay vì tạo GSI riêng cho từng pattern, **GSI Overloading** tái sử dụng cùng GSI cho nhiều entity types khác nhau bằng cách dùng các attribute tên generic (GSI1PK, GSI1SK).

```
// Entity: Customer
{
  PK: "CUSTOMER#C001",
  SK: "PROFILE",
  GSI1PK: "EMAIL#a@mail.com",     // Tìm customer theo email
  GSI1SK: "CUSTOMER#C001",
  Name: "Nguyễn Văn A"
}

// Entity: Order
{
  PK: "CUSTOMER#C001",
  SK: "ORDER#2026-04-18#O100",
  GSI1PK: "STATUS#processing",    // Tìm orders theo status
  GSI1SK: "2026-04-18",
  Total: 2500000
}

// Entity: Product
{
  PK: "PRODUCT#P001",
  SK: "METADATA",
  GSI1PK: "CATEGORY#electronics", // Tìm products theo category
  GSI1SK: "PRODUCT#P001",
  Name: "Laptop Pro 2026"
}
```

Cùng một GSI1 nhưng phục vụ 3 access patterns hoàn toàn khác nhau: tìm customer theo email, lọc orders theo status, và browse products theo category.

### 4.2 Hierarchical Data — Sort Key phân cấp

Dữ liệu phân cấp (Organization → Department → Team → Member) được mô hình hóa bằng Sort Key có delimiter:

```
PK: "ORG#FPT"
SK: "DEPT#engineering"                          → Department
SK: "DEPT#engineering#TEAM#platform"            → Team
SK: "DEPT#engineering#TEAM#platform#MEM#tu001"  → Member
SK: "DEPT#engineering#TEAM#backend"             → Team
SK: "DEPT#engineering#TEAM#backend#MEM#an002"   → Member

// Query tất cả trong department Engineering:
// PK = "ORG#FPT" AND begins_with(SK, "DEPT#engineering")

// Query chỉ team Platform:
// PK = "ORG#FPT" AND begins_with(SK, "DEPT#engineering#TEAM#platform")
```

#### 💡 Mẹo thiết kế Sort Key phân cấp

Sắp xếp các level từ general → specific trong Sort Key. Điều này cho phép `begins_with()` lọc ở bất kỳ level nào trong hierarchy. Dùng ký tự `#` làm delimiter vì nó hiếm khi xuất hiện trong data thực tế.

### 4.3 Adjacency List — Quan hệ many-to-many

Quan hệ many-to-many (Students ↔ Courses, Users ↔ Groups) là thách thức lớn nhất với NoSQL. Adjacency List pattern giải quyết bằng cách lưu cả hai chiều quan hệ:

```
graph LR
    subgraph "DynamoDB Table"
        A["PK: STUDENT#S01  
SK: COURSE#C01  
Grade: A"]
        B["PK: STUDENT#S01  
SK: COURSE#C02  
Grade: B+"]
        C["PK: COURSE#C01  
SK: STUDENT#S01  
Enrolled: 2026-01"]
        D["PK: COURSE#C01  
SK: STUDENT#S02  
Enrolled: 2026-02"]
    end

style A fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style B fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style C fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style D fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

```
Adjacency List: lưu quan hệ theo cả hai chiều để query hiệu quả từ entity nào cũng được

| Access Pattern | Query |
| --- | --- |
| Lấy tất cả courses của Student S01 | PK = "STUDENT#S01", SK begins_with "COURSE#" |
| Lấy tất cả students trong Course C01 | PK = "COURSE#C01", SK begins_with "STUDENT#" |

### 4.4 Sparse Index — Lọc hiệu quả với GSI

DynamoDB chỉ index những items **có attribute được dùng làm key của GSI**. Tính chất này tạo ra "sparse index" — một GSI chỉ chứa subset nhỏ của bảng chính, cực kỳ hữu ích cho việc lọc.

```
// Chỉ orders chưa thanh toán mới có attribute "UnpaidGSIPK"
{
  PK: "CUSTOMER#C001",
  SK: "ORDER#2026-04-18#O100",
  UnpaidGSIPK: "UNPAID",        // ← Chỉ set khi chưa thanh toán
  UnpaidGSISK: "2026-04-18",
  Total: 2500000,
  Status: "pending_payment"
}

// Order đã thanh toán → KHÔNG có attribute UnpaidGSIPK
{
  PK: "CUSTOMER#C001",
  SK: "ORDER#2026-04-15#O099",
  // Không có UnpaidGSIPK → không xuất hiện trong GSI
  Total: 350000,
  Status: "delivered"
}

// Query GSI: lấy TẤT CẢ orders chưa thanh toán trong hệ thống
// GSI: UnpaidGSIPK = "UNPAID"
// → Chỉ scan qua vài items thay vì triệu orders
```

### 4.5 Write Sharding — Phân tán hot partition

```
// Thay vì: PK = "GLOBAL_COUNTER" (hot partition!)
// Dùng: PK = "GLOBAL_COUNTER#" + random(0, 9)

PK: "GLOBAL_COUNTER#0", SK: "COUNT" → Value: 1523
PK: "GLOBAL_COUNTER#1", SK: "COUNT" → Value: 1487
...
PK: "GLOBAL_COUNTER#9", SK: "COUNT" → Value: 1501

// Tổng = Sum tất cả shards = 1523 + 1487 + ... + 1501
// 10 partitions chia đều write load → throughput tăng 10x
```

#### ⚠️ Trade-off của Write Sharding

## 5. Khi nào dùng Single-Table, khi nào Multi-Table?

Single-Table Design không phải lúc nào cũng tối ưu. AWS chính thức khuyến nghị cân nhắc cả hai approach dựa trên ngữ cảnh cụ thể:

| Tiêu chí | Single-Table ✅ | Multi-Table ✅ |
| --- | --- | --- |
| **Truy vấn liên entity** | Cần fetch nhiều entity types cùng lúc (materialized joins) | Mỗi entity được query độc lập |
| **DynamoDB Streams** | Tối đa 2 consumers — đủ dùng | Cần >2 stream consumers cho các entity khác nhau |
| **Analytics/OLAP** | Workload thuần OLTP | Cần export riêng từng entity ra Redshift/S3 |
| **Team size** | Một team quản lý toàn bộ service | Nhiều team, mỗi team own entity riêng |
| **Monitoring** | Một bảng dễ monitor hơn | Cần metrics riêng per entity type |
| **Cost** | Tối ưu RCU/WCU vì ít round-trips | Có thể tốn hơn do nhiều requests |
| **Độ phức tạp** | Schema phức tạp, cần team hiểu DynamoDB sâu | Đơn giản hơn, dễ onboard developer mới |

#### 💡 Quy tắc thực tế

**Microservices:** Mỗi service nên có bảng DynamoDB riêng (giống như mỗi service có database riêng). Single-Table Design áp dụng **trong phạm vi một service**, không phải gộp tất cả services vào một bảng.

## 6. Tích hợp với Serverless trên .NET

```
graph LR
    A["API Gateway"] --> B["Lambda .NET"]
    B --> C["DynamoDB  
Single Table"]
    C --> D["DynamoDB Streams"]
    D --> E["Lambda  
Event Handler"]
    E --> F["SQS / SNS  
EventBridge"]

style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#e94560,stroke:#fff,color:#fff
    style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style E fill:#2c3e50,stroke:#fff,color:#fff
    style F fill:#f8f9fa,stroke:#e94560,color:#2c3e50

```
Kiến trúc Serverless điển hình: API Gateway → Lambda → DynamoDB Single Table → Streams → Event Processing

```
// .NET — DynamoDB Low-Level API với Single-Table Design
using Amazon.DynamoDBv2;
using Amazon.DynamoDBv2.Model;

public class OrderRepository
{
    private readonly IAmazonDynamoDB _client;
    private const string TableName = "ECommerceTable";

// Lấy customer profile + tất cả orders trong 1 Query
    public async Task<(CustomerProfile, List<Order>)> GetCustomerWithOrders(string customerId)
    {
        var response = await _client.QueryAsync(new QueryRequest
        {
            TableName = TableName,
            KeyConditionExpression = "PK = :pk",
            ExpressionAttributeValues = new Dictionary<string, AttributeValue>
            {
                { ":pk", new AttributeValue($"CUSTOMER#{customerId}") }
            }
        });

CustomerProfile profile = null;
        var orders = new List<Order>();

foreach (var item in response.Items)
        {
            var sk = item["SK"].S;
            if (sk == "PROFILE")
                profile = MapToCustomer(item);
            else if (sk.StartsWith("ORDER#"))
                orders.Add(MapToOrder(item));
        }

return (profile, orders);
    }

// Lấy orders trong khoảng thời gian
    public async Task<List<Order>> GetOrdersByDateRange(
        string customerId, DateTime from, DateTime to)
    {
        var response = await _client.QueryAsync(new QueryRequest
        {
            TableName = TableName,
            KeyConditionExpression = "PK = :pk AND SK BETWEEN :start AND :end",
            ExpressionAttributeValues = new Dictionary<string, AttributeValue>
            {
                { ":pk", new AttributeValue($"CUSTOMER#{customerId}") },
                { ":start", new AttributeValue($"ORDER#{from:yyyy-MM-dd}") },
                { ":end", new AttributeValue($"ORDER#{to:yyyy-MM-dd}~") }
            }
        });

return response.Items
            .Where(i => i["SK"].S.StartsWith("ORDER#"))
            .Select(MapToOrder)
            .ToList();
    }
}
```

## 7. Tối ưu chi phí với DynamoDB

DynamoDB tính phí theo Read Capacity Units (RCU) và Write Capacity Units (WCU). Single-Table Design giúp tiết kiệm đáng kể vì giảm số lượng requests:

1 RCU= 1 strongly consistent read ≤ 4 KB

1 WCU= 1 write ≤ 1 KB

$1.25/ triệu WCU on-demand

$0.25/ triệu RCU on-demand

#### Chiến lược tối ưu chi phí

**1. Tách attribute nóng/lạnh:** Attribute thay đổi thường xuyên (view count, last_login) nên lưu trong item riêng biệt, tránh ghi lại toàn bộ item lớn mỗi lần update.

**2. Projection cho GSI:** Chỉ project những attributes cần thiết vào GSI thay vì ALL. GSI tính phí storage và write riêng.

**3. On-Demand vs Provisioned:** On-Demand phù hợp workload không dự đoán được. Provisioned + Auto Scaling rẻ hơn 5-7x cho workload ổn định.

**4. TTL tự động:** Dùng Time-To-Live để tự xóa data hết hạn (sessions, logs, temp data). DynamoDB không tính phí cho TTL deletes.

## 8. Các anti-patterns cần tránh

Anti-pattern #1: Scan toàn bảng

Dùng `Scan` thay vì `Query` sẽ đọc toàn bộ bảng — cực kỳ tốn kém và chậm. Nếu cần Scan, đó là dấu hiệu thiết kế key sai. Mỗi access pattern phải được phục vụ bởi Query hoặc GetItem.

Anti-pattern #2: Hot partition

Partition Key phân bố không đều (ví dụ: dùng ngày làm PK → traffic dồn vào partition hôm nay). Giải pháp: thêm entityId vào PK, dùng Write Sharding cho global counters.

Anti-pattern #3: Large items

Anti-pattern #4: Relational thinking

Anti-pattern #5: Quá nhiều GSI

Mỗi GSI là một bản copy dữ liệu, tốn storage và write capacity. Ưu tiên GSI Overloading và Sparse Index trước khi tạo GSI mới. Tối đa 5-7 GSI là con số hợp lý.

## 9. DynamoDB Free Tier và bắt đầu thực hành

AWS cung cấp DynamoDB Free Tier **vĩnh viễn** (không giới hạn 12 tháng đầu):

| Tài nguyên | Free Tier / tháng | Đủ cho |
| --- | --- | --- |
| Read Capacity | 25 RCU provisioned | ~200 triệu reads/tháng (eventually consistent) |
| Write Capacity | 25 WCU provisioned | ~66 triệu writes/tháng |
| Storage | 25 GB | Dư dả cho side project và MVP |
| DynamoDB Streams | 2.5 triệu read requests | Event-driven architecture nhỏ |
| Global Tables | Không free | Cần trả phí cho multi-region replication |

#### ✅ Bắt đầu thực hành

**NoSQL Workbench:** AWS cung cấp tool miễn phí để thiết kế và visualize DynamoDB data model offline. Bạn có thể tạo table, define access patterns, và test queries trước khi deploy lên AWS. Tải tại `aws.amazon.com/dynamodb/nosql-workbench`.

**DynamoDB Local:** Chạy DynamoDB trên máy local cho development, không cần AWS account. Tích hợp tốt với Docker: `docker run -p 8000:8000 amazon/dynamodb-local`.

## 10. Kết luận

### Nguồn tham khảo

- [AWS Blog — Single-table vs. multi-table design in Amazon DynamoDB](https://aws.amazon.com/blogs/database/single-table-vs-multi-table-design-in-amazon-dynamodb/)
- [Alex DeBrie — The What, Why, and When of Single-Table Design with DynamoDB](https://www.alexdebrie.com/posts/dynamodb-single-table/)
- [DEV Community — Advanced Single Table Design Patterns With DynamoDB](https://dev.to/urielbitton/advanced-single-table-design-patterns-with-dynamodb-4g26)
- [AWS DynamoDB Developer Guide — Official Documentation](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/)
- [AWS DynamoDB Pricing — Free Tier Details](https://aws.amazon.com/dynamodb/pricing/)

API Gateway 2026 — Kiến trúc Cổng Trung Tâm cho Microservices với YARP, Kong và BFF Pattern

Change Data Capture với Debezium: Đồng bộ dữ liệu real-time cho Microservices

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.