AWS Bedrock AgentCore — Nền tảng Serverless triển khai AI Agent Production

Posted on: 4/25/2026 1:13:40 AM

Table of contents

1. Vì sao cần một nền tảng chuyên biệt cho AI Agent?
1. Thực tế từ production
2. Kiến trúc tổng quan Amazon Bedrock AgentCore
3. Runtime — Triển khai Agent không cần hạ tầng
1. Managed Agent Harness (Mới 2026)
4. Knowledge Base — RAG được quản lý hoàn toàn
1. Chunking strategies
5. Guardrails — Kiểm soát output và bảo mật
1. Lưu ý về Guardrails cho Code
2. Ví dụ cấu hình Guardrails
6. Multi-Agent Collaboration — Đội hình Agent chuyên biệt
1. Hai chế độ điều phối
  1. Supervisor with Routing
  2. Supervisor with Orchestration
7. Identity — Xác thực Agent với hệ thống bên ngoài
1. Best Practice: Principle of Least Privilege
8. Memory — Agent có trí nhớ xuyên phiên
9. Gateway — Kết nối Agent với mọi API
1. Cách Gateway hoạt động
10. Action Groups — Mở rộng khả năng của Agent
11. Observability — Giám sát Agent trong Production
1. Tích hợp CloudWatch
12. So sánh với các nền tảng AI Agent khác
13. Kiến trúc Production: E-commerce Support Agent
14. Best Practices cho Production
15. Chi phí và Pricing
1. Mẹo tối ưu chi phí
Kết luận
1. Tham khảo

Xây dựng một AI Agent chạy được trong môi trường thử nghiệm là một chuyện. Đưa agent đó lên production với hàng triệu request, bảo mật chặt chẽ, khả năng ghi nhớ ngữ cảnh xuyên phiên, và tích hợp với hệ thống doanh nghiệp hiện có — đó mới là bài toán thực sự. Amazon Bedrock AgentCore ra đời để giải quyết chính xác vấn đề này: một nền tảng serverless cung cấp mọi building block cần thiết để triển khai AI Agent ở quy mô production.

3 API calls để deploy agent lên production

10 collaborator agents tối đa / supervisor

6 safeguard policies trong Guardrails

0 infrastructure cần quản lý (serverless)

1. Vì sao cần một nền tảng chuyên biệt cho AI Agent?

Khi xây dựng AI Agent, phần "thông minh" — gọi LLM, parse kết quả, gọi tool — thường chỉ chiếm khoảng 20% effort. 80% còn lại là các vấn đề production: bảo mật, xác thực, quản lý bộ nhớ, giám sát, retry, rate limiting, và triển khai ở quy mô lớn. Đây là lý do các managed platform như Bedrock AgentCore ngày càng quan trọng.

graph TB
    subgraph "Effort xây dựng AI Agent"
        A["20% — Agent Logic
LLM, Tools, Prompts"]
        B["80% — Production Infrastructure
Security, Memory, Observability,
Scaling, Identity, Deployment"]
    end

    style A fill:#4CAF50,stroke:#fff,color:#fff
    style B fill:#e94560,stroke:#fff,color:#fff

Tỷ lệ effort: Agent logic vs Infrastructure

Thực tế từ production

Theo AWS, các team doanh nghiệp trung bình mất 3-6 tháng chỉ để xây dựng infrastructure xung quanh AI Agent trước khi agent có thể xử lý workload thực. Bedrock AgentCore rút ngắn thời gian này xuống còn vài ngày.

2. Kiến trúc tổng quan Amazon Bedrock AgentCore

AgentCore không phải là một service đơn lẻ — nó là một bộ các service modular, mỗi service giải quyết một khía cạnh cụ thể của bài toán production AI Agent. Bạn có thể sử dụng toàn bộ hoặc chỉ những phần cần thiết.

graph TB
    DEV["Developer / Agent Framework"]

    subgraph "Amazon Bedrock AgentCore"
        RT["Runtime
Serverless Execution"]
        GW["Gateway
Tool Access & Auth"]
        MEM["Memory
Session & Long-term"]
        ID["Identity
Agent Authentication"]
        OBS["Observability
Tracing & Metrics"]
        EVAL["Evaluations
Quality Scoring"]
        POL["Policy
Action Control"]
        BR["Browser
Web Interaction"]
        CI["Code Interpreter
Code Execution"]
    end

    DEV --> RT
    RT --> GW
    RT --> MEM
    RT --> ID
    RT --> OBS
    RT --> EVAL
    RT --> POL
    RT --> BR
    RT --> CI

    style DEV fill:#e94560,stroke:#fff,color:#fff
    style RT fill:#2c3e50,stroke:#fff,color:#fff
    style GW fill:#16213e,stroke:#fff,color:#fff
    style MEM fill:#16213e,stroke:#fff,color:#fff
    style ID fill:#16213e,stroke:#fff,color:#fff
    style OBS fill:#16213e,stroke:#fff,color:#fff
    style EVAL fill:#16213e,stroke:#fff,color:#fff
    style POL fill:#16213e,stroke:#fff,color:#fff
    style BR fill:#16213e,stroke:#fff,color:#fff
    style CI fill:#16213e,stroke:#fff,color:#fff

Kiến trúc modular của Amazon Bedrock AgentCore

Service	Chức năng	Vấn đề giải quyết
Runtime	Môi trường serverless với session isolation	Triển khai agent không cần quản lý server
Gateway	Unified tool access qua OpenAPI specs	Kết nối agent với APIs, xử lý auth tự động
Memory	Session state + long-term memory	Agent nhớ ngữ cảnh xuyên phiên làm việc
Identity	Agent authentication với IdP (Okta, Entra ID)	Agent xác thực an toàn với third-party services
Observability	Distributed tracing, metrics, logs	Debug và monitor agent behavior
Evaluations	Continuous quality scoring	Đánh giá chất lượng response liên tục
Policy	Fine-grained action control	Kiểm soát agent được phép làm gì

3. Runtime — Triển khai Agent không cần hạ tầng

AgentCore Runtime là trung tâm của toàn bộ platform. Nó cung cấp một môi trường serverless với session isolation — mỗi cuộc hội thoại chạy trong một container riêng biệt, không chia sẻ state với các session khác.

Managed Agent Harness (Mới 2026)

Tính năng mới nhất cho phép bạn deploy một agent hoàn chỉnh chỉ với 3 API calls — không cần viết orchestration infrastructure. AgentCore CLI hỗ trợ toàn bộ development lifecycle từ init, test, đến deploy.

# Ví dụ: Deploy agent với AgentCore CLI
# Bước 1: Khởi tạo project
agentcore init my-support-agent --framework strands

# Bước 2: Định nghĩa agent logic
# agent.py sử dụng bất kỳ framework nào (Strands, LangGraph, CrewAI...)

# Bước 3: Deploy lên AgentCore Runtime
agentcore deploy --name my-support-agent \
    --memory enabled \
    --guardrails my-guardrail-id \
    --identity-provider okta

Điểm khác biệt lớn nhất của AgentCore Runtime: nó framework-agnostic. Bạn có thể sử dụng bất kỳ agent framework nào — Strands Agents SDK, LangGraph, CrewAI, AutoGen — và Runtime sẽ xử lý phần deploy, scale, và monitor.

4. Knowledge Base — RAG được quản lý hoàn toàn

Amazon Bedrock Knowledge Base cung cấp một pipeline RAG (Retrieval-Augmented Generation) hoàn chỉnh: từ ingestion dữ liệu, chunking, embedding, đến vector search — tất cả đều managed.

graph LR
    S3["S3 Bucket
Documents"] --> CHUNK["Auto Chunking
Semantic / Fixed"]
    CHUNK --> EMB["Embedding Model
Titan / Cohere"]
    EMB --> VS["Vector Store
OpenSearch / Pinecone"]

    Q["User Query"] --> AGENT["Bedrock Agent"]
    AGENT --> VS
    VS --> CTX["Retrieved Context"]
    CTX --> AGENT
    AGENT --> R["Grounded Response"]

    style S3 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style AGENT fill:#e94560,stroke:#fff,color:#fff
    style VS fill:#2c3e50,stroke:#fff,color:#fff
    style R fill:#4CAF50,stroke:#fff,color:#fff
    style CHUNK fill:#16213e,stroke:#fff,color:#fff
    style EMB fill:#16213e,stroke:#fff,color:#fff
    style Q fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style CTX fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

Luồng RAG trong Amazon Bedrock Knowledge Base

Chunking strategies

Strategy	Mô tả	Khi nào dùng
Fixed-size	Chia theo số tokens cố định	Dữ liệu đồng nhất, FAQ
Semantic	Chia theo ngữ nghĩa, giữ nguyên context	Tài liệu dài, technical docs
Hierarchical	Nhiều tầng chunk (parent-child)	Tài liệu có cấu trúc phức tạp
No chunking	Mỗi file là một chunk	File nhỏ, mỗi file một chủ đề

5. Guardrails — Kiểm soát output và bảo mật

Guardrails là lớp phòng thủ quan trọng nhất trong production. Amazon Bedrock Guardrails cung cấp 6 safeguard policies để kiểm soát cả input lẫn output của agent.

graph LR
    INPUT["User Input"] --> G1["Content Filter"]
    G1 --> G2["Denied Topics"]
    G2 --> G3["Word Filter"]
    G3 --> G4["PII Redaction"]
    G4 --> G5["Prompt Attack Detection"]
    G5 --> LLM["LLM Processing"]
    LLM --> G6["Contextual Grounding"]
    G6 --> OUTPUT["Safe Response"]

    style INPUT fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style LLM fill:#2c3e50,stroke:#fff,color:#fff
    style OUTPUT fill:#4CAF50,stroke:#fff,color:#fff
    style G1 fill:#e94560,stroke:#fff,color:#fff
    style G2 fill:#e94560,stroke:#fff,color:#fff
    style G3 fill:#e94560,stroke:#fff,color:#fff
    style G4 fill:#e94560,stroke:#fff,color:#fff
    style G5 fill:#e94560,stroke:#fff,color:#fff
    style G6 fill:#e94560,stroke:#fff,color:#fff

Pipeline xử lý 6 lớp Guardrails

Lưu ý về Guardrails cho Code

Từ 2026, Guardrails mở rộng sang bảo vệ code: phát hiện nội dung độc hại trong code elements, chặn code injection, và ngăn chặn rò rỉ PII qua code structures. Đây là điểm quan trọng nếu bạn xây dựng coding assistant agent.

Ví dụ cấu hình Guardrails

{
  "name": "production-guardrail",
  "contentPolicyConfig": {
    "filtersConfig": [
      { "type": "SEXUAL", "inputStrength": "HIGH", "outputStrength": "HIGH" },
      { "type": "VIOLENCE", "inputStrength": "HIGH", "outputStrength": "HIGH" },
      { "type": "HATE", "inputStrength": "HIGH", "outputStrength": "HIGH" },
      { "type": "INSULTS", "inputStrength": "MEDIUM", "outputStrength": "HIGH" }
    ]
  },
  "topicPolicyConfig": {
    "topicsConfig": [
      {
        "name": "competitor-comparison",
        "definition": "Câu hỏi so sánh sản phẩm với đối thủ cạnh tranh",
        "type": "DENY"
      }
    ]
  },
  "sensitiveInformationPolicyConfig": {
    "piiEntitiesConfig": [
      { "type": "EMAIL", "action": "ANONYMIZE" },
      { "type": "PHONE", "action": "ANONYMIZE" },
      { "type": "CREDIT_DEBIT_CARD_NUMBER", "action": "BLOCK" }
    ]
  },
  "contextualGroundingPolicyConfig": {
    "filtersConfig": [
      { "type": "GROUNDING", "threshold": 0.7 },
      { "type": "RELEVANCE", "threshold": 0.7 }
    ]
  }
}

6. Multi-Agent Collaboration — Đội hình Agent chuyên biệt

Khi một agent đơn lẻ không đủ khả năng xử lý các task phức tạp, Bedrock cho phép bạn xây dựng một đội hình agents phối hợp với nhau. Kiến trúc supervisor-collaborator hỗ trợ tối đa 10 collaborator agents cho mỗi supervisor.

graph TB
    USER["User Request"] --> SUP["Supervisor Agent
Điều phối & Tổng hợp"]

    SUP --> A1["Agent 1
Customer Lookup"]
    SUP --> A2["Agent 2
Order Processing"]
    SUP --> A3["Agent 3
Inventory Check"]
    SUP --> A4["Agent 4
Payment Processing"]

    A1 --> DB["CRM Database"]
    A2 --> OMS["Order Management"]
    A3 --> WMS["Warehouse System"]
    A4 --> PAY["Payment Gateway"]

    A1 --> SUP
    A2 --> SUP
    A3 --> SUP
    A4 --> SUP
    SUP --> RESP["Consolidated Response"]

    style USER fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style SUP fill:#e94560,stroke:#fff,color:#fff
    style A1 fill:#2c3e50,stroke:#fff,color:#fff
    style A2 fill:#2c3e50,stroke:#fff,color:#fff
    style A3 fill:#2c3e50,stroke:#fff,color:#fff
    style A4 fill:#2c3e50,stroke:#fff,color:#fff
    style RESP fill:#4CAF50,stroke:#fff,color:#fff
    style DB fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style OMS fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style WMS fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style PAY fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

Kiến trúc Multi-Agent với Supervisor pattern

Hai chế độ điều phối

Supervisor with Routing

Supervisor chỉ đóng vai trò router — phân tích query và chuyển đến đúng agent chuyên biệt. Phù hợp khi các task độc lập, không cần tổng hợp kết quả từ nhiều agent.

Supervisor with Orchestration

Supervisor phân tách bài toán thành nhiều phần, gửi đến các agent chuyên biệt, sau đó tổng hợp kết quả. Phù hợp cho các task phức tạp cần phối hợp nhiều nguồn dữ liệu.

import boto3

bedrock = boto3.client('bedrock-agent')

# Tạo Supervisor Agent
supervisor = bedrock.create_agent(
    agentName='ecommerce-supervisor',
    instruction="""Bạn là supervisor điều phối đội agent e-commerce.
    Phân tích yêu cầu khách hàng và phối hợp các agent chuyên biệt
    để xử lý: tra cứu đơn hàng, kiểm tra tồn kho, xử lý thanh toán.""",
    foundationModel='anthropic.claude-sonnet-4-20250514'
)

# Liên kết collaborator agents
bedrock.associate_agent_collaborator(
    agentId=supervisor['agent']['agentId'],
    collaborationInstruction='Gọi agent này khi cần tra cứu thông tin khách hàng',
    collaboratorId=customer_agent_id,
    collaboratorName='CustomerLookup',
    relayConversationHistory='TO_COLLABORATOR'
)

7. Identity — Xác thực Agent với hệ thống bên ngoài

Một trong những thách thức lớn nhất khi đưa AI Agent lên production: làm sao để agent truy cập vào các hệ thống doanh nghiệp (Slack, GitHub, Jira, Salesforce) một cách an toàn, không hardcode credentials?

AgentCore Identity giải quyết vấn đề này bằng cách gán cho agent một danh tính riêng và tích hợp với corporate Identity Provider (IdP).

graph LR
    AGENT["AI Agent"] --> ACID["AgentCore Identity"]
    ACID --> IDP["Corporate IdP
Okta / Entra ID / Cognito"]
    IDP --> TOKEN["OAuth Token"]
    TOKEN --> AGENT
    AGENT --> SLACK["Slack API"]
    AGENT --> GH["GitHub API"]
    AGENT --> JIRA["Jira API"]

    style AGENT fill:#e94560,stroke:#fff,color:#fff
    style ACID fill:#2c3e50,stroke:#fff,color:#fff
    style IDP fill:#16213e,stroke:#fff,color:#fff
    style TOKEN fill:#4CAF50,stroke:#fff,color:#fff
    style SLACK fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style GH fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style JIRA fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

Luồng xác thực Agent qua AgentCore Identity

Best Practice: Principle of Least Privilege

Luôn cấp cho agent quyền tối thiểu cần thiết. Sử dụng AgentCore Policy để giới hạn cụ thể agent được gọi API nào, trên resource nào. Không bao giờ cấp admin access cho agent.

8. Memory — Agent có trí nhớ xuyên phiên

AgentCore Memory tự động quản lý hai loại memory cho agent:

Loại Memory	Phạm vi	Use Case
Session Memory	Trong một phiên hội thoại	Giữ context của cuộc trò chuyện hiện tại
Long-term Memory	Xuyên nhiều phiên	Nhớ sở thích, lịch sử tương tác của người dùng

# Memory được tự động quản lý bởi AgentCore
# Agent có thể truy vấn memory qua API

# Ví dụ: Agent nhớ preference của khách hàng
response = bedrock_runtime.invoke_agent(
    agentId='support-agent',
    sessionId='user-12345',
    enableTrace=True,
    memoryId='user-12345-memory',  # Long-term memory identifier
    inputText='Tôi muốn đổi gói dịch vụ'
)

# Agent tự động biết:
# - User đang dùng gói Premium từ 2024
# - User đã hỏi về downgrade 2 lần trước
# - User prefer giao tiếp tiếng Việt

9. Gateway — Kết nối Agent với mọi API

AgentCore Gateway đóng vai trò trung gian giữa agent và các external tools/APIs. Thay vì mỗi agent tự xử lý authentication, rate limiting, error handling cho từng API — Gateway xử lý tất cả.

Cách Gateway hoạt động

Bạn định nghĩa tool bằng OpenAPI specification → Gateway tự hiểu API contract → xử lý auth, validate request/response, retry tự động, và chuyển kết quả về agent dưới dạng tool response.

# Ví dụ OpenAPI spec cho tool "get_order_status"
openapi: 3.0.0
info:
  title: Order Management API
  version: 1.0.0
paths:
  /orders/{orderId}:
    get:
      operationId: getOrderStatus
      summary: Lấy trạng thái đơn hàng
      parameters:
        - name: orderId
          in: path
          required: true
          schema:
            type: string
      responses:
        '200':
          description: Thông tin đơn hàng
          content:
            application/json:
              schema:
                type: object
                properties:
                  status:
                    type: string
                    enum: [pending, processing, shipped, delivered]
                  estimatedDelivery:
                    type: string
                    format: date

10. Action Groups — Mở rộng khả năng của Agent

Action Groups cho phép agent thực hiện các hành động cụ thể bằng cách kết nối với AWS Lambda functions hoặc API endpoints. Mỗi Action Group là một tập hợp các tools mà agent có thể gọi.

graph TB
    AGENT["Bedrock Agent"] --> AG1["Action Group: CRM"]
    AGENT --> AG2["Action Group: Inventory"]
    AGENT --> AG3["Action Group: Notifications"]
    AGENT --> KB["Knowledge Base"]

    AG1 --> L1["Lambda: searchCustomer"]
    AG1 --> L2["Lambda: updateCustomer"]
    AG2 --> L3["Lambda: checkStock"]
    AG2 --> L4["Lambda: reserveItem"]
    AG3 --> L5["Lambda: sendEmail"]
    AG3 --> L6["Lambda: sendSMS"]

    style AGENT fill:#e94560,stroke:#fff,color:#fff
    style AG1 fill:#2c3e50,stroke:#fff,color:#fff
    style AG2 fill:#2c3e50,stroke:#fff,color:#fff
    style AG3 fill:#2c3e50,stroke:#fff,color:#fff
    style KB fill:#16213e,stroke:#fff,color:#fff
    style L1 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style L2 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style L3 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style L4 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style L5 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style L6 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

Agent sử dụng Action Groups để tương tác với hệ thống

11. Observability — Giám sát Agent trong Production

Agent không phải là API truyền thống — output không deterministic, mỗi request có thể chạy qua nhiều bước reasoning, gọi nhiều tools. Observability trong thế giới agentic đòi hỏi tracing ở cấp độ từng bước suy luận.

# Enable trace để theo dõi reasoning chain
response = bedrock_runtime.invoke_agent(
    agentId='my-agent',
    sessionId='session-001',
    enableTrace=True,
    inputText='Kiểm tra đơn hàng ORD-12345 và gửi cập nhật cho khách'
)

# Trace output bao gồm:
# 1. PreProcessingTrace - Input parsing & classification
# 2. OrchestrationTrace - Reasoning steps, tool selection
# 3. PostProcessingTrace - Response formatting
# 4. GuardrailTrace - Safety filter results
# 5. FailureTrace - Error details nếu có

Tích hợp CloudWatch

AgentCore tự động gửi metrics và traces đến Amazon CloudWatch. Bạn có thể tạo dashboard giám sát: latency trung bình per step, số lần gọi tool, tỷ lệ guardrail block, và chi phí LLM inference per session.

12. So sánh với các nền tảng AI Agent khác

Tiêu chí	Bedrock AgentCore	Azure AI Foundry	LangGraph Cloud
Deployment	Fully serverless	Container-based	Managed hosting
Multi-model	Claude, Llama, Mistral, Titan...	GPT-4o, Phi, Llama	Bất kỳ LLM nào
Built-in RAG	Knowledge Base (managed)	Azure AI Search	Tự tích hợp
Agent Identity	AgentCore Identity + IdP	Managed Identity	Không có sẵn
Multi-Agent	Supervisor + 10 collaborators	Semantic Kernel orchestration	Graph-based workflows
Guardrails	6 safeguard policies built-in	Content Safety API	Tự implement
Protocols	MCP, A2A, HTTP	HTTP, gRPC	HTTP
Lock-in	Framework-agnostic	Azure ecosystem	LangChain ecosystem

13. Kiến trúc Production: E-commerce Support Agent

Để minh họa cách các building block kết hợp trong thực tế, hãy xem kiến trúc một hệ thống Customer Support Agent cho e-commerce.

graph TB
    CUST["Customer
Web / Mobile / Chat"] --> ALB["Application Load Balancer"]
    ALB --> API["API Gateway"]
    API --> RT["AgentCore Runtime"]

    RT --> SUP["Supervisor Agent"]
    SUP --> FAQ["FAQ Agent
+ Knowledge Base"]
    SUP --> ORD["Order Agent
+ Action Groups"]
    SUP --> ESC["Escalation Agent
+ SES / SNS"]

    RT --> MEM["AgentCore Memory"]
    RT --> GR["Guardrails"]
    RT --> OBS["CloudWatch
Observability"]
    RT --> AID["AgentCore Identity
→ Okta SSO"]

    FAQ --> S3["S3: Product Docs"]
    ORD --> DDB["DynamoDB: Orders"]
    ORD --> PAY["Stripe API"]
    ESC --> AGENT_DESK["Human Agent Queue"]

    style CUST fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style RT fill:#e94560,stroke:#fff,color:#fff
    style SUP fill:#2c3e50,stroke:#fff,color:#fff
    style FAQ fill:#16213e,stroke:#fff,color:#fff
    style ORD fill:#16213e,stroke:#fff,color:#fff
    style ESC fill:#16213e,stroke:#fff,color:#fff
    style GR fill:#e94560,stroke:#fff,color:#fff
    style MEM fill:#2c3e50,stroke:#fff,color:#fff
    style OBS fill:#2c3e50,stroke:#fff,color:#fff
    style AID fill:#2c3e50,stroke:#fff,color:#fff
    style ALB fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style API fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style S3 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style DDB fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style PAY fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style AGENT_DESK fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

Kiến trúc production: E-commerce Support Agent trên Bedrock AgentCore

14. Best Practices cho Production

Thiết kế Agent

Mỗi agent chỉ nên đảm nhận một nhiệm vụ rõ ràng. Nếu instruction của agent vượt quá 500 từ, hãy tách thành multi-agent. Agent instruction cần cụ thể, tránh mệnh lệnh mơ hồ như "hãy giúp đỡ khách hàng".

Guardrails trước tiên

Luôn setup Guardrails TRƯỚC khi expose agent ra bên ngoài. Bắt đầu với policy restrictive, sau đó nới lỏng dần dựa trên data thực tế. Enable contextual grounding với threshold ≥ 0.7 để giảm hallucination.

Observability từ ngày đầu

Enable trace cho 100% requests trong giai đoạn đầu. Tạo CloudWatch alarms cho: latency P99 > 10s, guardrail block rate > 15%, và error rate > 2%. Giảm sampling rate khi hệ thống ổn định.

Memory và Privacy

Cấu hình TTL cho long-term memory phù hợp với chính sách data retention. Sử dụng PII redaction trong Guardrails để đảm bảo memory không lưu thông tin nhạy cảm.

Testing liên tục

Sử dụng AgentCore Evaluations để chạy continuous quality scoring. Xây dựng test suite với các edge cases: prompt injection, off-topic queries, multi-step reasoning failures.

15. Chi phí và Pricing

Bedrock AgentCore sử dụng mô hình pay-per-use, tính phí dựa trên:

LLM inference: Tính theo input/output tokens, khác nhau tùy model (Claude Sonnet rẻ hơn Claude Opus)
Knowledge Base: Phí embedding + vector store (OpenSearch Serverless)
Runtime: Phí compute dựa trên thời gian session active
Guardrails: Phí theo số lượng text units được xử lý
Memory: Phí storage cho long-term memory

Mẹo tối ưu chi phí

Sử dụng Prompt Caching để giảm 90% chi phí inference cho các system prompt lặp lại. Chọn model phù hợp — không phải task nào cũng cần Claude Opus, nhiều task routing/classification chỉ cần Haiku. Kết hợp với Intelligent Prompt Routing để tự động chọn model tối ưu theo độ phức tạp.

Kết luận

Amazon Bedrock AgentCore đại diện cho sự trưởng thành của hệ sinh thái AI Agent — từ giai đoạn demo sang production-grade. Với kiến trúc modular, bạn không bị ép buộc phải dùng toàn bộ platform — có thể bắt đầu với Runtime + Guardrails, sau đó mở rộng sang Memory, Identity, Multi-Agent khi nhu cầu tăng.

Điều khiến AgentCore khác biệt không phải là từng tính năng riêng lẻ — Azure AI Foundry hay LangGraph Cloud cũng có các capability tương tự. Điểm mạnh nằm ở việc tất cả đều managed, serverless, và framework-agnostic — bạn tập trung vào agent logic, để AWS xử lý phần production infrastructure.

Tham khảo

#AWS Bedrock #AI Agent #Agentic AI #Serverless #Multi-Agent #RAG #system design

# AWS Bedrock AgentCore — Nền tảng Serverless triển khai AI Agent Production

Xây dựng một AI Agent chạy được trong môi trường thử nghiệm là một chuyện. Đưa agent đó lên production với hàng triệu request, bảo mật chặt chẽ, khả năng ghi nhớ ngữ cảnh xuyên phiên, và tích hợp với hệ thống doanh nghiệp hiện có — đó mới là bài toán thực sự. **Amazon Bedrock AgentCore** ra đời để giải quyết chính xác vấn đề này: một nền tảng serverless cung cấp mọi building block cần thiết để triển khai AI Agent ở quy mô production.

3 API calls để deploy agent lên production

10 collaborator agents tối đa / supervisor

6 safeguard policies trong Guardrails

0 infrastructure cần quản lý (serverless)

## 1. Vì sao cần một nền tảng chuyên biệt cho AI Agent?

```
graph TB
    subgraph "Effort xây dựng AI Agent"
        A["20% — Agent Logic  
LLM, Tools, Prompts"]
        B["80% — Production Infrastructure  
Security, Memory, Observability,  
Scaling, Identity, Deployment"]
    end

style A fill:#4CAF50,stroke:#fff,color:#fff
    style B fill:#e94560,stroke:#fff,color:#fff

```
Tỷ lệ effort: Agent logic vs Infrastructure

#### Thực tế từ production

Theo AWS, các team doanh nghiệp trung bình mất **3-6 tháng** chỉ để xây dựng infrastructure xung quanh AI Agent trước khi agent có thể xử lý workload thực. Bedrock AgentCore rút ngắn thời gian này xuống còn **vài ngày**.

## 2. Kiến trúc tổng quan Amazon Bedrock AgentCore

```
graph TB
    DEV["Developer / Agent Framework"]

subgraph "Amazon Bedrock AgentCore"
        RT["Runtime  
Serverless Execution"]
        GW["Gateway  
Tool Access & Auth"]
        MEM["Memory  
Session & Long-term"]
        ID["Identity  
Agent Authentication"]
        OBS["Observability  
Tracing & Metrics"]
        EVAL["Evaluations  
Quality Scoring"]
        POL["Policy  
Action Control"]
        BR["Browser  
Web Interaction"]
        CI["Code Interpreter  
Code Execution"]
    end

DEV --> RT
    RT --> GW
    RT --> MEM
    RT --> ID
    RT --> OBS
    RT --> EVAL
    RT --> POL
    RT --> BR
    RT --> CI

style DEV fill:#e94560,stroke:#fff,color:#fff
    style RT fill:#2c3e50,stroke:#fff,color:#fff
    style GW fill:#16213e,stroke:#fff,color:#fff
    style MEM fill:#16213e,stroke:#fff,color:#fff
    style ID fill:#16213e,stroke:#fff,color:#fff
    style OBS fill:#16213e,stroke:#fff,color:#fff
    style EVAL fill:#16213e,stroke:#fff,color:#fff
    style POL fill:#16213e,stroke:#fff,color:#fff
    style BR fill:#16213e,stroke:#fff,color:#fff
    style CI fill:#16213e,stroke:#fff,color:#fff

```
Kiến trúc modular của Amazon Bedrock AgentCore

| Service | Chức năng | Vấn đề giải quyết |
| --- | --- | --- |
| **Runtime** | Môi trường serverless với session isolation | Triển khai agent không cần quản lý server |
| **Gateway** | Unified tool access qua OpenAPI specs | Kết nối agent với APIs, xử lý auth tự động |
| **Memory** | Session state + long-term memory | Agent nhớ ngữ cảnh xuyên phiên làm việc |
| **Identity** | Agent authentication với IdP (Okta, Entra ID) | Agent xác thực an toàn với third-party services |
| **Observability** | Distributed tracing, metrics, logs | Debug và monitor agent behavior |
| **Evaluations** | Continuous quality scoring | Đánh giá chất lượng response liên tục |
| **Policy** | Fine-grained action control | Kiểm soát agent được phép làm gì |

## 3. Runtime — Triển khai Agent không cần hạ tầng

#### Managed Agent Harness (Mới 2026)

Tính năng mới nhất cho phép bạn deploy một agent hoàn chỉnh chỉ với **3 API calls** — không cần viết orchestration infrastructure. AgentCore CLI hỗ trợ toàn bộ development lifecycle từ init, test, đến deploy.

```python
# Ví dụ: Deploy agent với AgentCore CLI
# Bước 1: Khởi tạo project
agentcore init my-support-agent --framework strands

# Bước 2: Định nghĩa agent logic
# agent.py sử dụng bất kỳ framework nào (Strands, LangGraph, CrewAI...)

# Bước 3: Deploy lên AgentCore Runtime
agentcore deploy --name my-support-agent \
    --memory enabled \
    --guardrails my-guardrail-id \
    --identity-provider okta
```
Điểm khác biệt lớn nhất của AgentCore Runtime: nó **framework-agnostic**. Bạn có thể sử dụng bất kỳ agent framework nào — Strands Agents SDK, LangGraph, CrewAI, AutoGen — và Runtime sẽ xử lý phần deploy, scale, và monitor.

## 4. Knowledge Base — RAG được quản lý hoàn toàn

```
graph LR
    S3["S3 Bucket  
Documents"] --> CHUNK["Auto Chunking  
Semantic / Fixed"]
    CHUNK --> EMB["Embedding Model  
Titan / Cohere"]
    EMB --> VS["Vector Store  
OpenSearch / Pinecone"]

Q["User Query"] --> AGENT["Bedrock Agent"]
    AGENT --> VS
    VS --> CTX["Retrieved Context"]
    CTX --> AGENT
    AGENT --> R["Grounded Response"]

style S3 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style AGENT fill:#e94560,stroke:#fff,color:#fff
    style VS fill:#2c3e50,stroke:#fff,color:#fff
    style R fill:#4CAF50,stroke:#fff,color:#fff
    style CHUNK fill:#16213e,stroke:#fff,color:#fff
    style EMB fill:#16213e,stroke:#fff,color:#fff
    style Q fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style CTX fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

```
Luồng RAG trong Amazon Bedrock Knowledge Base

### Chunking strategies

| Strategy | Mô tả | Khi nào dùng |
| --- | --- | --- |
| **Fixed-size** | Chia theo số tokens cố định | Dữ liệu đồng nhất, FAQ |
| **Semantic** | Chia theo ngữ nghĩa, giữ nguyên context | Tài liệu dài, technical docs |
| **Hierarchical** | Nhiều tầng chunk (parent-child) | Tài liệu có cấu trúc phức tạp |
| **No chunking** | Mỗi file là một chunk | File nhỏ, mỗi file một chủ đề |

## 5. Guardrails — Kiểm soát output và bảo mật

Guardrails là lớp phòng thủ quan trọng nhất trong production. Amazon Bedrock Guardrails cung cấp 6 safeguard policies để kiểm soát cả input lẫn output của agent.

```
graph LR
    INPUT["User Input"] --> G1["Content Filter"]
    G1 --> G2["Denied Topics"]
    G2 --> G3["Word Filter"]
    G3 --> G4["PII Redaction"]
    G4 --> G5["Prompt Attack Detection"]
    G5 --> LLM["LLM Processing"]
    LLM --> G6["Contextual Grounding"]
    G6 --> OUTPUT["Safe Response"]

style INPUT fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style LLM fill:#2c3e50,stroke:#fff,color:#fff
    style OUTPUT fill:#4CAF50,stroke:#fff,color:#fff
    style G1 fill:#e94560,stroke:#fff,color:#fff
    style G2 fill:#e94560,stroke:#fff,color:#fff
    style G3 fill:#e94560,stroke:#fff,color:#fff
    style G4 fill:#e94560,stroke:#fff,color:#fff
    style G5 fill:#e94560,stroke:#fff,color:#fff
    style G6 fill:#e94560,stroke:#fff,color:#fff

```
Pipeline xử lý 6 lớp Guardrails

#### Lưu ý về Guardrails cho Code

### Ví dụ cấu hình Guardrails

```json
{
  "name": "production-guardrail",
  "contentPolicyConfig": {
    "filtersConfig": [
      { "type": "SEXUAL", "inputStrength": "HIGH", "outputStrength": "HIGH" },
      { "type": "VIOLENCE", "inputStrength": "HIGH", "outputStrength": "HIGH" },
      { "type": "HATE", "inputStrength": "HIGH", "outputStrength": "HIGH" },
      { "type": "INSULTS", "inputStrength": "MEDIUM", "outputStrength": "HIGH" }
    ]
  },
  "topicPolicyConfig": {
    "topicsConfig": [
      {
        "name": "competitor-comparison",
        "definition": "Câu hỏi so sánh sản phẩm với đối thủ cạnh tranh",
        "type": "DENY"
      }
    ]
  },
  "sensitiveInformationPolicyConfig": {
    "piiEntitiesConfig": [
      { "type": "EMAIL", "action": "ANONYMIZE" },
      { "type": "PHONE", "action": "ANONYMIZE" },
      { "type": "CREDIT_DEBIT_CARD_NUMBER", "action": "BLOCK" }
    ]
  },
  "contextualGroundingPolicyConfig": {
    "filtersConfig": [
      { "type": "GROUNDING", "threshold": 0.7 },
      { "type": "RELEVANCE", "threshold": 0.7 }
    ]
  }
}
```

## 6. Multi-Agent Collaboration — Đội hình Agent chuyên biệt

Khi một agent đơn lẻ không đủ khả năng xử lý các task phức tạp, Bedrock cho phép bạn xây dựng một đội hình agents phối hợp với nhau. Kiến trúc supervisor-collaborator hỗ trợ tối đa **10 collaborator agents** cho mỗi supervisor.

```
graph TB
    USER["User Request"] --> SUP["Supervisor Agent  
Điều phối & Tổng hợp"]

SUP --> A1["Agent 1  
Customer Lookup"]
    SUP --> A2["Agent 2  
Order Processing"]
    SUP --> A3["Agent 3  
Inventory Check"]
    SUP --> A4["Agent 4  
Payment Processing"]

A1 --> DB["CRM Database"]
    A2 --> OMS["Order Management"]
    A3 --> WMS["Warehouse System"]
    A4 --> PAY["Payment Gateway"]

A1 --> SUP
    A2 --> SUP
    A3 --> SUP
    A4 --> SUP
    SUP --> RESP["Consolidated Response"]

style USER fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style SUP fill:#e94560,stroke:#fff,color:#fff
    style A1 fill:#2c3e50,stroke:#fff,color:#fff
    style A2 fill:#2c3e50,stroke:#fff,color:#fff
    style A3 fill:#2c3e50,stroke:#fff,color:#fff
    style A4 fill:#2c3e50,stroke:#fff,color:#fff
    style RESP fill:#4CAF50,stroke:#fff,color:#fff
    style DB fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style OMS fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style WMS fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style PAY fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

```
Kiến trúc Multi-Agent với Supervisor pattern

### Hai chế độ điều phối

#### Supervisor with Routing

#### Supervisor with Orchestration

```python
import boto3

bedrock = boto3.client('bedrock-agent')

# Tạo Supervisor Agent
supervisor = bedrock.create_agent(
    agentName='ecommerce-supervisor',
    instruction="""Bạn là supervisor điều phối đội agent e-commerce.
    Phân tích yêu cầu khách hàng và phối hợp các agent chuyên biệt
    để xử lý: tra cứu đơn hàng, kiểm tra tồn kho, xử lý thanh toán.""",
    foundationModel='anthropic.claude-sonnet-4-20250514'
)

# Liên kết collaborator agents
bedrock.associate_agent_collaborator(
    agentId=supervisor['agent']['agentId'],
    collaborationInstruction='Gọi agent này khi cần tra cứu thông tin khách hàng',
    collaboratorId=customer_agent_id,
    collaboratorName='CustomerLookup',
    relayConversationHistory='TO_COLLABORATOR'
)
```

## 7. Identity — Xác thực Agent với hệ thống bên ngoài

AgentCore Identity giải quyết vấn đề này bằng cách gán cho agent một danh tính riêng và tích hợp với corporate Identity Provider (IdP).

```
graph LR
    AGENT["AI Agent"] --> ACID["AgentCore Identity"]
    ACID --> IDP["Corporate IdP  
Okta / Entra ID / Cognito"]
    IDP --> TOKEN["OAuth Token"]
    TOKEN --> AGENT
    AGENT --> SLACK["Slack API"]
    AGENT --> GH["GitHub API"]
    AGENT --> JIRA["Jira API"]

style AGENT fill:#e94560,stroke:#fff,color:#fff
    style ACID fill:#2c3e50,stroke:#fff,color:#fff
    style IDP fill:#16213e,stroke:#fff,color:#fff
    style TOKEN fill:#4CAF50,stroke:#fff,color:#fff
    style SLACK fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style GH fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style JIRA fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

```
Luồng xác thực Agent qua AgentCore Identity

#### Best Practice: Principle of Least Privilege

## 8. Memory — Agent có trí nhớ xuyên phiên

AgentCore Memory tự động quản lý hai loại memory cho agent:

| Loại Memory | Phạm vi | Use Case |
| --- | --- | --- |
| **Session Memory** | Trong một phiên hội thoại | Giữ context của cuộc trò chuyện hiện tại |
| **Long-term Memory** | Xuyên nhiều phiên | Nhớ sở thích, lịch sử tương tác của người dùng |

```python
# Memory được tự động quản lý bởi AgentCore
# Agent có thể truy vấn memory qua API

# Ví dụ: Agent nhớ preference của khách hàng
response = bedrock_runtime.invoke_agent(
    agentId='support-agent',
    sessionId='user-12345',
    enableTrace=True,
    memoryId='user-12345-memory',  # Long-term memory identifier
    inputText='Tôi muốn đổi gói dịch vụ'
)

# Agent tự động biết:
# - User đang dùng gói Premium từ 2024
# - User đã hỏi về downgrade 2 lần trước
# - User prefer giao tiếp tiếng Việt
```

## 9. Gateway — Kết nối Agent với mọi API

#### Cách Gateway hoạt động

```yaml
# Ví dụ OpenAPI spec cho tool "get_order_status"
openapi: 3.0.0
info:
  title: Order Management API
  version: 1.0.0
paths:
  /orders/{orderId}:
    get:
      operationId: getOrderStatus
      summary: Lấy trạng thái đơn hàng
      parameters:
        - name: orderId
          in: path
          required: true
          schema:
            type: string
      responses:
        '200':
          description: Thông tin đơn hàng
          content:
            application/json:
              schema:
                type: object
                properties:
                  status:
                    type: string
                    enum: [pending, processing, shipped, delivered]
                  estimatedDelivery:
                    type: string
                    format: date
```

## 10. Action Groups — Mở rộng khả năng của Agent

```
graph TB
    AGENT["Bedrock Agent"] --> AG1["Action Group: CRM"]
    AGENT --> AG2["Action Group: Inventory"]
    AGENT --> AG3["Action Group: Notifications"]
    AGENT --> KB["Knowledge Base"]

AG1 --> L1["Lambda: searchCustomer"]
    AG1 --> L2["Lambda: updateCustomer"]
    AG2 --> L3["Lambda: checkStock"]
    AG2 --> L4["Lambda: reserveItem"]
    AG3 --> L5["Lambda: sendEmail"]
    AG3 --> L6["Lambda: sendSMS"]

style AGENT fill:#e94560,stroke:#fff,color:#fff
    style AG1 fill:#2c3e50,stroke:#fff,color:#fff
    style AG2 fill:#2c3e50,stroke:#fff,color:#fff
    style AG3 fill:#2c3e50,stroke:#fff,color:#fff
    style KB fill:#16213e,stroke:#fff,color:#fff
    style L1 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style L2 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style L3 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style L4 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style L5 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style L6 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

```
Agent sử dụng Action Groups để tương tác với hệ thống

## 11. Observability — Giám sát Agent trong Production

```python
# Enable trace để theo dõi reasoning chain
response = bedrock_runtime.invoke_agent(
    agentId='my-agent',
    sessionId='session-001',
    enableTrace=True,
    inputText='Kiểm tra đơn hàng ORD-12345 và gửi cập nhật cho khách'
)

# Trace output bao gồm:
# 1. PreProcessingTrace - Input parsing & classification
# 2. OrchestrationTrace - Reasoning steps, tool selection
# 3. PostProcessingTrace - Response formatting
# 4. GuardrailTrace - Safety filter results
# 5. FailureTrace - Error details nếu có
```

#### Tích hợp CloudWatch

## 12. So sánh với các nền tảng AI Agent khác

| Tiêu chí | Bedrock AgentCore | Azure AI Foundry | LangGraph Cloud |
| --- | --- | --- | --- |
| **Deployment** | Fully serverless | Container-based | Managed hosting |
| **Multi-model** | Claude, Llama, Mistral, Titan... | GPT-4o, Phi, Llama | Bất kỳ LLM nào |
| **Built-in RAG** | Knowledge Base (managed) | Azure AI Search | Tự tích hợp |
| **Agent Identity** | AgentCore Identity + IdP | Managed Identity | Không có sẵn |
| **Multi-Agent** | Supervisor + 10 collaborators | Semantic Kernel orchestration | Graph-based workflows |
| **Guardrails** | 6 safeguard policies built-in | Content Safety API | Tự implement |
| **Protocols** | MCP, A2A, HTTP | HTTP, gRPC | HTTP |
| **Lock-in** | Framework-agnostic | Azure ecosystem | LangChain ecosystem |

## 13. Kiến trúc Production: E-commerce Support Agent

Để minh họa cách các building block kết hợp trong thực tế, hãy xem kiến trúc một hệ thống Customer Support Agent cho e-commerce.

```
graph TB
    CUST["Customer  
Web / Mobile / Chat"] --> ALB["Application Load Balancer"]
    ALB --> API["API Gateway"]
    API --> RT["AgentCore Runtime"]

RT --> SUP["Supervisor Agent"]
    SUP --> FAQ["FAQ Agent  
+ Knowledge Base"]
    SUP --> ORD["Order Agent  
+ Action Groups"]
    SUP --> ESC["Escalation Agent  
+ SES / SNS"]

RT --> MEM["AgentCore Memory"]
    RT --> GR["Guardrails"]
    RT --> OBS["CloudWatch  
Observability"]
    RT --> AID["AgentCore Identity  
→ Okta SSO"]

FAQ --> S3["S3: Product Docs"]
    ORD --> DDB["DynamoDB: Orders"]
    ORD --> PAY["Stripe API"]
    ESC --> AGENT_DESK["Human Agent Queue"]

style CUST fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style RT fill:#e94560,stroke:#fff,color:#fff
    style SUP fill:#2c3e50,stroke:#fff,color:#fff
    style FAQ fill:#16213e,stroke:#fff,color:#fff
    style ORD fill:#16213e,stroke:#fff,color:#fff
    style ESC fill:#16213e,stroke:#fff,color:#fff
    style GR fill:#e94560,stroke:#fff,color:#fff
    style MEM fill:#2c3e50,stroke:#fff,color:#fff
    style OBS fill:#2c3e50,stroke:#fff,color:#fff
    style AID fill:#2c3e50,stroke:#fff,color:#fff
    style ALB fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style API fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style S3 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style DDB fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style PAY fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style AGENT_DESK fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

```
Kiến trúc production: E-commerce Support Agent trên Bedrock AgentCore

## 14. Best Practices cho Production

Thiết kế Agent

Guardrails trước tiên

Observability từ ngày đầu

Memory và Privacy

Cấu hình TTL cho long-term memory phù hợp với chính sách data retention. Sử dụng PII redaction trong Guardrails để đảm bảo memory không lưu thông tin nhạy cảm.

Testing liên tục

Sử dụng AgentCore Evaluations để chạy continuous quality scoring. Xây dựng test suite với các edge cases: prompt injection, off-topic queries, multi-step reasoning failures.

## 15. Chi phí và Pricing

Bedrock AgentCore sử dụng mô hình pay-per-use, tính phí dựa trên:

- **LLM inference**: Tính theo input/output tokens, khác nhau tùy model (Claude Sonnet rẻ hơn Claude Opus)
- **Knowledge Base**: Phí embedding + vector store (OpenSearch Serverless)
- **Runtime**: Phí compute dựa trên thời gian session active
- **Guardrails**: Phí theo số lượng text units được xử lý
- **Memory**: Phí storage cho long-term memory

#### Mẹo tối ưu chi phí

Sử dụng **Prompt Caching** để giảm 90% chi phí inference cho các system prompt lặp lại. Chọn model phù hợp — không phải task nào cũng cần Claude Opus, nhiều task routing/classification chỉ cần Haiku. Kết hợp với **Intelligent Prompt Routing** để tự động chọn model tối ưu theo độ phức tạp.

## Kết luận

Điều khiến AgentCore khác biệt không phải là từng tính năng riêng lẻ — Azure AI Foundry hay LangGraph Cloud cũng có các capability tương tự. Điểm mạnh nằm ở việc tất cả đều **managed, serverless, và framework-agnostic** — bạn tập trung vào agent logic, để AWS xử lý phần production infrastructure.

### Tham khảo

- [Amazon Bedrock AgentCore — Official Page](https://aws.amazon.com/bedrock/agentcore/)
- [AgentCore Developer Guide — AWS Documentation](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/what-is-bedrock-agentcore.html)
- [Introducing Amazon Bedrock AgentCore — AWS Blog](https://aws.amazon.com/blogs/aws/introducing-amazon-bedrock-agentcore-securely-deploy-and-operate-ai-agents-at-any-scale/)
- [Multi-Agent Collaboration — AWS Bedrock Docs](https://docs.aws.amazon.com/bedrock/latest/userguide/agents-multi-agent-collaboration.html)
- [Amazon Bedrock Guardrails — AWS](https://aws.amazon.com/bedrock/guardrails/)
- [Navigating Managed AI Agent Options on Bedrock — The Scale Factory](https://scalefactory.com/blog/2026/04/09/navigating-your-managed-and-modular-ai-agent-options-using-amazon-bedrock/)
- [AI Agents Best Practices with AgentCore — AWS ML Blog](https://aws.amazon.com/blogs/machine-learning/ai-agents-in-enterprises-best-practices-with-amazon-bedrock-agentcore/)

Cloudflare Dynamic Workers — Serverless có trạng thái cho kỷ nguyên AI Agent

Chaos Engineering: Kiểm Chứng Khả Năng Chống Chịu Của Hệ Thống Phân Tán

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.