AWS Bedrock AgentCore — Serverless Platform for Production AI Agents
Posted on: 4/25/2026 1:13:40 AM
Table of contents
- 1. Why Do We Need a Dedicated Platform for AI Agents?
- 2. Amazon Bedrock AgentCore Architecture Overview
- 3. Runtime — Deploy Agents Without Infrastructure
- 4. Knowledge Base — Fully Managed RAG
- 5. Guardrails — Output Control and Security
- 6. Multi-Agent Collaboration — Specialized Agent Teams
- 7. Identity — Authenticating Agents with External Systems
- 8. Memory — Agents with Cross-Session Memory
- 9. Gateway — Connecting Agents to Any API
- 10. Action Groups — Extending Agent Capabilities
- 11. Observability — Monitoring Agents in Production
- 12. Comparison with Other AI Agent Platforms
- 13. Production Architecture: E-commerce Support Agent
- 14. Production Best Practices
- 15. Cost and Pricing
- Conclusion
Building an AI Agent that works in a is one thing. Deploying that agent to production with millions of requests, enterprise-grade security, cross-session memory, and integration with existing business systems — that's the real challenge. Amazon Bedrock AgentCore was built to solve exactly this: a serverless platform providing every building block needed to run AI Agents at production scale.
1. Why Do We Need a Dedicated Platform for AI Agents?
When building AI Agents, the "smart" part — calling LLMs, parsing results, invoking tools — typically accounts for about 20% of the effort. The remaining 80% is production concerns: security, authentication, memory management, monitoring, retry logic, rate limiting, and deployment at scale. This is why managed platforms like Bedrock AgentCore are becoming increasingly critical.
graph TB
subgraph "AI Agent Development Effort"
A["20% — Agent Logic
LLM, Tools, Prompts"]
B["80% — Production Infrastructure
Security, Memory, Observability,
Scaling, Identity, Deployment"]
end
style A fill:#4CAF50,stroke:#fff,color:#fff
style B fill:#e94560,stroke:#fff,color:#fff
Effort ratio: Agent logic vs Infrastructure
Production Reality
According to AWS, enterprise teams spend an average of 3-6 months building infrastructure around AI Agents before the agent can handle real workloads. Bedrock AgentCore reduces this timeline to days.
2. Amazon Bedrock AgentCore Architecture Overview
AgentCore isn't a single service — it's a suite of modular services, each addressing a specific aspect of the production AI Agent problem. You can use the entire suite or just the parts you need.
graph TB
DEV["Developer / Agent Framework"]
subgraph "Amazon Bedrock AgentCore"
RT["Runtime
Serverless Execution"]
GW["Gateway
Tool Access & Auth"]
MEM["Memory
Session & Long-term"]
ID["Identity
Agent Authentication"]
OBS["Observability
Tracing & Metrics"]
EVAL["Evaluations
Quality Scoring"]
POL["Policy
Action Control"]
BR["Browser
Web Interaction"]
CI["Code Interpreter
Code Execution"]
end
DEV --> RT
RT --> GW
RT --> MEM
RT --> ID
RT --> OBS
RT --> EVAL
RT --> POL
RT --> BR
RT --> CI
style DEV fill:#e94560,stroke:#fff,color:#fff
style RT fill:#2c3e50,stroke:#fff,color:#fff
style GW fill:#16213e,stroke:#fff,color:#fff
style MEM fill:#16213e,stroke:#fff,color:#fff
style ID fill:#16213e,stroke:#fff,color:#fff
style OBS fill:#16213e,stroke:#fff,color:#fff
style EVAL fill:#16213e,stroke:#fff,color:#fff
style POL fill:#16213e,stroke:#fff,color:#fff
style BR fill:#16213e,stroke:#fff,color:#fff
style CI fill:#16213e,stroke:#fff,color:#fff
Modular architecture of Amazon Bedrock AgentCore
| Service | Function | Problem Solved |
|---|---|---|
| Runtime | Serverless environment with session isolation | Deploy agents without managing servers |
| Gateway | Unified tool access via OpenAPI specs | Connect agents to APIs with automatic auth |
| Memory | Session state + long-term memory | Agent remembers context across sessions |
| Identity | Agent authentication with IdPs (Okta, Entra ID) | Secure agent auth with third-party services |
| Observability | Distributed tracing, metrics, logs | Debug and monitor agent behavior |
| Evaluations | Continuous quality scoring | Continuously assess response quality |
| Policy | Fine-grained action control | Control what agents are allowed to do |
3. Runtime — Deploy Agents Without Infrastructure
AgentCore Runtime is the heart of the entire platform. It provides a serverless environment with session isolation — each conversation runs in a separate container, sharing no state with other sessions.
Managed Agent Harness (New in 2026)
The latest feature lets you deploy a complete agent with just 3 API calls — no orchestration infrastructure needed. The AgentCore CLI supports the entire development lifecycle from init, test, to deploy.
# Example: Deploy agent with AgentCore CLI
# Step 1: Initialize project
agentcore init my-support-agent --framework strands
# Step 2: Define agent logic
# agent.py using any framework (Strands, LangGraph, CrewAI...)
# Step 3: Deploy to AgentCore Runtime
agentcore deploy --name my-support-agent \
--memory enabled \
--guardrails my-guardrail-id \
--identity-provider okta
The biggest differentiator of AgentCore Runtime: it's framework-agnostic. You can use any agent framework — Strands Agents SDK, LangGraph, CrewAI, AutoGen — and Runtime handles deployment, scaling, and monitoring.
4. Knowledge Base — Fully Managed RAG
Amazon Bedrock Knowledge Base provides a complete RAG (Retrieval-Augmented Generation) pipeline: from data ingestion, chunking, embedding, to vector search — all fully managed.
graph LR
S3["S3 Bucket
Documents"] --> CHUNK["Auto Chunking
Semantic / Fixed"]
CHUNK --> EMB["Embedding Model
Titan / Cohere"]
EMB --> VS["Vector Store
OpenSearch / Pinecone"]
Q["User Query"] --> AGENT["Bedrock Agent"]
AGENT --> VS
VS --> CTX["Retrieved Context"]
CTX --> AGENT
AGENT --> R["Grounded Response"]
style S3 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style AGENT fill:#e94560,stroke:#fff,color:#fff
style VS fill:#2c3e50,stroke:#fff,color:#fff
style R fill:#4CAF50,stroke:#fff,color:#fff
style CHUNK fill:#16213e,stroke:#fff,color:#fff
style EMB fill:#16213e,stroke:#fff,color:#fff
style Q fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style CTX fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
RAG flow in Amazon Bedrock Knowledge Base
Chunking Strategies
| Strategy | Description | When to Use |
|---|---|---|
| Fixed-size | Split by fixed token count | Uniform data, FAQ pages |
| Semantic | Split by meaning, preserving context | Long documents, technical docs |
| Hierarchical | Multi-tier chunks (parent-child) | Complex structured documents |
| No chunking | Each file is one chunk | Small files, one topic per file |
5. Guardrails — Output Control and Security
Guardrails are the most critical defense layer in production. Amazon Bedrock Guardrails provides 6 safeguard policies to control both agent input and output.
graph LR
INPUT["User Input"] --> G1["Content Filter"]
G1 --> G2["Denied Topics"]
G2 --> G3["Word Filter"]
G3 --> G4["PII Redaction"]
G4 --> G5["Prompt Attack Detection"]
G5 --> LLM["LLM Processing"]
LLM --> G6["Contextual Grounding"]
G6 --> OUTPUT["Safe Response"]
style INPUT fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style LLM fill:#2c3e50,stroke:#fff,color:#fff
style OUTPUT fill:#4CAF50,stroke:#fff,color:#fff
style G1 fill:#e94560,stroke:#fff,color:#fff
style G2 fill:#e94560,stroke:#fff,color:#fff
style G3 fill:#e94560,stroke:#fff,color:#fff
style G4 fill:#e94560,stroke:#fff,color:#fff
style G5 fill:#e94560,stroke:#fff,color:#fff
style G6 fill:#e94560,stroke:#fff,color:#fff
6-layer Guardrails processing pipeline
Guardrails for Code
Starting in 2026, Guardrails expanded to protect code: detecting harmful content within code elements, blocking code injection attempts, and preventing PII leakage through code structures. This is crucial if you're building coding assistant agents.
Guardrails Configuration Example
{
"name": "production-guardrail",
"contentPolicyConfig": {
"filtersConfig": [
{ "type": "SEXUAL", "inputStrength": "HIGH", "outputStrength": "HIGH" },
{ "type": "VIOLENCE", "inputStrength": "HIGH", "outputStrength": "HIGH" },
{ "type": "HATE", "inputStrength": "HIGH", "outputStrength": "HIGH" },
{ "type": "INSULTS", "inputStrength": "MEDIUM", "outputStrength": "HIGH" }
]
},
"topicPolicyConfig": {
"topicsConfig": [
{
"name": "competitor-comparison",
"definition": "Questions comparing products with competitors",
"type": "DENY"
}
]
},
"sensitiveInformationPolicyConfig": {
"piiEntitiesConfig": [
{ "type": "EMAIL", "action": "ANONYMIZE" },
{ "type": "PHONE", "action": "ANONYMIZE" },
{ "type": "CREDIT_DEBIT_CARD_NUMBER", "action": "BLOCK" }
]
},
"contextualGroundingPolicyConfig": {
"filtersConfig": [
{ "type": "GROUNDING", "threshold": 0.7 },
{ "type": "RELEVANCE", "threshold": 0.7 }
]
}
}
6. Multi-Agent Collaboration — Specialized Agent Teams
When a single agent can't handle complex tasks, Bedrock lets you build a team of agents that coordinate with each other. The supervisor-collaborator architecture supports up to 10 collaborator agents per supervisor.
graph TB
USER["User Request"] --> SUP["Supervisor Agent
Orchestrate & Synthesize"]
SUP --> A1["Agent 1
Customer Lookup"]
SUP --> A2["Agent 2
Order Processing"]
SUP --> A3["Agent 3
Inventory Check"]
SUP --> A4["Agent 4
Payment Processing"]
A1 --> DB["CRM Database"]
A2 --> OMS["Order Management"]
A3 --> WMS["Warehouse System"]
A4 --> PAY["Payment Gateway"]
A1 --> SUP
A2 --> SUP
A3 --> SUP
A4 --> SUP
SUP --> RESP["Consolidated Response"]
style USER fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style SUP fill:#e94560,stroke:#fff,color:#fff
style A1 fill:#2c3e50,stroke:#fff,color:#fff
style A2 fill:#2c3e50,stroke:#fff,color:#fff
style A3 fill:#2c3e50,stroke:#fff,color:#fff
style A4 fill:#2c3e50,stroke:#fff,color:#fff
style RESP fill:#4CAF50,stroke:#fff,color:#fff
style DB fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style OMS fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style WMS fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style PAY fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
Multi-Agent architecture with Supervisor pattern
Two Orchestration Modes
Supervisor with Routing
The supervisor acts only as a router — analyzing the query and directing it to the right specialized agent. Best when tasks are independent and don't require synthesizing results from multiple agents.
Supervisor with Orchestration
The supervisor breaks down the problem into parts, sends them to specialized agents, then synthesizes the results. Best for complex tasks requiring coordination across multiple data sources.
import boto3
bedrock = boto3.client('bedrock-agent')
# Create Supervisor Agent
supervisor = bedrock.create_agent(
agentName='ecommerce-supervisor',
instruction="""You are a supervisor orchestrating an e-commerce agent team.
Analyze customer requests and coordinate specialized agents for:
order lookup, inventory checks, and payment processing.""",
foundationModel='anthropic.claude-sonnet-4-20250514'
)
# Associate collaborator agents
bedrock.associate_agent_collaborator(
agentId=supervisor['agent']['agentId'],
collaborationInstruction='Call this agent for customer information lookup',
collaboratorId=customer_agent_id,
collaboratorName='CustomerLookup',
relayConversationHistory='TO_COLLABORATOR'
)
7. Identity — Authenticating Agents with External Systems
One of the biggest challenges when deploying AI Agents to production: how do you let agents access enterprise systems (Slack, GitHub, Jira, Salesforce) securely, without hardcoding credentials?
AgentCore Identity solves this by assigning agents their own identity and integrating with corporate Identity Providers (IdP).
graph LR
AGENT["AI Agent"] --> ACID["AgentCore Identity"]
ACID --> IDP["Corporate IdP
Okta / Entra ID / Cognito"]
IDP --> TOKEN["OAuth Token"]
TOKEN --> AGENT
AGENT --> SLACK["Slack API"]
AGENT --> GH["GitHub API"]
AGENT --> JIRA["Jira API"]
style AGENT fill:#e94560,stroke:#fff,color:#fff
style ACID fill:#2c3e50,stroke:#fff,color:#fff
style IDP fill:#16213e,stroke:#fff,color:#fff
style TOKEN fill:#4CAF50,stroke:#fff,color:#fff
style SLACK fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style GH fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style JIRA fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
Agent authentication flow through AgentCore Identity
Best Practice: Principle of Least Privilege
Always grant agents the minimum required permissions. Use AgentCore Policy to restrict exactly which APIs the agent can call and on which resources. Never grant admin access to agents.
8. Memory — Agents with Cross-Session Memory
AgentCore Memory automatically manages two types of memory for agents:
| Memory Type | Scope | Use Case |
|---|---|---|
| Session Memory | Within a single conversation | Maintain context of the current conversation |
| Long-term Memory | Across multiple sessions | Remember user preferences and interaction history |
# Memory is automatically managed by AgentCore
# Agent can query memory via API
# Example: Agent remembers customer preferences
response = bedrock_runtime.invoke_agent(
agentId='support-agent',
sessionId='user-12345',
enableTrace=True,
memoryId='user-12345-memory', # Long-term memory identifier
inputText='I want to change my subscription plan'
)
# Agent automatically knows:
# - User has been on Premium plan since 2024
# - User has asked about downgrading twice before
# - User prefers Vietnamese communication
9. Gateway — Connecting Agents to Any API
AgentCore Gateway acts as a middleman between agents and external tools/APIs. Instead of each agent handling authentication, rate limiting, and error handling for every API — Gateway handles it all.
How Gateway Works
You define tools using OpenAPI specifications → Gateway understands the API contract → handles auth, validates requests/responses, auto-retries, and returns results to the agent as tool responses.
# Example OpenAPI spec for "get_order_status" tool
openapi: 3.0.0
info:
title: Order Management API
version: 1.0.0
paths:
/orders/{orderId}:
get:
operationId: getOrderStatus
summary: Get order status
parameters:
- name: orderId
in: path
required: true
schema:
type: string
responses:
'200':
description: Order information
content:
application/json:
schema:
type: object
properties:
status:
type: string
enum: [pending, processing, shipped, delivered]
estimatedDelivery:
type: string
format: date
10. Action Groups — Extending Agent Capabilities
Action Groups allow agents to perform specific actions by connecting to AWS Lambda functions or API endpoints. Each Action Group is a collection of tools the agent can invoke.
graph TB
AGENT["Bedrock Agent"] --> AG1["Action Group: CRM"]
AGENT --> AG2["Action Group: Inventory"]
AGENT --> AG3["Action Group: Notifications"]
AGENT --> KB["Knowledge Base"]
AG1 --> L1["Lambda: searchCustomer"]
AG1 --> L2["Lambda: updateCustomer"]
AG2 --> L3["Lambda: checkStock"]
AG2 --> L4["Lambda: reserveItem"]
AG3 --> L5["Lambda: sendEmail"]
AG3 --> L6["Lambda: sendSMS"]
style AGENT fill:#e94560,stroke:#fff,color:#fff
style AG1 fill:#2c3e50,stroke:#fff,color:#fff
style AG2 fill:#2c3e50,stroke:#fff,color:#fff
style AG3 fill:#2c3e50,stroke:#fff,color:#fff
style KB fill:#16213e,stroke:#fff,color:#fff
style L1 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style L2 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style L3 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style L4 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style L5 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style L6 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
Agent using Action Groups to interact with systems
11. Observability — Monitoring Agents in Production
Agents aren't traditional APIs — output is non-deterministic, each request may go through multiple reasoning steps and tool calls. Observability in the agentic world requires tracing at the individual reasoning step level.
# Enable trace to monitor reasoning chain
response = bedrock_runtime.invoke_agent(
agentId='my-agent',
sessionId='session-001',
enableTrace=True,
inputText='Check order ORD-12345 and send an update to the customer'
)
# Trace output includes:
# 1. PreProcessingTrace - Input parsing & classification
# 2. OrchestrationTrace - Reasoning steps, tool selection
# 3. PostProcessingTrace - Response formatting
# 4. GuardrailTrace - Safety filter results
# 5. FailureTrace - Error details if any
CloudWatch Integration
AgentCore automatically sends metrics and traces to Amazon CloudWatch. You can create dashboards monitoring: average latency per step, tool invocation count, guardrail block rate, and LLM inference cost per session.
12. Comparison with Other AI Agent Platforms
| Criteria | Bedrock AgentCore | Azure AI Foundry | LangGraph Cloud |
|---|---|---|---|
| Deployment | Fully serverless | Container-based | Managed hosting |
| Multi-model | Claude, Llama, Mistral, Titan... | GPT-4o, Phi, Llama | Any LLM |
| Built-in RAG | Knowledge Base (managed) | Azure AI Search | Self-integrated |
| Agent Identity | AgentCore Identity + IdP | Managed Identity | Not built-in |
| Multi-Agent | Supervisor + 10 collaborators | Semantic Kernel orchestration | Graph-based workflows |
| Guardrails | 6 safeguard policies built-in | Content Safety API | Self-implemented |
| Protocols | MCP, A2A, HTTP | HTTP, gRPC | HTTP |
| Lock-in | Framework-agnostic | Azure ecosystem | LangChain ecosystem |
13. Production Architecture: E-commerce Support Agent
To illustrate how these building blocks combine in practice, let's examine the architecture of a Customer Support Agent system for e-commerce.
graph TB
CUST["Customer
Web / Mobile / Chat"] --> ALB["Application Load Balancer"]
ALB --> API["API Gateway"]
API --> RT["AgentCore Runtime"]
RT --> SUP["Supervisor Agent"]
SUP --> FAQ["FAQ Agent
+ Knowledge Base"]
SUP --> ORD["Order Agent
+ Action Groups"]
SUP --> ESC["Escalation Agent
+ SES / SNS"]
RT --> MEM["AgentCore Memory"]
RT --> GR["Guardrails"]
RT --> OBS["CloudWatch
Observability"]
RT --> AID["AgentCore Identity
→ Okta SSO"]
FAQ --> S3["S3: Product Docs"]
ORD --> DDB["DynamoDB: Orders"]
ORD --> PAY["Stripe API"]
ESC --> AGENT_DESK["Human Agent Queue"]
style CUST fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style RT fill:#e94560,stroke:#fff,color:#fff
style SUP fill:#2c3e50,stroke:#fff,color:#fff
style FAQ fill:#16213e,stroke:#fff,color:#fff
style ORD fill:#16213e,stroke:#fff,color:#fff
style ESC fill:#16213e,stroke:#fff,color:#fff
style GR fill:#e94560,stroke:#fff,color:#fff
style MEM fill:#2c3e50,stroke:#fff,color:#fff
style OBS fill:#2c3e50,stroke:#fff,color:#fff
style AID fill:#2c3e50,stroke:#fff,color:#fff
style ALB fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style API fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style S3 fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style DDB fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style PAY fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style AGENT_DESK fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
Production architecture: E-commerce Support Agent on Bedrock AgentCore
14. Production Best Practices
15. Cost and Pricing
Bedrock AgentCore uses a pay-per-use model, charging based on:
- LLM inference: Billed per input/output tokens, varying by model (Claude Sonnet is cheaper than Claude Opus)
- Knowledge Base: Embedding costs + vector store (OpenSearch Serverless)
- Runtime: Compute costs based on active session time
- Guardrails: Billed per text units processed
- Memory: Storage costs for long-term memory
Cost Optimization Tips
Use Prompt Caching to reduce inference costs by up to 90% for repeated system prompts. Choose the right model — not every task needs Claude Opus; many routing/classification tasks only need Haiku. Combine with Intelligent Prompt Routing to automatically select the optimal model based on complexity.
Conclusion
Amazon Bedrock AgentCore represents the maturation of the AI Agent ecosystem — from demo stage to production-grade. With its modular architecture, you're not forced to use the entire platform — you can start with Runtime + Guardrails, then expand to Memory, Identity, and Multi-Agent as needs grow.
What makes AgentCore different isn't any single feature — Azure AI Foundry and LangGraph Cloud offer similar capabilities. The strength lies in everything being managed, serverless, and framework-agnostic — you focus on agent logic while AWS handles the production infrastructure.
References
- Amazon Bedrock AgentCore — Official Page
- AgentCore Developer Guide — AWS Documentation
- Introducing Amazon Bedrock AgentCore — AWS Blog
- Multi-Agent Collaboration — AWS Bedrock Docs
- Amazon Bedrock Guardrails — AWS
- Navigating Managed AI Agent Options on Bedrock — The Scale Factory
- AI Agents Best Practices with AgentCore — AWS ML Blog
Cloudflare Dynamic Workers — Stateful Serverless for the AI Agent Era
Chaos Engineering: Validating Distributed System Resilience
Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.