Microsoft Foundry Agent Service — Building Production AI Agents on Azure 2026
Posted on: 4/25/2026 3:13:00 AM
Table of contents
- 1. Foundry Agent Service Architecture Overview
- 2. Three Agent Types — Pick the Right One
- 3. Tools & MCP — Extending Agent Capabilities
- 4. Model Router — Smart Model Selection Per Request
- 5. Voice Live — Speech-to-Speech in a Single API
- 6. Agent Memory — Remembering Across Sessions
- 7. Enterprise Security — Production-Grade from Day One
- 8. Development Lifecycle — From Prototype to Production
- 9. Comparison with Other Platforms
- 10. Code Example: End-to-End Customer Support Agent
- 11. Roadmap and Future
- References
If you've ever deployed an AI agent to production, you know the hardest part isn't writing prompts or choosing models — it's operations: scaling, networking, identity, observability, and security. Microsoft Foundry Agent Service (formerly Azure AI Foundry Agent Service) went GA in March 2026, delivering a fully managed platform that handles all of these operational concerns so developers can focus purely on agent logic.
This article dives deep into the architecture, 3 agent types, tools & MCP ecosystem, Model Router, Voice Live, and enterprise security — with practical .NET SDK code examples.
1. Foundry Agent Service Architecture Overview
Foundry Agent Service isn't just an LLM API wrapper. It's a complete agent runtime that handles the entire agent lifecycle: create, test, deploy, monitor, and publish.
graph TD
A["Developer"] --> B["Foundry Portal / SDK / REST API"]
B --> C["Agent Runtime"]
C --> D["Prompt Agent"]
C --> E["Workflow Agent"]
C --> F["Hosted Agent
(Container)"]
D --> G["Tools Layer"]
E --> G
F --> G
G --> H["Code Interpreter"]
G --> I["File Search"]
G --> J["Web Search"]
G --> K["MCP Servers"]
G --> L["Custom Functions"]
G --> M["Memory Store"]
C --> N["Model Catalog
GPT-5.4 / Llama / DeepSeek"]
C --> O["Observability
Tracing + Metrics"]
C --> P["Identity & Security
Entra + RBAC + VNet"]
style A fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style B fill:#e94560,stroke:#fff,color:#fff
style C fill:#2c3e50,stroke:#fff,color:#fff
style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style F fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style N fill:#16213e,stroke:#fff,color:#fff
style O fill:#16213e,stroke:#fff,color:#fff
style P fill:#16213e,stroke:#fff,color:#fff
Foundry Agent Service architecture — from developer to enterprise runtime
Every agent combines 3 core components:
- Model (LLM): provides reasoning capabilities — choose from the catalog: GPT-5.4, GPT-5.4 Mini, Llama, DeepSeek, Claude, xAI, and many more
- Instructions: define goals, constraints, and behavior — can be prompts, workflow YAML, or container code
- Tools: connect to data and actions — search, files, function calling, MCP servers
2. Three Agent Types — Pick the Right One
2.1 Prompt Agent — Fast and Simple
Prompt Agents are defined entirely through configuration: instructions, model selection, and tools. No code required — create them directly in the Foundry Portal or via SDK/API.
// Create a Prompt Agent with .NET SDK
using Azure.AI.Projects;
var client = new AIProjectClient(
new Uri("https://your-project.services.ai.azure.com"),
new DefaultAzureCredential());
var agentClient = client.GetAgentClient();
var agent = await agentClient.CreateAgentAsync(new CreateAgentOptions
{
Model = "gpt-5.4",
Name = "customer-support-agent",
Instructions = """
You are a customer support assistant.
Use file search to find product information.
Always cite sources from documentation.
""",
Tools = { new FileSearchToolDefinition() }
});
// Create thread and send message
var thread = await agentClient.CreateThreadAsync();
await agentClient.CreateMessageAsync(thread.Id, new CreateMessageOptions
{
Role = MessageRole.User,
Content = "What is your return policy?"
});
// Run agent
var run = await agentClient.CreateRunAsync(thread.Id, agent.Id);
When to use Prompt Agent?
Rapid prototyping, internal tools, FAQ chatbots, or any agent that doesn't need complex orchestration logic. You can create a working agent in minutes using the Portal.
2.2 Workflow Agent — Orchestration Without Code
Workflow Agents let you connect multiple agents, branching logic, human-in-the-loop steps, and automation patterns — all through a visual builder or YAML.
graph LR
A["Incoming
Request"] --> B{"Classify
Intent"}
B -->|Technical| C["Tech Support
Agent"]
B -->|Billing| D["Billing
Agent"]
B -->|Complaint| E["Escalation
Agent"]
C --> F{"Needs
Human?"}
D --> F
E --> G["Human
Review"]
F -->|No| H["Auto
Response"]
F -->|Yes| G
G --> H
style A fill:#e94560,stroke:#fff,color:#fff
style B fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style C fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
style D fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
style E fill:#f8f9fa,stroke:#ff9800,color:#2c3e50
style G fill:#16213e,stroke:#fff,color:#fff
style H fill:#4CAF50,stroke:#fff,color:#fff
Workflow Agent — multi-agent orchestration with branching logic and human-in-the-loop
Supported patterns:
- Sequential: Agent A → Agent B → Agent C
- Branching: classify then route to the appropriate agent
- Group Chat: multiple agents discuss to reach a decision
- Human-in-the-loop: wait for human approval before proceeding
2.3 Hosted Agent — Full Control
Hosted Agents are code-based, packaged as containers, and deployed on Foundry. You write the orchestration logic entirely in code — using Agent Framework, LangGraph, or any framework. Foundry handles runtime, scaling, and infrastructure.
// Hosted Agent with Microsoft Agent Framework + .NET
using Microsoft.Agents;
var builder = AgentApplication.CreateBuilder(args);
builder.AddAgent("research-agent", agent =>
{
agent.WithModel("gpt-5.4")
.WithInstructions("Research and synthesize information.")
.WithTools(tools =>
{
tools.AddWebSearch();
tools.AddFileSearch();
});
});
builder.AddAgent("writer-agent", agent =>
{
agent.WithModel("gpt-5.4-mini")
.WithInstructions("Write reports from research results.");
});
// Orchestration logic
builder.AddOrchestrator(orchestrator =>
{
orchestrator.Route("research-agent")
.Then("writer-agent")
.WithHandoff(context => context.LastAgentOutput);
});
var app = builder.Build();
await app.RunAsync();
Note: Hosted Agent is in Preview
Hosted Agents don't currently support private networking during preview. If you require VNet isolation, use Prompt Agent or Workflow Agent.
Comparing the 3 Agent Types
| Criteria | Prompt Agent | Workflow Agent | Hosted Agent |
|---|---|---|---|
| Code required? | No | No (YAML optional) | Yes |
| Hosting | Fully managed | Fully managed | Container, managed |
| Orchestration | Single agent | Multi-agent, branching | Custom logic |
| Private networking | Yes | Yes | No (Preview) |
| Best for | Prototype, FAQ bot | Multi-step automation | Full control, custom framework |
3. Tools & MCP — Extending Agent Capabilities
Foundry Agent Service provides a rich tool ecosystem, with native support for MCP (Model Context Protocol) servers — the standard protocol for connecting agents to external data and services.
Built-in Tools
- Code Interpreter: Python for data processing, calculations, chart generation
- File Search: vector search across documents (PDF, DOCX, TXT...) with automatic chunking
- Web Search: real-time internet search (powered by Bing)
- Memory (Preview): long-term memory store across sessions
- Computer Use (Preview): UI automation via screenshots + clicks
MCP Server Integration
Foundry supports remote MCP servers with 4 authentication methods:
| Method | Description | Use Case |
|---|---|---|
| Key-based | Shared access key via custom connections | Internal APIs, third-party services |
| Entra Agent Identity | Service-to-service auth without credentials | Azure resources, Microsoft Graph |
| Project Managed Identity | Automatic per-project isolation | Multi-tenant setups |
| OAuth Identity Passthrough | User-delegated access (OBO) | OneDrive, Salesforce, SaaS APIs |
Toolbox — Centralized Tool Management
Toolbox (Preview) lets you define a curated set of tools once, manage them centrally in Foundry, and expose them through a single MCP-compatible endpoint. Supports versioning — create a new version, test it, then promote to default when ready. Any MCP client can consume a Toolbox, regardless of framework.
4. Model Router — Smart Model Selection Per Request
One of Foundry's most impressive features is Model Router (GA). Instead of locking in a single model for all requests, Model Router dynamically selects the best-fit model based on: cost, performance, and quality.
graph LR
A["User
Request"] --> B["Model Router"]
B -->|"Classification,
extraction"| C["GPT-5.4 Mini
Low cost"]
B -->|"Complex
reasoning"| D["GPT-5.4 Pro
High quality"]
B -->|"General
tasks"| E["GPT-5.4
Balanced"]
B -->|"Open-source
preference"| F["Llama / DeepSeek
Full control"]
C --> G["Response"]
D --> G
E --> G
F --> G
style A fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style B fill:#e94560,stroke:#fff,color:#fff
style C fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
style D fill:#f8f9fa,stroke:#ff9800,color:#2c3e50
style E fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style F fill:#f8f9fa,stroke:#16213e,color:#2c3e50
style G fill:#4CAF50,stroke:#fff,color:#fff
Model Router — dynamic model selection based on request characteristics
Practical benefits:
- 40-60% cost reduction: route simple tasks (classification, extraction) to cheaper models
- Higher quality: complex tasks are automatically assigned to the most capable model
- Zero code changes: same API endpoint, Model Router handles routing behind the scenes
- Ideal for multi-agent systems: each agent in a pipeline can use a different model without hardcoding
5. Voice Live — Speech-to-Speech in a Single API
Voice Live (Preview) collapses the traditional STT → LLM → TTS pipeline into a single managed real-time API. Instead of deploying 3 separate services, you just connect Voice Live to your existing agent.
graph LR
subgraph "Traditional"
A1["Audio In"] --> B1["STT Service"]
B1 --> C1["LLM"]
C1 --> D1["TTS Service"]
D1 --> E1["Audio Out"]
end
subgraph "Voice Live"
A2["Audio In"] --> B2["Voice Live
Managed API"]
B2 --> C2["Foundry Agent
(prompt + tools)"]
C2 --> B2
B2 --> E2["Audio Out"]
end
style B1 fill:#f8f9fa,stroke:#e0e0e0,color:#888
style C1 fill:#f8f9fa,stroke:#e0e0e0,color:#888
style D1 fill:#f8f9fa,stroke:#e0e0e0,color:#888
style B2 fill:#e94560,stroke:#fff,color:#fff
style C2 fill:#2c3e50,stroke:#fff,color:#fff
Voice Live collapses STT + LLM + TTS into one managed pipeline
Built-in capabilities:
- Semantic Voice Activity Detection: recognizes when users are speaking based on context, not just volume
- End-of-turn Detection: detects when users have finished speaking based on sentence semantics
- Noise Suppression & Echo Cancellation: server-side, no client-side processing needed
- Barge-in Support: users can interrupt the agent at any time
Key point: the agent's prompt, tools, and configuration remain unchanged — Voice Live only handles the audio pipeline. A text-based agent can enable Voice Live without modifying any logic.
6. Agent Memory — Remembering Across Sessions
Memory (Public Preview) is a managed long-term memory store with automatic extraction, consolidation, and retrieval across sessions. Instead of building your own vector DB + embedding pipeline, Foundry handles everything.
How Memory Works
- Automatic extraction: agent automatically extracts facts and preferences from conversations
- Consolidation: merges and deduplicates information across sessions
- Contextual retrieval: injects relevant memories into the prompt at runtime
- Scoped access: memory can be scoped per user, session, or tenant
7. Enterprise Security — Production-Grade from Day One
This is where Foundry Agent Service truly differentiates from self-hosted agents. Enterprise infrastructure is built-in:
7.1 Private Networking
Agents run inside your Azure Virtual Network — zero public egress. MCP servers, Azure AI Search, and Fabric data agents all communicate over private paths. This is mandatory for many industries: finance, healthcare, government.
7.2 Agent Identity
Each agent can have its own dedicated Microsoft Entra identity — authenticating to resources and APIs without sharing credentials. Agent identities can authenticate to external MCP servers with OAuth OBO passthrough support.
7.3 Content Safety
Integrated content filters help mitigate prompt injection (including cross-prompt injection — XPIA) and prevent unsafe outputs. Partner integrations from Palo Alto Prisma AIRS + Zenity provide runtime guardrails for data leakage and tool misuse detection.
7.4 Observability & Evaluation
graph LR
A["Agent
Runtime"] --> B["Tracing
End-to-end"]
A --> C["Metrics
Application Insights"]
A --> D["Evaluations"]
D --> E["Built-in
coherence, relevance,
groundedness"]
D --> F["Custom
Business logic"]
D --> G["Continuous
Live traffic sampling"]
G --> H["Azure Monitor
Alerts"]
style A fill:#2c3e50,stroke:#fff,color:#fff
style B fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style D fill:#e94560,stroke:#fff,color:#fff
style H fill:#ff9800,stroke:#fff,color:#fff
Observability pipeline — from tracing to continuous evaluation with alerts
Three-layer evaluation system:
- Out-of-the-box evaluators: coherence, relevance, groundedness, retrieval quality, safety
- Custom evaluators: business logic and domain-specific criteria
- Continuous monitoring: live traffic sampling with Azure Monitor + alerting
8. Development Lifecycle — From Prototype to Production
9. Comparison with Other Platforms
| Criteria | Foundry Agent Service | AWS Bedrock Agents | Vertex AI Agent Builder |
|---|---|---|---|
| Model diversity | GPT, Llama, DeepSeek, Claude, xAI | Claude, Llama, Mistral, Titan | Gemini, Claude, Llama |
| No-code agents | Yes — Prompt Agent + Portal | Yes — Console wizard | Yes — Agent Builder UI |
| Multi-agent orchestration | Yes — Workflow + Hosted Agent | Preview — Multi-agent collaboration | Limited — Agent Engine |
| MCP support | Yes — Native with 4 auth methods | No | Limited |
| Voice integration | Yes — Voice Live (Preview) | No | No |
| Model Router | Yes — GA | Limited — Intelligent prompt routing | No |
| Private networking | Yes — Full VNet isolation | Yes — VPC | Yes — VPC-SC |
| Distribution | Teams, M365 Copilot, Entra Registry | Slack, API Gateway | Dialogflow, API |
10. Code Example: End-to-End Customer Support Agent
Here's a complete example creating a customer support agent with file search, memory, and function calling:
using Azure.AI.Projects;
using Azure.Identity;
using OpenAI.Agents;
// 1. Initialize client
var projectClient = new AIProjectClient(
new Uri("https://your-project.services.ai.azure.com"),
new DefaultAzureCredential());
var agentClient = projectClient.GetAgentClient();
// 2. Upload knowledge base
var fileClient = projectClient.GetFileClient();
var file = await fileClient.UploadFileAsync(
"product-catalog.pdf",
FilePurpose.Agents);
var vectorStore = await agentClient.CreateVectorStoreAsync(new CreateVectorStoreOptions
{
Name = "product-knowledge-base",
FileIds = { file.Id }
});
// 3. Define custom function tool
var checkOrderStatus = new FunctionToolDefinition
{
Name = "check_order_status",
Description = "Check order status by order ID",
Parameters = BinaryData.FromObjectAsJson(new
{
type = "object",
properties = new
{
order_id = new { type = "string", description = "The order ID" }
},
required = new[] { "order_id" }
})
};
// 4. Create agent
var agent = await agentClient.CreateAgentAsync(new CreateAgentOptions
{
Model = "gpt-5.4",
Name = "customer-support",
Instructions = """
You are a customer support assistant for an online store.
- Be friendly and professional.
- Use file search to find product information.
- Use check_order_status to look up orders.
- If unsure, ask the customer for clarification.
""",
Tools =
{
new FileSearchToolDefinition(),
checkOrderStatus
},
ToolResources = new ToolResources
{
FileSearch = new FileSearchToolResources
{
VectorStoreIds = { vectorStore.Id }
}
}
});
Console.WriteLine($"Agent created: {agent.Id}");
11. Roadmap and Future
Conclusion
Microsoft Foundry Agent Service is reshaping how enterprises build and deploy AI agents. With 3 agent types covering every use case, Model Router for cost optimization, Voice Live expanding into audio, and built-in enterprise security — this is the most compelling platform for any organization looking to bring AI agents to production on Azure infrastructure.
The biggest strength: its MCP-first approach and Toolbox — allowing agents to connect with the broader ecosystem through a standard protocol, avoiding vendor lock-in.
References
Chaos Engineering: Validating Distributed System Resilience
API Versioning — Production-Ready Strategies for Managing API Versions
Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.