Microsoft Foundry Agent Service — Building Production AI Agents on Azure 2026

Posted on: 4/25/2026 3:13:00 AM

3Agent Types: Prompt, Workflow, Hosted
GAGenerally Available — March 2026
6+New Regions for Hosted Agents
4MCP Authentication Methods

If you've ever deployed an AI agent to production, you know the hardest part isn't writing prompts or choosing models — it's operations: scaling, networking, identity, observability, and security. Microsoft Foundry Agent Service (formerly Azure AI Foundry Agent Service) went GA in March 2026, delivering a fully managed platform that handles all of these operational concerns so developers can focus purely on agent logic.

This article dives deep into the architecture, 3 agent types, tools & MCP ecosystem, Model Router, Voice Live, and enterprise security — with practical .NET SDK code examples.

1. Foundry Agent Service Architecture Overview

Foundry Agent Service isn't just an LLM API wrapper. It's a complete agent runtime that handles the entire agent lifecycle: create, test, deploy, monitor, and publish.

graph TD
    A["Developer"] --> B["Foundry Portal / SDK / REST API"]
    B --> C["Agent Runtime"]
    C --> D["Prompt Agent"]
    C --> E["Workflow Agent"]
    C --> F["Hosted Agent
(Container)"] D --> G["Tools Layer"] E --> G F --> G G --> H["Code Interpreter"] G --> I["File Search"] G --> J["Web Search"] G --> K["MCP Servers"] G --> L["Custom Functions"] G --> M["Memory Store"] C --> N["Model Catalog
GPT-5.4 / Llama / DeepSeek"] C --> O["Observability
Tracing + Metrics"] C --> P["Identity & Security
Entra + RBAC + VNet"] style A fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style B fill:#e94560,stroke:#fff,color:#fff style C fill:#2c3e50,stroke:#fff,color:#fff style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style F fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style N fill:#16213e,stroke:#fff,color:#fff style O fill:#16213e,stroke:#fff,color:#fff style P fill:#16213e,stroke:#fff,color:#fff

Foundry Agent Service architecture — from developer to enterprise runtime

Every agent combines 3 core components:

  • Model (LLM): provides reasoning capabilities — choose from the catalog: GPT-5.4, GPT-5.4 Mini, Llama, DeepSeek, Claude, xAI, and many more
  • Instructions: define goals, constraints, and behavior — can be prompts, workflow YAML, or container code
  • Tools: connect to data and actions — search, files, function calling, MCP servers

2. Three Agent Types — Pick the Right One

2.1 Prompt Agent — Fast and Simple

Prompt Agents are defined entirely through configuration: instructions, model selection, and tools. No code required — create them directly in the Foundry Portal or via SDK/API.

// Create a Prompt Agent with .NET SDK
using Azure.AI.Projects;

var client = new AIProjectClient(
    new Uri("https://your-project.services.ai.azure.com"),
    new DefaultAzureCredential());

var agentClient = client.GetAgentClient();

var agent = await agentClient.CreateAgentAsync(new CreateAgentOptions
{
    Model = "gpt-5.4",
    Name = "customer-support-agent",
    Instructions = """
        You are a customer support assistant.
        Use file search to find product information.
        Always cite sources from documentation.
        """,
    Tools = { new FileSearchToolDefinition() }
});

// Create thread and send message
var thread = await agentClient.CreateThreadAsync();
await agentClient.CreateMessageAsync(thread.Id, new CreateMessageOptions
{
    Role = MessageRole.User,
    Content = "What is your return policy?"
});

// Run agent
var run = await agentClient.CreateRunAsync(thread.Id, agent.Id);

When to use Prompt Agent?

Rapid prototyping, internal tools, FAQ chatbots, or any agent that doesn't need complex orchestration logic. You can create a working agent in minutes using the Portal.

2.2 Workflow Agent — Orchestration Without Code

Workflow Agents let you connect multiple agents, branching logic, human-in-the-loop steps, and automation patterns — all through a visual builder or YAML.

graph LR
    A["Incoming
Request"] --> B{"Classify
Intent"} B -->|Technical| C["Tech Support
Agent"] B -->|Billing| D["Billing
Agent"] B -->|Complaint| E["Escalation
Agent"] C --> F{"Needs
Human?"} D --> F E --> G["Human
Review"] F -->|No| H["Auto
Response"] F -->|Yes| G G --> H style A fill:#e94560,stroke:#fff,color:#fff style B fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style C fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50 style D fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50 style E fill:#f8f9fa,stroke:#ff9800,color:#2c3e50 style G fill:#16213e,stroke:#fff,color:#fff style H fill:#4CAF50,stroke:#fff,color:#fff

Workflow Agent — multi-agent orchestration with branching logic and human-in-the-loop

Supported patterns:

  • Sequential: Agent A → Agent B → Agent C
  • Branching: classify then route to the appropriate agent
  • Group Chat: multiple agents discuss to reach a decision
  • Human-in-the-loop: wait for human approval before proceeding

2.3 Hosted Agent — Full Control

Hosted Agents are code-based, packaged as containers, and deployed on Foundry. You write the orchestration logic entirely in code — using Agent Framework, LangGraph, or any framework. Foundry handles runtime, scaling, and infrastructure.

// Hosted Agent with Microsoft Agent Framework + .NET
using Microsoft.Agents;

var builder = AgentApplication.CreateBuilder(args);

builder.AddAgent("research-agent", agent =>
{
    agent.WithModel("gpt-5.4")
         .WithInstructions("Research and synthesize information.")
         .WithTools(tools =>
         {
             tools.AddWebSearch();
             tools.AddFileSearch();
         });
});

builder.AddAgent("writer-agent", agent =>
{
    agent.WithModel("gpt-5.4-mini")
         .WithInstructions("Write reports from research results.");
});

// Orchestration logic
builder.AddOrchestrator(orchestrator =>
{
    orchestrator.Route("research-agent")
                .Then("writer-agent")
                .WithHandoff(context => context.LastAgentOutput);
});

var app = builder.Build();
await app.RunAsync();

Note: Hosted Agent is in Preview

Hosted Agents don't currently support private networking during preview. If you require VNet isolation, use Prompt Agent or Workflow Agent.

Comparing the 3 Agent Types

CriteriaPrompt AgentWorkflow AgentHosted Agent
Code required?NoNo (YAML optional)Yes
HostingFully managedFully managedContainer, managed
OrchestrationSingle agentMulti-agent, branchingCustom logic
Private networkingYesYesNo (Preview)
Best forPrototype, FAQ botMulti-step automationFull control, custom framework

3. Tools & MCP — Extending Agent Capabilities

Foundry Agent Service provides a rich tool ecosystem, with native support for MCP (Model Context Protocol) servers — the standard protocol for connecting agents to external data and services.

Built-in Tools

  • Code Interpreter: Python for data processing, calculations, chart generation
  • File Search: vector search across documents (PDF, DOCX, TXT...) with automatic chunking
  • Web Search: real-time internet search (powered by Bing)
  • Memory (Preview): long-term memory store across sessions
  • Computer Use (Preview): UI automation via screenshots + clicks

MCP Server Integration

Foundry supports remote MCP servers with 4 authentication methods:

MethodDescriptionUse Case
Key-basedShared access key via custom connectionsInternal APIs, third-party services
Entra Agent IdentityService-to-service auth without credentialsAzure resources, Microsoft Graph
Project Managed IdentityAutomatic per-project isolationMulti-tenant setups
OAuth Identity PassthroughUser-delegated access (OBO)OneDrive, Salesforce, SaaS APIs

Toolbox — Centralized Tool Management

Toolbox (Preview) lets you define a curated set of tools once, manage them centrally in Foundry, and expose them through a single MCP-compatible endpoint. Supports versioning — create a new version, test it, then promote to default when ready. Any MCP client can consume a Toolbox, regardless of framework.

4. Model Router — Smart Model Selection Per Request

One of Foundry's most impressive features is Model Router (GA). Instead of locking in a single model for all requests, Model Router dynamically selects the best-fit model based on: cost, performance, and quality.

graph LR
    A["User
Request"] --> B["Model Router"] B -->|"Classification,
extraction"| C["GPT-5.4 Mini
Low cost"] B -->|"Complex
reasoning"| D["GPT-5.4 Pro
High quality"] B -->|"General
tasks"| E["GPT-5.4
Balanced"] B -->|"Open-source
preference"| F["Llama / DeepSeek
Full control"] C --> G["Response"] D --> G E --> G F --> G style A fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style B fill:#e94560,stroke:#fff,color:#fff style C fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50 style D fill:#f8f9fa,stroke:#ff9800,color:#2c3e50 style E fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50 style F fill:#f8f9fa,stroke:#16213e,color:#2c3e50 style G fill:#4CAF50,stroke:#fff,color:#fff

Model Router — dynamic model selection based on request characteristics

Practical benefits:

  • 40-60% cost reduction: route simple tasks (classification, extraction) to cheaper models
  • Higher quality: complex tasks are automatically assigned to the most capable model
  • Zero code changes: same API endpoint, Model Router handles routing behind the scenes
  • Ideal for multi-agent systems: each agent in a pipeline can use a different model without hardcoding

5. Voice Live — Speech-to-Speech in a Single API

Voice Live (Preview) collapses the traditional STT → LLM → TTS pipeline into a single managed real-time API. Instead of deploying 3 separate services, you just connect Voice Live to your existing agent.

graph LR
    subgraph "Traditional"
    A1["Audio In"] --> B1["STT Service"]
    B1 --> C1["LLM"]
    C1 --> D1["TTS Service"]
    D1 --> E1["Audio Out"]
    end

    subgraph "Voice Live"
    A2["Audio In"] --> B2["Voice Live
Managed API"] B2 --> C2["Foundry Agent
(prompt + tools)"] C2 --> B2 B2 --> E2["Audio Out"] end style B1 fill:#f8f9fa,stroke:#e0e0e0,color:#888 style C1 fill:#f8f9fa,stroke:#e0e0e0,color:#888 style D1 fill:#f8f9fa,stroke:#e0e0e0,color:#888 style B2 fill:#e94560,stroke:#fff,color:#fff style C2 fill:#2c3e50,stroke:#fff,color:#fff

Voice Live collapses STT + LLM + TTS into one managed pipeline

Built-in capabilities:

  • Semantic Voice Activity Detection: recognizes when users are speaking based on context, not just volume
  • End-of-turn Detection: detects when users have finished speaking based on sentence semantics
  • Noise Suppression & Echo Cancellation: server-side, no client-side processing needed
  • Barge-in Support: users can interrupt the agent at any time

Key point: the agent's prompt, tools, and configuration remain unchanged — Voice Live only handles the audio pipeline. A text-based agent can enable Voice Live without modifying any logic.

6. Agent Memory — Remembering Across Sessions

Memory (Public Preview) is a managed long-term memory store with automatic extraction, consolidation, and retrieval across sessions. Instead of building your own vector DB + embedding pipeline, Foundry handles everything.

How Memory Works

  • Automatic extraction: agent automatically extracts facts and preferences from conversations
  • Consolidation: merges and deduplicates information across sessions
  • Contextual retrieval: injects relevant memories into the prompt at runtime
  • Scoped access: memory can be scoped per user, session, or tenant

7. Enterprise Security — Production-Grade from Day One

This is where Foundry Agent Service truly differentiates from self-hosted agents. Enterprise infrastructure is built-in:

7.1 Private Networking

Agents run inside your Azure Virtual Network — zero public egress. MCP servers, Azure AI Search, and Fabric data agents all communicate over private paths. This is mandatory for many industries: finance, healthcare, government.

7.2 Agent Identity

Each agent can have its own dedicated Microsoft Entra identity — authenticating to resources and APIs without sharing credentials. Agent identities can authenticate to external MCP servers with OAuth OBO passthrough support.

7.3 Content Safety

Integrated content filters help mitigate prompt injection (including cross-prompt injection — XPIA) and prevent unsafe outputs. Partner integrations from Palo Alto Prisma AIRS + Zenity provide runtime guardrails for data leakage and tool misuse detection.

7.4 Observability & Evaluation

graph LR
    A["Agent
Runtime"] --> B["Tracing
End-to-end"] A --> C["Metrics
Application Insights"] A --> D["Evaluations"] D --> E["Built-in
coherence, relevance,
groundedness"] D --> F["Custom
Business logic"] D --> G["Continuous
Live traffic sampling"] G --> H["Azure Monitor
Alerts"] style A fill:#2c3e50,stroke:#fff,color:#fff style B fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style D fill:#e94560,stroke:#fff,color:#fff style H fill:#ff9800,stroke:#fff,color:#fff

Observability pipeline — from tracing to continuous evaluation with alerts

Three-layer evaluation system:

  1. Out-of-the-box evaluators: coherence, relevance, groundedness, retrieval quality, safety
  2. Custom evaluators: business logic and domain-specific criteria
  3. Continuous monitoring: live traffic sampling with Azure Monitor + alerting

8. Development Lifecycle — From Prototype to Production

Step 1 — Create
Define a Prompt Agent in the Portal or build a Hosted Agent in code. Agent Manifests provide pre-built templates for common workflows.
Step 2 — Test
Chat directly with your agent in the Agents Playground. MCP server integrations can be tested right in the playground via chat prompts.
Step 3 — Trace
Inspect every model call, tool invocation, and decision through agent tracing. See exactly what your agent "thinks" at each step.
Step 4 — Evaluate
Run evaluations to measure quality and catch regressions. Compare across versions.
Step 5 — Publish
Promote your agent to a managed resource with a stable endpoint. Automatic versioning — rollback anytime.
Step 6 — Distribute
Share agents through Microsoft 365 Copilot, Teams, and Entra Agent Registry — deliver agents where users already work.

9. Comparison with Other Platforms

CriteriaFoundry Agent ServiceAWS Bedrock AgentsVertex AI Agent Builder
Model diversityGPT, Llama, DeepSeek, Claude, xAIClaude, Llama, Mistral, TitanGemini, Claude, Llama
No-code agentsYes — Prompt Agent + PortalYes — Console wizardYes — Agent Builder UI
Multi-agent orchestrationYes — Workflow + Hosted AgentPreview — Multi-agent collaborationLimited — Agent Engine
MCP supportYes — Native with 4 auth methodsNoLimited
Voice integrationYes — Voice Live (Preview)NoNo
Model RouterYes — GALimited — Intelligent prompt routingNo
Private networkingYes — Full VNet isolationYes — VPCYes — VPC-SC
DistributionTeams, M365 Copilot, Entra RegistrySlack, API GatewayDialogflow, API

10. Code Example: End-to-End Customer Support Agent

Here's a complete example creating a customer support agent with file search, memory, and function calling:

using Azure.AI.Projects;
using Azure.Identity;
using OpenAI.Agents;

// 1. Initialize client
var projectClient = new AIProjectClient(
    new Uri("https://your-project.services.ai.azure.com"),
    new DefaultAzureCredential());

var agentClient = projectClient.GetAgentClient();

// 2. Upload knowledge base
var fileClient = projectClient.GetFileClient();
var file = await fileClient.UploadFileAsync(
    "product-catalog.pdf",
    FilePurpose.Agents);

var vectorStore = await agentClient.CreateVectorStoreAsync(new CreateVectorStoreOptions
{
    Name = "product-knowledge-base",
    FileIds = { file.Id }
});

// 3. Define custom function tool
var checkOrderStatus = new FunctionToolDefinition
{
    Name = "check_order_status",
    Description = "Check order status by order ID",
    Parameters = BinaryData.FromObjectAsJson(new
    {
        type = "object",
        properties = new
        {
            order_id = new { type = "string", description = "The order ID" }
        },
        required = new[] { "order_id" }
    })
};

// 4. Create agent
var agent = await agentClient.CreateAgentAsync(new CreateAgentOptions
{
    Model = "gpt-5.4",
    Name = "customer-support",
    Instructions = """
        You are a customer support assistant for an online store.
        - Be friendly and professional.
        - Use file search to find product information.
        - Use check_order_status to look up orders.
        - If unsure, ask the customer for clarification.
        """,
    Tools =
    {
        new FileSearchToolDefinition(),
        checkOrderStatus
    },
    ToolResources = new ToolResources
    {
        FileSearch = new FileSearchToolResources
        {
            VectorStoreIds = { vectorStore.Id }
        }
    }
});

Console.WriteLine($"Agent created: {agent.Id}");

11. Roadmap and Future

October 2025
Azure AI Foundry Agent Service launches in Preview. Prompt Agent support with basic tools.
January 2026
Hosted Agents added, expanded model catalog (DeepSeek, xAI). First MCP server support.
March 2026
Generally Available — private networking, Voice Live preview, Model Router GA, Evaluations GA. 6 new regions for Hosted Agents.
April 2026
Memory Public Preview, Workflow Agents preview, OAuth Identity Passthrough for MCP, Toolbox preview.

Conclusion

Microsoft Foundry Agent Service is reshaping how enterprises build and deploy AI agents. With 3 agent types covering every use case, Model Router for cost optimization, Voice Live expanding into audio, and built-in enterprise security — this is the most compelling platform for any organization looking to bring AI agents to production on Azure infrastructure.

The biggest strength: its MCP-first approach and Toolbox — allowing agents to connect with the broader ecosystem through a standard protocol, avoiding vendor lock-in.

References