GraphQL Federation — Unifying Microservices APIs into a Single Supergraph
Posted on: 4/22/2026 2:12:26 AM
Table of contents
- 1. The Problem: Traditional API Gateways Don't Scale with Organizations
- 2. What Is GraphQL Federation?
- 3. How Federation Composes Schemas: Key Directives
- 4. Apollo Router — The Brain of the Supergraph
- 5. Designing Subgraphs the Right Way
- 6. Schema Composition and CI/CD
- 7. Query Planning and Execution — Deep Dive into Router Internals
- 8. Security and Authorization in Federation
- 9. Performance Optimization
- 10. Federation in .NET with Hot Chocolate
- 11. Monitoring and Observability
- 12. When NOT to Use Federation
- 13. Alternatives Comparison
- Conclusion
In the world of microservices, each team owns a separate service with its own API schema. But on the client side — web apps, mobile apps — all they want is a single endpoint to query all the data they need. This is exactly the problem GraphQL Federation solves: it allows multiple subgraphs (each subgraph is a microservice) to be composed into a single supergraph, with a gateway in the middle orchestrating everything.
1. The Problem: Traditional API Gateways Don't Scale with Organizations
As microservices systems grow to dozens or hundreds of services, the traditional REST API Gateway model faces serious issues:
- Over-fetching / Under-fetching: Clients must call multiple separate endpoints, receive too much or too little data, then stitch results together manually.
- Gateway monolith: A central team must maintain all routing logic, becoming an organizational bottleneck.
- Schema coupling: When service A adds a new field, the gateway must redeploy — even though services B and C are unrelated.
- API versioning hell: REST forces maintaining v1, v2, v3… in parallel, increasing complexity over time.
Why not just use regular GraphQL?
Monolithic GraphQL solves over-fetching but creates a single schema monolith — one massive schema file that every team must commit to the same repo. Federation removes that constraint by letting each team own their own subgraph.
2. What Is GraphQL Federation?
GraphQL Federation is an architecture that lets you split your GraphQL schema into multiple independent subgraphs, each owned by a separate team/service. A router (gateway) automatically composes all subgraphs into a single supergraph that clients interact with.
graph TD
Client["🖥️ Client App"] --> Router["Apollo Router
(Supergraph Gateway)"]
Router --> S1["Subgraph: Users
Team A"]
Router --> S2["Subgraph: Products
Team B"]
Router --> S3["Subgraph: Orders
Team C"]
Router --> S4["Subgraph: Reviews
Team D"]
style Client fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style Router fill:#e94560,stroke:#fff,color:#fff
style S1 fill:#2c3e50,stroke:#fff,color:#fff
style S2 fill:#2c3e50,stroke:#fff,color:#fff
style S3 fill:#2c3e50,stroke:#fff,color:#fff
style S4 fill:#2c3e50,stroke:#fff,color:#fff
2.1. Three Core Concepts
| Concept | Description | Example |
|---|---|---|
| Subgraph | An independent GraphQL service that defines part of the schema. Each team owns ≥1 subgraph. | Users subgraph defines type User |
| Supergraph | The composed schema, automatically built from all subgraphs. Clients only see the supergraph. | Query { user { orders { items } } } spans 3 subgraphs |
| Router | Gateway that receives client queries, creates a query plan, calls the right subgraphs, and merges results. | Apollo Router, Cosmo Router, Grafbase Gateway |
3. How Federation Composes Schemas: Key Directives
Federation uses special directives that allow subgraphs to "communicate" with each other through the router, without needing to know about each other directly.
3.1. @key — Entity Identification
The @key directive marks a type as an entity — something that can be resolved by multiple subgraphs based on a primary key.
# Subgraph: Users (owned by Team A)
type User @key(fields: "id") {
id: ID!
name: String!
email: String!
avatar: String
}
type Query {
user(id: ID!): User
users: [User!]!
}
# Subgraph: Reviews (owned by Team D)
# Team D "extends" the User type WITHOUT modifying the Users subgraph
type User @key(fields: "id") {
id: ID!
reviews: [Review!]! # Team D adds this field
}
type Review @key(fields: "id") {
id: ID!
rating: Int!
comment: String!
author: User!
}
Entity Resolution
When the router receives the query { user(id: "1") { name reviews { rating } } }, it will: (1) call the Users subgraph to get name, (2) call the Reviews subgraph with __resolveReference({ id: "1" }) to get reviews, then (3) merge the results. The client never knows the data came from 2 different services.
3.2. @external, @requires, @provides — Cross-Subgraph Dependencies
# Subgraph: Products
type Product @key(fields: "id") {
id: ID!
name: String!
price: Float!
weight: Float!
}
# Subgraph: Shipping
type Product @key(fields: "id") {
id: ID!
weight: Float! @external # Declares a field from another subgraph
shippingCost: Float! @requires(fields: "weight") # Needs weight to compute
}
When the client queries shippingCost, the router automatically fetches weight from the Products subgraph first, then passes it to the Shipping subgraph to calculate the cost.
3.3. @shareable — Multiple Subgraphs Resolving the Same Field
# Both Users and Auth subgraphs can resolve email
type User @key(fields: "id") {
id: ID!
email: String! @shareable
}
4. Apollo Router — The Brain of the Supergraph
Apollo Router (written in Rust) is the central component in Federation architecture. It replaces the legacy Apollo Gateway (Node.js) with vastly superior performance.
graph LR
Q["Client Query"] --> QP["Query Planner
(determines which subgraphs)"]
QP --> EE["Execution Engine
(calls subgraphs in parallel)"]
EE --> S1["Subgraph A"]
EE --> S2["Subgraph B"]
EE --> S3["Subgraph C"]
S1 --> MR["Response Merger
(combines results)"]
S2 --> MR
S3 --> MR
MR --> R["Response → Client"]
style Q fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style QP fill:#e94560,stroke:#fff,color:#fff
style EE fill:#e94560,stroke:#fff,color:#fff
style MR fill:#e94560,stroke:#fff,color:#fff
style R fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style S1 fill:#2c3e50,stroke:#fff,color:#fff
style S2 fill:#2c3e50,stroke:#fff,color:#fff
style S3 fill:#2c3e50,stroke:#fff,color:#fff
| Criteria | Apollo Gateway (Node.js) | Apollo Router (Rust) |
|---|---|---|
| RAM @ 10K QPS | ~800 MB | ~120 MB |
| Query planning overhead | 3–5 ms | ~0.5 ms |
| Cold start | Several seconds (Node.js boot) | <100 ms |
| Plugin system | JavaScript middleware | Rhai scripting + Rust native |
| Language | JavaScript/TypeScript | Rust |
| Status | Legacy (not recommended) | Production-ready, actively developed |
4.1. Basic Router Configuration
# router.yaml
supergraph:
listen: 0.0.0.0:4000
introspection: true
cors:
origins:
- https://app.example.com
headers:
all:
request:
- propagate:
named: "Authorization"
- propagate:
named: "X-Request-Id"
telemetry:
exporters:
tracing:
otlp:
enabled: true
endpoint: http://otel-collector:4317
limits:
max_depth: 15
max_height: 200
coprocessor:
url: http://auth-service:8080
router:
request:
headers: true
5. Designing Subgraphs the Right Way
5.1. Separation of Concerns Principles
Each subgraph should represent a bounded context in the domain — similar to Domain-Driven Design.
graph TD
subgraph "E-commerce Supergraph"
UG["Users Subgraph
Profiles, Auth, Preferences"]
PG["Products Subgraph
Catalog, Inventory, Pricing"]
OG["Orders Subgraph
Cart, Checkout, Fulfillment"]
RG["Reviews Subgraph
Ratings, Comments, Moderation"]
SG["Search Subgraph
Full-text, Filters, Suggestions"]
end
UG -.->|"@key: User.id"| RG
PG -.->|"@key: Product.id"| OG
PG -.->|"@key: Product.id"| RG
UG -.->|"@key: User.id"| OG
PG -.->|"@key: Product.id"| SG
style UG fill:#e94560,stroke:#fff,color:#fff
style PG fill:#2c3e50,stroke:#fff,color:#fff
style OG fill:#4CAF50,stroke:#fff,color:#fff
style RG fill:#ff9800,stroke:#fff,color:#fff
style SG fill:#9C27B0,stroke:#fff,color:#fff
5.2. Entity Ownership vs. Contribution
The Golden Rule
Only one subgraph owns an entity — it defines the core fields and primary resolver. Other subgraphs only contribute additional fields to that entity via the @key directive. For example: the Users subgraph owns User.name, User.email; the Reviews subgraph contributes User.reviews.
5.3. Avoiding Circular Dependencies
# ❌ AVOID: Deep circular references
# Users subgraph
type User @key(fields: "id") {
id: ID!
orders: [Order!]! # User → Order
}
# Orders subgraph
type Order @key(fields: "id") {
id: ID!
buyer: User! # Order → User
items: [OrderItem!]!
}
type OrderItem {
product: Product! # OrderItem → Product
}
# Products subgraph
type Product @key(fields: "id") {
id: ID!
reviews: [Review!]! # Product → Review
}
# Reviews subgraph
type Review {
author: User! # Review → User → cycle!
# ✅ BETTER: Keep references unidirectional, use IDs when needed
# Reviews subgraph
type Review @key(fields: "id") {
id: ID!
rating: Int!
comment: String!
authorId: ID! # Return only the ID, let client resolve if needed
author: User! # Or still resolve but shallow (only @key fields)
}
6. Schema Composition and CI/CD
In production, schema composition happens in the CI/CD pipeline — not at runtime. This ensures every breaking change is detected before deployment.
graph LR
Dev["Developer pushes
subgraph change"] --> CI["CI Pipeline"]
CI --> Check["rover subgraph check
(composition + breaking change detection)"]
Check -->|Pass| Pub["rover subgraph publish
(update schema registry)"]
Check -->|Fail| Fix["Fix schema
incompatibility"]
Pub --> Router["Router hot-reloads
new supergraph schema"]
Fix --> Dev
style Dev fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style CI fill:#2c3e50,stroke:#fff,color:#fff
style Check fill:#e94560,stroke:#fff,color:#fff
style Pub fill:#4CAF50,stroke:#fff,color:#fff
style Fix fill:#ff9800,stroke:#fff,color:#fff
style Router fill:#2c3e50,stroke:#fff,color:#fff
6.1. Rover CLI — Schema Management Tool
# Check if a subgraph change is compatible
rover subgraph check my-graph@production \
--name users \
--schema ./users/schema.graphql
# Publish subgraph to the registry
rover subgraph publish my-graph@production \
--name users \
--schema ./users/schema.graphql \
--routing-url http://users-service:4001/graphql
# Fetch the current supergraph schema
rover supergraph fetch my-graph@production
6.2. Breaking Change Detection
| Change Type | Severity | Example |
|---|---|---|
| Remove a field in active use | 🔴 Breaking | Remove User.email while 50 operations use it |
| Change field type | 🔴 Breaking | price: Float! → price: String! |
| Add required argument | 🟡 Potentially breaking | users(limit: Int!) — old clients missing the argument |
| Add new field | 🟢 Safe | Add User.lastLoginAt |
| Deprecate field | 🟢 Safe | email: String! @deprecated(reason: "Use contactEmail") |
7. Query Planning and Execution — Deep Dive into Router Internals
When the router receives a query, it performs query planning — analyzing the query into an execution plan that calls the right subgraphs in the correct dependency order.
# Client sends this query:
query GetUserDashboard($userId: ID!) {
user(id: $userId) {
name # → Users subgraph
email # → Users subgraph
orders(last: 5) { # → Orders subgraph
id
total
items {
product {
name # → Products subgraph
price # → Products subgraph
}
}
}
reviews { # → Reviews subgraph
rating
comment
}
}
}
Router's Query Plan
The router creates an execution plan: Fetch User (parallel) → Fetch Orders + Reviews → Fetch Products. Subgraph calls that don't depend on each other run in parallel. The router also caches query plans (plan caching) so the same query next time only takes ~0.01ms instead of ~0.5ms.
8. Security and Authorization in Federation
Authorization in a federated graph is more complex than in a monolith because data comes from multiple services. There are 2 common patterns:
8.1. Router-level Authentication + Subgraph-level Authorization
# router.yaml — JWT authentication at the router
authentication:
router:
jwt:
jwks:
url: https://auth.example.com/.well-known/jwks.json
header_name: Authorization
header_value_prefix: Bearer
# Subgraph: Orders — authorization at the subgraph level
type Order @key(fields: "id") @authenticated {
id: ID!
total: Float!
buyerId: ID!
items: [OrderItem!]! @requiresScopes(scopes: [["order:read"]])
internalNotes: String @requiresScopes(scopes: [["admin"]])
}
8.2. @policy Directive for Business Rules
type User @key(fields: "id") {
id: ID!
name: String!
email: String! @policy(policies: [["self_or_admin"]])
salary: Float! @policy(policies: [["hr_department"]])
}
9. Performance Optimization
9.1. DataLoader Pattern — Solving N+1 Queries
When querying a list of orders with buyer info, without DataLoader the Users subgraph would be called N times (once per order). DataLoader batches these requests into a single call.
// Users subgraph — Reference resolver with DataLoader
import DataLoader from 'dataloader';
const userLoader = new DataLoader<string, User>(async (ids) => {
const users = await db.users.findMany({
where: { id: { in: ids as string[] } }
});
const userMap = new Map(users.map(u => [u.id, u]));
return ids.map(id => userMap.get(id)!);
});
const resolvers = {
User: {
__resolveReference(ref: { id: string }) {
return userLoader.load(ref.id);
}
}
};
9.2. @defer — Streaming Responses
# Client receives name/email immediately, reviews load later (streaming)
query {
user(id: "1") {
name
email
... @defer(label: "reviews") {
reviews {
rating
comment
}
}
}
}
9.3. Persisted Queries
# router.yaml
persisted_queries:
enabled: true
safelist:
enabled: true # Only allow registered queries
require_id: true # Client sends hash instead of full query text
Performance Note
Avoid designing subgraphs with too many cross-entity references — each hop between subgraphs adds ~2-5ms of network latency. If a query must pass through 4-5 subgraphs sequentially, total latency can reach 20-30ms from overhead alone. Design so that most queries only need 1-2 hops.
10. Federation in .NET with Hot Chocolate
If you're using .NET, Hot Chocolate (ChilliCream) has full Federation 2 support:
// Program.cs — .NET 10 subgraph
var builder = WebApplication.CreateBuilder(args);
builder.Services
.AddGraphQLServer()
.AddApolloFederationV2()
.AddQueryType<Query>()
.AddType<UserType>()
.AddType<ReviewType>()
.RegisterService<IReviewRepository>();
var app = builder.Build();
app.MapGraphQL();
app.Run();
// UserType.cs — Entity type with @key
[Key("id")]
public class UserType : ObjectType<User>
{
protected override void Configure(IObjectTypeDescriptor<User> descriptor)
{
descriptor.Field(u => u.Id).Type<NonNullType<IdType>>();
descriptor.Field("reviews")
.ResolveWith<UserResolvers>(r => r.GetReviews(default!, default!));
}
}
// Reference resolver — called by the router to resolve User entity
[ReferenceResolver]
public static async Task<User?> GetUserById(
[ID] int id,
IReviewRepository repo)
{
return await repo.GetUserReferenceAsync(id);
}
11. Monitoring and Observability
In a federated architecture, observability is especially important because a single query can traverse multiple subgraphs:
11.1. Distributed Tracing
# router.yaml — OpenTelemetry tracing
telemetry:
exporters:
tracing:
otlp:
enabled: true
endpoint: http://otel-collector:4317
protocol: grpc
instrumentation:
spans:
router:
attributes:
graphql.document: true
subgraph:
attributes:
subgraph.name: true
graphql.operation.name: true
11.2. Key Metrics to Monitor
| Metric | Description | Alert Threshold |
|---|---|---|
apollo.router.query_planning.duration |
Time to create query plan | >5ms (may need plan cache warming) |
apollo.router.http.request.duration |
Total end-to-end request time | p99 >500ms |
subgraph.request.duration per subgraph |
Latency per subgraph | Slowest subgraph = bottleneck |
apollo.router.cache.hit_rate |
Query plan cache hit rate | <90% is concerning |
graphql.error.count |
GraphQL error count (partial responses) | Sudden spike = subgraph down |
12. When NOT to Use Federation
Federation Is Not a Silver Bullet
- Small teams (<3 backend devs): The overhead of managing multiple subgraphs, a router, and a schema registry outweighs the benefits. Monolithic GraphQL is sufficient.
- Rarely changing schema: If the API schema is stable, REST + OpenAPI is much simpler.
- Latency-critical (sub-ms): Each hop through the router adds overhead — high-frequency trading or game servers should use gRPC directly.
- No clear domain boundaries: Federation forces you to divide subgraphs — if the domain model isn't mature, you'll be refactoring constantly.
13. Alternatives Comparison
| Solution | Strengths | Weaknesses | Best For |
|---|---|---|---|
| GraphQL Federation | Unified schema, team autonomy, type-safe | Complexity, learning curve, router overhead | Large orgs, many teams, complex domains |
| REST + API Gateway | Simple, mature tooling, easy caching | Over-fetching, many endpoints, versioning | CRUD apps, public APIs, small teams |
| gRPC + Envoy | High performance, strong typing, streaming | Not frontend-friendly, binary protocol | Service-to-service, latency-critical |
| Schema Stitching | Simpler schema merging | Doesn't scale organizationally, fragile | 2-3 services, quick prototypes |
Conclusion
GraphQL Federation is a powerful architectural solution for the problem of "how to let multiple teams build a unified API without creating bottlenecks". With Apollo Router (Rust), schema composition in CI/CD, and an increasingly diverse ecosystem (Hot Chocolate for .NET, federation-jvm for Java/Kotlin), Federation is no longer an experiment — it's a production-proven pattern for large-scale organizations.
However, remember: Federation solves organizational scaling problems before technical ones. If your team is small and your domain is simple, a monolithic GraphQL server is still the better choice.
References:
Container Security & Supply Chain 2026 — SBOM, Cosign, SLSA, Trivy for DevSecOps
Multi-Region Deployment 2026 — Architecture for Systems That Cannot Afford Downtime
Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.