GraphQL Federation — Building a Unified API Gateway for Microservices

Posted on: 4/25/2026 6:12:31 AM

GraphQL Federation Microservices API Gateway System Design

As microservice systems grow to tens or hundreds of services, the biggest question is no longer "REST or gRPC?" but rather: how can the frontend make a single request and access data from all services? REST API Gateways handle routing, but they don't solve data composition — when a page needs data from 5-10 different services, the frontend must make multiple API calls and stitch the data together. GraphQL Federation was built to solve exactly this problem: each team owns a subgraph, and the router automatically composes them into a single unified API. This article provides a deep analysis of the architecture, implementation on .NET with HotChocolate, and real-world production lessons.

1. The Problem — When REST API Gateway Falls Short

Imagine a product detail page on an e-commerce platform: you need product information (Product Service), pricing and inventory (Inventory Service), reviews (Review Service), recommendations (Recommendation Service), and seller information (Seller Service). With REST, the frontend must make 5 sequential round-trips to the gateway, each request waiting for the previous one to complete (waterfall pattern).

With GraphQL Federation, the frontend sends 1 single query. The Router automatically fans out to relevant subgraphs in parallel and composes the response.

1Request from frontend
340%Adoption growth at Fortune 500
10xApollo Router faster than legacy Gateway
60%GitHub reduced response time with GraphQL

2. Core Concepts

GraphQL Federation revolves around 3 main components:

ConceptDescriptionExample
SubgraphA GraphQL service owned by one team. Defines the types and fields that team is responsible for.Product Subgraph defines type Product { id, name, price }
SupergraphThe unified schema composed from all subgraphs. Clients only see the supergraph — they don't know how many subgraphs exist behind it.type Product { id, name, price, stock, reviews }
RouterThe runtime that receives client queries, creates a query plan, fans out to relevant subgraphs, and composes the response.Apollo Router (Rust), Cosmo Router (Go), HotChocolate Fusion (.NET)
@keyDirective marking an entity — allows other subgraphs to reference and extend this type.type Product @key(fields: "id")
@externalField defined in another subgraph, only referenced here.extend type Product { id: ID! @external }
@requiresField that needs data from another subgraph to resolve.shippingCost: Float @requires(fields: "weight")

3. Federation v2 — Key Improvements

Federation v2 (current version: v2.11) brings significant improvements over v1:

FeatureFederation v1Federation v2
Shared typesEach type owned by only 1 subgraph@shareable — multiple subgraphs can define the same field
Schema evolutionBreaking changes hard to manage@override — safely migrate fields between subgraphs
Input typesCannot be shared@inaccessible — hide fields from supergraph without deleting
CompositionRuntime composition at gatewayBuild-time composition — errors caught before deployment
Error messagesVague, hard to debugDetailed composition hints pointing to exact error locations
Progressive migrationNot supported@override(label: "percent(50)") — canary migration
# Product Subgraph — owned by Product Team
extend schema @link(
  url: "https://specs.apollo.dev/federation/v2.11"
  import: ["@key", "@shareable", "@override"]
)

type Product @key(fields: "id") {
  id: ID!
  name: String!
  description: String
  price: Float!
  weight: Float
  sellerId: ID!
  createdAt: DateTime!
}

type Query {
  product(id: ID!): Product
  products(first: Int = 10, after: String): ProductConnection!
}

Entity Resolution — How the Router stitches data together

When a client queries product(id: "123") { name, stock, reviews }, the Router:

Step 1: Sends query to Product Subgraph for name and id.
Step 2: Uses id from step 1, sends requests in parallel to Inventory Subgraph (for stock) and Review Subgraph (for reviews).
Step 3: Merges results into a single JSON response.

This entire process is transparent to the client — it only sees one unified schema with no knowledge of where data originates.

4. Choosing a Router — Apollo vs Cosmo vs HotChocolate

CriteriaApollo RouterCosmo RouterHotChocolate Fusion
LanguageRustGoC# (.NET)
LicenseElastic v2 (source-available)Apache 2.0 (open-source)MIT (open-source)
PerformanceVery high (Rust runtime)High (Go runtime)High (Kestrel + .NET 10)
Federation specv2.11 (spec authors)v2.x compatiblev2.x + custom Fusion
REST ConnectorsYes (GraphOS)NoSchema Stitching legacy
Managed platformApollo GraphOS (paid)Cosmo Cloud (free tier)Self-hosted
Best forPolyglot teams, enterpriseTeams needing true open-source.NET teams, ASP.NET Core integration

5. Implementation on .NET with HotChocolate

For .NET teams, HotChocolate v14 is the only production-grade choice. It supports both Apollo Federation v2 spec and ChilliCream's own Fusion ecosystem:

// ProductSubgraph/Program.cs
var builder = WebApplication.CreateBuilder(args);

builder.Services
    .AddGraphQLServer()
    .AddApolloFederation() // Enable Federation v2
    .AddQueryType<ProductQuery>()
    .AddType<ProductType>()
    .AddFiltering()
    .AddSorting()
    .RegisterDbContext<ProductDbContext>();

var app = builder.Build();
app.MapGraphQL();
app.Run();
// Types/ProductType.cs
[Key("id")] // Federation @key directive
public class Product
{
    public int Id { get; set; }
    public string Name { get; set; } = default!;
    public decimal Price { get; set; }
    public float? Weight { get; set; }
    public int SellerId { get; set; }
}

// DataLoader — batch N+1 queries
public class ProductByIdDataLoader : BatchDataLoader<int, Product>
{
    private readonly IDbContextFactory<ProductDbContext> _factory;

    public ProductByIdDataLoader(
        IDbContextFactory<ProductDbContext> factory,
        IBatchScheduler scheduler) : base(scheduler)
    {
        _factory = factory;
    }

    protected override async Task<IReadOnlyDictionary<int, Product>>
        LoadBatchAsync(IReadOnlyList<int> keys, CancellationToken ct)
    {
        await using var ctx = await _factory.CreateDbContextAsync(ct);
        return await ctx.Products
            .Where(p => keys.Contains(p.Id))
            .ToDictionaryAsync(p => p.Id, ct);
    }
}

DataLoader — the solution for N+1 in Federation

When the Router fans out queries to multiple subgraphs, each subgraph may receive hundreds of entity resolution requests simultaneously (e.g., resolving stock for 50 products in a list query). Without DataLoader, each entity = 1 database query — the classic N+1 problem.

HotChocolate has built-in DataLoader that automatically batches requests within the same execution cycle. 50 entity resolutions become 1 SQL query with WHERE id IN (...). This is the key difference between a PoC and production-grade Federation.

6. Performance — Is Federation Slower Than REST?

A common concern: "doesn't adding a Router layer slow things down?" Short answer: Router adds slight latency, but total end-to-end time typically decreases thanks to parallel fan-out and elimination of frontend waterfall patterns.

ScenarioREST (5 API calls)GraphQL FederationDifference
Product detail page~450ms (5 sequential calls)~180ms (1 query, parallel fan-out)-60%
Product list (20 items)~200ms (1 call, over-fetching)~220ms (1 query, exact fields)+10% (but less data transfer)
Mobile — weak network~2.5s (5 round-trips x high latency)~800ms (1 round-trip)-68%
Router overheadN/A~3-8ms (query planning)Negligible

Real benchmarks from The Guild (2026)

The Guild (creators of GraphQL Hive) published benchmarks comparing Federation routers:

Apollo Router (Rust): ~8,500 req/s, p99 latency ~12ms
Cosmo Router (Go): ~7,200 req/s, p99 latency ~15ms
Hive Gateway (TypeScript): ~3,800 req/s, p99 latency ~28ms

These numbers show that Router overhead is not the bottleneck — subgraph response times and database queries are the deciding factors.

7. Caching — GraphQL's Biggest Challenge

GraphQL's well-known weakness: CDN caching is hard. REST uses GET + Cache-Control headers for easy caching. GraphQL uses POST with different query bodies each time.

StrategyDescriptionEffectiveness
Persisted QueriesHash queries into IDs (e.g., abc123), send GET /graphql?id=abc123. CDN-cacheable like REST.High — production recommended
Response CachingCache full responses at Router level by query hash + variables.Medium — low hit rate with diverse queries
Entity CachingCache individual entities (e.g., Product:123) at Router. When another query needs Product:123, use cache instead of calling subgraph.High — Apollo Router supports with Redis
Subgraph-level CachingEach subgraph manages its own caching at data layer (Redis, in-memory).High — full control, Router-independent

8. Apollo Connectors — Integrating REST into Federation

A breakthrough feature in Apollo Router v2: Connectors let you turn REST APIs into subgraphs without writing code. Just declare in the schema:

# Turn REST endpoint into GraphQL subgraph
# No separate server needed — Router calls REST API directly

extend schema
  @link(url: "https://specs.apollo.dev/federation/v2.11", import: ["@key"])
  @link(url: "https://specs.apollo.dev/connect/v0.4", import: ["@source", "@connect"])
  @source(name: "legacy", http: { baseURL: "https://api.legacy.com/v1" })

type Product @key(fields: "id") {
  id: ID!
  name: String!
  legacyData: LegacyProductData
    @connect(
      source: "legacy"
      http: { GET: "/products/{$this.id}" }
      selection: """
        sku
        category: categoryName
        tags: labels
      """
    )
}

Connectors enable gradual migration

Instead of rewriting all REST APIs to GraphQL at once (high risk), you can:

1. Keep existing REST services running.
2. Use Connectors to expose them through Federation.
3. Migrate frontend to GraphQL queries.
4. Gradually replace REST services with native GraphQL subgraphs.

This is the "strangler fig pattern" applied to API migration.

9. When Should You Use Federation?

ScenarioUse FederationDon't Use Federation
Service count5+ microservices, multiple teamsMonolith or 2-3 services
FrontendMultiple platforms (web, mobile, partner API)Single frontend, few complex pages
Data compositionOne page needs data from 3+ servicesEach page only needs data from 1 service
Team structureEach team owns their service independentlyOne small team manages everything
CachingFrequently changing, personalized dataMostly static content, need strong CDN caching

Federation is not a silver bullet

1. Increased complexity: Adding Router, schema composition, entity resolution requires deep GraphQL understanding.

2. Harder debugging: When queries are slow, you need to trace through Router, subgraph, and database layers. Without proper observability, debugging is extremely difficult.

3. Schema governance: With 10+ subgraphs from multiple teams, you need schema change review processes (schema registry, CI checks) to prevent breaking changes.

4. Learning curve: Developers must learn GraphQL spec, Federation directives, DataLoader patterns, and caching strategies.

5. Doesn't fully replace API Gateway: You still need an API Gateway for authentication, rate limiting, and IP filtering. The Router only handles GraphQL routing.

10. Conclusion

GraphQL Federation solves a problem that REST API Gateways cannot: automatically composing data from multiple microservices into a single response, while maintaining each team's autonomy with their own subgraph. With Apollo Router (Rust, 10x faster than legacy gateway), HotChocolate Fusion (deep .NET 10 integration), and features like Connectors (integrate REST without code), Federation is production-ready at scale.

However, Federation isn't for every project. If your system only has 2-3 services and one frontend — REST + BFF is much simpler. Federation truly shines when you have 5+ services, multiple teams, and frontends needing data from multiple sources. Start with 1-2 subgraphs, validate the value, then expand gradually — don't migrate your entire system at once.

References