Outbox Pattern — Never Lose a Message in Microservices

Posted on: 4/22/2026 8:13:24 PM

In microservices architecture, one of the hardest problems isn't scaling or deployment — it's ensuring data and messages stay consistent. When a service writes data to a database and publishes an event to a message broker, what happens if one of those operations fails? This is the dual-write problem — and the Outbox Pattern is engineered to eliminate it completely.

67% Microservices hit data inconsistency from dual-writes
0 Messages lost with Outbox Pattern
<50ms Average latency with CDC relay
At-least-once Default delivery guarantee

1. The Dual-Write Problem

Imagine an order processing service. When a customer places an order, the service must: (1) save the order to the database and (2) publish an OrderCreated event to a message broker (RabbitMQ, Kafka...) so other services can react (send confirmation email, deduct inventory, process payment...).

sequenceDiagram
    participant S as Order Service
    participant DB as Database
    participant MQ as Message Broker
    S->>DB: INSERT order
    Note over DB: ✅ Success
    S->>MQ: Publish OrderCreated
    Note over MQ: ❌ Broker down!
    Note over S: DB has the order but
no one knows it exists

Figure 1: Dual-write failure — database commit succeeds but message publish fails

Three scenarios can occur when performing two separate write operations:

ScenarioDB WriteMessage PublishConsequence
Happy pathEverything consistent
Lost messageData exists but downstream is unaware → stuck order, no email, no inventory deduction
Lost dataDownstream processes a phantom event → deducts inventory for non-existent order

Why distributed transactions (2PC) don't solve this

Two-Phase Commit (2PC) requires both the database and message broker to support XA transactions. Most modern message brokers (RabbitMQ, Kafka, Azure Service Bus) do not support XA. Even when supported, 2PC introduces high latency, low throughput, and creates a single point of failure at the coordinator. In microservices, 2PC is widely considered an anti-pattern.

2. The Outbox Pattern — How It Works

The core idea is elegantly simple: instead of writing to two different systems, write to only one — the database. Messages to be published are stored in an OutboxMessage table within the same transaction as the business data. A separate process (relay/publisher) then reads from the outbox table and publishes to the message broker.

graph LR
    A["Order Service"] -->|"BEGIN TRANSACTION"| B["Database"]
    B --> C["INSERT Order"]
    B --> D["INSERT OutboxMessage"]
    B -->|"COMMIT"| E["✅ Atomic"]
    F["Outbox Relay"] -->|"Poll / CDC"| B
    F -->|"Publish"| G["Message Broker"]
    G --> H["Inventory Service"]
    G --> I["Email Service"]
    G --> J["Payment Service"]
    style A fill:#e94560,stroke:#fff,color:#fff
    style E fill:#4CAF50,stroke:#fff,color:#fff
    style F fill:#2c3e50,stroke:#e94560,color:#fff
    style G fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style B fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style H fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style I fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style J fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

Figure 2: Outbox Pattern — business data and message written in the same transaction

2.1. Outbox Table Schema

The outbox table must contain enough information for the relay to publish messages without knowing business logic:

CREATE TABLE OutboxMessage (
    Id              UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWSEQUENTIALID(),
    OccurredOn      DATETIME2        NOT NULL DEFAULT SYSUTCDATETIME(),
    MessageType     NVARCHAR(256)    NOT NULL,  -- e.g. 'OrderCreated'
    Payload         NVARCHAR(MAX)    NOT NULL,  -- JSON serialized event
    CorrelationId   NVARCHAR(128)    NULL,
    Destination     NVARCHAR(256)    NULL,       -- routing key / topic
    ProcessedOn     DATETIME2        NULL,       -- NULL = not yet published
    RetryCount      INT              NOT NULL DEFAULT 0,
    Error           NVARCHAR(MAX)    NULL
);

CREATE INDEX IX_OutboxMessage_Unprocessed
    ON OutboxMessage (OccurredOn)
    WHERE ProcessedOn IS NULL;

Why NEWSEQUENTIALID() instead of NEWID()?

NEWSEQUENTIALID() generates monotonically increasing GUIDs, preventing clustered index fragmentation. For high-throughput outbox tables (thousands of rows/second), this significantly impacts INSERT performance and scan efficiency.

2.2. Write Path — Writing Within the Same Transaction

This is the most critical part: business data and the outbox message must be in the same database transaction. If the transaction rolls back, both roll back — there's never a state where data is lost but a message was published, or vice versa.

// C# — EF Core 10 + .NET 10
public class OrderService(AppDbContext db)
{
    public async Task<Order> CreateOrderAsync(CreateOrderCommand cmd)
    {
        var order = new Order
        {
            CustomerId = cmd.CustomerId,
            Items = cmd.Items.Select(i => new OrderItem
            {
                ProductId = i.ProductId,
                Quantity = i.Quantity,
                UnitPrice = i.UnitPrice
            }).ToList(),
            TotalAmount = cmd.Items.Sum(i => i.Quantity * i.UnitPrice),
            Status = OrderStatus.Created
        };

        db.Orders.Add(order);

        // Write outbox message in the SAME DbContext (same transaction)
        db.OutboxMessages.Add(new OutboxMessage
        {
            MessageType = nameof(OrderCreatedEvent),
            Payload = JsonSerializer.Serialize(new OrderCreatedEvent
            {
                OrderId = order.Id,
                CustomerId = order.CustomerId,
                TotalAmount = order.TotalAmount,
                Items = order.Items.Select(i => new OrderItemDto
                {
                    ProductId = i.ProductId,
                    Quantity = i.Quantity
                }).ToList()
            }),
            CorrelationId = cmd.CorrelationId,
            Destination = "order-events"
        });

        await db.SaveChangesAsync(); // 1 transaction, atomic
        return order;
    }
}

3. Relay Strategies — Getting Messages From DB to Broker

Once messages are safely in the outbox table, the next step is delivering them to the message broker. There are two main strategies: Polling Publisher and Transaction Log Tailing (CDC).

3.1. Polling Publisher

The simplest approach: a background service periodically queries the outbox table, fetches unprocessed messages, publishes them to the broker, and marks them as processed.

sequenceDiagram
    participant R as Outbox Relay
    participant DB as Database
    participant MQ as Message Broker
    loop Every 1-5 seconds
        R->>DB: SELECT ... WHERE ProcessedOn IS NULL
        DB-->>R: Batch messages
        R->>MQ: Publish each message
        MQ-->>R: ACK
        R->>DB: UPDATE ProcessedOn = NOW()
    end

Figure 3: Polling Publisher — simple but with latency trade-offs

// .NET 10 — Background service polling outbox
public class OutboxPollingService(
    IServiceScopeFactory scopeFactory,
    IPublishEndpoint bus,
    ILogger<OutboxPollingService> logger) : BackgroundService
{
    protected override async Task ExecuteAsync(CancellationToken ct)
    {
        while (!ct.IsCancellationRequested)
        {
            try
            {
                using var scope = scopeFactory.CreateScope();
                var db = scope.ServiceProvider
                    .GetRequiredService<AppDbContext>();

                var messages = await db.OutboxMessages
                    .Where(m => m.ProcessedOn == null)
                    .OrderBy(m => m.OccurredOn)
                    .Take(100)
                    .ToListAsync(ct);

                foreach (var msg in messages)
                {
                    var eventType = Type.GetType(msg.MessageType);
                    var eventObj = JsonSerializer.Deserialize(
                        msg.Payload, eventType!);

                    await bus.Publish(eventObj!, ct);

                    msg.ProcessedOn = DateTime.UtcNow;
                }

                await db.SaveChangesAsync(ct);
            }
            catch (Exception ex)
            {
                logger.LogError(ex, "Outbox relay error");
            }

            await Task.Delay(TimeSpan.FromSeconds(2), ct);
        }
    }
}

Polling Publisher pitfalls

Latency: messages may be delayed 1-5 seconds depending on the polling interval. Database load: continuous queries create pressure on the DB. Scaling: multiple instances polling simultaneously cause duplicate publishes without distributed locking. Solution: use SELECT ... WITH (UPDLOCK, READPAST) in SQL Server or FOR UPDATE SKIP LOCKED in PostgreSQL.

3.2. Transaction Log Tailing (CDC)

Instead of polling the database, CDC (Change Data Capture) reads directly from the database's transaction log. When a new row appears in the outbox table, CDC streams it immediately to the relay without any query.

graph LR
    A["Database
Transaction Log"] -->|"CDC Stream"| B["Debezium /
SQL Server CDC"] B -->|"Outbox event"| C["Kafka Connect"] C --> D["Kafka Topic:
order-events"] D --> E["Inventory Service"] D --> F["Email Service"] style A fill:#2c3e50,stroke:#e94560,color:#fff style B fill:#e94560,stroke:#fff,color:#fff style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50

Figure 4: CDC-based relay — ultra-low latency, zero additional database load

CriteriaPolling PublisherCDC (Log Tailing)
Latency1-5 seconds (depends on interval)<100ms (near real-time)
Database loadHigh — continuous queriesNear zero — reads from log
Implementation complexityLow — just a background serviceHigh — requires Debezium/Kafka Connect
Ordering guaranteeNeeds extra handlingGuaranteed by transaction order
Infra dependencyNone additionalKafka + Kafka Connect + Debezium
When to useLow-medium throughput, small teamsHigh throughput, low latency requirements

4. Implementation with MassTransit — Built-in Outbox

MassTransit (the most popular message bus library in .NET) has built-in Transactional Outbox support since version 8+. You don't need to write the outbox table or relay yourself — MassTransit handles everything.

4.1. Configuring MassTransit Outbox

// Program.cs — .NET 10
builder.Services.AddMassTransit(x =>
{
    x.AddConsumers(typeof(Program).Assembly);

    x.AddEntityFrameworkOutbox<AppDbContext>(o =>
    {
        o.UseSqlServer();           // or UsePostgres()
        o.UseBusOutbox();           // enable outbox for publish
        o.QueryDelay = TimeSpan.FromSeconds(1);
        o.DuplicateDetectionWindow = TimeSpan.FromMinutes(5);
    });

    x.UsingRabbitMq((ctx, cfg) =>
    {
        cfg.Host("rabbitmq://localhost");
        cfg.ConfigureEndpoints(ctx);
    });
});

builder.Services.AddDbContext<AppDbContext>(opts =>
    opts.UseSqlServer(connectionString));

4.2. Using It in Business Logic

Once the outbox is configured, business code doesn't change at all — MassTransit automatically intercepts Publish() and writes to the outbox instead of sending directly to the broker:

public class CreateOrderConsumer(
    AppDbContext db,
    IPublishEndpoint publisher) : IConsumer<CreateOrderCommand>
{
    public async Task Consume(ConsumeContext<CreateOrderCommand> context)
    {
        var order = new Order
        {
            CustomerId = context.Message.CustomerId,
            TotalAmount = context.Message.TotalAmount,
            Status = OrderStatus.Created
        };

        db.Orders.Add(order);

        // MassTransit writes to OutboxMessage table,
        // does NOT publish directly to RabbitMQ
        await publisher.Publish(new OrderCreatedEvent
        {
            OrderId = order.Id,
            CustomerId = order.CustomerId,
            TotalAmount = order.TotalAmount
        });

        await db.SaveChangesAsync();
        // Transaction commits → both Order and OutboxMessage persisted
        // MassTransit relay auto-publishes to RabbitMQ afterwards
    }
}

MassTransit manages the outbox tables

When using AddEntityFrameworkOutbox, MassTransit creates 3 tables: InboxState, OutboxState, and OutboxMessage. Just run dotnet ef migrations add AddOutbox then dotnet ef database update. The relay runs automatically in the background — no need to write a separate background service.

5. Inbox Pattern — Process Each Message Exactly Once

The Outbox Pattern guarantees at-least-once delivery — messages will definitely be published, but may be published multiple times (when the relay crashes after publishing but before marking ProcessedOn). The consumer side needs the Inbox Pattern to ensure idempotency.

sequenceDiagram
    participant MQ as Message Broker
    participant C as Consumer
    participant DB as Database
    MQ->>C: OrderCreated (MessageId: abc-123)
    C->>DB: SELECT FROM InboxMessage WHERE Id = 'abc-123'
    alt Already processed
        DB-->>C: EXISTS
        C->>MQ: ACK (skip)
    else Not yet processed
        DB-->>C: NOT EXISTS
        C->>DB: BEGIN TRANSACTION
        C->>DB: INSERT InboxMessage(Id='abc-123')
        C->>DB: Execute business logic
        C->>DB: COMMIT
        C->>MQ: ACK
    end

Figure 5: Inbox Pattern — deduplicate messages on the consumer side

// Inbox table
CREATE TABLE InboxMessage (
    MessageId      UNIQUEIDENTIFIER PRIMARY KEY,
    ConsumerType   NVARCHAR(256)    NOT NULL,
    ReceivedOn     DATETIME2        NOT NULL DEFAULT SYSUTCDATETIME(),
    ProcessedOn    DATETIME2        NULL,
    CONSTRAINT UQ_Inbox UNIQUE (MessageId, ConsumerType)
);
// Idempotent consumer pattern
public class OrderCreatedHandler(AppDbContext db) : IConsumer<OrderCreatedEvent>
{
    public async Task Consume(ConsumeContext<OrderCreatedEvent> ctx)
    {
        var messageId = ctx.MessageId!.Value;

        var alreadyProcessed = await db.InboxMessages
            .AnyAsync(i => i.MessageId == messageId
                && i.ConsumerType == nameof(OrderCreatedHandler));

        if (alreadyProcessed) return;

        await using var tx = await db.Database
            .BeginTransactionAsync();

        db.InboxMessages.Add(new InboxMessage
        {
            MessageId = messageId,
            ConsumerType = nameof(OrderCreatedHandler)
        });

        // Business logic: deduct inventory
        var order = ctx.Message;
        foreach (var item in order.Items)
        {
            var product = await db.Products
                .FirstAsync(p => p.Id == item.ProductId);
            product.Stock -= item.Quantity;
        }

        await db.SaveChangesAsync();
        await tx.CommitAsync();
    }
}

6. Production Operations — Key Considerations

6.1. Cleanup — Pruning the Outbox Table

The outbox table grows over time. A scheduled job should delete processed messages:

-- Delete messages processed more than 7 days ago (batch delete to avoid locks)
WHILE 1 = 1
BEGIN
    DELETE TOP (5000) FROM OutboxMessage
    WHERE ProcessedOn IS NOT NULL
      AND ProcessedOn < DATEADD(DAY, -7, SYSUTCDATETIME());

    IF @@ROWCOUNT < 5000 BREAK;
    WAITFOR DELAY '00:00:01'; -- prevent lock escalation
END

6.2. Message Ordering

The outbox guarantees causal ordering within a single transaction, but not across transactions. If you need strict per-entity ordering (all events for Order #123 must be in order), use a partition key = OrderId when publishing to Kafka.

6.3. Monitoring — Catching Issues Early

The most important metric to monitor: outbox lag — the count of unprocessed messages and the age of the oldest pending message.

-- Monitoring query: outbox health check
SELECT
    COUNT(*) AS PendingMessages,
    MIN(OccurredOn) AS OldestPending,
    DATEDIFF(SECOND, MIN(OccurredOn), SYSUTCDATETIME()) AS LagSeconds,
    MAX(RetryCount) AS MaxRetries
FROM OutboxMessage
WHERE ProcessedOn IS NULL;

-- Alert if lag > 30 seconds or pending > 1000

6.4. Dead Letter — Handling Failed Messages

After N retry failures, messages should be moved to a dead-letter state so they don't block other messages:

// In the relay service
if (msg.RetryCount >= 5)
{
    msg.Error = $"Max retries exceeded. Last error: {ex.Message}";
    msg.ProcessedOn = DateTime.UtcNow; // mark as processed
    // Publish metric/alert for the team
    logger.LogCritical("Outbox message {Id} dead-lettered after {Retries} retries",
        msg.Id, msg.RetryCount);
    continue;
}
msg.RetryCount++;

7. End-to-End Architecture — Outbox + Inbox

graph TB
    subgraph "Order Service"
        A["API Controller"] --> B["Order Service"]
        B --> C["DbContext
SaveChanges"] C --> D["Orders Table"] C --> E["Outbox Table"] end subgraph "Relay" F["Outbox Relay
(Poll / CDC)"] --> E F --> G["RabbitMQ / Kafka"] end subgraph "Inventory Service" G --> H["Consumer"] H --> I["Inbox Check"] I --> J["Business Logic"] J --> K["Products Table"] J --> L["Inbox Table"] end style A fill:#e94560,stroke:#fff,color:#fff style F fill:#2c3e50,stroke:#e94560,color:#fff style H fill:#e94560,stroke:#fff,color:#fff style G fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style D fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50 style E fill:#4CAF50,stroke:#fff,color:#fff style L fill:#4CAF50,stroke:#fff,color:#fff

Figure 6: End-to-end Outbox + Inbox Pattern architecture

8. Comparison with Alternative Approaches

ApproachConsistencyLatencyComplexityWhen to use
Outbox PatternStrong (atomic)MediumMediumDefault choice for most use cases
Saga PatternEventualHighHighLong-running business transactions across services
Event SourcingStrongLowVery HighAudit trail needs, complex domains
2PC / XAStrongVery HighMediumLegacy systems, acceptable tight coupling
Best effort + retryWeakLowLowNon-critical notifications, analytics events

When you DON'T need the Outbox Pattern

If a lost message doesn't cause serious business impact (analytics events, non-critical notifications), best-effort publish with retry is sufficient. The Outbox Pattern adds complexity — only use it when the business requires zero message loss.

9. Conclusion

The Outbox Pattern isn't new — it has existed for decades in enterprise systems. But with the rise of microservices, it's become more important than ever. Key takeaways:

  • Never write to two different systems simultaneously — always write to one, then relay to the other
  • Polling is simple, CDC is powerful — start with polling, upgrade to CDC as throughput grows
  • Outbox + Inbox = exactly-once semantics — at-least-once delivery + idempotent consumer
  • MassTransit in .NET has it built-in, no need to build from scratch
  • Monitor outbox lag — this is the lifeline metric of your system

Start with the Outbox Pattern using a polling publisher. When the system scales, switch to CDC with Debezium. When you need exactly-once, add the Inbox Pattern. That's the practical roadmap for any microservices project.

References