Building Blocks Intermediate 4 min read

Message Queues for .NET: RabbitMQ vs Azure Service Bus vs Kafka

How to pick a message queue for .NET: RabbitMQ, Azure Service Bus, Kafka. Queue vs topic, MassTransit vs raw client, at-least-once semantics.

Table of contents
  1. When does a message queue actually replace a synchronous call?
  2. What numbers should I budget for queue choice?
  3. What does the minimal architecture look like?
  4. What is the .NET 10 wiring with MassTransit?
  5. What failure modes does a queue introduce?
  6. When should you not use a queue?
  7. Where should you go from here?

The first time a synchronous HTTP request times out and you realise the receiver was just slow, not broken, you have met the queue's favourite use case. This chapter shows when a queue beats an HTTP call, which queue to pick for a .NET service, and the MassTransit wiring that makes the choice forgiving.

When does a message queue actually replace a synchronous call?

Three concrete signals.

Long work that exceeds the HTTP request budget. Generating a PDF takes 5 seconds; the user's browser will not wait. Accept the request, drop a message on the queue, return 202 Accepted with a status URL.

Bursty traffic with a slow downstream. A flash sale spikes checkout requests 100x. The payment provider supports 200 req/s. A queue absorbs the burst; the consumer drains at the supported rate.

Cross-service notification. Order created → ship inventory → send email → update analytics. Each receiver should not slow down the writer; each should retry independently. A topic broadcasts the event; consumers subscribe at their own pace.

If none of these are true, an HTTP call with Polly retries (chapter 11) is simpler and easier to debug.

What numbers should I budget for queue choice?

Backend                  Throughput          Latency p99       Storage durability
RabbitMQ (single node)   ~30K msg/s          ~10 ms            optional disk
RabbitMQ (cluster)       ~100K msg/s         ~20 ms            disk + mirroring
Azure Service Bus Std    ~2K msg/s           ~50 ms            built-in
Azure Service Bus Prem   ~10-50K msg/s       ~20 ms            built-in
Amazon SQS               ~3K msg/s/queue     ~50-200 ms        built-in
Kafka                    100K-1M msg/s       ~5 ms             disk, replicated

For most .NET services the throughput numbers are irrelevant - any backend handles your traffic. Pick by semantics (queue vs topic, ordered vs unordered, replay vs no-replay) and operational fit (self-host vs managed, cloud lock-in tolerance).

What does the minimal architecture look like?

flowchart LR
    Producer[ASP.NET Core API] -->|publish| Q[(RabbitMQ queue)]
    Q -->|consume| Worker[Background Service]
    Worker --> DB[(Postgres)]
    Worker -. on retry exceeded .-> DLQ[(Dead-letter queue)]

Producer accepts the user request, returns 202, publishes a message. Consumer (a BackgroundService host) drains the queue, processes each message, acks it. Repeated failures route to a dead-letter queue for inspection. This is the shape of 90% of .NET background work and it scales by adding consumer replicas.

What is the .NET 10 wiring with MassTransit?

Producer side, in Program.cs:

builder.Services.AddMassTransit(x =>
{
    x.UsingRabbitMq((ctx, cfg) =>
    {
        cfg.Host(builder.Configuration.GetConnectionString("RabbitMq"));
        cfg.ConfigureEndpoints(ctx);
    });
});

// Publish from a controller:
public class OrderController(IPublishEndpoint bus) : Controller
{
    [HttpPost]
    public async Task<IActionResult> Create(OrderRequest req)
    {
        var orderId = Guid.NewGuid();
        await bus.Publish(new OrderCreated(orderId, req.UserId, req.Items));
        return Accepted(new { orderId });
    }
}

Consumer side, in a separate worker project:

public record OrderCreated(Guid OrderId, Guid UserId, IReadOnlyList<Item> Items);

public class OrderCreatedConsumer(IOrderProcessor processor)
    : IConsumer<OrderCreated>
{
    public async Task Consume(ConsumeContext<OrderCreated> ctx)
    {
        // Idempotent - safe to call twice with same OrderId.
        await processor.ProcessAsync(ctx.Message.OrderId, ctx.CancellationToken);
    }
}

builder.Services.AddMassTransit(x =>
{
    x.AddConsumer<OrderCreatedConsumer>();
    x.UsingRabbitMq((ctx, cfg) =>
    {
        cfg.Host(builder.Configuration.GetConnectionString("RabbitMq"));
        cfg.ReceiveEndpoint("order-created", ep =>
        {
            ep.UseMessageRetry(r => r.Exponential(5,
                TimeSpan.FromSeconds(1), TimeSpan.FromMinutes(1), TimeSpan.FromSeconds(5)));
            ep.ConfigureConsumer<OrderCreatedConsumer>(ctx);
        });
    });
});

Three things to note. The same code runs against Azure Service Bus by changing one line (UsingAzureServiceBus). MassTransit handles retries with exponential backoff out of the box. The consumer must be idempotent - read it twice in a row should produce the same state.

What failure modes does a queue introduce?

Five that show up first:

Chapter 13 shows how to expose queue_depth, consumer_lag_seconds, and dlq_count_total through OpenTelemetry.

When should you not use a queue?

Three smells.

One: synchronous user-visible work. If the user is staring at a spinner waiting for the result, a queue does not help. They want the answer in 200 ms; the queue makes them wait for the consumer to pick up the message. Use the cache layer or a faster query.

Two: tiny throughput. A queue at 1 msg/min is operational overhead with no benefit. A BackgroundService polling the database is simpler.

Three: workflow with strict cross-service consistency. A queue is at-least-once and unordered. If you need exactly-once multi-service consistency, the saga pattern is the right shape - and even then, queues are the transport, not the answer.

Where should you go from here?

Next chapter: API styles for .NET - REST, gRPC, GraphQL, and when each one wins. Synchronous APIs and async queues compose into every real .NET service; the next chapter completes the synchronous half.

Frequently asked questions

When does a queue beat a synchronous HTTP call?
Three cases: (1) the work takes longer than the HTTP timeout you can tolerate (sending email, generating PDF, ML inference); (2) the receiver may be temporarily down and you must not lose the request (order processing during a payment outage); (3) you want to absorb traffic spikes without scaling the receiver. If none apply, an HTTP call with retries is simpler.
RabbitMQ or Kafka for my .NET service?
RabbitMQ for task queues - one consumer takes a message, processes it, deletes it. Kafka for event logs - many consumers read the same stream at their own pace, replay history, and the producer doesn't know who is listening. The throughput claims of Kafka rarely matter for a .NET app under 10K msg/s; pick by semantics, not benchmarks.
Should I use MassTransit or the raw client?
MassTransit, almost always. It abstracts RabbitMQ, Azure Service Bus, and Amazon SQS behind one API, gives you sagas (chapter 12), retries with backoff, and instrumentation. The raw client is correct only when you need a feature MassTransit hides (low-level RabbitMQ exchange topology, Kafka headers) or when the abstraction overhead is measurable - rare under 10K msg/s.
What is at-least-once delivery and why does it matter?
Most queues do not guarantee 'exactly once' - they guarantee 'at least once'. A consumer may receive the same message twice if the ack was lost. Your handler must be idempotent: applying it twice produces the same state as applying it once. Chapter 10 is dedicated to this.