Design a Multi-Channel Notification System in .NET
How to build a notification service in .NET: email, SMS, push channels, preference store, deduplication, and the queue pipeline that delivers reliably.
Table of contents
- When does a dedicated notification service pay off?
- What numbers should I budget for?
- What does the architecture look like?
- What is the .NET 10 wiring for the routing worker?
- What scale-out path does this support?
- What failure modes does this introduce?
- When is a notification service overkill?
- Where should you go from here?
A notification system is the cleanest example of fan-out across channels with strict preference rules. This chapter designs one in .NET: events come in, preferences route them, channel-specific queues drain to providers, and delivery is tracked. The same patterns from earlier chapters - queue, idempotency, outbox - compose into a service that survives provider outages.
When does a dedicated notification service pay off?
Three signals.
Multiple senders, same logic. When five different services each send their own emails, you end up with five inconsistent implementations of templating, retry, and unsubscribe handling. Centralising stops that drift.
Preference and unsubscribe compliance. GDPR, CAN-SPAM, and similar regulations require auditable opt-out handling. A central service is the only practical place to enforce it.
Multi-channel routing. "Order shipped" goes to email + push + SMS depending on user preference. The routing logic does not belong in the order service.
If you only have one sender, one channel, and no compliance
constraints, a SendEmailAsync method on a service is enough.
What numbers should I budget for?
Events / day 10M
Channels per event (avg) 1.5
Notifications / day 15M
Peak rate 15M / 100K * 5 = 750/s
Email send rate 100/s per provider
SMS send rate 10/s per provider
Push send rate ~1000/s
Storage 1 KB per notification * 15M = 15 GB/day
The numbers say: one ASP.NET Core fleet, one Postgres for storage
- preferences, one Redis cache for idempotency keys, three queues (one per channel). The provider rate limits are usually the bottleneck.
What does the architecture look like?
flowchart LR
Producer[Order service] --> Q0[(events queue)]
Q0 --> Router[Routing worker]
Router --> Pref[(Preferences DB)]
Router --> QE[(email queue)]
Router --> QS[(SMS queue)]
Router --> QP[(push queue)]
QE --> WE[Email worker] --> Mailgun
QS --> WS[SMS worker] --> Twilio
QP --> WP[Push worker] --> FCM
WE --> NDB[(notifications log)]
WS --> NDB
WP --> NDB
Producer publishes one event per business action. Routing worker
looks up preferences, decomposes into channel-specific messages,
publishes to channel queues. Each channel worker drains its queue
to its provider. All deliveries write to a notifications log
for audit and analytics.
What is the .NET 10 wiring for the routing worker?
public record OrderShipped(Guid OrderId, Guid UserId, string TrackingNumber);
public class NotificationRouter(
IPreferenceService prefs,
IPublishEndpoint bus,
AppDbContext db) : IConsumer<OrderShipped>
{
public async Task Consume(ConsumeContext<OrderShipped> ctx)
{
var msg = ctx.Message;
var userPrefs = await prefs.GetAsync(msg.UserId, "order_shipped");
var notification = new Notification
{
Id = Guid.NewGuid(),
EventType = "order_shipped",
UserId = msg.UserId,
Payload = JsonSerializer.Serialize(msg),
CreatedAt = DateTimeOffset.UtcNow
};
db.Notifications.Add(notification);
await db.SaveChangesAsync();
if (userPrefs.Email)
await bus.Publish(new SendEmail(notification.Id, msg.UserId, "order_shipped", msg));
if (userPrefs.Sms)
await bus.Publish(new SendSms(notification.Id, msg.UserId, "order_shipped", msg));
if (userPrefs.Push)
await bus.Publish(new SendPush(notification.Id, msg.UserId, "order_shipped", msg));
}
}
// Email worker - idempotent send
public class EmailSendConsumer(IMailgunClient mailgun, IConnectionMultiplexer redis,
ITemplateRenderer renderer, AppDbContext db)
: IConsumer<SendEmail>
{
public async Task Consume(ConsumeContext<SendEmail> ctx)
{
var key = $"sent:email:{ctx.Message.NotificationId}";
var redisDb = redis.GetDatabase();
if (!await redisDb.StringSetAsync(key, "1", TimeSpan.FromDays(7), When.NotExists))
return; // already sent
var template = await db.Templates.FirstAsync(t => t.EventType == ctx.Message.EventType
&& t.Channel == "email");
var rendered = renderer.Render(template.Body, ctx.Message.Payload);
var providerId = await mailgun.SendAsync(ctx.Message.UserId, template.Subject, rendered);
await db.NotificationDeliveries.AddAsync(new()
{
NotificationId = ctx.Message.NotificationId,
Channel = "email",
ProviderId = providerId,
DeliveredAt = DateTimeOffset.UtcNow
});
await db.SaveChangesAsync();
}
}
Three details. The StringSetAsync with When.NotExists is the
atomic idempotency check. The template renderer is pluggable - swap
Handlebars for Liquid for whatever - so non-engineers can edit
templates without code changes. Every delivery is logged with the
provider's ID so you can correlate bounces and complaints back to
notifications.
What scale-out path does this support?
- Routing worker: parallelisable - partition by user ID hash so one user's events stay in order if needed.
- Channel workers: scale per channel; each channel often has a fixed provider rate limit, so the bottleneck is the provider.
- Storage: partition
notificationsby month; archive after 90 days. - Preferences: cached in Redis; backed by Postgres.
For very high volume (>1M notifications/min), specialised systems like AWS SNS or Firebase replace the SMS/push workers entirely. The routing logic stays in your service.
What failure modes does this introduce?
- Provider outage - email provider returns 503. Mitigation: Polly retries; fallback to secondary provider; queue does not lose messages.
- Spam complaints - users report your emails as spam, hurting domain reputation. Mitigation: enforce unsubscribe, suppress bounce list automatically, monitor complaint rate.
- Template error - placeholder typo crashes renderer for that event. Mitigation: schema validation at template-save time; isolate one bad event from the queue (poison-message handling).
- Loop / runaway - a notification triggers an event that
triggers another notification. Mitigation: dedup based on
(userId, eventType, contentHash)for short windows.
When is a notification service overkill?
When you send transactional emails only and have no preference
matrix. A MailKit call from the order service is fine for that
volume. Build the dedicated service when you have multiple senders,
multiple channels, or compliance requirements - not before.
Where should you go from here?
Next case study: file upload service - the shape changes (large binary objects, presigned URLs) but many of the same building blocks (queue, idempotency, observability) carry over.
Frequently asked questions
Why one queue per channel?
How do I avoid sending duplicates?
hash(eventId + userId + channel); the channel sender checks Redis for the key before sending. If present, skip. The send + key-write is wrapped in a Lua script for atomicity. The idempotency chapter explains the pattern in full.Where does the user preference data live?
(user_id, event_type) -> channels[] mapping. Cache hot rows in Redis. Preferences change rarely; cache TTL of an hour is safe. The lookup happens in the routing worker - if the user has opted out of email, the email queue is never written to.How do templates work?
{{ user.name }}). The renderer (Handlebars.NET or Liquid) substitutes at send time. Localisation lives in a template_translations (template_id, locale, body) table. Versioning is by template_id; when a marketer edits, a new version is created and the old one is preserved for reproducibility.