Case Studies Intermediate 5 min read

Design a File Upload Service in .NET (Chunked, Resumable)

How to build a file upload service in .NET: presigned URLs, chunked resumable uploads to Azure Blob or S3, virus scanning pipeline, and metadata in Postgres.

Phùng Anh Tú · May 25, 2026

Table of contents

When does the file upload service get serious?
What numbers should I budget for?
What does the architecture look like?
What is the .NET 10 wiring for the upload flow?
What scale-out path does this support?
What failure modes does this introduce?
When is a custom service overkill?
Where should you go from here?

A file upload service is one of those problems that look simple until you have to scale. The first version - a multipart/form-data endpoint that streams to local disk - falls over the moment you have multiple replicas, large files, or a virus to scan. This chapter designs the production shape: presigned URLs, chunked resumable upload, and the asynchronous scanning pipeline.

When does the file upload service get serious?

Three signals.

Files larger than 10 MB. Streaming through your service uses bandwidth and threads you cannot afford under load.

Mobile or unreliable clients. A 50 MB upload over hotel Wi-Fi will fail; the client must be able to resume rather than restart.

Files come from untrusted users. Anywhere users upload, virus scanning is mandatory; downloads must check scan status before serving.

If none of these hold (small avatars from authenticated employees), a direct stream-to-blob is fine.

What numbers should I budget for?

Uploads / day              500K
Avg file size              5 MB
Storage / year             5 MB * 500K * 365 = ~900 TB
Peak uploads / s           500K / 100K * 5 = 25/s
CDN egress (downloads)     5x upload bandwidth typical
Bandwidth in (peak)        25 * 5 MB = 125 MB/s

The 900 TB / year number tells you object storage is the only practical answer; that volume in a database would be a disaster. The 25/s peak is small for the application tier - signing URLs is trivial work.

What does the architecture look like?

flowchart LR
    Client --> App[ASP.NET Core API]
    App -->|1. POST /uploads<br/>returns presigned URL| Client
    Client -->|2. PUT chunks| Blob[(Azure Blob / S3)]
    Client -->|3. POST /uploads/{id}/complete| App
    App --> PG[(Postgres<br/>metadata)]
    App --> Q[(scan queue)]
    Q --> Scanner[Antivirus worker]
    Scanner --> Blob
    Scanner --> PG
    Reader[Download client] --> App
    App -->|signed CDN URL| CDN[(CDN)] --> Reader

Three paths. Init: client asks the API for an upload URL, gets a presigned blob URL plus an upload ID. Upload: client PUTs chunks directly to blob storage. Complete: client tells API the upload is done; API enqueues the scan. Download is a short signed CDN URL.

What is the .NET 10 wiring for the upload flow?

public record InitUploadRequest(string FileName, string ContentType, long Size);

app.MapPost("/uploads", async (InitUploadRequest req, AppDbContext db,
                                 BlobContainerClient blobs, ClaimsPrincipal user) =>
{
    var upload = new UploadSession
    {
        Id = Guid.NewGuid(),
        UserId = user.GetUserId(),
        FileName = req.FileName,
        ContentType = req.ContentType,
        Size = req.Size,
        BlobName = $"{Guid.NewGuid()}/{req.FileName}",
        Status = "pending",
        CreatedAt = DateTimeOffset.UtcNow
    };
    db.Uploads.Add(upload);
    await db.SaveChangesAsync();

    var blobClient = blobs.GetBlobClient(upload.BlobName);
    var sas = blobClient.GenerateSasUri(
        BlobSasPermissions.Write | BlobSasPermissions.Create,
        DateTimeOffset.UtcNow.AddHours(1));

    return Results.Ok(new { uploadId = upload.Id, uploadUrl = sas.ToString() });
})
.RequireAuthorization()
.RequireRateLimiting("per-user");

app.MapPost("/uploads/{id:guid}/complete", async (Guid id, AppDbContext db,
                                                    IPublishEndpoint bus) =>
{
    var upload = await db.Uploads.FindAsync(id);
    if (upload is null) return Results.NotFound();
    upload.Status = "uploaded";
    upload.UploadedAt = DateTimeOffset.UtcNow;
    await db.SaveChangesAsync();

    await bus.Publish(new ScanRequested(upload.Id, upload.BlobName));
    return Results.Ok();
});

// Scan worker
public class ScanConsumer(BlobContainerClient blobs, IAntiVirus av, AppDbContext db)
    : IConsumer<ScanRequested>
{
    public async Task Consume(ConsumeContext<ScanRequested> ctx)
    {
        await using var stream = await blobs.GetBlobClient(ctx.Message.BlobName).OpenReadAsync();
        var verdict = await av.ScanAsync(stream, ctx.CancellationToken);
        var upload = await db.Uploads.FindAsync(ctx.Message.UploadId);
        upload!.Status = verdict.IsClean ? "clean" : "quarantined";
        upload.ScanResult = verdict.Detail;
        await db.SaveChangesAsync();
    }
}

// Download endpoint
app.MapGet("/files/{id:guid}", async (Guid id, AppDbContext db, BlobContainerClient blobs) =>
{
    var upload = await db.Uploads.FindAsync(id);
    if (upload is null || upload.Status != "clean") return Results.NotFound();
    var sas = blobs.GetBlobClient(upload.BlobName).GenerateSasUri(
        BlobSasPermissions.Read, DateTimeOffset.UtcNow.AddMinutes(15));
    return Results.Redirect(sas.ToString());  // 302 to CDN
});

Three details. The presigned URL has a tight expiry (1 hour for upload, 15 min for download) so leaked URLs die quickly. The scan status gate prevents serving infected files. The download endpoint redirects rather than streaming - the CDN serves the bytes.

What scale-out path does this support?

API tier: stateless, scales horizontally with replicas.
Object storage: S3 / Azure Blob scale infinitely; cost is storage + egress.
Scanner workers: parallelisable; partition by file hash so duplicates dedupe.
CDN: handles all download traffic; the API never serves bytes.
Metadata DB: partition uploads by month after a year.

The bandwidth bottleneck disappears with this design - every byte goes between the client and blob storage / CDN, never your service.

What failure modes does this introduce?

Orphaned blobs - upload session created, client never completes. Mitigation: nightly job deletes blobs for sessions older than 24 hours with status=pending.
Scanner backlog - a viral upload spike fills the scan queue. Mitigation: alert on queue depth; scanner pool scales with the observability metrics.
Virus uploaded then served - races between upload-complete and scan finish. Mitigation: status gate on download; never serve until status=clean.
CDN cache poisoning - signed URL cached by CDN reused after expiry. Mitigation: short signed URL TTL plus CDN cache key includes the signature.

When is a custom service overkill?

For small avatars and document attachments inside a SaaS, hosted services like Uploadcare, Filestack, or Cloudinary handle the entire flow including scanning and CDN. They are cheaper than building, operating, and securing your own pipeline. Build custom when volume, compliance, or integration depth justifies it.

Where should you go from here?

Next case study: typeahead autocomplete - the data-structure-heavy case study, where Redis sorted sets and trie data structures earn their keep.

Frequently asked questions

Why presigned URLs instead of streaming through my service?

Two reasons: bandwidth and CPU. Streaming a 1 GB file through ASP.NET Core uses 1 GB of server bandwidth and a thread for the duration. Presigned URLs let the client upload directly to S3/Azure Blob - your service spends one millisecond signing the URL and never sees the bytes. The pattern is non-negotiable above ~10 MB files.

How do chunked resumable uploads work?

The client splits the file into chunks (5-10 MB), uploads them in parallel, and tracks which chunks succeeded. On reconnect, it queries the server for completed chunks and resumes only the missing ones. Azure Blob 'Append Blob' and S3 'Multipart Upload' both support this natively; your service just tracks the upload session.

Where does virus scanning fit?

Async via queue. The upload completes, your service stores metadata with status=pending, an event goes to a scan queue, a worker pulls the file from blob storage, runs ClamAV or Defender, and updates the metadata to status=clean or quarantined. The download endpoint refuses to serve files in non-clean state.

Should I serve downloads through my service or via CDN?

CDN with signed URLs - same pattern as upload. The CDN caches at the edge, your service signs short-lived URLs (15 min). The download endpoint redirects with 302 to the CDN URL. The URL shortener pattern of redirect-and-track applies here too for download analytics.