Load Testing for Distributed Systems — k6, NBomber and Performance Testing Strategies

Posted on: 4/25/2026 1:14:20 PM

Table of contents

1. Why Load Testing Matters for Distributed Systems
2. Types of Performance Testing
1. Recommended Testing Order
3. Grafana k6 — Modern Load Testing with JavaScript/TypeScript
4. NBomber — Native .NET Load Testing for C# Developers
5. k6 vs NBomber Comparison
1. Practical Recommendation
6. Load Testing Strategies for Microservices
7. Integrating Load Testing into CI/CD Pipelines
1. 7.1 GitHub Actions + k6
2. 7.2 .NET CI + NBomber
8. Analyzing Results and Debugging Bottlenecks
1. 8.1 Critical Metrics to Monitor
2. 8.2 Bottleneck Detection Patterns
  1. Common Warning Signs
9. Best Practices for Production Load Testing
1. 9.1 Golden Rules
2. 9.2 Anti-Patterns to Avoid
10. Conclusion
References

You just deployed a new microservice — everything runs smoothly on staging with response times under 200ms and zero errors. But when production hits 10,000 concurrent users, the system collapses within 3 minutes. Load testing is the defense layer that helps you discover these weaknesses before they become production incidents.

This article dives deep into performance testing strategies for distributed systems, comparing two leading tools — Grafana k6 (JavaScript/TypeScript) and NBomber (.NET/C#) — along with practical patterns for integrating load testing into your CI/CD pipeline.

72.8%Dev teams prioritize AI-powered testing (2025)

70%k6 CPU savings vs alternatives

83%APIs still use REST protocol

340%GraphQL growth at Fortune 500

1. Why Load Testing Matters for Distributed Systems

In monolithic architecture, bottlenecks typically reside at a single point — the database or application server. But with distributed systems, things get much more complex: one slow service can cause cascading failures across the entire call chain, and connection pool exhaustion at one node can affect dozens of downstream services.

graph LR
    A[Client
10K req/s] --> B[API Gateway]
    B --> C[Auth Service]
    B --> D[Product Service]
    B --> E[Order Service]
    D --> F[(Database)]
    E --> F
    E --> G[Payment Service]
    G --> H[External Payment API]

    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style F fill:#16213e,stroke:#fff,color:#fff
    style G fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style H fill:#ff9800,stroke:#fff,color:#fff

Typical microservices architecture — every node can be a bottleneck under high load

Load testing for distributed systems needs to answer these questions:

Maximum throughput — How many requests/second can the system handle before response time exceeds the acceptable threshold?
Breaking point — At what load level does the system start returning errors or timeouts?
Cascading failure — When one service is overloaded, how far does the impact spread?
Resource saturation — Which service exhausts CPU, memory, or connection pool first?
Recovery time — After load decreases, how long does it take the system to return to normal state?

2. Types of Performance Testing

Each test type serves a different purpose. Understanding each one helps you design a test suite that matches your system's SLA requirements.

graph TB
    subgraph Types["Performance Testing Types"]
        A["🔥 Smoke Test
Basic validation
1-5 VUs"]
        B["📊 Load Test
Average load
Target VUs"]
        C["💪 Stress Test
Beyond limits
2-3x Target"]
        D["⚡ Spike Test
Sudden surge
0 → Max → 0"]
        E["🕐 Soak Test
Long-running
4-24 hours"]
        F["🎯 Breakpoint Test
Find limits
Continuous ramp"]
    end

    style A fill:#4CAF50,stroke:#fff,color:#fff
    style B fill:#2196F3,stroke:#fff,color:#fff
    style C fill:#ff9800,stroke:#fff,color:#fff
    style D fill:#e94560,stroke:#fff,color:#fff
    style E fill:#9C27B0,stroke:#fff,color:#fff
    style F fill:#16213e,stroke:#fff,color:#fff

6 main performance testing types — from basic validation to finding system breaking points

Test Type	Objective	Load Model	Duration	When to Use
Smoke Test	Verify script correctness	1-5 constant VUs	1-3 min	After every test script change
Load Test	Evaluate normal performance	Ramp up → steady → ramp down	10-30 min	Every sprint/release
Stress Test	Find system limits	2-3x normal load	15-30 min	Before major launches
Spike Test	Test sudden traffic surges	0 → peak → 0 in seconds	5-10 min	Flash sales, viral events
Soak Test	Detect memory/connection leaks	Sustained average load	4-24 hours	Before production releases
Breakpoint Test	Determine max capacity	Continuous ramp until failure	Variable	Capacity planning

Recommended Testing Order

Always start with a Smoke Test to ensure the script works correctly, then run a Load Test to establish a baseline, before moving to Stress/Spike/Soak for advanced scenarios. Don't jump straight to stress testing — you'll waste time debugging your test script instead of debugging the system.

3. Grafana k6 — Modern Load Testing with JavaScript/TypeScript

3.1 Why Choose k6?

k6 is an open-source load testing tool from Grafana Labs, written in Go but allowing scripting in JavaScript/TypeScript. Its biggest strength is the high-performance engine — using 70% less CPU than similar tools, allowing thousands of virtual users on a single machine.

Since version 1.0, k6 supports native TypeScript — you get type safety and IDE autocomplete without a separate build step.

3.2 Writing Your First Test Script

// load-test.ts — k6 load test for an API endpoint
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';

// Custom metrics
const errorRate = new Rate('errors');
const responseTime = new Trend('response_time_ms');

// Test configuration
export const options = {
  scenarios: {
    average_load: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '2m', target: 100 },  // Ramp up to 100 VUs
        { duration: '5m', target: 100 },  // Stay at 100 VUs
        { duration: '2m', target: 0 },    // Ramp down
      ],
    },
  },
  thresholds: {
    http_req_duration: ['p(95)<500', 'p(99)<1000'],
    errors: ['rate<0.01'],  // Error rate < 1%
    response_time_ms: ['p(95)<400'],
  },
};

export default function () {
  const res = http.get('https://api.example.com/products', {
    headers: { 'Authorization': `Bearer ${__ENV.API_TOKEN}` },
    tags: { name: 'GetProducts' },
  });

  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
    'body has products': (r) => JSON.parse(r.body).length > 0,
  });

  errorRate.add(res.status !== 200);
  responseTime.add(res.timings.duration);

  sleep(1); // Think time between requests
}

3.3 Advanced Scenarios and Executors

k6 provides multiple executor types for different load models:

export const options = {
  scenarios: {
    // Scenario 1: Constant arrival rate — control RPS
    constant_rps: {
      executor: 'constant-arrival-rate',
      rate: 200,           // 200 iterations/second
      timeUnit: '1s',
      duration: '5m',
      preAllocatedVUs: 50,
      maxVUs: 200,
    },
    // Scenario 2: Spike test
    spike: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '10s', target: 0 },
        { duration: '10s', target: 500 },  // Spike to 500
        { duration: '30s', target: 500 },
        { duration: '10s', target: 0 },    // Drop to 0
      ],
      startTime: '6m', // Start after scenario 1
    },
  },
};

Constant VUs vs Constant Arrival Rate

Constant VUs (ramping-vus): each VU completes one iteration then starts another. RPS depends on response time — slower server → lower RPS. Use when simulating concurrent users.

Constant Arrival Rate: guarantees exactly N iterations/second regardless of response time. k6 spawns additional VUs as needed. Use when measuring the system under fixed throughput — ideal for SLA testing.

3.4 Browser Testing with k6

k6 integrates a Playwright-compatible browser API, allowing hybrid tests — combining protocol-level (HTTP) and browser-level testing in the same scenario:

import { browser } from 'k6/browser';
import http from 'k6/http';

export const options = {
  scenarios: {
    browser_test: {
      executor: 'constant-vus',
      vus: 5,
      duration: '3m',
      options: { browser: { type: 'chromium' } },
    },
    api_test: {
      executor: 'constant-arrival-rate',
      rate: 100,
      timeUnit: '1s',
      duration: '3m',
      preAllocatedVUs: 20,
    },
  },
};

export async function browser_test() {
  const page = await browser.newPage();
  await page.goto('https://app.example.com/dashboard');

  // Measure real Web Vitals
  const lcp = await page.evaluate(() => {
    return new Promise(resolve => {
      new PerformanceObserver(list => {
        const entries = list.getEntries();
        resolve(entries[entries.length - 1].startTime);
      }).observe({ type: 'largest-contentful-paint', buffered: true });
    });
  });

  console.log(`LCP: ${lcp}ms`);
  await page.close();
}

4. NBomber — Native .NET Load Testing for C# Developers

4.1 When to Choose NBomber

If your team primarily uses .NET and wants to write load tests in C#/F# — leveraging IDE support, debugging, and NuGet packages — NBomber is the ideal choice. NBomber works as a .NET library, installed via NuGet, and test scenarios run like regular unit tests.

4.2 Writing Scenarios with NBomber

using NBomber.CSharp;
using NBomber.Http.CSharp;

var httpClient = new HttpClient();

var scenario = Scenario.Create("get_products", async context =>
{
    var request = Http.CreateRequest("GET", "https://api.example.com/products")
        .WithHeader("Authorization", "Bearer " + Environment.GetEnvironmentVariable("API_TOKEN"));

    var response = await Http.Send(httpClient, request);

    return response;
})
.WithLoadSimulations(
    Simulation.RampingInject(rate: 100, interval: TimeSpan.FromSeconds(1),
                             during: TimeSpan.FromMinutes(2)),
    Simulation.Inject(rate: 100, interval: TimeSpan.FromSeconds(1),
                      during: TimeSpan.FromMinutes(5)),
    Simulation.RampingInject(rate: 0, interval: TimeSpan.FromSeconds(1),
                             during: TimeSpan.FromMinutes(2))
);

NBomberRunner
    .RegisterScenarios(scenario)
    .WithReportFormats(ReportFormat.Html, ReportFormat.Csv)
    .WithReportFolder("./reports")
    .Run();

4.3 Multi-Protocol Testing

NBomber's strength lies in testing any protocol — HTTP, gRPC, WebSocket, database, message queue — within the same scenario:

var httpStep = Scenario.Create("mixed_workload", async context =>
{
    // Step 1: Call REST API
    var apiResponse = await Http.Send(httpClient,
        Http.CreateRequest("GET", "https://api.example.com/products"));

    if (apiResponse.StatusCode != "200")
        return Response.Fail();

    // Step 2: Call gRPC service
    var grpcResponse = await grpcClient.GetProductDetailsAsync(
        new ProductRequest { Id = context.ScenarioInfo.ThreadNumber });

    // Step 3: Publish message to RabbitMQ
    channel.BasicPublish(exchange: "", routingKey: "orders",
        body: Encoding.UTF8.GetBytes($"order-{context.InvocationNumber}"));

    return Response.Ok(statusCode: "200",
        sizeBytes: apiResponse.SizeBytes + grpcResponse.CalculateSize());
})
.WithLoadSimulations(
    Simulation.KeepConstant(copies: 50, during: TimeSpan.FromMinutes(10))
);

5. k6 vs NBomber Comparison

Criteria	Grafana k6	NBomber
Language	JavaScript / TypeScript	C# / F#
Engine	Go (goroutines)	.NET (async/await)
Installation	Standalone binary	NuGet package
IDE Support	VS Code + extensions	Visual Studio / Rider (full debug)
Browser Testing	Yes (built-in Chromium)	Yes (via Playwright NuGet)
Protocols	HTTP, WebSocket, gRPC (extension)	Any (HTTP, gRPC, WS, DB, MQ...)
Distributed Testing	k6 Cloud or k6-operator (K8s)	NBomber Cluster
Reporting	Grafana dashboards, JSON, CSV	HTML, CSV, TXT, Markdown
CI/CD	Native (exit code based on thresholds)	xUnit/NUnit runner, threshold assertions
Pricing	OSS free, Cloud paid	Free (personal), $99/month (business)
Best For	Multi-language teams, DevOps-driven	.NET teams, developer-driven testing

Practical Recommendation

If your team primarily uses .NET and wants load tests running as unit tests in the CI pipeline → choose NBomber. If your team is multi-language or prefers fast TypeScript scripting → choose k6. Many large organizations use both: k6 for quick API-level testing in CI, NBomber for complex integration testing that requires debugging.

6. Load Testing Strategies for Microservices

6.1 Layered Testing Model

graph TB
    subgraph L1["Layer 1 — Component Testing"]
        A["Test individual
services"]
        B["Mock dependencies"]
        C["Measure baseline
response time"]
    end

    subgraph L2["Layer 2 — Integration Testing"]
        D["Test chains of
2-3 services"]
        E["Real dependencies"]
        F["Measure end-to-end
latency"]
    end

    subgraph L3["Layer 3 — System Testing"]
        G["Test entire
system"]
        H["Production-like
traffic"]
        I["Measure throughput
& error rate"]
    end

    L1 --> L2 --> L3

    style A fill:#4CAF50,stroke:#fff,color:#fff
    style B fill:#4CAF50,stroke:#fff,color:#fff
    style C fill:#4CAF50,stroke:#fff,color:#fff
    style D fill:#2196F3,stroke:#fff,color:#fff
    style E fill:#2196F3,stroke:#fff,color:#fff
    style F fill:#2196F3,stroke:#fff,color:#fff
    style G fill:#e94560,stroke:#fff,color:#fff
    style H fill:#e94560,stroke:#fff,color:#fff
    style I fill:#e94560,stroke:#fff,color:#fff

3-layer performance testing model for microservices

6.2 Designing Realistic Test Scenarios

The most common mistake in load testing is creating traffic patterns that don't resemble reality. Production traffic is rarely uniform — there are typically hot paths (20% of endpoints receiving 80% of traffic) and cold paths.

// k6: Simulating realistic traffic patterns
export const options = {
  scenarios: {
    // 70% traffic: Browse products (reads)
    browse: {
      executor: 'constant-arrival-rate',
      rate: 700,
      timeUnit: '1s',
      duration: '10m',
      preAllocatedVUs: 100,
      exec: 'browseProducts',
    },
    // 20% traffic: Search (read + compute)
    search: {
      executor: 'constant-arrival-rate',
      rate: 200,
      timeUnit: '1s',
      duration: '10m',
      preAllocatedVUs: 50,
      exec: 'searchProducts',
    },
    // 10% traffic: Checkout (writes)
    checkout: {
      executor: 'constant-arrival-rate',
      rate: 100,
      timeUnit: '1s',
      duration: '10m',
      preAllocatedVUs: 30,
      exec: 'checkout',
    },
  },
};

export function browseProducts() {
  const categoryId = Math.floor(Math.random() * 20) + 1;
  http.get(`${BASE_URL}/products?category=${categoryId}`);
  sleep(Math.random() * 3 + 1); // Think time 1-4s
}

export function searchProducts() {
  const terms = ['laptop', 'phone', 'tablet', 'headphone', 'camera'];
  const q = terms[Math.floor(Math.random() * terms.length)];
  http.get(`${BASE_URL}/search?q=${q}`);
  sleep(Math.random() * 2 + 0.5);
}

export function checkout() {
  const payload = JSON.stringify({
    productId: Math.floor(Math.random() * 1000) + 1,
    quantity: Math.floor(Math.random() * 3) + 1,
  });
  http.post(`${BASE_URL}/orders`, payload, {
    headers: { 'Content-Type': 'application/json' },
  });
  sleep(Math.random() * 5 + 2); // Longer think time for checkout
}

6.3 Thresholds and SLA Validation

Thresholds transform load tests from "reports" into "quality gates" — if metrics exceed the threshold, the test fails and the CI pipeline stops.

export const options = {
  thresholds: {
    // Global thresholds
    http_req_duration: [
      'p(50)<200',   // 50th percentile < 200ms
      'p(95)<500',   // 95th percentile < 500ms
      'p(99)<1000',  // 99th percentile < 1s
    ],
    http_req_failed: ['rate<0.01'],  // <1% errors

    // Per-endpoint thresholds
    'http_req_duration{name:GetProducts}': ['p(95)<300'],
    'http_req_duration{name:Checkout}': ['p(95)<800'],
    'http_req_duration{name:Search}': ['p(95)<600'],

    // Custom metrics
    'errors{scenario:checkout}': ['rate<0.005'], // Checkout: <0.5% errors
  },
};

7. Integrating Load Testing into CI/CD Pipelines

graph LR
    A[Code Push] --> B[Build & Unit Test]
    B --> C[Deploy to Staging]
    C --> D[Smoke Test
k6 / NBomber]
    D -->|Pass| E[Load Test
10 min]
    D -->|Fail| F[Alert & Stop]
    E -->|Thresholds OK| G[Deploy to Prod]
    E -->|Thresholds Fail| F
    G --> H[Canary Test
5% traffic]
    H -->|Healthy| I[Full Rollout]
    H -->|Degraded| J[Rollback]

    style A fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style D fill:#4CAF50,stroke:#fff,color:#fff
    style E fill:#2196F3,stroke:#fff,color:#fff
    style F fill:#e94560,stroke:#fff,color:#fff
    style G fill:#4CAF50,stroke:#fff,color:#fff
    style H fill:#ff9800,stroke:#fff,color:#fff
    style I fill:#4CAF50,stroke:#fff,color:#fff
    style J fill:#e94560,stroke:#fff,color:#fff

Load testing in CI/CD — smoke test gate first, load test gate second

7.1 GitHub Actions + k6

# .github/workflows/load-test.yml
name: Load Test
on:
  pull_request:
    branches: [main]

jobs:
  load-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Deploy to staging
        run: ./deploy-staging.sh

      - name: Install k6
        run: |
          sudo gpg -k
          sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg \
            --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D68
          echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" \
            | sudo tee /etc/apt/sources.list.d/k6.list
          sudo apt-get update && sudo apt-get install k6

      - name: Run smoke test
        run: k6 run --tag testid=smoke tests/smoke.ts
        env:
          API_TOKEN: ${{ secrets.STAGING_API_TOKEN }}

      - name: Run load test
        run: k6 run tests/load.ts
        env:
          API_TOKEN: ${{ secrets.STAGING_API_TOKEN }}

      - name: Upload results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: k6-results
          path: results/

7.2 .NET CI + NBomber

// LoadTests/ProductApiLoadTest.cs — runs as xUnit test
public class ProductApiLoadTest
{
    [Fact]
    public void Products_endpoint_handles_100rps()
    {
        var httpClient = new HttpClient();

        var scenario = Scenario.Create("get_products", async context =>
        {
            var response = await Http.Send(httpClient,
                Http.CreateRequest("GET", "https://staging.api.example.com/products"));
            return response;
        })
        .WithLoadSimulations(
            Simulation.Inject(rate: 100, interval: TimeSpan.FromSeconds(1),
                              during: TimeSpan.FromMinutes(5))
        );

        var stats = NBomberRunner
            .RegisterScenarios(scenario)
            .Run();

        var scnStats = stats.ScenarioStats[0];

        // Assert SLA
        Assert.True(scnStats.Ok.Latency.Percent95 < 500,
            $"P95 latency {scnStats.Ok.Latency.Percent95}ms exceeds 500ms threshold");
        Assert.True(scnStats.Fail.Request.Percent < 1,
            $"Error rate {scnStats.Fail.Request.Percent}% exceeds 1% threshold");
    }
}

8. Analyzing Results and Debugging Bottlenecks

8.1 Critical Metrics to Monitor

Metric	Meaning	Suggested Threshold	When Exceeded
p50 (median)	Median response time	< 200ms	System-wide slowness
p95	95% of requests are faster than this	< 500ms	Tail latency impacting UX
p99	99% of requests are faster than this	< 1000ms	Outliers from GC, cold start, DB locks
Error rate	% of failed requests (4xx, 5xx, timeout)	< 1%	Service overloaded or bugs
Throughput (RPS)	Actual requests per second	≥ target SLA	Bottleneck at compute or I/O
Active VUs	Virtual users currently active	As planned	Stuck VU = connection leak

8.2 Bottleneck Detection Patterns

Common Warning Signs

Response time increases linearly with VUs → CPU-bound: the application is serializing processing. Review async code or scale horizontally.

Response time spikes sharply at N VUs → Connection pool exhaustion: the database or downstream service has run out of connections. Check MaxPoolSize and HttpClient lifecycle.

Error rate spikes while response time drops → Circuit breaker is open or requests are being rejected early. Good for system self-protection, but thresholds need tuning.

Memory grows steadily during soak test → Memory leak: typically caused by unsubscribed event handlers, HttpClient created repeatedly, or cache without an eviction policy.

9. Best Practices for Production Load Testing

9.1 Golden Rules

Test on a production-like environment — Same hardware specs, same data volume, same network topology. Testing on a laptop with 100 DB rows tells you nothing.
Use realistic data — Create datasets that reflect real distributions: full product catalogs, diverse user profiles, search queries from actual access logs.
Measure from the client side, not the server — Server metrics show processing response time, but user experience includes network latency, DNS resolution, and TLS handshake.
Warm up before measuring — JIT compilation (.NET), connection pool initialization, and cache warming all affect results if you start measuring from the first request.
Run multiple iterations, take averages — Single-run results have high variance. Run at least 3 times with the same config, remove outliers, and report average results.

9.2 Anti-Patterns to Avoid

Testing without think time — Real users don't fire requests continuously. Add sleep(1-5s) between requests to simulate reading time.
Hardcoded test data — Calling the same endpoint with the same parameters → 100% cache hits → results don't reflect reality.
Skipping ramp-up — Firing 10,000 VUs simultaneously creates a thundering herd — that's not load testing, that's a DDoS. Always ramp up gradually.
Testing only the happy path — Production has 404s, 401s, timeouts, and malformed requests. Test scripts should include error scenarios.
Not monitoring server-side resources — Load tests only produce output metrics. Combine with APM (Application Performance Monitoring) to know server CPU/memory/disk I/O levels.

10. Conclusion

Load testing isn't a one-time activity before launch that you then forget about. In distributed systems, every new service addition, database schema change, or dependency upgrade can affect overall performance. Integrating load testing into your CI/CD pipeline — starting with a smoke test gate, progressing to a load test gate — is the only way to ensure performance regressions are caught early.

Whether you choose k6 for JavaScript/TypeScript flexibility or NBomber for the power of the .NET ecosystem, the most important thing is to start — a simple smoke test in a CI pipeline is more valuable than a perfect load testing plan that never gets executed.

References

#system design #CI/CD #Microservices #.NET #Load Testing #Performance Testing #Grafana k6 #NBomber

# Load Testing for Distributed Systems — k6, NBomber and Performance Testing Strategies

You just deployed a new microservice — everything runs smoothly on staging with response times under 200ms and zero errors. But when production hits 10,000 concurrent users, the system collapses within 3 minutes. **Load testing** is the defense layer that helps you discover these weaknesses *before* they become production incidents.

This article dives deep into performance testing strategies for distributed systems, comparing two leading tools — **Grafana k6** (JavaScript/TypeScript) and **NBomber** (.NET/C#) — along with practical patterns for integrating load testing into your CI/CD pipeline.

72.8%Dev teams prioritize AI-powered testing (2025)

70%k6 CPU savings vs alternatives

83%APIs still use REST protocol

340%GraphQL growth at Fortune 500

## 1. Why Load Testing Matters for Distributed Systems

```
graph LR
    A[Client  
10K req/s] --> B[API Gateway]
    B --> C[Auth Service]
    B --> D[Product Service]
    B --> E[Order Service]
    D --> F[(Database)]
    E --> F
    E --> G[Payment Service]
    G --> H[External Payment API]

style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style F fill:#16213e,stroke:#fff,color:#fff
    style G fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style H fill:#ff9800,stroke:#fff,color:#fff

```
Typical microservices architecture — every node can be a bottleneck under high load

Load testing for distributed systems needs to answer these questions:

- **Maximum throughput** — How many requests/second can the system handle before response time exceeds the acceptable threshold?
- **Breaking point** — At what load level does the system start returning errors or timeouts?
- **Cascading failure** — When one service is overloaded, how far does the impact spread?
- **Resource saturation** — Which service exhausts CPU, memory, or connection pool first?
- **Recovery time** — After load decreases, how long does it take the system to return to normal state?

## 2. Types of Performance Testing

Each test type serves a different purpose. Understanding each one helps you design a test suite that matches your system's SLA requirements.

```
graph TB
    subgraph Types["Performance Testing Types"]
        A["🔥 Smoke Test  
Basic validation  
1-5 VUs"]
        B["📊 Load Test  
Average load  
Target VUs"]
        C["💪 Stress Test  
Beyond limits  
2-3x Target"]
        D["⚡ Spike Test  
Sudden surge  
0 → Max → 0"]
        E["🕐 Soak Test  
Long-running  
4-24 hours"]
        F["🎯 Breakpoint Test  
Find limits  
Continuous ramp"]
    end

style A fill:#4CAF50,stroke:#fff,color:#fff
    style B fill:#2196F3,stroke:#fff,color:#fff
    style C fill:#ff9800,stroke:#fff,color:#fff
    style D fill:#e94560,stroke:#fff,color:#fff
    style E fill:#9C27B0,stroke:#fff,color:#fff
    style F fill:#16213e,stroke:#fff,color:#fff

```
6 main performance testing types — from basic validation to finding system breaking points

| Test Type | Objective | Load Model | Duration | When to Use |
| --- | --- | --- | --- | --- |
| **Smoke Test** | Verify script correctness | 1-5 constant VUs | 1-3 min | After every test script change |
| **Load Test** | Evaluate normal performance | Ramp up → steady → ramp down | 10-30 min | Every sprint/release |
| **Stress Test** | Find system limits | 2-3x normal load | 15-30 min | Before major launches |
| **Spike Test** | Test sudden traffic surges | 0 → peak → 0 in seconds | 5-10 min | Flash sales, viral events |
| **Soak Test** | Detect memory/connection leaks | Sustained average load | 4-24 hours | Before production releases |
| **Breakpoint Test** | Determine max capacity | Continuous ramp until failure | Variable | Capacity planning |

#### Recommended Testing Order

Always start with a **Smoke Test** to ensure the script works correctly, then run a **Load Test** to establish a baseline, before moving to **Stress/Spike/Soak** for advanced scenarios. Don't jump straight to stress testing — you'll waste time debugging your test script instead of debugging the system.

## 3. Grafana k6 — Modern Load Testing with JavaScript/TypeScript

### 3.1 Why Choose k6?

Since version 1.0, k6 supports **native TypeScript** — you get type safety and IDE autocomplete without a separate build step.

### 3.2 Writing Your First Test Script

```typescript
// load-test.ts — k6 load test for an API endpoint
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';

// Custom metrics
const errorRate = new Rate('errors');
const responseTime = new Trend('response_time_ms');

// Test configuration
export const options = {
  scenarios: {
    average_load: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '2m', target: 100 },  // Ramp up to 100 VUs
        { duration: '5m', target: 100 },  // Stay at 100 VUs
        { duration: '2m', target: 0 },    // Ramp down
      ],
    },
  },
  thresholds: {
    http_req_duration: ['p(95)<500', 'p(99)<1000'],
    errors: ['rate<0.01'],  // Error rate < 1%
    response_time_ms: ['p(95)<400'],
  },
};

export default function () {
  const res = http.get('https://api.example.com/products', {
    headers: { 'Authorization': `Bearer ${__ENV.API_TOKEN}` },
    tags: { name: 'GetProducts' },
  });

check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
    'body has products': (r) => JSON.parse(r.body).length > 0,
  });

errorRate.add(res.status !== 200);
  responseTime.add(res.timings.duration);

sleep(1); // Think time between requests
}
```

### 3.3 Advanced Scenarios and Executors

k6 provides multiple executor types for different load models:

```typescript
export const options = {
  scenarios: {
    // Scenario 1: Constant arrival rate — control RPS
    constant_rps: {
      executor: 'constant-arrival-rate',
      rate: 200,           // 200 iterations/second
      timeUnit: '1s',
      duration: '5m',
      preAllocatedVUs: 50,
      maxVUs: 200,
    },
    // Scenario 2: Spike test
    spike: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '10s', target: 0 },
        { duration: '10s', target: 500 },  // Spike to 500
        { duration: '30s', target: 500 },
        { duration: '10s', target: 0 },    // Drop to 0
      ],
      startTime: '6m', // Start after scenario 1
    },
  },
};
```

#### Constant VUs vs Constant Arrival Rate

**Constant VUs** (ramping-vus): each VU completes one iteration then starts another. RPS depends on response time — slower server → lower RPS. Use when simulating *concurrent users*.

**Constant Arrival Rate**: guarantees exactly N iterations/second regardless of response time. k6 spawns additional VUs as needed. Use when measuring the system under *fixed throughput* — ideal for SLA testing.

### 3.4 Browser Testing with k6

k6 integrates a Playwright-compatible browser API, allowing hybrid tests — combining protocol-level (HTTP) and browser-level testing in the same scenario:

```typescript
import { browser } from 'k6/browser';
import http from 'k6/http';

export const options = {
  scenarios: {
    browser_test: {
      executor: 'constant-vus',
      vus: 5,
      duration: '3m',
      options: { browser: { type: 'chromium' } },
    },
    api_test: {
      executor: 'constant-arrival-rate',
      rate: 100,
      timeUnit: '1s',
      duration: '3m',
      preAllocatedVUs: 20,
    },
  },
};

export async function browser_test() {
  const page = await browser.newPage();
  await page.goto('https://app.example.com/dashboard');

// Measure real Web Vitals
  const lcp = await page.evaluate(() => {
    return new Promise(resolve => {
      new PerformanceObserver(list => {
        const entries = list.getEntries();
        resolve(entries[entries.length - 1].startTime);
      }).observe({ type: 'largest-contentful-paint', buffered: true });
    });
  });

console.log(`LCP: ${lcp}ms`);
  await page.close();
}
```

## 4. NBomber — Native .NET Load Testing for C# Developers

### 4.1 When to Choose NBomber

### 4.2 Writing Scenarios with NBomber

```csharp
using NBomber.CSharp;
using NBomber.Http.CSharp;

var httpClient = new HttpClient();

var scenario = Scenario.Create("get_products", async context =>
{
    var request = Http.CreateRequest("GET", "https://api.example.com/products")
        .WithHeader("Authorization", "Bearer " + Environment.GetEnvironmentVariable("API_TOKEN"));

var response = await Http.Send(httpClient, request);

return response;
})
.WithLoadSimulations(
    Simulation.RampingInject(rate: 100, interval: TimeSpan.FromSeconds(1),
                             during: TimeSpan.FromMinutes(2)),
    Simulation.Inject(rate: 100, interval: TimeSpan.FromSeconds(1),
                      during: TimeSpan.FromMinutes(5)),
    Simulation.RampingInject(rate: 0, interval: TimeSpan.FromSeconds(1),
                             during: TimeSpan.FromMinutes(2))
);

NBomberRunner
    .RegisterScenarios(scenario)
    .WithReportFormats(ReportFormat.Html, ReportFormat.Csv)
    .WithReportFolder("./reports")
    .Run();
```

### 4.3 Multi-Protocol Testing

NBomber's strength lies in testing any protocol — HTTP, gRPC, WebSocket, database, message queue — within the same scenario:

```csharp
var httpStep = Scenario.Create("mixed_workload", async context =>
{
    // Step 1: Call REST API
    var apiResponse = await Http.Send(httpClient,
        Http.CreateRequest("GET", "https://api.example.com/products"));

if (apiResponse.StatusCode != "200")
        return Response.Fail();

// Step 2: Call gRPC service
    var grpcResponse = await grpcClient.GetProductDetailsAsync(
        new ProductRequest { Id = context.ScenarioInfo.ThreadNumber });

// Step 3: Publish message to RabbitMQ
    channel.BasicPublish(exchange: "", routingKey: "orders",
        body: Encoding.UTF8.GetBytes($"order-{context.InvocationNumber}"));

return Response.Ok(statusCode: "200",
        sizeBytes: apiResponse.SizeBytes + grpcResponse.CalculateSize());
})
.WithLoadSimulations(
    Simulation.KeepConstant(copies: 50, during: TimeSpan.FromMinutes(10))
);
```

## 5. k6 vs NBomber Comparison

| Criteria | Grafana k6 | NBomber |
| --- | --- | --- |
| **Language** | JavaScript / TypeScript | C# / F# |
| **Engine** | Go (goroutines) | .NET (async/await) |
| **Installation** | Standalone binary | NuGet package |
| **IDE Support** | VS Code + extensions | Visual Studio / Rider (full debug) |
| **Browser Testing** | Yes (built-in Chromium) | Yes (via Playwright NuGet) |
| **Protocols** | HTTP, WebSocket, gRPC (extension) | Any (HTTP, gRPC, WS, DB, MQ...) |
| **Distributed Testing** | k6 Cloud or k6-operator (K8s) | NBomber Cluster |
| **Reporting** | Grafana dashboards, JSON, CSV | HTML, CSV, TXT, Markdown |
| **CI/CD** | Native (exit code based on thresholds) | xUnit/NUnit runner, threshold assertions |
| **Pricing** | OSS free, Cloud paid | Free (personal), $99/month (business) |
| **Best For** | Multi-language teams, DevOps-driven | .NET teams, developer-driven testing |

#### Practical Recommendation

If your team **primarily uses .NET** and wants load tests running as unit tests in the CI pipeline → choose **NBomber**. If your team is **multi-language** or prefers fast TypeScript scripting → choose **k6**. Many large organizations use both: k6 for quick API-level testing in CI, NBomber for complex integration testing that requires debugging.

## 6. Load Testing Strategies for Microservices

### 6.1 Layered Testing Model

```
graph TB
    subgraph L1["Layer 1 — Component Testing"]
        A["Test individual  
services"]
        B["Mock dependencies"]
        C["Measure baseline  
response time"]
    end

subgraph L2["Layer 2 — Integration Testing"]
        D["Test chains of  
2-3 services"]
        E["Real dependencies"]
        F["Measure end-to-end  
latency"]
    end

subgraph L3["Layer 3 — System Testing"]
        G["Test entire  
system"]
        H["Production-like  
traffic"]
        I["Measure throughput  
& error rate"]
    end

L1 --> L2 --> L3

style A fill:#4CAF50,stroke:#fff,color:#fff
    style B fill:#4CAF50,stroke:#fff,color:#fff
    style C fill:#4CAF50,stroke:#fff,color:#fff
    style D fill:#2196F3,stroke:#fff,color:#fff
    style E fill:#2196F3,stroke:#fff,color:#fff
    style F fill:#2196F3,stroke:#fff,color:#fff
    style G fill:#e94560,stroke:#fff,color:#fff
    style H fill:#e94560,stroke:#fff,color:#fff
    style I fill:#e94560,stroke:#fff,color:#fff

```
3-layer performance testing model for microservices

### 6.2 Designing Realistic Test Scenarios

```typescript
// k6: Simulating realistic traffic patterns
export const options = {
  scenarios: {
    // 70% traffic: Browse products (reads)
    browse: {
      executor: 'constant-arrival-rate',
      rate: 700,
      timeUnit: '1s',
      duration: '10m',
      preAllocatedVUs: 100,
      exec: 'browseProducts',
    },
    // 20% traffic: Search (read + compute)
    search: {
      executor: 'constant-arrival-rate',
      rate: 200,
      timeUnit: '1s',
      duration: '10m',
      preAllocatedVUs: 50,
      exec: 'searchProducts',
    },
    // 10% traffic: Checkout (writes)
    checkout: {
      executor: 'constant-arrival-rate',
      rate: 100,
      timeUnit: '1s',
      duration: '10m',
      preAllocatedVUs: 30,
      exec: 'checkout',
    },
  },
};

export function browseProducts() {
  const categoryId = Math.floor(Math.random() * 20) + 1;
  http.get(`${BASE_URL}/products?category=${categoryId}`);
  sleep(Math.random() * 3 + 1); // Think time 1-4s
}

export function searchProducts() {
  const terms = ['laptop', 'phone', 'tablet', 'headphone', 'camera'];
  const q = terms[Math.floor(Math.random() * terms.length)];
  http.get(`${BASE_URL}/search?q=${q}`);
  sleep(Math.random() * 2 + 0.5);
}

export function checkout() {
  const payload = JSON.stringify({
    productId: Math.floor(Math.random() * 1000) + 1,
    quantity: Math.floor(Math.random() * 3) + 1,
  });
  http.post(`${BASE_URL}/orders`, payload, {
    headers: { 'Content-Type': 'application/json' },
  });
  sleep(Math.random() * 5 + 2); // Longer think time for checkout
}
```

### 6.3 Thresholds and SLA Validation

Thresholds transform load tests from "reports" into "quality gates" — if metrics exceed the threshold, the test fails and the CI pipeline stops.

```typescript
export const options = {
  thresholds: {
    // Global thresholds
    http_req_duration: [
      'p(50)<200',   // 50th percentile < 200ms
      'p(95)<500',   // 95th percentile < 500ms
      'p(99)<1000',  // 99th percentile < 1s
    ],
    http_req_failed: ['rate<0.01'],  // <1% errors

// Per-endpoint thresholds
    'http_req_duration{name:GetProducts}': ['p(95)<300'],
    'http_req_duration{name:Checkout}': ['p(95)<800'],
    'http_req_duration{name:Search}': ['p(95)<600'],

// Custom metrics
    'errors{scenario:checkout}': ['rate<0.005'], // Checkout: <0.5% errors
  },
};
```

## 7. Integrating Load Testing into CI/CD Pipelines

style A fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style D fill:#4CAF50,stroke:#fff,color:#fff
    style E fill:#2196F3,stroke:#fff,color:#fff
    style F fill:#e94560,stroke:#fff,color:#fff
    style G fill:#4CAF50,stroke:#fff,color:#fff
    style H fill:#ff9800,stroke:#fff,color:#fff
    style I fill:#4CAF50,stroke:#fff,color:#fff
    style J fill:#e94560,stroke:#fff,color:#fff

```
Load testing in CI/CD — smoke test gate first, load test gate second

### 7.1 GitHub Actions + k6

```yaml
# .github/workflows/load-test.yml
name: Load Test
on:
  pull_request:
    branches: [main]

jobs:
  load-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

- name: Deploy to staging
        run: ./deploy-staging.sh

- name: Install k6
        run: |
          sudo gpg -k
          sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg \
            --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D68
          echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" \
            | sudo tee /etc/apt/sources.list.d/k6.list
          sudo apt-get update && sudo apt-get install k6

- name: Run smoke test
        run: k6 run --tag testid=smoke tests/smoke.ts
        env:
          API_TOKEN: ${{ secrets.STAGING_API_TOKEN }}

- name: Run load test
        run: k6 run tests/load.ts
        env:
          API_TOKEN: ${{ secrets.STAGING_API_TOKEN }}

- name: Upload results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: k6-results
          path: results/
```

### 7.2 .NET CI + NBomber

```csharp
// LoadTests/ProductApiLoadTest.cs — runs as xUnit test
public class ProductApiLoadTest
{
    [Fact]
    public void Products_endpoint_handles_100rps()
    {
        var httpClient = new HttpClient();

var scenario = Scenario.Create("get_products", async context =>
        {
            var response = await Http.Send(httpClient,
                Http.CreateRequest("GET", "https://staging.api.example.com/products"));
            return response;
        })
        .WithLoadSimulations(
            Simulation.Inject(rate: 100, interval: TimeSpan.FromSeconds(1),
                              during: TimeSpan.FromMinutes(5))
        );

var stats = NBomberRunner
            .RegisterScenarios(scenario)
            .Run();

var scnStats = stats.ScenarioStats[0];

// Assert SLA
        Assert.True(scnStats.Ok.Latency.Percent95 < 500,
            $"P95 latency {scnStats.Ok.Latency.Percent95}ms exceeds 500ms threshold");
        Assert.True(scnStats.Fail.Request.Percent < 1,
            $"Error rate {scnStats.Fail.Request.Percent}% exceeds 1% threshold");
    }
}
```

## 8. Analyzing Results and Debugging Bottlenecks

### 8.1 Critical Metrics to Monitor

| Metric | Meaning | Suggested Threshold | When Exceeded |
| --- | --- | --- | --- |
| **p50 (median)** | Median response time | < 200ms | System-wide slowness |
| **p95** | 95% of requests are faster than this | < 500ms | Tail latency impacting UX |
| **p99** | 99% of requests are faster than this | < 1000ms | Outliers from GC, cold start, DB locks |
| **Error rate** | % of failed requests (4xx, 5xx, timeout) | < 1% | Service overloaded or bugs |
| **Throughput (RPS)** | Actual requests per second | ≥ target SLA | Bottleneck at compute or I/O |
| **Active VUs** | Virtual users currently active | As planned | Stuck VU = connection leak |

### 8.2 Bottleneck Detection Patterns

#### Common Warning Signs

**Response time increases linearly with VUs** → CPU-bound: the application is serializing processing. Review async code or scale horizontally.

**Response time spikes sharply at N VUs** → Connection pool exhaustion: the database or downstream service has run out of connections. Check `MaxPoolSize` and `HttpClient` lifecycle.

**Error rate spikes while response time drops** → Circuit breaker is open or requests are being rejected early. Good for system self-protection, but thresholds need tuning.

**Memory grows steadily during soak test** → Memory leak: typically caused by unsubscribed event handlers, `HttpClient` created repeatedly, or cache without an eviction policy.

## 9. Best Practices for Production Load Testing

### 9.1 Golden Rules

1. **Test on a production-like environment** — Same hardware specs, same data volume, same network topology. Testing on a laptop with 100 DB rows tells you nothing.
2. **Use realistic data** — Create datasets that reflect real distributions: full product catalogs, diverse user profiles, search queries from actual access logs.
3. **Measure from the client side, not the server** — Server metrics show *processing* response time, but user experience includes network latency, DNS resolution, and TLS handshake.
4. **Warm up before measuring** — JIT compilation (.NET), connection pool initialization, and cache warming all affect results if you start measuring from the first request.
5. **Run multiple iterations, take averages** — Single-run results have high variance. Run at least 3 times with the same config, remove outliers, and report average results.

### 9.2 Anti-Patterns to Avoid

1. **Testing without think time** — Real users don't fire requests continuously. Add `sleep(1-5s)` between requests to simulate reading time.
2. **Hardcoded test data** — Calling the same endpoint with the same parameters → 100% cache hits → results don't reflect reality.
3. **Skipping ramp-up** — Firing 10,000 VUs simultaneously creates a thundering herd — that's not load testing, that's a DDoS. Always ramp up gradually.
4. **Testing only the happy path** — Production has 404s, 401s, timeouts, and malformed requests. Test scripts should include error scenarios.
5. **Not monitoring server-side resources** — Load tests only produce output metrics. Combine with APM (Application Performance Monitoring) to know server CPU/memory/disk I/O levels.

## 10. Conclusion

Whether you choose **k6** for JavaScript/TypeScript flexibility or **NBomber** for the power of the .NET ecosystem, the most important thing is to *start* — a simple smoke test in a CI pipeline is more valuable than a perfect load testing plan that never gets executed.

## References

- [Grafana k6 Documentation — Official Docs](https://grafana.com/docs/k6/latest/)
- [NBomber — Distributed load testing framework for .NET](https://nbomber.com/)
- [k6.io — Load testing for engineering teams](https://k6.io/)
- [Grafana Cloud k6 — Performance & Load Testing](https://grafana.com/products/cloud/performance-load-testing-k6/)
- [k6 GitHub Repository — Open Source](https://github.com/grafana/k6)
- [NBomber GitHub Repository — .NET Load Testing](https://github.com/PragmaticFlow/NBomber)
- [Best API Load Testing Tools 2026 — PFLB Comparison](https://pflb.us/blog/best-api-load-testing-tools/)

AWS Step Functions: Orchestrating Serverless Workflows for Distributed Systems

Multi-Tier Caching Strategy: From Browser to Database for High-Performance Applications

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.