ClickHouse — Real-Time Analytics Database for Large-Scale Systems

Posted on: 4/25/2026 10:10:31 AM

Table of contents

1. Why Do We Need a Columnar Database?
2. How Does ClickHouse Work?
3. MergeTree Engine Variants
4. Practical Schema Design for Log Analytics
1. Key Design Principles
2. 4.1. Sample Analytical Queries
5. Materialized Views — Pre-compute for Speed
1. Real-World Impact
6. ClickHouse vs Other Analytics Solutions
7. Production Deployment Architecture
1. Data Flow Explained
8. Query Performance Optimization
1. 8.1. The Golden Rule: Read Less Data
  1. Query Optimization Checklist
2. 8.2. PREWHERE and Skip Index Examples
9. ClickHouse Cloud and SharedMergeTree
10. When NOT to Use ClickHouse
1. ClickHouse is not a silver bullet
Conclusion
1. References

When your system starts processing billions of events per day — from access logs, monitoring metrics, to clickstream data — traditional databases begin to choke. PostgreSQL, MySQL, or SQL Server are all designed for OLTP (transaction processing), not for scanning millions of rows in the blink of an eye. That's exactly where ClickHouse comes in — an open-source columnar database, designed from the ground up for real-time analytics at massive scale.

100xFaster than PostgreSQL for analytical queries

1B+Rows inserted per second (cluster)

6xLess storage compared to row-store

<100msQuery latency on petabyte-scale data

1. Why Do We Need a Columnar Database?

In traditional row-oriented databases, each row is stored contiguously on disk. When you run an analytical query like SELECT AVG(price) FROM orders WHERE created_at > '2026-01-01', the database has to read every complete row — including columns you don't need (customer_name, address, phone...) — before extracting the price column.

Columnar databases flip this logic: data for each column is stored contiguously. The query above only needs to read 2 columns (price and created_at), completely skipping the rest. With a 50-column table, you only read 4% of the data.

graph LR
    subgraph ROW["Row-Oriented Storage"]
        R1["Row 1: id|name|price|date|addr"]
        R2["Row 2: id|name|price|date|addr"]
        R3["Row 3: id|name|price|date|addr"]
    end
    subgraph COL["Column-Oriented Storage"]
        C1["id: 1, 2, 3, 4, 5..."]
        C2["name: A, B, C, D, E..."]
        C3["price: 10, 20, 15, 30..."]
        C4["date: 01, 02, 03, 04..."]
    end
    Q["SELECT AVG(price)
WHERE date > X"] --> COL
    Q -.->|"Reads entire rows"| ROW

    style ROW fill:#f8f9fa,stroke:#e0e0e0,color:#2c3e50
    style COL fill:#e8f5e9,stroke:#4CAF50,color:#2c3e50
    style Q fill:#e94560,stroke:#fff,color:#fff

How row-store vs column-store read data differently

2. How Does ClickHouse Work?

2.1. MergeTree Engine Architecture

At the heart of ClickHouse is the MergeTree engine family — a storage engine designed for extremely fast writes and high-performance analytical queries. When you INSERT data, ClickHouse doesn't insert rows one by one into a B-tree index like traditional databases. Instead:

Writes as immutable "parts": Each INSERT batch creates a "part" — a directory containing data sorted by primary key, with each column stored as a separate file.
Background merging: ClickHouse continuously merges smaller parts into larger ones in the background, similar to how LSM-trees work — but optimized for columnar data.
Sparse primary index: Instead of indexing every row, ClickHouse only stores an index entry for every 8,192 rows (a granule). A 1-billion-row table needs only ~122,000 index entries — small enough to fit entirely in RAM.

graph TD
    INSERT["INSERT batch"] --> P1["Part 1
(sorted, immutable)"]
    INSERT --> P2["Part 2"]
    INSERT --> P3["Part 3"]
    P1 --> MERGE["Background Merge"]
    P2 --> MERGE
    P3 --> MERGE
    MERGE --> BIG["Merged Part
(larger, optimized)"]
    BIG --> QUERY["Query Engine"]
    QUERY --> IDX["Sparse Index
(1 entry / 8192 rows)"]
    IDX --> SCAN["Column Scan
(reads only needed columns)"]

    style INSERT fill:#e94560,stroke:#fff,color:#fff
    style MERGE fill:#2c3e50,stroke:#fff,color:#fff
    style BIG fill:#16213e,stroke:#fff,color:#fff
    style QUERY fill:#4CAF50,stroke:#fff,color:#fff
    style IDX fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style SCAN fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50

Data flow in the MergeTree Engine

2.2. Vectorized Query Execution

ClickHouse processes data in batches (chunks) rather than one row at a time — a technique called vectorized execution, inspired by the MonetDB/X100 model. Each operator in the query pipeline receives a block of thousands of rows, applies computations to the entire block at once, fully leveraging SIMD instructions (Single Instruction, Multiple Data) of modern CPUs.

Why is vectorized execution faster?

In the traditional Volcano model (row-at-a-time), each row must pass through a chain of virtual function calls. With 1 billion rows, that's 1 billion virtual function invocations. Vectorized execution reduces this to just ~122,000 calls (1B / 8,192) — a 99.99% reduction in overhead. Combined with SIMD, the CPU can process 4-16 values simultaneously in a single instruction cycle.

2.3. Smart Data Compression

Because data in the same column shares the same type and often has repeating patterns, ClickHouse achieves extremely high compression ratios:

Codec	Best For	Compression Ratio
`LZ4`	Default, balanced speed and compression	3-5x
`ZSTD`	Cold data, prioritize storage savings	5-10x
`Delta + ZSTD`	Timestamp columns, monotonically increasing values	10-20x
`T64`	Integer columns with small range	8-15x
`DoubleDelta`	Time-series data	15-40x

A 1TB log table in PostgreSQL can shrink to just 60-150GB in ClickHouse — significant storage cost savings, especially on cloud infrastructure.

3. MergeTree Engine Variants

ClickHouse offers more than just the basic MergeTree — the family includes several variants for different use cases:

Engine	Function	Use Case
`ReplacingMergeTree`	Automatically deduplicates rows with same primary key during merge	CDC (Change Data Capture), upsert patterns
`AggregatingMergeTree`	Pre-aggregates data during the merge process	Materialized views, metric rollups
`CollapsingMergeTree`	Supports update/delete logic by inserting "cancel" rows	Data with frequently changing state
`SummingMergeTree`	Automatically sums numeric columns during merge	Counters, revenue aggregation
`VersionedCollapsingMergeTree`	Like Collapsing but supports out-of-order inserts	Distributed ingestion without guaranteed ordering

4. Practical Schema Design for Log Analytics

Imagine you're building an API monitoring system handling hundreds of millions of requests per day. Here's how to design an optimized schema in ClickHouse:

CREATE TABLE api_logs
(
    timestamp DateTime64(3),
    date Date DEFAULT toDate(timestamp),
    service LowCardinality(String),
    endpoint LowCardinality(String),
    method Enum8('GET' = 1, 'POST' = 2, 'PUT' = 3, 'DELETE' = 4, 'PATCH' = 5),
    status_code UInt16,
    response_time_ms UInt32,
    request_size UInt32,
    response_size UInt32,
    user_id UInt64,
    trace_id UUID,
    error_message Nullable(String),
    tags Map(String, String)
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(date)
ORDER BY (service, endpoint, timestamp)
TTL date + INTERVAL 90 DAY DELETE
SETTINGS index_granularity = 8192;

Key Design Principles

LowCardinality: Use for string columns with few unique values (service name, endpoint). Creates automatic dictionary encoding, reducing storage 5-10x and speeding up filters.
Enum8/Enum16: For columns with a fixed set of values (HTTP method). Stores only 1-2 bytes instead of character strings.
ORDER BY: Order from low cardinality to high. Filtering by service + endpoint becomes extremely fast thanks to the sparse index.
PARTITION BY: Partition by month. TTL automatically deletes data older than 90 days — no cron job needed.
Map type: Flexible for non-fixed metadata (tags, labels) without needing ALTER TABLE.

4.1. Sample Analytical Queries

-- Top 10 slowest endpoints in the last 24 hours
SELECT
    service,
    endpoint,
    quantile(0.95)(response_time_ms) AS p95_ms,
    quantile(0.99)(response_time_ms) AS p99_ms,
    count() AS total_requests,
    countIf(status_code >= 500) AS errors,
    round(countIf(status_code >= 500) * 100.0 / count(), 2) AS error_rate
FROM api_logs
WHERE timestamp >= now() - INTERVAL 24 HOUR
GROUP BY service, endpoint
ORDER BY p95_ms DESC
LIMIT 10;

-- Hourly traffic pattern analysis (heatmap data)
SELECT
    toHour(timestamp) AS hour,
    toDayOfWeek(timestamp) AS day_of_week,
    count() AS requests,
    avg(response_time_ms) AS avg_latency
FROM api_logs
WHERE date >= today() - 7
GROUP BY hour, day_of_week
ORDER BY day_of_week, hour;

5. Materialized Views — Pre-compute for Speed

One of ClickHouse's most powerful features is Materialized Views — allowing you to pre-aggregate data as it's inserted, without needing a separate ETL pipeline.

-- Create destination table for per-minute metrics
CREATE TABLE api_metrics_1m
(
    timestamp DateTime,
    service LowCardinality(String),
    endpoint LowCardinality(String),
    requests AggregateFunction(count, UInt64),
    p95_latency AggregateFunction(quantile(0.95), UInt32),
    errors AggregateFunction(sum, UInt64)
)
ENGINE = AggregatingMergeTree()
ORDER BY (service, endpoint, timestamp);

-- Materialized view auto-aggregates on INSERT
CREATE MATERIALIZED VIEW api_metrics_1m_mv
TO api_metrics_1m AS
SELECT
    toStartOfMinute(timestamp) AS timestamp,
    service,
    endpoint,
    countState() AS requests,
    quantileState(0.95)(response_time_ms) AS p95_latency,
    sumState(if(status_code >= 500, 1, 0)) AS errors
FROM api_logs
GROUP BY timestamp, service, endpoint;

Real-World Impact

Querying the source table api_logs (1 billion rows): ~2-5 seconds. The same query on api_metrics_1m (pre-aggregated data only): ~10-50ms. That's a 100-500x speedup, with zero additional ETL effort.

6. ClickHouse vs Other Analytics Solutions

Criteria	ClickHouse	PostgreSQL	Elasticsearch
Storage Model	Columnar (OLAP)	Row-oriented (OLTP)	Inverted Index + Doc Store
Analytical Query Speed	Ultra-fast (~ms on billions of rows)	Slow (minutes-hours on billions)	Moderate (seconds-minutes)
INSERT Throughput	Millions of rows/sec	Thousands of rows/sec	Tens of thousands docs/sec
Data Compression	Excellent (6-20x)	Average (2-3x)	Poor (1.5-3x)
Full-text Search	Basic (bloom filter)	Good (tsvector)	Excellent (inverted index)
UPDATE/DELETE	Limited (async mutation)	Excellent (MVCC)	Good (doc-level)
Operational Cost	Low (less RAM, less storage)	Medium	High (RAM-heavy)
SQL Support	Very comprehensive	Most complete	Limited (DSL-first)

7. Production Deployment Architecture

A production analytics system typically combines ClickHouse with other components following the Kappa architecture pattern:

graph LR
    APP["Application
Services"] --> KAFKA["Apache Kafka
/ Redpanda"]
    KAFKA --> CH["ClickHouse
Cluster"]
    CH --> GRAFANA["Grafana
Dashboard"]
    CH --> API["Analytics API
(.NET / Node.js)"]
    API --> UI["Frontend
(Vue.js)"]
    CH --> MV["Materialized
Views"]
    MV --> CH

    KAFKA --> REDIS["Redis
(real-time cache)"]
    REDIS --> API

    style APP fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style KAFKA fill:#2c3e50,stroke:#fff,color:#fff
    style CH fill:#e94560,stroke:#fff,color:#fff
    style GRAFANA fill:#16213e,stroke:#fff,color:#fff
    style API fill:#4CAF50,stroke:#fff,color:#fff
    style UI fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
    style MV fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style REDIS fill:#ff9800,stroke:#fff,color:#fff

Kappa Architecture with ClickHouse as the analytics core

Data Flow Explained

Ingestion: Application services send events to Kafka/Redpanda. Never write directly to ClickHouse to avoid many small INSERTs (anti-pattern).
Batch insert: ClickHouse Kafka engine or a consumer service reads batches from Kafka, inserting in large blocks (10,000-100,000 rows/batch) — the most optimal approach.
Pre-aggregation: Materialized Views automatically aggregate data in real-time upon insert, serving dashboards faster.
Query layer: Grafana connects directly (official plugin). An API layer serves custom analytics for the frontend.
Hot cache: Redis caches frequently-requested query results, reducing ClickHouse load under high concurrent queries.

8. Query Performance Optimization

8.1. The Golden Rule: Read Less Data

ClickHouse is fastest when it reads the least data. Every optimization revolves around this principle:

Query Optimization Checklist

Filter first, transform later: Always put WHERE clauses filtering on columns in the ORDER BY key. ClickHouse uses the sparse index to skip entire irrelevant granules.
Use PREWHERE: For small filter columns. ClickHouse reads the PREWHERE column first, eliminates non-matching rows, then reads remaining columns — significantly reducing I/O.
Avoid SELECT *: Only select the columns you need. Each additional column = additional I/O.
Projections: Create alternative physical sort orders for queries with filter patterns different from the PRIMARY KEY.
Skip indexes: Bloom filter indexes for high-cardinality columns (trace_id, user_id) not in the ORDER BY.

8.2. PREWHERE and Skip Index Examples

-- Add bloom filter index for trace_id
ALTER TABLE api_logs
ADD INDEX idx_trace_id trace_id TYPE bloom_filter(0.01) GRANULARITY 4;

-- ClickHouse auto-moves lightweight filters to PREWHERE
-- But you can force it when needed:
SELECT timestamp, endpoint, response_time_ms, error_message
FROM api_logs
PREWHERE service = 'payment-service'
WHERE timestamp >= now() - INTERVAL 1 HOUR
  AND status_code >= 500;

9. ClickHouse Cloud and SharedMergeTree

ClickHouse Cloud (managed service) introduces SharedMergeTree — a new engine that completely separates storage and compute:

Feature	MergeTree (self-hosted)	SharedMergeTree (Cloud)
Storage	Local disk or attached volume	Object storage (S3/GCS)
Compute Scaling	Must rebalance data when adding nodes	Add stateless nodes, query immediately
Merge Process	Runs on query nodes, affects latency	Runs on separate nodes, no query impact
Cost	Pay for servers 24/7	Pay-per-query + storage
Best For	Stable workloads, want full control	Variable workloads, optimize for cost

10. When NOT to Use ClickHouse

ClickHouse is not a silver bullet

OLTP applications: Need frequent UPDATE/DELETE, ACID transactions → use PostgreSQL, SQL Server.
Complex full-text search: Need relevance scoring, fuzzy matching → use Elasticsearch/OpenSearch.
Small data (<100GB): PostgreSQL with proper indexes is fast enough, no need for extra complexity.
Heavy multi-table JOINs: ClickHouse supports JOINs but isn't optimized like traditional RDBMS for complex join patterns.
Point lookups: Fetching a single row by primary key — row-stores are faster.

Conclusion

ClickHouse has proven itself as the leading solution for real-time data analytics at massive scale. With its columnar architecture, vectorized execution, and smart compression system, it enables processing billions of rows in milliseconds — something traditional databases need minutes or hours for. When combined with Kafka for ingestion, Redis for caching, and Grafana for visualization, ClickHouse becomes the centerpiece of a modern analytics architecture, meeting the increasingly complex observability and business intelligence needs of modern software systems.

References

#system design #ClickHouse #Data Engineering #OLAP #Real-time Analytics #Database

# ClickHouse — Real-Time Analytics Database for Large-Scale Systems

100xFaster than PostgreSQL for analytical queries

1B+Rows inserted per second (cluster)

6xLess storage compared to row-store

<100msQuery latency on petabyte-scale data

## 1. Why Do We Need a Columnar Database?

In traditional row-oriented databases, each row is stored contiguously on disk. When you run an analytical query like `SELECT AVG(price) FROM orders WHERE created_at > '2026-01-01'`, the database has to read every complete row — including columns you don't need (customer_name, address, phone...) — before extracting the `price` column.

Columnar databases flip this logic: data for each column is stored contiguously. The query above only needs to read 2 columns (`price` and `created_at`), completely skipping the rest. With a 50-column table, you only read 4% of the data.

style ROW fill:#f8f9fa,stroke:#e0e0e0,color:#2c3e50
    style COL fill:#e8f5e9,stroke:#4CAF50,color:#2c3e50
    style Q fill:#e94560,stroke:#fff,color:#fff

```
How row-store vs column-store read data differently

## 2. How Does ClickHouse Work?

### 2.1. MergeTree Engine Architecture

At the heart of ClickHouse is the **MergeTree engine family** — a storage engine designed for extremely fast writes and high-performance analytical queries. When you INSERT data, ClickHouse doesn't insert rows one by one into a B-tree index like traditional databases. Instead:

1. **Writes as immutable "parts"**: Each INSERT batch creates a "part" — a directory containing data sorted by primary key, with each column stored as a separate file.
2. **Background merging**: ClickHouse continuously merges smaller parts into larger ones in the background, similar to how LSM-trees work — but optimized for columnar data.
3. **Sparse primary index**: Instead of indexing every row, ClickHouse only stores an index entry for every 8,192 rows (a granule). A 1-billion-row table needs only ~122,000 index entries — small enough to fit entirely in RAM.

```
graph TD
    INSERT["INSERT batch"] --> P1["Part 1  
(sorted, immutable)"]
    INSERT --> P2["Part 2"]
    INSERT --> P3["Part 3"]
    P1 --> MERGE["Background Merge"]
    P2 --> MERGE
    P3 --> MERGE
    MERGE --> BIG["Merged Part  
(larger, optimized)"]
    BIG --> QUERY["Query Engine"]
    QUERY --> IDX["Sparse Index  
(1 entry / 8192 rows)"]
    IDX --> SCAN["Column Scan  
(reads only needed columns)"]

style INSERT fill:#e94560,stroke:#fff,color:#fff
    style MERGE fill:#2c3e50,stroke:#fff,color:#fff
    style BIG fill:#16213e,stroke:#fff,color:#fff
    style QUERY fill:#4CAF50,stroke:#fff,color:#fff
    style IDX fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style SCAN fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50

```
Data flow in the MergeTree Engine

### 2.2. Vectorized Query Execution

ClickHouse processes data in **batches (chunks)** rather than one row at a time — a technique called *vectorized execution*, inspired by the MonetDB/X100 model. Each operator in the query pipeline receives a block of thousands of rows, applies computations to the entire block at once, fully leveraging **SIMD instructions** (Single Instruction, Multiple Data) of modern CPUs.

#### Why is vectorized execution faster?

### 2.3. Smart Data Compression

Because data in the same column shares the same type and often has repeating patterns, ClickHouse achieves extremely high compression ratios:

| Codec | Best For | Compression Ratio |
| --- | --- | --- |
| `LZ4` | Default, balanced speed and compression | 3-5x |
| `ZSTD` | Cold data, prioritize storage savings | 5-10x |
| `Delta + ZSTD` | Timestamp columns, monotonically increasing values | 10-20x |
| `T64` | Integer columns with small range | 8-15x |
| `DoubleDelta` | Time-series data | 15-40x |

A 1TB log table in PostgreSQL can shrink to just **60-150GB** in ClickHouse — significant storage cost savings, especially on cloud infrastructure.

## 3. MergeTree Engine Variants

ClickHouse offers more than just the basic MergeTree — the family includes several variants for different use cases:

| Engine | Function | Use Case |
| --- | --- | --- |
| `ReplacingMergeTree` | Automatically deduplicates rows with same primary key during merge | CDC (Change Data Capture), upsert patterns |
| `AggregatingMergeTree` | Pre-aggregates data during the merge process | Materialized views, metric rollups |
| `CollapsingMergeTree` | Supports update/delete logic by inserting "cancel" rows | Data with frequently changing state |
| `SummingMergeTree` | Automatically sums numeric columns during merge | Counters, revenue aggregation |
| `VersionedCollapsingMergeTree` | Like Collapsing but supports out-of-order inserts | Distributed ingestion without guaranteed ordering |

## 4. Practical Schema Design for Log Analytics

Imagine you're building an API monitoring system handling hundreds of millions of requests per day. Here's how to design an optimized schema in ClickHouse:

```sql
CREATE TABLE api_logs
(
    timestamp DateTime64(3),
    date Date DEFAULT toDate(timestamp),
    service LowCardinality(String),
    endpoint LowCardinality(String),
    method Enum8('GET' = 1, 'POST' = 2, 'PUT' = 3, 'DELETE' = 4, 'PATCH' = 5),
    status_code UInt16,
    response_time_ms UInt32,
    request_size UInt32,
    response_size UInt32,
    user_id UInt64,
    trace_id UUID,
    error_message Nullable(String),
    tags Map(String, String)
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(date)
ORDER BY (service, endpoint, timestamp)
TTL date + INTERVAL 90 DAY DELETE
SETTINGS index_granularity = 8192;
```

#### Key Design Principles

- **LowCardinality**: Use for string columns with few unique values (service name, endpoint). Creates automatic dictionary encoding, reducing storage 5-10x and speeding up filters.
- **Enum8/Enum16**: For columns with a fixed set of values (HTTP method). Stores only 1-2 bytes instead of character strings.
- **ORDER BY**: Order from low cardinality to high. Filtering by service + endpoint becomes extremely fast thanks to the sparse index.
- **PARTITION BY**: Partition by month. TTL automatically deletes data older than 90 days — no cron job needed.
- **Map type**: Flexible for non-fixed metadata (tags, labels) without needing ALTER TABLE.

### 4.1. Sample Analytical Queries

```sql
-- Top 10 slowest endpoints in the last 24 hours
SELECT
    service,
    endpoint,
    quantile(0.95)(response_time_ms) AS p95_ms,
    quantile(0.99)(response_time_ms) AS p99_ms,
    count() AS total_requests,
    countIf(status_code >= 500) AS errors,
    round(countIf(status_code >= 500) * 100.0 / count(), 2) AS error_rate
FROM api_logs
WHERE timestamp >= now() - INTERVAL 24 HOUR
GROUP BY service, endpoint
ORDER BY p95_ms DESC
LIMIT 10;
```

```sql
-- Hourly traffic pattern analysis (heatmap data)
SELECT
    toHour(timestamp) AS hour,
    toDayOfWeek(timestamp) AS day_of_week,
    count() AS requests,
    avg(response_time_ms) AS avg_latency
FROM api_logs
WHERE date >= today() - 7
GROUP BY hour, day_of_week
ORDER BY day_of_week, hour;
```

## 5. Materialized Views — Pre-compute for Speed

One of ClickHouse's most powerful features is **Materialized Views** — allowing you to pre-aggregate data as it's inserted, without needing a separate ETL pipeline.

```sql
-- Create destination table for per-minute metrics
CREATE TABLE api_metrics_1m
(
    timestamp DateTime,
    service LowCardinality(String),
    endpoint LowCardinality(String),
    requests AggregateFunction(count, UInt64),
    p95_latency AggregateFunction(quantile(0.95), UInt32),
    errors AggregateFunction(sum, UInt64)
)
ENGINE = AggregatingMergeTree()
ORDER BY (service, endpoint, timestamp);

-- Materialized view auto-aggregates on INSERT
CREATE MATERIALIZED VIEW api_metrics_1m_mv
TO api_metrics_1m AS
SELECT
    toStartOfMinute(timestamp) AS timestamp,
    service,
    endpoint,
    countState() AS requests,
    quantileState(0.95)(response_time_ms) AS p95_latency,
    sumState(if(status_code >= 500, 1, 0)) AS errors
FROM api_logs
GROUP BY timestamp, service, endpoint;
```

#### Real-World Impact

Querying the source table `api_logs` (1 billion rows): ~2-5 seconds. The same query on `api_metrics_1m` (pre-aggregated data only): ~10-50ms. That's a **100-500x speedup**, with zero additional ETL effort.

## 6. ClickHouse vs Other Analytics Solutions

| Criteria | ClickHouse | PostgreSQL | Elasticsearch |
| --- | --- | --- | --- |
| Storage Model | Columnar (OLAP) | Row-oriented (OLTP) | Inverted Index + Doc Store |
| Analytical Query Speed | Ultra-fast (~ms on billions of rows) | Slow (minutes-hours on billions) | Moderate (seconds-minutes) |
| INSERT Throughput | Millions of rows/sec | Thousands of rows/sec | Tens of thousands docs/sec |
| Data Compression | Excellent (6-20x) | Average (2-3x) | Poor (1.5-3x) |
| Full-text Search | Basic (bloom filter) | Good (tsvector) | Excellent (inverted index) |
| UPDATE/DELETE | Limited (async mutation) | Excellent (MVCC) | Good (doc-level) |
| Operational Cost | Low (less RAM, less storage) | Medium | High (RAM-heavy) |
| SQL Support | Very comprehensive | Most complete | Limited (DSL-first) |

## 7. Production Deployment Architecture

A production analytics system typically combines ClickHouse with other components following the Kappa architecture pattern:

```
graph LR
    APP["Application  
Services"] --> KAFKA["Apache Kafka  
/ Redpanda"]
    KAFKA --> CH["ClickHouse  
Cluster"]
    CH --> GRAFANA["Grafana  
Dashboard"]
    CH --> API["Analytics API  
(.NET / Node.js)"]
    API --> UI["Frontend  
(Vue.js)"]
    CH --> MV["Materialized  
Views"]
    MV --> CH

KAFKA --> REDIS["Redis  
(real-time cache)"]
    REDIS --> API

style APP fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style KAFKA fill:#2c3e50,stroke:#fff,color:#fff
    style CH fill:#e94560,stroke:#fff,color:#fff
    style GRAFANA fill:#16213e,stroke:#fff,color:#fff
    style API fill:#4CAF50,stroke:#fff,color:#fff
    style UI fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
    style MV fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style REDIS fill:#ff9800,stroke:#fff,color:#fff

```
Kappa Architecture with ClickHouse as the analytics core

#### Data Flow Explained

1. **Ingestion**: Application services send events to Kafka/Redpanda. Never write directly to ClickHouse to avoid many small INSERTs (anti-pattern).
2. **Batch insert**: ClickHouse Kafka engine or a consumer service reads batches from Kafka, inserting in large blocks (10,000-100,000 rows/batch) — the most optimal approach.
3. **Pre-aggregation**: Materialized Views automatically aggregate data in real-time upon insert, serving dashboards faster.
4. **Query layer**: Grafana connects directly (official plugin). An API layer serves custom analytics for the frontend.
5. **Hot cache**: Redis caches frequently-requested query results, reducing ClickHouse load under high concurrent queries.

## 8. Query Performance Optimization

### 8.1. The Golden Rule: Read Less Data

ClickHouse is fastest when it reads the least data. Every optimization revolves around this principle:

#### Query Optimization Checklist

- **Filter first, transform later**: Always put WHERE clauses filtering on columns in the ORDER BY key. ClickHouse uses the sparse index to skip entire irrelevant granules.
- **Use PREWHERE**: For small filter columns. ClickHouse reads the PREWHERE column first, eliminates non-matching rows, then reads remaining columns — significantly reducing I/O.
- **Avoid SELECT ***: Only select the columns you need. Each additional column = additional I/O.
- **Projections**: Create alternative physical sort orders for queries with filter patterns different from the PRIMARY KEY.
- **Skip indexes**: Bloom filter indexes for high-cardinality columns (trace_id, user_id) not in the ORDER BY.

### 8.2. PREWHERE and Skip Index Examples

```sql
-- Add bloom filter index for trace_id
ALTER TABLE api_logs
ADD INDEX idx_trace_id trace_id TYPE bloom_filter(0.01) GRANULARITY 4;

-- ClickHouse auto-moves lightweight filters to PREWHERE
-- But you can force it when needed:
SELECT timestamp, endpoint, response_time_ms, error_message
FROM api_logs
PREWHERE service = 'payment-service'
WHERE timestamp >= now() - INTERVAL 1 HOUR
  AND status_code >= 500;
```

## 9. ClickHouse Cloud and SharedMergeTree

ClickHouse Cloud (managed service) introduces **SharedMergeTree** — a new engine that completely separates storage and compute:

| Feature | MergeTree (self-hosted) | SharedMergeTree (Cloud) |
| --- | --- | --- |
| Storage | Local disk or attached volume | Object storage (S3/GCS) |
| Compute Scaling | Must rebalance data when adding nodes | Add stateless nodes, query immediately |
| Merge Process | Runs on query nodes, affects latency | Runs on separate nodes, no query impact |
| Cost | Pay for servers 24/7 | Pay-per-query + storage |
| Best For | Stable workloads, want full control | Variable workloads, optimize for cost |

## 10. When NOT to Use ClickHouse

#### ClickHouse is not a silver bullet

- **OLTP applications**: Need frequent UPDATE/DELETE, ACID transactions → use PostgreSQL, SQL Server.
- **Complex full-text search**: Need relevance scoring, fuzzy matching → use Elasticsearch/OpenSearch.
- **Small data (<100GB)**: PostgreSQL with proper indexes is fast enough, no need for extra complexity.
- **Heavy multi-table JOINs**: ClickHouse supports JOINs but isn't optimized like traditional RDBMS for complex join patterns.
- **Point lookups**: Fetching a single row by primary key — row-stores are faster.

## Conclusion

### References

- [ClickHouse Documentation — What is ClickHouse?](https://clickhouse.com/docs/intro)
- [How to choose a database for real-time analytics in 2026 — ClickHouse Resource Hub](https://clickhouse.com/resources/engineering/how-to-choose-a-database-for-real-time-analytics-in-2026)
- [The definitive guide to ClickHouse query optimization (2026)](https://clickhouse.com/resources/engineering/clickhouse-query-optimisation-definitive-guide)
- [What is a columnar database? — ClickHouse Engineering](https://clickhouse.com/resources/engineering/what-is-columnar-database)
- [ClickHouse Architecture Deep Dive — Medium](https://medium.com/@aman.kohli1/clickhouse-architecture-deep-dive-how-it-powers-real-time-analytics-at-scale-a49281acdca2)

WebGPU — The New Era of GPU Computing in the Browser

Structured Logging in .NET 10: From Console.WriteLine to Professional Log Systems with Serilog

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.