ClickHouse — Real-Time Analytics Database for Large-Scale Systems
Posted on: 4/25/2026 10:10:31 AM
Table of contents
- 1. Why Do We Need a Columnar Database?
- 2. How Does ClickHouse Work?
- 3. MergeTree Engine Variants
- 4. Practical Schema Design for Log Analytics
- 5. Materialized Views — Pre-compute for Speed
- 6. ClickHouse vs Other Analytics Solutions
- 7. Production Deployment Architecture
- 8. Query Performance Optimization
- 9. ClickHouse Cloud and SharedMergeTree
- 10. When NOT to Use ClickHouse
- Conclusion
When your system starts processing billions of events per day — from access logs, monitoring metrics, to clickstream data — traditional databases begin to choke. PostgreSQL, MySQL, or SQL Server are all designed for OLTP (transaction processing), not for scanning millions of rows in the blink of an eye. That's exactly where ClickHouse comes in — an open-source columnar database, designed from the ground up for real-time analytics at massive scale.
1. Why Do We Need a Columnar Database?
In traditional row-oriented databases, each row is stored contiguously on disk. When you run an analytical query like SELECT AVG(price) FROM orders WHERE created_at > '2026-01-01', the database has to read every complete row — including columns you don't need (customer_name, address, phone...) — before extracting the price column.
Columnar databases flip this logic: data for each column is stored contiguously. The query above only needs to read 2 columns (price and created_at), completely skipping the rest. With a 50-column table, you only read 4% of the data.
graph LR
subgraph ROW["Row-Oriented Storage"]
R1["Row 1: id|name|price|date|addr"]
R2["Row 2: id|name|price|date|addr"]
R3["Row 3: id|name|price|date|addr"]
end
subgraph COL["Column-Oriented Storage"]
C1["id: 1, 2, 3, 4, 5..."]
C2["name: A, B, C, D, E..."]
C3["price: 10, 20, 15, 30..."]
C4["date: 01, 02, 03, 04..."]
end
Q["SELECT AVG(price)
WHERE date > X"] --> COL
Q -.->|"Reads entire rows"| ROW
style ROW fill:#f8f9fa,stroke:#e0e0e0,color:#2c3e50
style COL fill:#e8f5e9,stroke:#4CAF50,color:#2c3e50
style Q fill:#e94560,stroke:#fff,color:#fff
How row-store vs column-store read data differently
2. How Does ClickHouse Work?
2.1. MergeTree Engine Architecture
At the heart of ClickHouse is the MergeTree engine family — a storage engine designed for extremely fast writes and high-performance analytical queries. When you INSERT data, ClickHouse doesn't insert rows one by one into a B-tree index like traditional databases. Instead:
- Writes as immutable "parts": Each INSERT batch creates a "part" — a directory containing data sorted by primary key, with each column stored as a separate file.
- Background merging: ClickHouse continuously merges smaller parts into larger ones in the background, similar to how LSM-trees work — but optimized for columnar data.
- Sparse primary index: Instead of indexing every row, ClickHouse only stores an index entry for every 8,192 rows (a granule). A 1-billion-row table needs only ~122,000 index entries — small enough to fit entirely in RAM.
graph TD
INSERT["INSERT batch"] --> P1["Part 1
(sorted, immutable)"]
INSERT --> P2["Part 2"]
INSERT --> P3["Part 3"]
P1 --> MERGE["Background Merge"]
P2 --> MERGE
P3 --> MERGE
MERGE --> BIG["Merged Part
(larger, optimized)"]
BIG --> QUERY["Query Engine"]
QUERY --> IDX["Sparse Index
(1 entry / 8192 rows)"]
IDX --> SCAN["Column Scan
(reads only needed columns)"]
style INSERT fill:#e94560,stroke:#fff,color:#fff
style MERGE fill:#2c3e50,stroke:#fff,color:#fff
style BIG fill:#16213e,stroke:#fff,color:#fff
style QUERY fill:#4CAF50,stroke:#fff,color:#fff
style IDX fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style SCAN fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
Data flow in the MergeTree Engine
2.2. Vectorized Query Execution
ClickHouse processes data in batches (chunks) rather than one row at a time — a technique called vectorized execution, inspired by the MonetDB/X100 model. Each operator in the query pipeline receives a block of thousands of rows, applies computations to the entire block at once, fully leveraging SIMD instructions (Single Instruction, Multiple Data) of modern CPUs.
Why is vectorized execution faster?
In the traditional Volcano model (row-at-a-time), each row must pass through a chain of virtual function calls. With 1 billion rows, that's 1 billion virtual function invocations. Vectorized execution reduces this to just ~122,000 calls (1B / 8,192) — a 99.99% reduction in overhead. Combined with SIMD, the CPU can process 4-16 values simultaneously in a single instruction cycle.
2.3. Smart Data Compression
Because data in the same column shares the same type and often has repeating patterns, ClickHouse achieves extremely high compression ratios:
| Codec | Best For | Compression Ratio |
|---|---|---|
LZ4 | Default, balanced speed and compression | 3-5x |
ZSTD | Cold data, prioritize storage savings | 5-10x |
Delta + ZSTD | Timestamp columns, monotonically increasing values | 10-20x |
T64 | Integer columns with small range | 8-15x |
DoubleDelta | Time-series data | 15-40x |
A 1TB log table in PostgreSQL can shrink to just 60-150GB in ClickHouse — significant storage cost savings, especially on cloud infrastructure.
3. MergeTree Engine Variants
ClickHouse offers more than just the basic MergeTree — the family includes several variants for different use cases:
| Engine | Function | Use Case |
|---|---|---|
ReplacingMergeTree | Automatically deduplicates rows with same primary key during merge | CDC (Change Data Capture), upsert patterns |
AggregatingMergeTree | Pre-aggregates data during the merge process | Materialized views, metric rollups |
CollapsingMergeTree | Supports update/delete logic by inserting "cancel" rows | Data with frequently changing state |
SummingMergeTree | Automatically sums numeric columns during merge | Counters, revenue aggregation |
VersionedCollapsingMergeTree | Like Collapsing but supports out-of-order inserts | Distributed ingestion without guaranteed ordering |
4. Practical Schema Design for Log Analytics
Imagine you're building an API monitoring system handling hundreds of millions of requests per day. Here's how to design an optimized schema in ClickHouse:
CREATE TABLE api_logs
(
timestamp DateTime64(3),
date Date DEFAULT toDate(timestamp),
service LowCardinality(String),
endpoint LowCardinality(String),
method Enum8('GET' = 1, 'POST' = 2, 'PUT' = 3, 'DELETE' = 4, 'PATCH' = 5),
status_code UInt16,
response_time_ms UInt32,
request_size UInt32,
response_size UInt32,
user_id UInt64,
trace_id UUID,
error_message Nullable(String),
tags Map(String, String)
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(date)
ORDER BY (service, endpoint, timestamp)
TTL date + INTERVAL 90 DAY DELETE
SETTINGS index_granularity = 8192;
Key Design Principles
- LowCardinality: Use for string columns with few unique values (service name, endpoint). Creates automatic dictionary encoding, reducing storage 5-10x and speeding up filters.
- Enum8/Enum16: For columns with a fixed set of values (HTTP method). Stores only 1-2 bytes instead of character strings.
- ORDER BY: Order from low cardinality to high. Filtering by service + endpoint becomes extremely fast thanks to the sparse index.
- PARTITION BY: Partition by month. TTL automatically deletes data older than 90 days — no cron job needed.
- Map type: Flexible for non-fixed metadata (tags, labels) without needing ALTER TABLE.
4.1. Sample Analytical Queries
-- Top 10 slowest endpoints in the last 24 hours
SELECT
service,
endpoint,
quantile(0.95)(response_time_ms) AS p95_ms,
quantile(0.99)(response_time_ms) AS p99_ms,
count() AS total_requests,
countIf(status_code >= 500) AS errors,
round(countIf(status_code >= 500) * 100.0 / count(), 2) AS error_rate
FROM api_logs
WHERE timestamp >= now() - INTERVAL 24 HOUR
GROUP BY service, endpoint
ORDER BY p95_ms DESC
LIMIT 10;
-- Hourly traffic pattern analysis (heatmap data)
SELECT
toHour(timestamp) AS hour,
toDayOfWeek(timestamp) AS day_of_week,
count() AS requests,
avg(response_time_ms) AS avg_latency
FROM api_logs
WHERE date >= today() - 7
GROUP BY hour, day_of_week
ORDER BY day_of_week, hour;
5. Materialized Views — Pre-compute for Speed
One of ClickHouse's most powerful features is Materialized Views — allowing you to pre-aggregate data as it's inserted, without needing a separate ETL pipeline.
-- Create destination table for per-minute metrics
CREATE TABLE api_metrics_1m
(
timestamp DateTime,
service LowCardinality(String),
endpoint LowCardinality(String),
requests AggregateFunction(count, UInt64),
p95_latency AggregateFunction(quantile(0.95), UInt32),
errors AggregateFunction(sum, UInt64)
)
ENGINE = AggregatingMergeTree()
ORDER BY (service, endpoint, timestamp);
-- Materialized view auto-aggregates on INSERT
CREATE MATERIALIZED VIEW api_metrics_1m_mv
TO api_metrics_1m AS
SELECT
toStartOfMinute(timestamp) AS timestamp,
service,
endpoint,
countState() AS requests,
quantileState(0.95)(response_time_ms) AS p95_latency,
sumState(if(status_code >= 500, 1, 0)) AS errors
FROM api_logs
GROUP BY timestamp, service, endpoint;
Real-World Impact
Querying the source table api_logs (1 billion rows): ~2-5 seconds. The same query on api_metrics_1m (pre-aggregated data only): ~10-50ms. That's a 100-500x speedup, with zero additional ETL effort.
6. ClickHouse vs Other Analytics Solutions
| Criteria | ClickHouse | PostgreSQL | Elasticsearch |
|---|---|---|---|
| Storage Model | Columnar (OLAP) | Row-oriented (OLTP) | Inverted Index + Doc Store |
| Analytical Query Speed | Ultra-fast (~ms on billions of rows) | Slow (minutes-hours on billions) | Moderate (seconds-minutes) |
| INSERT Throughput | Millions of rows/sec | Thousands of rows/sec | Tens of thousands docs/sec |
| Data Compression | Excellent (6-20x) | Average (2-3x) | Poor (1.5-3x) |
| Full-text Search | Basic (bloom filter) | Good (tsvector) | Excellent (inverted index) |
| UPDATE/DELETE | Limited (async mutation) | Excellent (MVCC) | Good (doc-level) |
| Operational Cost | Low (less RAM, less storage) | Medium | High (RAM-heavy) |
| SQL Support | Very comprehensive | Most complete | Limited (DSL-first) |
7. Production Deployment Architecture
A production analytics system typically combines ClickHouse with other components following the Kappa architecture pattern:
graph LR
APP["Application
Services"] --> KAFKA["Apache Kafka
/ Redpanda"]
KAFKA --> CH["ClickHouse
Cluster"]
CH --> GRAFANA["Grafana
Dashboard"]
CH --> API["Analytics API
(.NET / Node.js)"]
API --> UI["Frontend
(Vue.js)"]
CH --> MV["Materialized
Views"]
MV --> CH
KAFKA --> REDIS["Redis
(real-time cache)"]
REDIS --> API
style APP fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style KAFKA fill:#2c3e50,stroke:#fff,color:#fff
style CH fill:#e94560,stroke:#fff,color:#fff
style GRAFANA fill:#16213e,stroke:#fff,color:#fff
style API fill:#4CAF50,stroke:#fff,color:#fff
style UI fill:#f8f9fa,stroke:#4CAF50,color:#2c3e50
style MV fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style REDIS fill:#ff9800,stroke:#fff,color:#fff
Kappa Architecture with ClickHouse as the analytics core
Data Flow Explained
- Ingestion: Application services send events to Kafka/Redpanda. Never write directly to ClickHouse to avoid many small INSERTs (anti-pattern).
- Batch insert: ClickHouse Kafka engine or a consumer service reads batches from Kafka, inserting in large blocks (10,000-100,000 rows/batch) — the most optimal approach.
- Pre-aggregation: Materialized Views automatically aggregate data in real-time upon insert, serving dashboards faster.
- Query layer: Grafana connects directly (official plugin). An API layer serves custom analytics for the frontend.
- Hot cache: Redis caches frequently-requested query results, reducing ClickHouse load under high concurrent queries.
8. Query Performance Optimization
8.1. The Golden Rule: Read Less Data
ClickHouse is fastest when it reads the least data. Every optimization revolves around this principle:
Query Optimization Checklist
- Filter first, transform later: Always put WHERE clauses filtering on columns in the ORDER BY key. ClickHouse uses the sparse index to skip entire irrelevant granules.
- Use PREWHERE: For small filter columns. ClickHouse reads the PREWHERE column first, eliminates non-matching rows, then reads remaining columns — significantly reducing I/O.
- Avoid SELECT *: Only select the columns you need. Each additional column = additional I/O.
- Projections: Create alternative physical sort orders for queries with filter patterns different from the PRIMARY KEY.
- Skip indexes: Bloom filter indexes for high-cardinality columns (trace_id, user_id) not in the ORDER BY.
8.2. PREWHERE and Skip Index Examples
-- Add bloom filter index for trace_id
ALTER TABLE api_logs
ADD INDEX idx_trace_id trace_id TYPE bloom_filter(0.01) GRANULARITY 4;
-- ClickHouse auto-moves lightweight filters to PREWHERE
-- But you can force it when needed:
SELECT timestamp, endpoint, response_time_ms, error_message
FROM api_logs
PREWHERE service = 'payment-service'
WHERE timestamp >= now() - INTERVAL 1 HOUR
AND status_code >= 500;
9. ClickHouse Cloud and SharedMergeTree
ClickHouse Cloud (managed service) introduces SharedMergeTree — a new engine that completely separates storage and compute:
| Feature | MergeTree (self-hosted) | SharedMergeTree (Cloud) |
|---|---|---|
| Storage | Local disk or attached volume | Object storage (S3/GCS) |
| Compute Scaling | Must rebalance data when adding nodes | Add stateless nodes, query immediately |
| Merge Process | Runs on query nodes, affects latency | Runs on separate nodes, no query impact |
| Cost | Pay for servers 24/7 | Pay-per-query + storage |
| Best For | Stable workloads, want full control | Variable workloads, optimize for cost |
10. When NOT to Use ClickHouse
ClickHouse is not a silver bullet
- OLTP applications: Need frequent UPDATE/DELETE, ACID transactions → use PostgreSQL, SQL Server.
- Complex full-text search: Need relevance scoring, fuzzy matching → use Elasticsearch/OpenSearch.
- Small data (<100GB): PostgreSQL with proper indexes is fast enough, no need for extra complexity.
- Heavy multi-table JOINs: ClickHouse supports JOINs but isn't optimized like traditional RDBMS for complex join patterns.
- Point lookups: Fetching a single row by primary key — row-stores are faster.
Conclusion
ClickHouse has proven itself as the leading solution for real-time data analytics at massive scale. With its columnar architecture, vectorized execution, and smart compression system, it enables processing billions of rows in milliseconds — something traditional databases need minutes or hours for. When combined with Kafka for ingestion, Redis for caching, and Grafana for visualization, ClickHouse becomes the centerpiece of a modern analytics architecture, meeting the increasingly complex observability and business intelligence needs of modern software systems.
References
WebGPU — The New Era of GPU Computing in the Browser
Structured Logging in .NET 10: From Console.WriteLine to Professional Log Systems with Serilog
Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.