ClickHouse and Apache Iceberg: A Practical Guide to Data Lakehouse Integration

A hands-on guide to ClickHouse's Iceberg integration - table engines, table functions, DataLakeCatalog, architecture patterns for hot/cold tiering and write-back, and performance tuning tips.

Storage and compute keep drifting apart in the modern data stack, and open table formats like Apache Iceberg have become the default way to manage analytical data at scale. ClickHouse - famous for sub-second queries on billions of rows - has built deep Iceberg integration over 2024 and 2025, letting you query and write to Iceberg tables directly.

There's a genuine architectural tension here. ClickHouse is blazing fast but historically locked data into its own MergeTree format. Iceberg gives you ACID transactions on object storage, schema evolution, and engine-agnostic access - but no high-performance real-time query layer. Put them together and you get the core of a data lakehouse: Iceberg as the open storage and interchange layer, ClickHouse as the real-time analytics engine. Below, we walk through how the integration works, which architecture patterns hold up in production, and where the rough edges still are.

How ClickHouse Integrates with Iceberg

ClickHouse offers three integration mechanisms: the Iceberg table engine for persistent table definitions, table functions for ad-hoc queries, and the DataLakeCatalog database engine for catalog-driven discovery. Each fits a different access pattern.

The Iceberg Table Engine

This creates a persistent ClickHouse table pointing to an existing Iceberg table on external storage. Four variants cover different backends: IcebergS3 (aliased as Iceberg), IcebergAzure, IcebergHDFS, and IcebergLocal.

CREATE TABLE my_iceberg_table
  ENGINE = IcebergS3(
    's3://my-bucket/warehouse/my_table/',
    'access_key', 'secret_key'
  )

The engine supports the features that make Iceberg worth using. Schema evolution just works - ClickHouse handles added, removed, reordered, and renamed columns, along with type promotions like int to long and float to double. Partition pruning skips irrelevant data files based on Iceberg's metadata when you set use_iceberg_partition_pruning = 1. Time travel (added in 25.4) lets you query historical snapshots:

-- Query by timestamp (milliseconds since epoch)
  SELECT * FROM my_iceberg_table
  SETTINGS iceberg_timestamp_ms = 1714636800000;
  
  -- Or by snapshot ID
  SELECT * FROM my_iceberg_table
  SETTINGS iceberg_snapshot_id = 3547395809148285433;

Write support landed in 25.7 with INSERT INTO for existing Iceberg tables. Version 25.8 expanded this to CREATE TABLE for new Iceberg tables, ALTER DELETE mutations, and equality delete handling. 25.9 completed the picture with ALTER UPDATE and distributed writes, reaching read/write parity. Writes work with REST and AWS Glue catalogs.

The engine can also connect directly to catalogs through settings like storage_catalog_type, storage_catalog_url, and catalog_credential - supporting REST, Glue, and Unity catalog types.

The Iceberg Table Function

For quick exploration, ClickHouse provides inline table functions - no persistent table definition needed:

SELECT count(*), avg(price)
  FROM icebergS3('s3://my-bucket/warehouse/sales/')
  WHERE sale_date > '2025-01-01';

The ClickHouse documentation actually recommends table functions over the engine for most read-only use cases, since ClickHouse wasn't originally designed for tables whose schemas change externally. Table functions are read-only but support all the same settings - partition pruning, time travel, metadata caching.

For distributed processing, the icebergS3Cluster variant (since 24.11) spreads file-level work across cluster nodes, scaling roughly linearly:

SELECT * FROM icebergS3Cluster(
    'my_cluster',
    's3://my-bucket/warehouse/large_table/'
  )
  WHERE event_date = today();

The DataLakeCatalog Database Engine

The newest integration - and arguably the most useful - is the DataLakeCatalog database engine. It connects to external catalog services and auto-discovers all tables. Point ClickHouse at a catalog, and every table becomes queryable without manual definitions.

CREATE DATABASE my_lake
  ENGINE = DataLakeCatalog
  SETTINGS
      catalog_type = 'glue',
      region = 'us-east-2',
      aws_access_key_id = '...',
      aws_secret_access_key = '...';
  
  -- All tables are immediately queryable
  SHOW TABLES FROM my_lake;
  SELECT * FROM my_lake.`my_schema.events` WHERE ts > now() - INTERVAL 1 DAY;

Catalog support has expanded quickly: REST catalogs and Apache Polaris since 24.12, AWS Glue and Databricks Unity Catalog since 25.3, Hive Metastore since 25.5, Microsoft OneLake since 25.11. The engine detects whether a table is Iceberg or Delta Lake from catalog metadata and routes queries accordingly - which also enables federated queries across multiple catalogs in a single statement. DataLakeCatalog entered beta on ClickHouse Cloud with version 25.8.

For a closer look at the metadata structures ClickHouse reads under the hood, see our Apache Iceberg Architecture Deep Dive.

Architecture Patterns for ClickHouse + Iceberg

Three patterns have taken hold in production, each with different trade-offs.

Pattern 1: Ad-Hoc Query Layer

The simplest approach. Data lives entirely in Iceberg; ClickHouse queries it via table functions or DataLakeCatalog with no data movement.

Good for data exploration, cross-system joins, and occasional analytical queries where Parquet decoding and network I/O overhead are tolerable. Think of it as an Athena replacement with ClickHouse's richer SQL dialect and stronger performance on complex queries. The catch: queries run in seconds, not milliseconds - roughly 2-3x slower than native MergeTree tables. That gap is shrinking as ClickHouse improves its native Parquet reader, but it's still a poor fit for high-concurrency, sub-second dashboards.

Pattern 2: Hot/Cold Tiered Architecture

The most common production pattern. Recent "hot" data goes into native ClickHouse MergeTree tables for maximum query speed. Historical "cold" data stays in Iceberg on object storage - accessible when needed but not burning expensive ClickHouse resources.

A typical setup streams data through Kafka and Flink, writing to both ClickHouse (hot data) and Iceberg (durable archive). The KFC Architecture - Kafka, Flink, and ClickHouse extends naturally to include Iceberg as the cold tier.

Cross-tier queries use a UNION ALL view:

CREATE VIEW unified_events AS
    SELECT * FROM events_hot WHERE event_date >= today() - 7
    UNION ALL
    SELECT * FROM icebergS3('s3://lake/events/') WHERE event_date < today() - 7;

There's real friction here, though. ClickHouse uses unsigned integers (UInt32, UInt64) heavily while Iceberg only supports signed types, causing type mismatches in UNION queries. ClickHouse also doesn't push aggregations down through UNION ALL subqueries - it materializes all rows first, then aggregates. Expensive for large cold datasets. Teams at scale have had to build workarounds, like Altinity's Hybrid Table engine that merges partial aggregation results from both tiers transparently.

Netflix runs a variant of this at massive scale - 5 petabytes of logs daily at 10.6 million events per second on average, with ClickHouse serving sub-second queries on recent data and Iceberg handling cost-efficient long-term storage.

Pattern 3: ClickHouse as Iceberg Writer

ClickHouse doesn't just read Iceberg here - it writes back. Process and transform data in ClickHouse, then materialize results into Iceberg tables that Spark, Trino, Flink, or other engines can consume.

Write support arrived in 25.7 (INSERT INTO existing tables), grew in 25.8 (CREATE TABLE for new tables, equality deletes, ALTER DELETE), and reached full read/write parity in 25.9 (ALTER UPDATE, distributed writes). There's even experimental compaction via OPTIMIZE TABLE.

One critical limitation: Iceberg's commit model uses optimistic concurrency with atomic metadata swaps, serializing writes at the metadata level. Adobe hit a ceiling of roughly 15 commits per minute per table in production. High-frequency streaming writes from ClickHouse to Iceberg are not viable. This pattern fits batch transformations, aggregated outputs, and periodic materializations - not real-time pipelines.

Performance Considerations and Best Practices

Good performance requires tuning on both the ClickHouse and Iceberg sides.

Enable partition pruning. use_iceberg_partition_pruning lets ClickHouse skip irrelevant data files at query planning time using Iceberg's partition metadata. It was off by default in earlier versions but became the default around 25.6. On older releases, set it to 1 explicitly. ClickHouse also leverages Iceberg v2 column-level statistics for additional file pruning automatically.

Keep metadata caching on. use_iceberg_metadata_files_cache (on by default) caches manifest files, manifest lists, and metadata JSON in memory. Saves you from re-reading metadata from S3 on every query. One caveat: cached metadata can go stale when external writers update the same tables.

Use cluster table functions for large scans. icebergS3Cluster distributes file-level work across cluster nodes for near-linear scaling. Since 25.3, parallel replicas work without explicit cluster functions too.

Right-size your approach. For frequent, latency-sensitive queries, ingest into native MergeTree tables rather than querying Iceberg directly. Reserve direct Iceberg queries for ad-hoc analysis and infrequent access. The native Parquet reader (which bypasses Arrow's intermediate layer) has narrowed the performance gap, but it hasn't closed it.

Maintain your Iceberg tables. Tables that are well-compacted with managed snapshots perform noticeably better when queried from ClickHouse. High-frequency streaming into Iceberg produces many small Parquet files, and query planning degrades as file count grows. Regular compaction and snapshot expiry are not optional in production.

Watch for type mismatches. ClickHouse's type system doesn't map 1:1 to Iceberg's. Unsigned integers, Nullable handling, and complex nested types can behave unexpectedly. Test schemas early - especially if you plan to write from ClickHouse and read from other engines.

Wrapping Up

ClickHouse's Iceberg integration matured fast - from basic read-only table functions to full bidirectional access with catalog federation, all within about two years. The three architecture patterns (ad-hoc queries, hot/cold tiering, write-back) serve distinct needs, and the right pick depends on your latency requirements, query frequency, and how many engines need access to the same data.

Read workloads are production-ready. Write support is maturing quickly. But choosing the right pattern, tuning partition pruning and caching, and managing the Iceberg table lifecycle all take hands-on expertise. At BigData Boutique, we work with teams building ClickHouse and Iceberg-based lakehouse architectures daily - from initial design through production optimization. If you're evaluating this stack or hitting performance walls, reach out and let's talk.