Apache Iceberg is an open table format for huge analytic datasets, built to give data lakes the reliability that data warehouses had always taken for granted. It originated at Netflix in 2017, where Ryan Blue and Daniel Weeks created it to solve real production problems with Hive tables at scale. Netflix donated it to the Apache Software Foundation in 2018, and it graduated to a top-level project in May 2020.
The core idea: Iceberg sits between your storage system (S3, HDFS, ADLS) and whatever query engine you're using. It's not a database, not a server, nothing to run or deploy. It's a specification -- a set of rules for organizing Parquet files, tracking metadata, and managing table state -- plus libraries that implement those rules. Spark, Trino, Flink, Presto, Snowflake, Dremio, Athena, BigQuery, and Databricks all speak Iceberg natively. Any of them can read from and write to the same tables without stepping on each other.
Why Iceberg Exists
To understand what Iceberg solves, trace how we got here. Data warehouses dominated for decades -- batch overnight ETL, specialized hardware, predictable schema. Around 2010, Hadoop changed the model: dump everything into a distributed file system cheaply, worry about structure later. Schema-on-read made sense when managing schema was painful and data volumes were breaking traditional warehouse economics.
But schema does matter. Consistency matters. The ability to safely modify data without corrupting concurrent reads matters. Early data lakes -- even Hive tables on HDFS -- had none of that. Updating a row meant rewriting the whole partition. Two writers touching the same table at once? Either you got lucky or you got corruption. Changing a column name meant rewriting every file or breaking downstream readers. Querying a large table required listing every file in a directory hierarchy, which got comically slow at scale.
Iceberg and formats like it emerged from those pain points. Netflix had tens of petabytes in Hive tables and was hitting all of them.
How Apache Iceberg Works
Iceberg organizes table data through four layers of metadata stacked on top of ordinary data files.
At the bottom are data files -- typically Parquet, though ORC and Avro work too. Standard columnar files in whatever object storage you're using.
Above that, manifest files track which data files belong to a table. They're not just path lists -- each entry records column-level statistics: value counts, null counts, min/max values. Those statistics let query engines skip files entirely during planning without opening them.
Multiple ingest operations produce multiple manifests, so manifest lists collect them. A manifest list represents one complete version of a table's file set -- it's what the query engine reads to figure out which manifests and data files to touch.
The metadata file holds everything together: schema, partition spec, and a history of snapshots. Each snapshot points to a manifest list. This is what makes time travel possible and enables atomic schema changes -- you're never modifying files in place, just creating a new snapshot pointing to a new view of the data while old snapshots remain intact and queryable.
On top sits a catalog -- a lightweight lookup mapping table names to current metadata file locations. It can be the Hive Metastore, a JDBC database, AWS Glue, Project Nessie, or any compatible implementation. The query engine asks the catalog where to find a table, gets pointed to the metadata file, and the rest unwinds from there.
This structure -- JSON and Parquet in a bucket -- is what delivers ACID transactions, time travel, schema evolution, and efficient file-level skipping without specialized storage or a dedicated server.
Key Features
ACID transactions with snapshot isolation. Concurrent writers work against the same table safely. Iceberg resolves conflicts at commit time using optimistic concurrency. Readers always see a consistent snapshot, unaffected by in-progress writes.
Hidden partitioning. Most formats require explicit partitioning and queries that match the scheme exactly. Iceberg tracks the relationship between partition values and the transform that produced them, so the engine prunes correctly even with different query expressions. Partition layouts can change without touching existing data -- old files stay on the old layout, new files use the new one, queries span both transparently.
Schema evolution. Columns are identified by internal integer IDs, not names or positions. Rename, reorder, add, or drop columns and old files read correctly -- the engine maps stored column IDs to the current schema. Most other formats assume file column order matches schema order.
Time travel. Every write creates a new snapshot; none are deleted until you explicitly run expiry. Query any historical snapshot by ID or timestamp -- useful for debugging, auditing, or reproducing dataset state at a specific point.
Scalable metadata. File-skipping statistics in manifests mean query planning doesn't require opening data files or listing directories. Scales to tables with millions of files where directory-listing breaks down.
Where Iceberg Fits in a Data Lakehouse Stack
A lakehouse on Iceberg typically has five layers:
- Object storage -- S3, GCS, ADLS, or HDFS for the actual files.
- File format -- Parquet is standard; Iceberg also supports ORC and Avro.
- Table format -- Apache Iceberg provides the metadata layer making files behave like managed tables.
- Catalog -- AWS Glue, Nessie, Hive Metastore, or Iceberg REST Catalog tracks tables and metadata locations.
- Query engine -- Spark for batch, Trino or Dremio for interactive queries, Flink for streaming -- all reading the same tables.
The practical benefit over a traditional warehouse: no second copy of data into a proprietary system. Everything stays in open formats on cheap storage, and any Iceberg-capable engine can use it. When a better engine comes along, adopt it without migrating data.
Apache Iceberg vs Delta Lake vs Apache Hudi
All three are open table formats solving the same core problem. Differences come down to ecosystem fit and feature tradeoffs.
| Apache Iceberg | Delta Lake | Apache Hudi | |
|---|---|---|---|
| Engine support | Broadest -- Spark, Trino, Flink, Athena, BigQuery, Snowflake, Dremio, and more | Strong, Spark-native; growing beyond Databricks | Good, Spark-native; growing |
| Catalog interop | REST Catalog (open standard, cross-vendor) | Unity Catalog (Databricks), Delta Sharing | Hive Metastore, custom |
| Partition evolution | Yes, without rewriting existing data | Limited | Limited |
| Hidden partitioning | Yes | No | No |
| Deletion vectors | Yes (V3 spec) | Yes | Yes |
| Primary maintainer | Apache Software Foundation | Databricks (open source) | Apache Software Foundation |
| Best fit | Multi-engine, multi-cloud setups | Databricks-centric stacks | Upsert-heavy pipelines, CDC |
Iceberg is the right choice when engine portability matters and you want to avoid coupling your table format to a vendor. Delta Lake works well if you're deep in Databricks and staying there. Hudi tends to appear in heavy upsert and CDC pipelines, where its record-level indexing is a genuine advantage.
Cloud and Platform Support
Every major cloud platform now has native Iceberg support:
AWS offers Amazon S3 Tables for managed Iceberg storage, with Athena, EMR, Glue, and Redshift all reading and writing Iceberg on S3. AWS supports the Iceberg REST Catalog standard.
Google Cloud has BigQuery BigLake Iceberg Tables with streaming ingestion, auto-reclustering, and Vertex AI integration.
Snowflake provides managed Iceberg Tables with full DML, plus Snowflake Open Catalog (a managed version of Apache Polaris, the open-source REST Catalog reference).
Databricks added full Iceberg support through Unity Catalog's REST Catalog API -- a significant shift following their 2024 acquisition of Tabular, the company founded by Iceberg's creators.
Microsoft Fabric supports Iceberg via metadata virtualization and Data Factory integration.
The Iceberg REST Catalog
The REST Catalog is a vendor-neutral HTTP API spec for catalog operations -- creating tables, listing namespaces, committing metadata updates. Any engine implementing the client can work with any backend implementing the server, regardless of how either is built.
Apache Polaris (contributed to ASF by Snowflake) is the reference implementation. AWS, Google Cloud, Snowflake, Databricks, and Dremio all offer REST Catalog-compatible services. The practical effect: swap catalog backends without changing query engines, and new engines don't need custom integrations for each catalog.
Iceberg V3 Specification
The V3 spec, ratified spring 2025, adds capabilities that were V2 limitations:
Deletion vectors replace positional delete files for row-level deletes. Instead of a separate file listing deleted positions, V3 uses a compact binary bitmap. UPDATE and DELETE become significantly cheaper at both write and read time.
Row-level lineage adds metadata to individual records for precise change tracking, simplifying CDC pipelines and audit requirements.
New data types include Variant for semi-structured data, geospatial types (geometry and geography), and nanosecond-precision timestamps -- gaps that required workarounds in V2.
Is Iceberg Parquet? Is It a Database?
Two questions that come up often:
Iceberg is not Parquet. Parquet is a file format for columnar data storage. Iceberg is a table format managing a collection of Parquet files -- it adds the metadata layer for transactional semantics, schema management, and partition tracking. Iceberg tables almost always store data as Parquet, but could use ORC or Avro. Parquet alone has no concept of tables, schemas, or transactions.
Iceberg is not a database. There's no server process to run. It's a specification plus libraries. The "database-like" features come from the metadata layer alongside data files in object storage, interpreted by whichever engine you use.
Netflix, Apple, Airbnb, LinkedIn, Salesforce, Tencent, and Bloomberg are among the organizations running Iceberg at production scale.