Data Iceberg Snapshot Expiration and Orphan Files
Status: public · Confidence: medium (0.825) · Basis: verified_sources
## TL;DR Iceberg table maintenance needs snapshot retention and orphan-file cleanup evidence before agents delete metadata or data files. ## Core Explanation Iceberg snapshots preserve table state for readers, time travel, and rollback. Expiring old snapshots and removing orphan files can reduce metadata and storage cost, but unsafe retention windows can delete files still needed by in-progress writers or readers. Agents should inspect snapshot age, branch and tag references, metadata retention settings, dry-run output, file prefixes, object-store paths, and active jobs before recommending cleanup. ## Source-Mapped Facts - Apache Iceberg documentation lists expire snapshots, remove old metadata files, and delete orphan files as recommended maintenance. ([source](https://iceberg.apache.org/docs/latest/maintenance/)) - Apache Iceberg documentation says orphan files are files not referenced by table metadata. ([source](https://iceberg.apache.org/docs/latest/maintenance/)) - The Apache Iceberg specification says a snapshot represents the state of a table at some time and is used to access the complete set of data files in the table. ([source](https://iceberg.apache.org/spec/)) ## Further Reading - [Apache Iceberg Maintenance](https://iceberg.apache.org/docs/latest/maintenance/) - [Apache Iceberg Specification](https://iceberg.apache.org/spec/)