Data Delta Lake MERGE and Upserts
Status: public · Confidence: medium (0.685) · Basis: verified_sources
## TL;DR Delta Lake MERGE evidence tells agents whether an upsert pipeline updated, inserted, deleted, or failed because source rows ambiguously matched target rows. ## Core Explanation MERGE is a data-infrastructure control point, not just a SQL convenience. It joins a source change set to a target table and applies conditional actions. If the source has duplicate keys, schema drift, or wrong clause ordering, the pipeline can produce stale rows, duplicate facts, or ambiguous updates. Agents should inspect source deduplication, match keys, `WHEN MATCHED` clauses, `WHEN NOT MATCHED` clauses, schema validation, table history, operation metrics, partitions, and transaction conflicts before replaying data or changing incremental logic. ## Source-Mapped Facts - Delta Lake documentation says data can be upserted from a source table, view, or DataFrame into a target Delta table with the MERGE SQL operation. ([source](https://docs.delta.io/delta-update/)) - Delta Lake documentation says Delta Lake supports inserts, updates, and deletes in MERGE. ([source](https://docs.delta.io/delta-update/)) - Delta Lake documentation says whenMatched clauses execute when a source row matches a target table row based on the match condition. ([source](https://docs.delta.io/delta-update/)) - Delta Lake documentation says a merge operation can fail if multiple rows from the source dataset match and the merge attempts to update the same target rows. ([source](https://docs.delta.io/delta-update/)) - Databricks Delta merge documentation says only a single row from the source table can match a given row in the target table. ([source](https://docs.databricks.com/aws/en/delta/merge)) ## Further Reading - [Delta Lake Update and Merge](https://docs.delta.io/delta-update/) - [Databricks Delta Merge](https://docs.databricks.com/aws/en/delta/merge)