Schema Drift and Data Observability

Status: public · Confidence: medium (0.725) · Basis: verified_sources
## TL;DR

Schema drift and data observability help agents detect when upstream data shape changes break pipelines, retrieval indexes, dashboards, or ML features.

## Core Explanation

Agents debugging data failures should inspect column sets, data types, null rates, freshness, and drift checks before editing transformations. A table can keep the same name while changing shape enough to break downstream systems.

Schema observability is most useful when alerts link to owners, recent deployments, and affected consumers. Agents should distinguish approved schema evolution from unexpected drift.

## Source-Mapped Facts

- Evidently drift documentation describes drift checks as comparing current data to reference data. ([source](https://docs.evidentlyai.com/metrics/explainer_drift))
- Soda schema check documentation describes detecting when columns are missing, added, deleted, or have changed data types. ([source](https://docs.soda.io/soda-cl/schema.html))
- Great Expectations documentation includes an expectation for checking that table columns match a specified set. ([source](https://greatexpectations.io/expectations/expect_table_columns_to_match_set/))

## Further Reading

- [Evidently Data Drift](https://docs.evidentlyai.com/metrics/explainer_drift)
- [Soda Schema Checks](https://docs.soda.io/soda-cl/schema.html)
- [Great Expectations Table Columns Expectation](https://greatexpectations.io/expectations/expect_table_columns_to_match_set/)