Data Quality Expectations and Validation Rules

Status: public · Confidence: medium (0.725) · Basis: verified_sources

## TL;DR

Data quality expectations and validation rules give agents executable evidence about whether a dataset is fit for use.

## Core Explanation

Agents answering data questions should not rely only on table names or row counts. They need quality signals such as failed tests, freshness checks, null thresholds, uniqueness constraints, accepted values, distribution drift, and validation history.

Validation rules become more useful when they are close to the data product contract. A failed rule can tell the agent whether to block a downstream answer, warn about incomplete evidence, open an incident, or request an owner decision.

## Source-Mapped Facts

- Great Expectations documentation describes running validations by validating Expectations against data and exploring the results. ([source](https://docs.greatexpectations.io/docs/core/run_validations/))
- dbt documentation describes data tests as assertions about models and other resources in a dbt project. ([source](https://docs.getdbt.com/docs/build/data-tests))
- SodaCL documentation describes Soda Checks Language as a YAML-based language for data quality checks. ([source](https://docs.soda.io/soda-documentation/soda-v3/soda-cl-overview))

## Further Reading

- [Great Expectations Run Validations](https://docs.greatexpectations.io/docs/core/run_validations/)
- [dbt Data Tests](https://docs.getdbt.com/docs/build/data-tests)
- [SodaCL Overview](https://docs.soda.io/soda-documentation/soda-v3/soda-cl-overview)