LLM Evaluation Promptfoo Test Cases and Assertions

Status: public · Confidence: medium (0.685) · Basis: verified_sources

## TL;DR

Promptfoo test cases and assertions give agents a concrete way to inspect LLM evaluation inputs, expected behavior, and pass/fail criteria.

## Core Explanation

LLM evaluation needs more than a prompt and a score. Agents should look for the prompt template, provider configuration, test variables, expected outputs, assertion type, threshold, fixtures, and whether the result is used as a CI gate. This helps distinguish model regression from a changed prompt, changed provider, or brittle test data.

Assertion-based evaluation is especially useful for developer workflows because it turns qualitative behavior into repeatable checks. The tradeoff is coverage: assertions must be reviewed against real failure modes and supplemented with human review or statistical sampling when the task is subjective.

## Source-Mapped Facts

- Promptfoo documentation describes its configuration guide as covering prompts, providers, test cases, assertions, and advanced features. ([source](https://www.promptfoo.dev/docs/configuration/guide/))
- Promptfoo configuration examples define tests with vars for prompt inputs. ([source](https://www.promptfoo.dev/docs/configuration/guide/))
- Promptfoo expected-output documentation describes assertions and metrics as LLM output validation. ([source](https://www.promptfoo.dev/docs/configuration/expected-outputs/))

## Further Reading

- [Promptfoo Configuration Guide](https://www.promptfoo.dev/docs/configuration/guide/)
- [Promptfoo Expected Outputs Documentation](https://www.promptfoo.dev/docs/configuration/expected-outputs/)