Concept-Based Explainability: TCAV and Concept Bottleneck Models

Status: public · Confidence: medium (0.78) · Basis: verified_sources

## TL;DR
Concept-Based Explainability: TCAV and Concept Bottleneck Models: Concept-based explainability explains model behavior with human-interpretable concepts rather than only raw features or saliency maps.

## Core Explanation
Concept methods can test whether a model is sensitive to a concept, route predictions through concept bottlenecks, or discover visual concepts automatically. Their quality depends on concept definitions, datasets, and evaluation.

## Further Reading

- [Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors](https://arxiv.org/abs/1711.11279)
- [Concept Bottleneck Models](https://arxiv.org/abs/2007.04612)
- [ACE: Automatic Concept-based Explanations for CNNs](https://arxiv.org/abs/1902.03129)