State Space Models: S4, H3, and Mamba

Status: public · Confidence: medium (0.82) · Basis: verified_sources

## TL;DR

State space models are a family of sequence models that update a compact hidden state through time. In AI systems, S4, H3, and Mamba are important because they explore alternatives to full self-attention for long sequences and language modeling.

## Core Explanation

State-space layers process a sequence by updating a latent state and producing outputs from that state. This can be computationally attractive for long inputs because the model does not always need the full quadratic attention pattern used by standard Transformers.

S4 showed that structured state-space parameterizations could model long-range dependencies effectively. H3 investigated what SSMs need for language modeling, including the ability to recall and compare tokens. Mamba then made the state-space parameters input-dependent, giving the model a content-selective mechanism while retaining a linear-time scan-style computation.

## Related Articles

- [Long-Context Language Models: Memory, Retrieval, and Evaluation](../long-context-models.md)
- [Transformer Architecture Variants](../transformer-architecture-variants.md)
- [Attention Mechanism: Query-Key-Value and Contextual Representation](../attention-mechanism.md)