Recurrent Neural Networks (RNN)

Status: public · Confidence: medium (0.78) · Basis: verified_sources

## TL;DR

Recurrent neural networks process sequences by updating a hidden state over time. LSTM and GRU architectures add gates to make recurrent models easier to train on longer dependencies.

## Core Explanation

Vanilla RNNs can struggle with vanishing or exploding gradients over long sequences. LSTMs use gates to regulate memory flow, while GRUs simplify the gated design. Transformers later showed that attention-only sequence models could replace recurrence for many machine-translation and NLP tasks, though recurrent designs remain useful in some streaming and resource-constrained settings.

## Further Reading

- [Long Short-Term Memory](https://doi.org/10.1162/neco.1997.9.8.1735)
- [Learning Phrase Representations using RNN Encoder-Decoder](https://arxiv.org/abs/1406.1078)
- [Attention Is All You Need](https://arxiv.org/abs/1706.03762)

## Related Articles

- [Activation Functions in Neural Networks](../activation-functions.md)
- [AI for Fraud Detection: Graph Neural Networks, Anti-Money Laundering, and Financial Crime](../ai-for-fraud-detection.md)
- [Convolutional Neural Networks (CNN)](../convolutional-neural-networks-cnn.md)