## TL;DR
Self-supervised learning extracts supervisory signals from unlabeled data, enabling models to learn useful representations without expensive human annotation. SSL underpins modern pretraining of foundation models.
## Core Explanation
Pretext tasks: masked language modeling (predict masked words from context — BERT), contrastive learning (pull similar instances together, push dissimilar apart — SimCLR, MoCo), and generative approaches (reconstruct corrupted input — masked autoencoders). The learned representations transfer to downstream tasks with minimal labeled data.
## Detailed Analysis
BYOL (Grill et al., 2020) eliminated negative pairs — using a momentum encoder and predictor to prevent representational collapse without contrastive loss. MAE (He et al., 2022) masks 75% of image patches, forcing the model to learn semantic understanding from sparse visible pixels.
## Further Reading
- LeCun: Self-Supervised Learning (AAAI 2020 Keynote)
- Lilian Weng: Self-Supervised Representation Learning
- Papers With Code: Self-Supervised Learning