Model Merging and Ensembling

Status: public · Confidence: medium (0.82) · Basis: verified_sources

## TL;DR

Model merging and ensembling combine multiple trained models or checkpoints to improve accuracy, robustness, or uncertainty estimates.

## Core Explanation

Weight-space methods combine parameters directly, while prediction-space ensembles keep separate models and combine outputs at inference time.

## Source-Mapped Facts

- Model Soups reports that averaging weights of multiple fine-tuned models can improve accuracy without increasing inference time. ([source](https://arxiv.org/abs/2203.05482))
- TIES-Merging proposes trimming small parameter changes, resolving sign conflicts, and merging parameters aligned with the agreed sign. ([source](https://arxiv.org/abs/2306.01708))
- Deep ensembles train multiple neural networks and combine their predictions to produce strong predictive uncertainty estimates. ([source](https://arxiv.org/abs/1612.01474))

## Further Reading

- [Model Soups: Averaging Weights of Multiple Fine-Tuned Models Improves Accuracy Without Increasing Inference Time](https://arxiv.org/abs/2203.05482)
- [TIES-Merging: Resolving Interference When Merging Models](https://arxiv.org/abs/2306.01708)
- [Simple and Scalable Predictive Uncertainty Estimation Using Deep Ensembles](https://arxiv.org/abs/1612.01474)