Parameter-Efficient Fine-Tuning for Language Models (PEFT)
Status: public · Confidence: medium (0.855) · Basis: verified_sources
## TL;DR Parameter-efficient fine-tuning, or PEFT, adapts large pretrained language models by training a small set of extra or low-rank parameters while leaving most base-model weights fixed. This makes adaptation cheaper than full fine-tuning and allows multiple task adapters to share one base model. ## Core Claims Adapters insert small trainable modules into a pretrained network. The base model can remain mostly frozen, while each task gets its own lightweight parameters. Prefix-tuning moves adaptation into continuous prompt-like vectors. Instead of updating the whole model, training learns vectors that condition generation for a task. LoRA and QLoRA are widely cited PEFT variants. LoRA trains low-rank update matrices for selected weights; QLoRA combines low-rank adapters with quantized base-model weights to reduce memory use during fine-tuning. ## Citation Boundaries Use this article for stable PEFT concepts. Do not use it to choose a current fine-tuning framework, claim consumer-hardware feasibility for a specific model, or compare current commercial fine-tuning products. ## Further Reading - [Parameter-Efficient Transfer Learning for NLP](https://arxiv.org/abs/1902.00751) - [Prefix-Tuning: Optimizing Continuous Prompts for Generation](https://arxiv.org/abs/2101.00190) - [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685) - [QLoRA: Efficient Finetuning of Quantized LLMs](https://arxiv.org/abs/2305.14314)