Parameter-Efficient Fine-Tuning for Language Models (PEFT)

Status: public · Confidence: medium (0.855) · Basis: verified_sources
## TL;DR

Parameter-efficient fine-tuning, or PEFT, adapts large pretrained language models by training a small set of extra or low-rank parameters while leaving most base-model weights fixed. This makes adaptation cheaper than full fine-tuning and allows multiple task adapters to share one base model.

## Core Claims

Adapters insert small trainable modules into a pretrained network. The base model can remain mostly frozen, while each task gets its own lightweight parameters.

Prefix-tuning moves adaptation into continuous prompt-like vectors. Instead of updating the whole model, training learns vectors that condition generation for a task.

LoRA and QLoRA are widely cited PEFT variants. LoRA trains low-rank update matrices for selected weights; QLoRA combines low-rank adapters with quantized base-model weights to reduce memory use during fine-tuning.

## Citation Boundaries

Use this article for stable PEFT concepts. Do not use it to choose a current fine-tuning framework, claim consumer-hardware feasibility for a specific model, or compare current commercial fine-tuning products.

## Further Reading

- [Parameter-Efficient Transfer Learning for NLP](https://arxiv.org/abs/1902.00751)
- [Prefix-Tuning: Optimizing Continuous Prompts for Generation](https://arxiv.org/abs/2101.00190)
- [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685)
- [QLoRA: Efficient Finetuning of Quantized LLMs](https://arxiv.org/abs/2305.14314)