PEFT

Tracked

State-of-the-art parameter-efficient fine-tuning methods for large language models, enabling adapter-based training with minimal GPU resources.

Author Hugging Face Open Sourced 2022-11-25 Last Commit Unknown

Overview

PEFT (Parameter-Efficient Fine-Tuning) is Hugging Face's library for adapting large pretrained models using a fraction of the parameters. It implements LoRA, QLoRA, prefix tuning, prompt tuning, and other PEFT methods, enabling fine-tuning of large models on consumer GPUs with minimal memory overhead.

Key Features

LoRA, QLoRA, AdaLoRA, and IA3 adapter methods
Prefix tuning, prompt tuning, and P-tuning v2
Seamless integration with Hugging Face Transformers and Accelerate
Adapter merging, mixing, and loading utilities
Support for saving and sharing adapters on Hugging Face Hub

Use Cases

Fine-tuning LLMs on single consumer GPUs via QLoRA
Creating domain-specific adapters without full model training
Multi-task adaptation by combining multiple LoRA adapters
Rapid experimentation with different fine-tuning strategies

Technical Details

Works with any Hugging Face Transformers model
Reduces trainable parameters by 90%+ compared to full fine-tuning
Supports 8-bit and 4-bit quantization via bitsandbytes integration
Adapters are typically 10-100MB, easily shared and versioned