πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

LoRA and PEFT

🟒 Free Lesson

Advertisement

LoRA and PEFT

LoRA ArchitectureStandard LayerWeight Matrix Wd x d parametersFrozen during trainingParameters: 100%LoRA LayerW (frozen)+A (rxd)B (dxr)Low-rank decompositionParameters: 0.1-1%BenefitsMemory efficientFast trainingMultiple adaptersEasy deployment

What is LoRA?

Low-Rank Adaptation (LoRA) is a parameter-efficient fine-tuning method that injects trainable rank decomposition matrices into transformer layers, dramatically reducing the number of parameters to train.

How LoRA Works

import torch
import torch.nn as nn

class LoRALayer(nn.Module):
    def __init__(self, in_features, out_features, rank=8, alpha=16):
        super().__init__()
        self.rank = rank
        self.alpha = alpha

        # Original frozen weight
        self.W = nn.Linear(in_features, out_features, bias=False)
        self.W.weight.requires_grad = False

        # LoRA trainable matrices
        self.A = nn.Linear(in_features, rank, bias=False)
        self.B = nn.Linear(rank, out_features, bias=False)

        # Scaling factor
        self.scaling = alpha / rank

    def forward(self, x):
        # Original output + LoRA output
        return self.W(x) + self.B(self.A(x)) * self.scaling

# Usage
layer = LoRALayer(768, 768, rank=8)
print(f"Original params: {768 * 768:,}")
print(f"LoRA params: {768 * 8 * 2:,}")
print(f"Reduction: {(1 - (768*8*2)/(768*768))*100:.1f}%")

Using PEFT Library

from peft import LoraConfig, get_peft_model, TaskType
from transformers import AutoModelForCausalLM

def setup_lora(model_name, rank=8, alpha=16):
    model = AutoModelForCausalLM.from_pretrained(model_name)

    lora_config = LoraConfig(
        task_type=TaskType.CAUSAL_LM,
        r=rank,
        lora_alpha=alpha,
        lora_dropout=0.1,
        target_modules=["q_proj", "v_proj", "k_proj", "o_proj"]
    )

    peft_model = get_peft_model(model, lora_config)
    peft_model.print_trainable_parameters()

    return peft_model

# Example usage
model = setup_lora("meta-llama/Llama-2-7b-hf", rank=16)
# Output: trainable params: 4,194,304 || all params: 6,742,609,920 || trainable%: 0.0622

LoRA Variants

VariantDescription
QLoRAQuantized base model + LoRA
AdaLoRAAdaptive rank allocation
DoRAWeight-decomposed LoRA
LoRA+Different learning rates for A and B

Training with LoRA

from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(
    output_dir="./lora-output",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    learning_rate=2e-4,
    fp16=True,
    save_steps=500,
    logging_steps=100,
)

trainer = Trainer(
    model=peft_model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
)

trainer.train()

Summary

LoRA and PEFT enable fine-tuning large models with minimal resources, making AI customization accessible to more developers.

Next: We'll explore model quantization techniques.

⭐

Premium Content

LoRA and PEFT

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert Generative AI Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement