Transfer Learning — Pre-trained Models Complete Guide

Deep LearningTransfer LearningFree Lesson

Advertisement

Transfer Learning — Complete Guide

Transfer learning reuses a pre-trained model on a new task, dramatically reducing data and training requirements.


Why Transfer Learning?

Training from scratch:
├─ Needs millions of examples
├─ Takes weeks on GPUs
├─ Expensive compute
└─ Risk of overfitting

Transfer learning:
├─ Start with pre-trained weights
├─ Fine-tune on small dataset
├─ Hours instead of weeks
└─ Often better performance

Strategies

Strategy 1: Feature Extraction
├─ Freeze pre-trained layers
├─ Only train new classification head
├─ Fast, less overfitting
└─ Use when: Small dataset, similar domain

Strategy 2: Fine-Tuning
├─ Unfreeze some/all pre-trained layers
├─ Train with small learning rate
├─ Better performance
└─ Use when: More data, different domain

Strategy 3: Full Fine-Tuning
├─ Unfreeze everything
├─ Train with very small learning rate
├─ Best performance
└─ Use when: Large dataset, different domain

Implementation

from torchvision import models
import torch.nn as nn

# Load pre-trained ResNet
model = models.resnet50(pretrained=True)

# Freeze all layers
for param in model.parameters():
    param.requires_grad = False

# Replace classifier
model.fc = nn.Linear(2048, num_classes)

# Only new layer trains
optimizer = torch.optim.Adam(model.fc.parameters(), lr=0.001)

When to Use

Small data + Similar domain → Feature extraction
Small data + Different domain → Fine-tune top layers
Large data + Similar domain → Fine-tune all layers
Large data + Different domain → Fine-tune or train from scratch

Key Takeaways

  1. Transfer learning dramatically reduces data and compute needs
  2. Feature extraction is safest for small datasets
  3. Fine-tuning with small learning rate prevents catastrophic forgetting
  4. ImageNet pre-trained models work for most vision tasks
  5. BERT/GPT pre-trained models work for most NLP tasks
  6. Discriminative learning rates — lower for early layers
  7. Gradual unfreezing — unfreeze layer by layer
  8. Transfer learning is the default approach in modern ML

Advertisement

Need Expert Machine Learning Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement