Introduction
Regularization techniques prevent overfitting by constraining model complexity through dropout, weight constraints, and normalization.
Dropout
from tensorflow.keras import layers
model = keras.Sequential([
layers.Dense(256, activation='relu', input_shape=(784,)),
layers.Dropout(0.5), # 50% dropout
layers.Dense(128, activation='relu'),
layers.Dropout(0.3),
layers.Dense(10, activation='softmax')
])
L1/L2 Regularization
from tensorflow.keras import regularizers
# L1 (Lasso) - promotes sparsity
layers.Dense(64, activation='relu',
kernel_regularizer=regularizers.l1(0.01))
# L2 (Ridge) - reduces weights
layers.Dense(64, activation='relu',
kernel_regularizer=regularizers.l2(0.01))
# Combined L1+L2
layers.Dense(64, activation='relu',
kernel_regularizer=regularizers.l1_l2(l1=0.01, l2=0.01))
Activity Regularization
# Regularize output values
layers.Dense(64, activation='relu',
activity_regularizer=regularizers.l2(0.01))
Batch Normalization
model = keras.Sequential([
layers.Dense(64, activation='relu', input_shape=(784,)),
layers.BatchNormalization(),
layers.Dense(64, activation='relu'),
layers.BatchNormalization(),
layers.Dense(10, activation='softmax')
])
# With momentum
layers.BatchNormalization(momentum=0.99, epsilon=0.001)
Practice Problems
- Add dropout between layers
- Apply L2 regularization to weights
- Use batch normalization
- Combine dropout with L2
- Tune dropout rate