Introduction
Dimensionality reduction techniques reduce the number of features while preserving important information.
PCA
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)
print(f"Explained variance: {pca.explained_variance_ratio_}")
print(f"Total variance explained: {sum(pca.explained_variance_ratio_):.2f}")
Explained Variance
import matplotlib.pyplot as plt
pca_full = PCA().fit(X)
plt.plot(range(1, len(pca_full.explained_variance_ratio_) + 1),
pca_full.explained_variance_ratio_.cumsum())
plt.xlabel("Number of Components")
plt.ylabel("Cumulative Explained Variance")
t-SNE
from sklearn.manifold import TSNE
tsne = TSNE(n_components=2, perplexity=30, random_state=42)
X_tsne = tsne.fit_transform(X)
Practice Problems
- Reduce dimensions with PCA
- Visualize explained variance
- Apply t-SNE for visualization
- Use PCA for noise filtering
- Combine PCA with classifier