Dimensionality Reduction

Machine LearningFeature EngineeringFree Lesson

Advertisement

Introduction

Dimensionality reduction techniques reduce the number of features while preserving important information.

PCA

from sklearn.decomposition import PCA

pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)

print(f"Explained variance: {pca.explained_variance_ratio_}")
print(f"Total variance explained: {sum(pca.explained_variance_ratio_):.2f}")

Explained Variance

import matplotlib.pyplot as plt

pca_full = PCA().fit(X)
plt.plot(range(1, len(pca_full.explained_variance_ratio_) + 1),
         pca_full.explained_variance_ratio_.cumsum())
plt.xlabel("Number of Components")
plt.ylabel("Cumulative Explained Variance")

t-SNE

from sklearn.manifold import TSNE

tsne = TSNE(n_components=2, perplexity=30, random_state=42)
X_tsne = tsne.fit_transform(X)

Practice Problems

  1. Reduce dimensions with PCA
  2. Visualize explained variance
  3. Apply t-SNE for visualization
  4. Use PCA for noise filtering
  5. Combine PCA with classifier

Advertisement

Need Expert Python Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement