Model Evaluation: ROC, AUC, PR Curves
Classification Metrics Deep Dive
Beyond accuracy, we need metrics that capture different aspects of model performance.
The Confusion Matrix
DfConfusion Matrix
A table that summarizes the performance of a classification model by comparing predicted labels against true labels. For binary classification, it consists of four entries: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN).
CONFUSION MATRIX LAYOUT:
Predicted
Negative Positive
Actual Negative [ TN | FP ]
Positive [ FN | TP ]
Where:
โข TP (True Positive): Correctly predicted positive
โข TN (True Negative): Correctly predicted negative
โข FP (False Positive): Incorrectly predicted positive (Type I Error)
โข FN (False Negative): Incorrectly predicted negative (Type II Error)
Derived Metrics
Accuracy:
Accuracy
Here,
- =True Positives
- =True Negatives
- =False Positives (Type I error)
- =False Negatives (Type II error)
Precision (Positive Predictive Value):
Precision
Here,
- =True Positives
- =False Positives
Recall (Sensitivity, True Positive Rate):
Recall (Sensitivity)
Here,
- =True Positives
- =False Negatives
Specificity (True Negative Rate):
Specificity
Here,
- =True Negatives
- =False Positives
F1 Score (Harmonic Mean):
The F1 score is the harmonic mean of precision and recall, which penalizes extreme values more than the arithmetic mean. A classifier must have both high precision AND high recall to achieve a high F1 score. The F-beta generalization allows weighting recall more heavily than precision (beta > 1) or vice versa (beta < 1).
Complete Metrics Implementation
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, cross_val_predict
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.svm import SVC
from sklearn.metrics import (
confusion_matrix, classification_report,
precision_score, recall_score, f1_score, accuracy_score
)
import warnings
warnings.filterwarnings('ignore')
np.random.seed(42)
print("=" * 70)
print("CLASSIFICATION METRICS DEEP DIVE")
print("=" * 70)
# Generate imbalanced dataset
X, y = make_classification(
n_samples=1000, n_features=20, n_informative=10,
n_classes=2, weights=[0.7, 0.3],
random_state=42
)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
print(f"Training set: {X_train.shape[0]} samples")
print(f"Test set: {X_test.shape[0]} samples")
print(f"Class distribution (train): {np.bincount(y_train)}")
print(f"Class distribution (test): {np.bincount(y_test)}")
# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
# Get predictions
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)[:, 1]
# Confusion matrix
cm = confusion_matrix(y_test, y_pred)
print("\nConfusion Matrix:")
print(cm)
print(f"\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=['Negative', 'Positive']))
ROC Curve and AUC
The Receiver Operating Characteristic (ROC) curve plots True Positive Rate vs False Positive Rate at different thresholds.
DfROC Curve
A graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. It is created by plotting the True Positive Rate (TPR) against the False Positive Rate (FPR) at various threshold settings.
Mathematical Definition
True Positive Rate (Recall/Sensitivity):
True Positive Rate
Here,
- =True Positives
- =False Negatives
False Positive Rate (1 - Specificity):
False Positive Rate
Here,
- =False Positives
- =True Negatives
AUC (Area Under Curve):
ThAUC Probabilistic Interpretation
The AUC of a classifier is equivalent to the probability that a randomly chosen positive instance is ranked higher (has a higher predicted probability) than a randomly chosen negative instance. That is, where and are drawn from the positive and negative classes respectively.
Visual Representation
ROC CURVE INTERPRETATION:
TPR โ
1.0 โ โญโโโโโโโโโโโโโโโ Perfect Classifier
โ โฑ
โ โฑ AUC = 1.0
0.8 โ โฑ
โ โฑ โ Good Classifier
โ โฑ AUC = 0.9
0.6 โ โฑ
โ โฑ โ Random Classifier
โ โฑ AUC = 0.5
0.4 โโฑ
โ ร Worst Classifier
โ AUC = 0.0
0.2 โ
โ
0.0 โโโโโโโโโโโโโโโโโโโโโโโ FPR
0.0 0.2 0.4 0.6 0.8 1.0
Interpretation:
โข AUC = 0.5: Random guessing (diagonal)
โข AUC > 0.5: Better than random
โข AUC = 1.0: Perfect classifier
โข AUC < 0.5: Worse than random (inverted predictions)
AUC is threshold-independent โ it evaluates the classifier across all possible thresholds. This makes it useful for comparing models without committing to a specific operating point. However, AUC can be misleading for highly imbalanced datasets; in such cases, use PR AUC (Average Precision) instead.
Complete ROC Implementation
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, auc, roc_auc_score
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.svm import SVC
import warnings
warnings.filterwarnings('ignore')
np.random.seed(42)
print("\n" + "=" * 70)
print("ROC CURVE AND AUC")
print("=" * 70)
# Generate dataset
X, y = make_classification(
n_samples=1000, n_features=20, n_informative=10,
n_classes=2, weights=[0.7, 0.3], random_state=42
)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# Train classifier
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)
# Get predicted probabilities
y_proba = clf.predict_proba(X_test)[:, 1]
# Compute ROC curve
fpr, tpr, thresholds = roc_curve(y_test, y_proba)
roc_auc = auc(fpr, tpr)
print(f"ROC AUC: {roc_auc:.4f}")
# Visualize ROC Curve
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
# Plot 1: ROC Curve
ax1 = axes[0]
ax1.plot(fpr, tpr, 'b-', linewidth=2, label=f'ROC Curve (AUC = {roc_auc:.4f})')
ax1.plot([0, 1], [0, 1], 'k--', linewidth=1, label='Random Classifier (AUC = 0.5)')
ax1.fill_between(fpr, tpr, alpha=0.2)
ax1.set_xlabel('False Positive Rate (1 - Specificity)')
ax1.set_ylabel('True Positive Rate (Recall)')
ax1.set_title('ROC Curve')
ax1.legend(loc='lower right')
ax1.grid(True, alpha=0.3)
ax1.set_xlim([0, 1])
ax1.set_ylim([0, 1.05])
# Mark optimal threshold (Youden's J statistic)
J = tpr - fpr
optimal_idx = np.argmax(J)
optimal_threshold = thresholds[optimal_idx]
ax1.scatter(fpr[optimal_idx], tpr[optimal_idx], c='red', marker='o', s=100,
label=f'Optimal Threshold = {optimal_threshold:.3f}')
ax1.legend(loc='lower right')
# Plot 2: Threshold vs TPR/FPR
ax2 = axes[1]
ax2.plot(thresholds, tpr, 'b-', linewidth=2, label='TPR (Recall)')
ax2.plot(thresholds, fpr, 'r-', linewidth=2, label='FPR')
ax2.plot(thresholds, tpr - fpr, 'g--', linewidth=2, label="Youden's J")
ax2.axvline(x=optimal_threshold, color='gray', linestyle='--', alpha=0.5)
ax2.set_xlabel('Threshold')
ax2.set_ylabel('Score')
ax2.set_title('TPR and FPR vs Threshold')
ax2.legend()
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('roc_curve.png', dpi=150, bbox_inches='tight')
plt.show()
๐Comparing Multiple Classifiers with ROC
# Compare multiple classifiers
classifiers = {
'Logistic Regression': LogisticRegression(max_iter=1000, random_state=42),
'Random Forest': RandomForestClassifier(n_estimators=100, random_state=42),
'Gradient Boosting': GradientBoostingClassifier(n_estimators=100, random_state=42),
'SVM (probability)': SVC(probability=True, random_state=42)
}
fig, ax = plt.subplots(figsize=(10, 8))
colors = ['blue', 'red', 'green', 'orange']
for (name, clf), color in zip(classifiers.items(), colors):
clf.fit(X_train, y_train)
y_proba = clf.predict_proba(X_test)[:, 1]
fpr, tpr, _ = roc_curve(y_test, y_proba)
roc_auc = auc(fpr, tpr)
ax.plot(fpr, tpr, color=color, linewidth=2,
label=f'{name} (AUC = {roc_auc:.4f})')
print(f"{name:25s} AUC: {roc_auc:.4f}")
ax.plot([0, 1], [0, 1], 'k--', linewidth=1, label='Random (AUC = 0.5)')
ax.set_xlabel('False Positive Rate', fontsize=12)
ax.set_ylabel('True Positive Rate', fontsize=12)
ax.set_title('ROC Curves for Multiple Classifiers', fontsize=14)
ax.legend(loc='lower right', fontsize=10)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('roc_comparison.png', dpi=150, bbox_inches='tight')
plt.show()
Precision-Recall Curves
PR curves are especially useful for imbalanced datasets.
DfPrecision-Recall Curve
A plot of precision (y-axis) versus recall (x-axis) for different threshold settings. Unlike ROC curves, PR curves focus on the positive (minority) class and are more informative when classes are imbalanced.
When to Use PR Curves
ROC vs PR CURVES:
ROC CURVE: PR CURVE:
โข Uses FPR and TPR โข Uses Precision and Recall
โข Affected by class imbalance โข Better for imbalanced data
โข Good when classes are balanced โข Good when positive class is rare
โข Shows overall performance โข Focuses on positive class
Example: Fraud Detection (1% fraud)
โข ROC AUC might be 0.95 (looks good)
โข PR AUC might be 0.30 (reality check!)
USE PR CURVES WHEN:
โข Positive class is rare (< 20%)
โข Cost of false positives is high
โข You care more about positive predictions
Mathematical Definition
Average Precision (AP):
Average Precision
Here,
- =Recall at threshold n
- =Precision at threshold n
PR AUC:
The baseline for a PR curve is the positive class prevalence (e.g., 0.10 for 10% positive cases), not 0.5 as in ROC curves. A model with PR AUC below the baseline is worse than random. The area under the PR curve (Average Precision) provides a single summary statistic, but the shape of the curve reveals trade-offs at different operating points.
Complete PR Curve Implementation
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import (
precision_recall_curve, average_precision_score,
PrecisionRecallDisplay
)
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
import warnings
warnings.filterwarnings('ignore')
np.random.seed(42)
print("\n" + "=" * 70)
print("PRECISION-RECALL CURVES")
print("=" * 70)
# Generate imbalanced dataset
X, y = make_classification(
n_samples=1000, n_features=20, n_informative=10,
n_classes=2, weights=[0.9, 0.1],
random_state=42
)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
print(f"Class distribution: {np.bincount(y)}")
print(f"Positive class ratio: {y.mean():.2%}")
# Train classifier
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)
# Get probabilities
y_proba = clf.predict_proba(X_test)[:, 1]
# Compute PR curve
precision, recall, thresholds = precision_recall_curve(y_test, y_proba)
avg_precision = average_precision_score(y_test, y_proba)
print(f"\nAverage Precision: {avg_precision:.4f}")
# Visualize PR Curve
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
# Plot 1: PR Curve
ax1 = axes[0]
ax1.plot(recall, precision, 'b-', linewidth=2, label=f'PR Curve (AP = {avg_precision:.4f})')
ax1.fill_between(recall, precision, alpha=0.2)
ax1.set_xlabel('Recall')
ax1.set_ylabel('Precision')
ax1.set_title('Precision-Recall Curve')
ax1.legend(loc='lower left')
ax1.grid(True, alpha=0.3)
# Baseline
baseline = y_test.mean()
ax1.axhline(y=baseline, color='gray', linestyle='--', alpha=0.5,
label=f'Baseline (prevalence = {baseline:.3f})')
# Plot 2: Precision and Recall vs Threshold
ax2 = axes[1]
ax2.plot(thresholds, precision[:-1], 'b-', linewidth=2, label='Precision')
ax2.plot(thresholds, recall[:-1], 'r-', linewidth=2, label='Recall')
ax2.plot(thresholds, 2 * (precision[:-1] * recall[:-1]) /
(precision[:-1] + recall[:-1] + 1e-10), 'g--', linewidth=2, label='F1 Score')
ax2.set_xlabel('Threshold')
ax2.set_ylabel('Score')
ax2.set_title('Precision, Recall, and F1 vs Threshold')
ax2.legend()
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('pr_curve.png', dpi=150, bbox_inches='tight')
plt.show()
Threshold Tuning
Optimal Threshold Selection
DfYouden's J Statistic
A threshold selection method that maximizes the sum of sensitivity and specificity minus 1. It finds the threshold where is maximized, providing a balance between capturing positives and avoiding false alarms.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, precision_recall_curve, f1_score
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
import warnings
warnings.filterwarnings('ignore')
np.random.seed(42)
print("\n" + "=" * 70)
print("THRESHOLD TUNING")
print("=" * 70)
# Generate dataset
X, y = make_classification(
n_samples=1000, n_features=20, n_informative=10,
n_classes=2, weights=[0.7, 0.3], random_state=42
)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# Train model
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)
y_proba = clf.predict_proba(X_test)[:, 1]
# Method 1: Youden's J
fpr, tpr, thresholds_roc = roc_curve(y_test, y_proba)
J = tpr - fpr
optimal_idx_youden = np.argmax(J)
optimal_threshold_youden = thresholds_roc[optimal_idx_youden]
# Method 2: Maximize F1 Score
precision, recall, thresholds_pr = precision_recall_curve(y_test, y_proba)
f1_scores = 2 * (precision * recall) / (precision + recall + 1e-10)
optimal_idx_f1 = np.argmax(f1_scores[:-1])
optimal_threshold_f1 = thresholds_pr[optimal_idx_f1]
# Method 3: Equal error rate (FPR = FNR)
fnr = 1 - tpr
equal_error_idx = np.argmin(np.abs(fpr - fnr))
optimal_threshold_eer = thresholds_roc[equal_error_idx]
print(f"Method Optimal Threshold")
print("-" * 50)
print(f"Youden's J: {optimal_threshold_youden:.4f}")
print(f"Max F1: {optimal_threshold_f1:.4f}")
print(f"Equal Error Rate: {optimal_threshold_eer:.4f}")
The choice of threshold depends on the cost structure of your problem. If false negatives are costly (e.g., missing a disease), lower the threshold to increase recall. If false positives are costly (e.g., spam filter blocking legitimate email), raise the threshold to increase precision. Always choose the threshold based on the business context, not just the default 0.5.
Cost-Sensitive Threshold Selection
Total Classification Cost
Here,
- =Cost of a false positive
- =Cost of a false negative
- =Number of false positives
- =Number of false negatives
def calculate_cost(y_true, y_pred, cost_fp=1, cost_fn=10):
"""Calculate total cost given predictions."""
cm = confusion_matrix(y_true, y_pred)
tn, fp, fn, tp = cm.ravel()
return fp * cost_fp + fn * cost_fn
from sklearn.metrics import confusion_matrix
thresholds_to_try = np.arange(0.1, 0.9, 0.05)
costs = []
print(f"\n{'Threshold':<12} {'FP Cost':<10} {'FN Cost':<10} {'Total Cost':<12} {'Accuracy'}")
print("-" * 60)
for threshold in thresholds_to_try:
y_pred = (y_proba >= threshold).astype(int)
cost = calculate_cost(y_test, y_pred, cost_fp=1, cost_fn=10)
accuracy = (y_test == y_pred).mean()
costs.append(cost)
if threshold in [0.3, 0.5, 0.7, optimal_threshold_youden]:
cm = confusion_matrix(y_test, y_pred)
fp_cost = cm[0, 1] * 1
fn_cost = cm[1, 0] * 10
print(f"{threshold:<12.2f} {fp_cost:<10} {fn_cost:<10} {cost:<12} {accuracy:.4f}")
optimal_cost_threshold = thresholds_to_try[np.argmin(costs)]
print(f"\nOptimal threshold for cost sensitivity: {optimal_cost_threshold:.4f}")
Multi-Class Evaluation Strategies
One-vs-Rest and One-vs-One
MULTI-CLASS EVALUATION STRATEGIES:
Given 3 classes: A, B, C
ONE-vs-REST (OvR):
โข Train K binary classifiers
โข A vs (B+C), B vs (A+C), C vs (A+B)
โข Combine predictions
ONE-vs-ONE (OvO):
โข Train K(K-1)/2 binary classifiers
โข A vs B, A vs C, B vs C
โข Majority voting
MACRO vs MICRO vs WEIGHTED:
โข Macro: Average metric across classes (treats all classes equally)
โข Micro: Aggregate TP, FP, FN across classes (biased toward majority)
โข Weighted: Weight by class frequency
DfMacro vs Micro Averaging
Macro averaging computes the metric independently for each class and takes the average, treating all classes equally regardless of size. Micro averaging aggregates the contributions of all classes to compute the average metric, which is dominated by the majority class. Weighted averaging weights each class's metric by its support (number of true instances).
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import (
roc_curve, auc, precision_recall_curve, average_precision_score,
roc_auc_score
)
from sklearn.preprocessing import label_binarize
from itertools import cycle
import warnings
warnings.filterwarnings('ignore')
np.random.seed(42)
print("\n" + "=" * 70)
print("MULTI-CLASS EVALUATION")
print("=" * 70)
# Load digits dataset (10 classes)
digits = load_digits()
X, y = digits.data, digits.target
# Use only first 5 classes for clarity
X = X[y < 5]
y = y[y < 5]
n_classes = len(np.unique(y))
print(f"Dataset: {X.shape}")
print(f"Number of classes: {n_classes}")
# Split data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# Binarize labels for ROC
y_test_bin = label_binarize(y_test, classes=range(n_classes))
# Train classifier
clf = RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1)
clf.fit(X_train, y_train)
y_proba = clf.predict_proba(X_test)
# Calculate different AUC averaging methods
macro_auc = roc_auc_score(y_test_bin, y_proba, average='macro')
micro_auc = roc_auc_score(y_test_bin, y_proba, average='micro')
weighted_auc = roc_auc_score(y_test_bin, y_proba, average='weighted')
print(f"\nMacro AUC: {macro_auc:.4f} (treats all classes equally)")
print(f"Micro AUC: {micro_auc:.4f} (aggregates all predictions)")
print(f"Weighted AUC: {weighted_auc:.4f} (weights by class frequency)")
Key Takeaways
- Confusion Matrix is the foundation for all metrics
- Precision measures how many positive predictions are correct
- Recall measures how many actual positives are captured
- F1 Score balances precision and recall
- ROC Curve shows performance across all thresholds
- AUC provides a single number summary (threshold-independent)
- PR Curves are better for imbalanced datasets
- Threshold tuning is crucial for real-world applications
- Multi-class evaluation requires macro/micro/weighted averaging
- Always consider costs when choosing thresholds
Summary Table
| Metric | Formula | Best For | Interpretation |
|---|---|---|---|
| Accuracy | (TP+TN)/N | Balanced data | Overall correctness |
| Precision | TP/(TP+FP) | Low FP cost | Positive prediction quality |
| Recall | TP/(TP+FN) | Low FN cost | Capture all positives |
| F1 | 2ยทPยทR/(P+R) | Balance P&R | Harmonic mean |
| AUC-ROC | โซTPR dFPR | Balanced data | Threshold-independent |
| AP | โซP dR | Imbalanced data | Positive class focus |
When to Use What
DECISION GUIDE:
Is your data balanced?
โโโ Yes โ Use AUC-ROC
โโโ No โ Use PR AUC (Average Precision)
What matters more?
โโโ Catch all positives (e.g., disease) โ Optimize Recall
โโโ Don't raise false alarms (e.g., spam) โ Optimize Precision
โโโ Balance both โ Optimize F1 or F-beta
Multi-class?
โโโ All classes equally important โ Macro average
โโโ Care more about majority โ Micro average
โโโ Weighted by importance โ Weighted average
๐Summary: Model Evaluation โ ROC, AUC, PR Curves
- The confusion matrix (TP, TN, FP, FN) is the foundation from which all classification metrics are derived.
- Accuracy = (TP+TN)/N is misleading for imbalanced data; prefer precision, recall, and F1.
- The F1 score is the harmonic mean of precision and recall โ it penalizes extreme imbalances between the two.
- The ROC curve plots TPR vs FPR across thresholds; AUC gives a threshold-independent summary. AUC = P(positive ranked higher than negative).
- PR curves are more informative than ROC for imbalanced datasets because they focus on the positive class. The baseline is the positive class prevalence, not 0.5.
- Average Precision (AP) is the area under the PR curve and provides a single-number summary for imbalanced evaluation.
- Threshold selection should be driven by the cost structure: Youden's J for balanced costs, cost-sensitive optimization for asymmetric costs.
- For multi-class problems, use macro (equal class weight), micro (aggregate all), or weighted (frequency-based) averaging of AUC.
- Always report confidence intervals (via bootstrap) for AUC to understand uncertainty in your evaluation.
- Match your metric to your problem: ROC-AUC for balanced data, PR-AUC for imbalanced data, cost-sensitive metrics when error types have different consequences.
Practice Exercises
Exercise 1: Medical Diagnosis
"""
Build a medical diagnosis system:
1. Generate imbalanced dataset (5% disease prevalence)
2. Train multiple classifiers
3. Compare ROC and PR curves
4. Choose threshold minimizing FN (missing disease)
5. Calculate cost of different error types
"""
# Your code here
Exercise 2: Anomaly Detection
"""
Anomaly detection evaluation:
1. Generate data with 1% anomalies
2. Train anomaly detector
3. Evaluate with PR curve (not ROC!)
4. Find threshold with 95% recall
5. Report precision at that threshold
"""
# Your code here
Exercise 3: Multi-class Comparison
"""
Compare multi-class evaluation strategies:
1. Use Iris or Digits dataset
2. Train 3 classifiers
3. Calculate macro, micro, weighted AUC
4. Analyze per-class performance
5. Identify worst-performing classes
"""
# Your code here
Exercise 4: Cost-Sensitive Learning
"""
Implement cost-sensitive evaluation:
1. Define cost matrix for different errors
2. Train standard classifier
3. Find cost-optimal threshold
4. Compare with default threshold (0.5)
5. Visualize cost landscape
"""
# Your code here
Congratulations! You've completed Module 2: Machine Learning. You now have a solid foundation in:
- Cross-validation and model evaluation
- Hyperparameter tuning strategies
- Unsupervised learning (clustering)
- Dimensionality reduction (PCA)
- Comprehensive model evaluation metrics
Next: We'll explore Deep Learning fundamentals in Module 3!