Introduction
Comprehensive model evaluation techniques for selecting the best model.
Cross-Validation
from sklearn.model_selection import cross_val_score, KFold
# 5-fold cross-validation
scores = cross_val_score(model, X, y, cv=5)
# Custom CV
kf = KFold(n_splits=10, shuffle=True)
scores = cross_val_score(model, X, y, cv=kf)
Metrics
from sklearn.metrics import (
accuracy_score, precision_score, recall_score, f1_score,
confusion_matrix, roc_auc_score, roc_curve
)
# Classification metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average="weighted")
recall = recall_score(y_test, y_pred, average="weighted")
f1 = f1_score(y_test, y_pred, average="weighted")
# ROC-AUC
y_prob = model.predict_proba(X_test)[:, 1]
auc = roc_auc_score(y_test, y_prob)
Confusion Matrix
from sklearn.metrics import confusion_matrix
import seaborn as sns
cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt="d")
Practice Problems
- Perform k-fold cross-validation
- Calculate multiple metrics
- Plot ROC curves
- Use confusion matrix
- Compare models statistically