Model Interpretability — Complete Guide
Interpretability explains why a model makes specific predictions. Essential for trust, debugging, and regulatory compliance.
Interpretability Methods
Global (model-level):
├─ Feature importance (tree-based)
├─ Permutation importance
├─ Partial dependence plots
└─ SHAP summary plots
Local (prediction-level):
├─ LIME
├─ SHAP waterfall plots
├─ Counterfactual explanations
└─ Anchors
SHAP Implementation
import shap
# TreeExplainer for tree models
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
# Summary plot
shap.summary_plot(shap_values, X_test)
# Force plot (single prediction)
shap.force_plot(explainer.expected_value, shap_values[0], X_test.iloc[0])
# Dependence plot
shap.dependence_plot("feature_name", shap_values, X_test)
LIME Implementation
from lime.lime_tabular import LimeTabularExplainer
explainer = LimeTabularExplainer(
X_train.values,
feature_names=feature_names,
class_names=['Not Fraud', 'Fraud']
)
# Explain single prediction
explanation = explainer.explain_instance(
X_test.iloc[0].values,
model.predict_proba,
num_features=10
)
explanation.show_in_notebook()
Key Takeaways
- SHAP provides theoretically sound feature attributions
- LIME creates local interpretable explanations
- Feature importance shows global feature relevance
- Partial dependence plots show feature effects
- Counterfactuals explain "what would need to change"
- Model-agnostic methods work with any model
- Interpretability is required by law (GDPR, EU AI Act)
- Use interpretability for debugging and trust-building