Model Interpretability: SHAP, LIME and Feature Importance

Why Interpretability Matters

Modern ML models achieve remarkable predictive accuracy, but black-box predictions are insufficient when trust, debugging, and regulatory compliance are required.

The Interpretability Imperative

Stakeholder	Need	Example
Regulators	Legal compliance	EU GDPR Article 22 "right to explanation"
Domain Experts	Scientific validation	Do feature effects align with theory?
Engineers	Debugging and monitoring	Which features drive distribution shifts?
End Users	Trust and adoption	Why was my loan denied?

When Interpretability Is Critical

⚠️

A model with 99% accuracy that cannot explain why it makes predictions is dangerous in regulated industries. Interpretability is not optionalit is a legal and ethical requirement.

Interpretability Spectrum

Models range from fully interpretable (you can read the rules) to completely opaque (only inputs and outputs visible).

Intrinsic vs Post-Hoc Interpretability

Category	Method	Pros	Cons
Intrinsic	Linear models, trees	Built-in, no extra computation	Limited to simple models
Post-Hoc (Model-Specific)	Tree feature importance	Fast, native to model	Only for that model type
Post-Hoc (Model-Agnostic)	SHAP, LIME, PDP	Works on any model	Computationally expensive

Feature Importance

Permutation Feature Importance

Measures importance by randomly shuffling each feature and measuring the drop in model performance.

Algorithm:

Train model , compute baseline score
For each feature :
- Create permuted dataset (column shuffled randomly)
- Compute
Feature importance:

Theoretical Foundation:

Permutation importance approximates the expected performance drop:

where denotes with feature replaced by a random permutation.

ℹ️

Permutation importance is model-agnostic and measures the decrease in model performance when a single feature's values are randomly shuffled. It captures both linear and nonlinear effects.

Implementation

import numpy as np
from sklearn.inspection import permutation_importance
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer

# Load data
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.3, random_state=42
)

# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Permutation importance
result = permutation_importance(
    model, X_test, y_test, n_repeats=10, random_state=42
)

# Display results
import pandas as pd
importance_df = pd.DataFrame({
    'Feature': data.feature_names,
    'Importance_Mean': result.importances_mean,
    'Importance_Std': result.importances_std
}).sort_values('Importance_Mean', ascending=False)

print(importance_df.head(10))

Feature Importance Comparison

Gini Importance (Mean Decrease in Impurity)

For tree-based models, Gini importance measures the total reduction of impurity (Gini or entropy) contributed by each feature across all trees:

where is the impurity decrease at node in tree .

SHAP (SHapley Additive exPlanations)

Game Theory Foundation

SHAP is grounded in cooperative game theory. Each feature is treated as a "player" in a game, and SHAP values compute the fair marginal contribution of each feature.

The Shapley value for feature is:

where:

is the set of all features
is a subset of features not including
is the value function (model prediction using features in )
is the weighting factor

Key SHAP Properties

Property	Definition	Implication
Efficiency		SHAP values sum to the deviation from the expected prediction
Symmetry	If and contribute equally,	Fair allocation
Null Player	If doesn't affect output,	Irrelevant features get zero
Linearity		Additive decomposition

SHAP Additive Explanation

The SHAP explanation model is:

where indicates whether feature is included, and .

TreeSHAP: Efficient Exact Computation

For tree ensembles, TreeSHAP computes exact Shapley values in time (not exponential):

where:

= number of trees
= number of leaves
= maximum depth

TreeSHAP avoids enumerating all feature subsets by exploiting the recursive structure of trees.

SHAP Implementation

import shap
import xgboost as xgb
from sklearn.datasets import load_breast_cancer

# Load data and train XGBoost
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.3, random_state=42
)

model = xgb.XGBClassifier(n_estimators=100, use_label_encoder=False, eval_metric='logloss')
model.fit(X_train, y_train)

# Compute SHAP values using TreeSHAP
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Summary plot (global feature importance)
shap.summary_plot(shap_values, X_test, feature_names=data.feature_names)

# Waterfall plot (single prediction)
shap.waterfall_plot(shap.Explanation(
    values=shap_values[0],
    base_values=explainer.expected_value,
    data=X_test[0],
    feature_names=data.feature_names
))

# Dependence plot (feature interaction)
shap.dependence_plot("worst radius", shap_values, X_test, 
                     feature_names=data.feature_names)

SHAP for Deep Learning (DeepSHAP / GradientSHAP)

For neural networks, approximate Shapley values using:

import shap
import torch
import torchvision

# Load pretrained model
model = torchvision.models.resnet18(pretrained=True)
model.eval()

# DeepExplainer for deep learning
background = torch.randn(100, 3, 224, 224)  # Random background samples
e = shap.DeepExplainer(model, background)
shap_values = e.shap_values(test_images)

# Visualize
shap.image_plot(shap_values, -test_images.numpy())

LIME (Local Interpretable Model-agnostic Explanations)

Core Idea

LIME explains individual predictions by fitting a local linear model around the instance of interest in the perturbed input space.

LIME Algorithm

Objective:

where:

is the black-box model
is an interpretable model (e.g., linear model)
is a kernel measuring proximity to instance
is a complexity penalty (e.g., number of features)

Step-by-step:

Perturb: Generate samples around by sampling from
Predict: Get for each perturbed sample
Weight: Compute weights
Fit: Train interpretable model on weighted dataset
Explain: Return coefficients of as local explanation

LIME Implementation

import lime
import lime.lime_tabular
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

# Load data and train model
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.3, random_state=42
)

model = GradientBoostingClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Create LIME explainer
explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=X_train,
    feature_names=data.feature_names,
    class_names=['Benign', 'Malignant'],
    mode='classification',
    discretize_continuous=True
)

# Explain a single prediction
instance = X_test[0]
explanation = explainer.explain_instance(
    instance,
    model.predict_proba,
    num_features=10
)

# Visualize
explanation.show_in_notebook()

# Get feature contributions as dataframe
exp_df = explanation.as_list()
print("LIME Explanation:")
for feature, weight in exp_df:
    print(f"  {feature}: {weight:.4f}")

LIME for Images and Text

# Image classification with LIME
from lime import lime_image
from skimage.segmentation import slic

explainer = lime_image.LimeImageExplainer()

def predict_fn(images):
    """Model prediction function for LIME"""
    return model.predict(preprocess(images))

explanation = explainer.explain_instance(
    image,
    predict_fn,
    top_labels=5,
    hide_color=0,
    num_samples=1000,
    segmentation_fn=slic  # Superpixel segmentation
)

# Get explanation for top predicted label
temp, mask = explanation.get_image_and_mask(
    explanation.top_labels[0],
    positive_only=True,
    num_features=5,
    hide_rest=False
)

# Text classification with LIME
from lime.lime_text import LimeTextExplainer

text_explainer = LimeTextExplainer(class_names=['Negative', 'Positive'])
text_exp = text_explainer.explain_instance(
    "This movie was absolutely fantastic!",
    classifier.predict_proba,
    num_features=6
)

Partial Dependence Plots (PDP)

Mathematical Definition

The partial dependence function shows the marginal effect of feature on the prediction:

where denotes all features except .

The empirical estimate is:

Individual Conditional Expectation (ICE)

ICE curves extend PDP by showing the effect for each individual instance:

While PDP shows the average effect, ICE reveals heterogeneity in the effect across instances.

PDP and ICE Implementation

from sklearn.inspection import PartialDependenceDisplay
import matplotlib.pyplot as plt
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

# Load data and train model
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.3, random_state=42
)

model = GradientBoostingClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Single feature PDP
fig, ax = plt.subplots(figsize=(10, 6))
PartialDependenceDisplay.from_estimator(
    model, X_test,
    features=[0, 1, 2, 3],  # Feature indices
    feature_names=data.feature_names,
    kind='both',  # PDP + ICE
    subsample=50,  # Number of ICE curves
    grid_resolution=50,
    ax=ax
)
plt.suptitle('Partial Dependence and Individual Conditional Expectation')
plt.tight_layout()
plt.show()

# 2D PDP (feature interaction)
fig, ax = plt.subplots(figsize=(10, 8))
PartialDependenceDisplay.from_estimator(
    model, X_test,
    features=[(0, 1)],  # 2D interaction
    feature_names=data.feature_names,
    grid_resolution=30,
    ax=ax
)
plt.title('2D Partial Dependence Plot (Feature Interaction)')
plt.show()

Practical Comparison: When to Use What

Method	Scope	Speed	Faithfulness	Best For
Permutation Importance	Global	Fast	High	Quick feature ranking
SHAP	Global + Local	Slow	Very High	Detailed explanations, theory
LIME	Local	Medium	Medium	Quick local explanations
PDP	Global	Fast	High	Feature effect visualization
ICE	Global + Individual	Medium	High	Heterogeneity detection

Decision Framework

Advanced: SHAP Interaction Values

SHAP can decompose effects into main effects and interaction effects:

where captures the interaction between features and .

# SHAP interaction values
explainer = shap.TreeExplainer(model)
shap_interaction = explainer.shap_interaction_values(X_test)

# Visualization
shap.summary_plot(shap_interaction, X_test, feature_names=data.feature_names)

Implementation: Complete Pipeline

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import shap
import lime
import lime.lime_tabular
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.inspection import permutation_importance, PartialDependenceDisplay
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import accuracy_score

# Load data
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.3, random_state=42
)

# Train model
model = RandomForestClassifier(n_estimators=200, random_state=42)
model.fit(X_train, y_train)

print(f"Test Accuracy: {accuracy_score(y_test, model.predict(X_test)):.4f}")

# 1. Permutation Feature Importance
perm_result = permutation_importance(model, X_test, y_test, n_repeats=10, random_state=42)
perm_df = pd.DataFrame({
    'Feature': data.feature_names,
    'Importance': perm_result.importances_mean
}).sort_values('Importance', ascending=False)
print("\n=== Permutation Feature Importance (Top 10) ===")
print(perm_df.head(10).to_string(index=False))

# 2. SHAP Analysis
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Summary plot
shap.summary_plot(shap_values[1], X_test, feature_names=data.feature_names, show=False)
plt.title("SHAP Summary Plot (Malignant Class)")
plt.tight_layout()
plt.savefig("shap_summary.png", dpi=150)
plt.show()

# Single prediction explanation
idx = 0
shap.waterfall_plot(shap.Explanation(
    values=shap_values[1][idx],
    base_values=explainer.expected_value[1],
    data=X_test[idx],
    feature_names=data.feature_names
))

# 3. LIME Explanation
lime_explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=X_train,
    feature_names=list(data.feature_names),
    class_names=['Benign', 'Malignant'],
    mode='classification'
)
lime_exp = lime_explainer.explain_instance(
    X_test[idx],
    model.predict_proba,
    num_features=10
)

# 4. Partial Dependence Plot
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
PartialDependenceDisplay.from_estimator(
    model, X_test,
    features=['worst radius', 'worst concave points'],
    feature_names=data.feature_names,
    kind='both',
    subsample=50,
    ax=axes
)
plt.suptitle('Partial Dependence Plots')
plt.tight_layout()
plt.savefig("pdp_plots.png", dpi=150)
plt.show()

print("\n=== Analysis Complete ===")

Evaluation: Knowledge Check

Q1. What is the fundamental difference between permutation importance and SHAP-based importance?

Q2. Why does SHAP use Shapley values rather than simple marginal contributions?

Q3. When would LIME provide a better explanation than SHAP?

Q4. What does the width of ICE curves in a PDP reveal?

Q5. Prove that Shapley values satisfy the efficiency property: .

Key Takeaways

SHAP provides theoretically grounded explanations with consistency guarantees
LIME is faster for single-instance explanations but lacks global consistency
PDP/ICE reveals how features affect predictions on average and individually
Permutation importance is the simplest model-agnostic global method
TreeSHAP makes exact Shapley computation feasible for tree ensembles
Always validate explanations against domain knowledge no method is perfect

ℹ️

The field of Explainable AI (XAI) is evolving rapidly. Recent advances include Counterfactual Explanations, Anchors, Concept-based Explanations, and Influence Functions. The techniques covered here form the foundation for understanding any new method that emerges.

Next: Hyperparameter Tuning Learn systematic approaches to optimizing model performance through Bayesian optimization, grid search, and random search.

Model Interpretability: SHAP, LIME and Feature Importance

Why Interpretability Matters

The Interpretability Imperative

When Interpretability Is Critical

Interpretability Spectrum

Intrinsic vs Post-Hoc Interpretability

Feature Importance

Permutation Feature Importance

Implementation

Feature Importance Comparison

Gini Importance (Mean Decrease in Impurity)

SHAP (SHapley Additive exPlanations)

Game Theory Foundation

Key SHAP Properties

SHAP Additive Explanation

TreeSHAP: Efficient Exact Computation

SHAP Implementation

SHAP for Deep Learning (DeepSHAP / GradientSHAP)

LIME (Local Interpretable Model-agnostic Explanations)

Core Idea

LIME Algorithm

LIME Implementation

LIME for Images and Text

Partial Dependence Plots (PDP)

Mathematical Definition

Individual Conditional Expectation (ICE)

PDP and ICE Implementation

Practical Comparison: When to Use What

Decision Framework

Advanced: SHAP Interaction Values

Implementation: Complete Pipeline

Evaluation: Knowledge Check

Key Takeaways

Need Expert Data Science Help?