Model Interpretability: SHAP, LIME

Introduction

Model interpretability is crucial for building trust, meeting regulatory requirements, and debugging model behavior. As models become more complex, understanding why a model makes a prediction becomes as important as the prediction itself.

Architecture Diagram

Interpretability Spectrum:
═══════════════════════════════════════════════════════════════════

 High Interpretability                          Low Interpretability
 ◄─────────────────────────────────────────────────────────────────►
 │                                                                  │
 Linear    Decision   Rule    Random   Gradient    Neural    Deep   │
 Regression Trees    Lists   Forest   Boosting    Networks  Learning│
 │                                                                  │
 ◄── Global Methods ──►  ◄── Local Methods ──►  ◄── Both ──────►  │
      (SHAP)                (LIME)                (SHAP+LIME)       │
═══════════════════════════════════════════════════════════════════

Why Interpretability Matters

Architecture Diagram

Interpretability Benefits:
═══════════════════════════════════════════════════

 1. Trust Building              2. Debugging
 ┌─────────────────────┐       ┌─────────────────────┐
 │ "Why was my loan    │       │ Model relies on     │
 │  rejected?"         │       │ correlation instead  │
 │                      │       │ causation           │
 │ Transparent decision│       │ → Fix data/features  │
 │ process             │       │                     │
 └─────────────────────┘       └─────────────────────┘

 3. Regulatory Compliance       4. Scientific Discovery
 ┌─────────────────────┐       ┌─────────────────────┐
 │ GDPR "right to     │       │ Feature importance  │
 │ explanation"       │       │ reveals new         │
 │                      │       │ domain insights     │
 │ EU AI Act requires  │       │                      │
 │ transparency        │       │ "What drives Y?"    │
 └─────────────────────┘       └─────────────────────┘
═══════════════════════════════════════════════════

SHAP (SHapley Additive exPlanations)

Theory

SHAP values are based on Shapley values from cooperative game theory. They provide a unified measure of feature contribution.

DfShapley Value

For a player $i$ in a coalition $S$ , the Shapley value measures the marginal contribution of player $i$ averaged over all possible coalitions.

\phi_i = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(|N|-|S|-1)!}{|N|!} \left[ v(S \cup \{i\}) - v(S) \right]

Shapley Value Formula

\phi_i = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(|N|-|S|-1)!}{|N|!} \left[ v(S \cup \{i\}) - v(S) \right]

Here,

$N$ =Set of all features (players)
$S$ =Subset of features not including i
$v(S)$ =Model prediction using features in S
$\phi_i$ =SHAP value for feature i

ThShapley Value Axioms (Unique Solution)

The Shapley value is the unique allocation that satisfies four desirable axioms:

Efficiency: The sum of all Shapley values equals the total value: $\sum_{i=1}^{M} \phi_i = v(N) - v(\emptyset)$
Symmetry: If two players contribute equally to all coalitions, they receive equal Shapley values
Dummy: A player that contributes nothing to any coalition receives a Shapley value of zero
Additivity: The Shapley value of a sum of games equals the sum of individual Shapley values

This makes SHAP the only method that satisfies all four axioms simultaneously, providing a theoretically grounded explanation.

SHAP Properties:

Efficiency Property

\sum_{i=1}^{M} \phi_i = f(x) - E[f(X)]

Here,

$\phi_i$ =SHAP value for feature i
$f(x)$ =Model prediction for instance x
$E[f(X)]$ =Expected (average) model prediction

The SHAP values sum to the difference between the prediction and the average prediction.

ℹ️ Why SHAP Values Sum to the Difference

The SHAP value decomposition ensures that the sum of all feature contributions equals the difference between the individual prediction and the average prediction. This provides a complete and consistent explanation — every feature's contribution is accounted for, and no prediction component is left unexplained.

ℹ️ Why SHAP Values Sum to the Difference

Architecture Diagram

SHAP Value Decomposition Example:
═══════════════════════════════════════════════════

 Model Prediction: 750 (Credit Score)
 Average Prediction: 680

 SHAP Values:
 ┌──────────────────────────────────────────────┐
 │ Income:        +45                           │
 │ Age:           +20                           │
 │ Debt:          -30                           │
 │ Credit History: +35                          │
 │ ──────────────────────────────────────       │
 │ Total SHAP:    +70                           │
 │                                              │
 │ 680 (avg) + 70 = 750 (prediction)            │
 └──────────────────────────────────────────────┘
═══════════════════════════════════════════════════

SHAP Implementation

import numpy as np
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingRegressor
import shap
import matplotlib.pyplot as plt

# ═══════════════════════════════════════════════════
# Load and Prepare Data
# ═══════════════════════════════════════════════════
data = fetch_california_housing()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Train model
model = GradientBoostingRegressor(
    n_estimators=200,
    max_depth=5,
    learning_rate=0.1,
    random_state=42
)
model.fit(X_train, y_train)

print(f"R² Score: {model.score(X_test, y_test):.4f}")

# ═══════════════════════════════════════════════════
# SHAP Explainer
# ═══════════════════════════════════════════════════
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# ═══════════════════════════════════════════════════
# 1. Summary Plot (Global Feature Importance)
# ═══════════════════════════════════════════════════
plt.figure(figsize=(10, 8))
shap.summary_plot(shap_values, X_test, plot_type="bar", show=False)
plt.title("Global Feature Importance (SHAP)")
plt.tight_layout()
plt.savefig('shap_summary_bar.png', dpi=150)
plt.show()

# ═══════════════════════════════════════════════════
# 2. Beeswarm Plot (Feature Effects)
# ═══════════════════════════════════════════════════
plt.figure(figsize=(10, 8))
shap.summary_plot(shap_values, X_test, show=False)
plt.title("Feature Effects (SHAP Beeswarm)")
plt.tight_layout()
plt.savefig('shap_beeswarm.png', dpi=150)
plt.show()

# ═══════════════════════════════════════════════════
# 3. Dependence Plot (Feature Interaction)
# ═══════════════════════════════════════════════════
plt.figure(figsize=(10, 6))
shap.dependence_plot(
    "MedInc", shap_values, X_test,
    interaction_index="AveRooms",
    show=False
)
plt.title("MedIncome Dependence (colored by AveRooms)")
plt.tight_layout()
plt.savefig('shap_dependence.png', dpi=150)
plt.show()

# ═══════════════════════════════════════════════════
# 4. Waterfall Plot (Individual Prediction)
# ═══════════════════════════════════════════════════
sample_idx = 0
plt.figure(figsize=(10, 8))
shap.waterfall_plot(
    shap.Explanation(
        values=shap_values[sample_idx],
        base_values=explainer.expected_value,
        data=X_test.iloc[sample_idx],
        feature_names=X_test.columns.tolist()
    ),
    show=False
)
plt.title("Individual Prediction Explanation")
plt.tight_layout()
plt.savefig('shap_waterfall.png', dpi=150)
plt.show()

# ═══════════════════════════════════════════════════
# 5. Force Plot (Single Prediction)
# ═══════════════════════════════════════════════════
shap.initjs()
force_plot = shap.force_plot(
    explainer.expected_value,
    shap_values[sample_idx],
    X_test.iloc[sample_idx]
)
shap.save_html('shap_force_plot.html', force_plot)

# ═══════════════════════════════════════════════════
# SHAP Interaction Values
# ═══════════════════════════════════════════════════
shap_interaction = explainer.shap_interaction_values(X_test[:100])

plt.figure(figsize=(12, 10))
shap.summary_plot(
    shap_interaction,
    X_test[:100],
    show=False
)
plt.title("Feature Interactions")
plt.tight_layout()
plt.savefig('shap_interactions.png', dpi=150)
plt.show()

LIME (Local Interpretable Model-agnostic Explanations)

Theory

LIME explains individual predictions by approximating the model locally with an interpretable model.

DfLIME Optimization Problem

LIME finds an interpretable model $g$ that approximates the complex model $f$ in the local neighborhood of the instance $x$ , balancing fidelity to $f$ with simplicity of $g$ .

LIME Objective

\xi(x) = \arg\min_{g \in G} L(f, g, \pi_x) + \Omega(g)

Here,

$f$ =Complex (black-box) model
$g$ =Interpretable model (e.g., linear model)
$\pi_x$ =Local neighborhood around x
$\Omega(g)$ =Complexity penalty for g
$L$ =Fidelity loss measuring how well g approximates f

Sample Weighting:

LIME Neighborhood Weighting

\pi_x(z) = \exp\left(-\frac{D(x, z)^2}{\sigma^2}\right)

Here,

$D(x, z)$ =Distance between original instance x and perturbed instance z
$\sigma$ =Kernel width controlling locality

💡 Kernel Width Selection

The kernel width $\sigma$ controls the "locality" of the explanation. A smaller $\sigma$ makes the explanation more local (only very nearby points matter), while a larger $\sigma$ considers a broader neighborhood. The default in the LIME library is $\sigma = \sqrt{num\_features} \cdot 0.75$ .

Architecture Diagram

LIME Explanation Process:
═══════════════════════════════════════════════════

 Original Data Point x:
 ┌─────────────────────────────────────────┐
 │ Income: $75,000  Age: 35  Debt: $15,000 │
 │ Credit Score: 720                       │
 └─────────────────────────────────────────┘
                    │
                    ▼
 ┌─────────────────────────────────────────┐
 │ Step 1: Generate Neighborhood           │
 │ • Perturb features around x             │
 │ • Create weighted samples               │
 │                                         │
 │   ○ ○ ○ ● ● ● ● ● ○ ○ ○               │
 │    (● = original, ○ = perturbed)        │
 └─────────────────────────────────────────┘
                    │
                    ▼
 ┌─────────────────────────────────────────┐
 │ Step 2: Get Model Predictions           │
 │ • Run complex model on perturbations    │
 │ • Assign weights by distance            │
 └─────────────────────────────────────────┘
                    │
                    ▼
 ┌─────────────────────────────────────────┐
 │ Step 3: Fit Interpretable Model         │
 │ • Train linear model on weighted data   │
 │ • Feature importance = coefficients     │
 │                                         │
 │ Income:    +0.45  ← Most important     │
 │ Credit:    +0.35                       │
 │ Age:       +0.12                       │
 │ Debt:      -0.25                       │
 └─────────────────────────────────────────┘
═══════════════════════════════════════════════════

ThLIME Local Approximation Guarantees

LIME provides locally faithful explanations: for a given instance $x$ , the explanation $g$ approximates the complex model $f$ well in a neighborhood around $x$ . However, unlike SHAP, LIME does not guarantee global consistency — explanations for nearby instances may differ significantly due to the sampling-based nature of the method.

📝LIME Explanation for Loan Application

A bank uses a gradient boosting model to predict loan defaults. For a specific applicant:

Applicant features: Income = $75,000, Age = 35, Debt =$ 15,000, Credit Score = 720

LIME Explanation (local linear model):

Income: +0.45 (high income → lower risk)
Credit Score: +0.35 (good score → lower risk)
Debt: -0.25 (moderate debt → slight risk increase)
Age: +0.12 (prime working age → slight positive)

Interpretation: The model predicts "low risk" primarily because of the applicant's high income and good credit score. The debt partially offsets these positives. This explanation is only valid for this specific prediction — a different applicant might have different feature contributions.

LIME Implementation

from lime.lime_tabular import LimeTabularExplainer
import lime.lime_tabular

# ═══════════════════════════════════════════════════
# LIME Explainer Setup
# ═══════════════════════════════════════════════════
lime_explainer = LimeTabularExplainer(
    training_data=X_train.values,
    feature_names=X_train.columns.tolist(),
    mode='regression',
    discretize_continuous=True,
    random_state=42
)

# ═══════════════════════════════════════════════════
# Explain Individual Predictions
# ═══════════════════════════════════════════════════
# Explain 5 different samples
sample_indices = [0, 50, 100, 200, 500]

for idx in sample_indices:
    exp = lime_explainer.explain_instance(
        X_test.iloc[idx].values,
        model.predict,
        num_features=10,
        top_labels=1
    )

    print(f"\n{'=' * 60}")
    print(f"Sample {idx}")
    print(f"Actual: {y_test.iloc[idx]:.4f}")
    print(f"Predicted: {model.predict(X_test.iloc[[idx]])[0]:.4f}")
    print(f"\nTop Feature Contributions:")
    for feat, weight in exp.as_list()[:5]:
        print(f"  {feat}: {weight:.4f}")

    # Save HTML explanation
    exp.save_to_file(f'lime_explanation_{idx}.html')

# ═══════════════════════════════════════════════════
# Visualize LIME Explanation
# ═══════════════════════════════════════════════════
exp = lime_explainer.explain_instance(
    X_test.iloc[0].values,
    model.predict,
    num_features=10
)

fig = exp.as_pyplot_figure()
plt.title("LIME Local Explanation")
plt.tight_layout()
plt.savefig('lime_explanation.png', dpi=150)
plt.show()

Partial Dependence Plots

from sklearn.inspection import partial_dependence, PartialDependenceDisplay

# ═══════════════════════════════════════════════════
# Individual Conditional Expectation (ICE) Plots
# ═══════════════════════════════════════════════════
fig, axes = plt.subplots(2, 4, figsize=(20, 10))
features = ['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms',
            'Population', 'AveOccup', 'Latitude', 'Longitude']

for i, feature in enumerate(features):
    row, col = i // 4, i % 4
    PartialDependenceDisplay.from_estimator(
        model, X_train, [feature],
        ax=axes[row, col],
        kind='both',  # ICE + PDP
        subsample=50,
        grid_resolution=50
    )
    axes[row, col].set_title(f'PDP: {feature}')

plt.suptitle("Partial Dependence Plots", fontsize=16, y=1.02)
plt.tight_layout()
plt.savefig('partial_dependence_plots.png', dpi=150, bbox_inches='tight')
plt.show()

# ═══════════════════════════════════════════════════
# 2D Partial Dependence (Feature Interactions)
# ═══════════════════════════════════════════════════
fig, ax = plt.subplots(figsize=(10, 8))
PartialDependenceDisplay.from_estimator(
    model, X_train, [('MedInc', 'AveRooms')],
    ax=ax,
    grid_resolution=50
)
plt.title("2D PDP: MedIncome × AveRooms Interaction")
plt.tight_layout()
plt.savefig('pdp_2d_interaction.png', dpi=150)
plt.show()

Global vs Local Interpretability

Architecture Diagram

Global vs Local Interpretability:
═══════════════════════════════════════════════════

 GLOBAL (Overall Model Behavior)
 ────────────────────────────────
 Methods:
 • Feature Importance Rankings
 • Partial Dependence Plots
 • SHAP Summary Plots

 Question: "What features matter most overall?"

 ┌──────────────────────────────────────────────┐
 │ Feature Importance:                          │
 │ 1. Income        ████████████████  (0.35)    │
 │ 2. Credit Score  ██████████████    (0.30)    │
 │ 3. Age           ████████          (0.15)    │
 │ 4. Debt          ██████            (0.12)    │
 │ 5. Employment    ████              (0.08)    │
 └──────────────────────────────────────────────┘

 LOCAL (Individual Prediction)
 ────────────────────────────────
 Methods:
 • LIME Explanations
 • SHAP Waterfall Plots
 • Counterfactual Explanations

 Question: "Why did THIS prediction happen?"

 ┌──────────────────────────────────────────────┐
 │ Sample #142:                                 │
 │ Base Value: $680                             │
 │ Income:     +$45 (high income)               │
 │ Credit:     +$35 (good history)              │
 │ Debt:       -$30 (high debt)                 │
 │ Age:        +$20 (prime age)                 │
 │ ───────────────────────────────              │
 │ Prediction: $750                             │
 └──────────────────────────────────────────────┘
═══════════════════════════════════════════════════

Comparison: SHAP vs LIME

Architecture Diagram

SHAP vs LIME Comparison:
═══════════════════════════════════════════════════════════════════

 Feature       │ SHAP                    │ LIME
 ══════════════╪═════════════════════════╪═══════════════════════
 Theory        │ Game Theory (Shapley)   │ Local Surrogate Model
 Speed         │ Slower (exact)          │ Faster (sampling)
 Consistency   │ Guaranteed              │ Not guaranteed
 Global        │ Yes (summary plots)     │ No (local only)
 Local         │ Yes (waterfall)         │ Yes (primary use)
 Interactions  │ Yes (SHAP interaction)  │ No (independent)
 ══════════════╧═════════════════════════╧═══════════════════════

 When to Use:
 ─────────────
 • SHAP: When you need theoretical guarantees and both global/local
 • LIME: When you need fast, model-agnostic local explanations
 • Both: For comprehensive interpretability analysis
═══════════════════════════════════════════════════════════════════

📋Key Takeaways

SHAP is grounded in game theory and satisfies the four Shapley axioms (efficiency, symmetry, dummy, additivity), providing theoretically unique and consistent explanations
LIME approximates the model locally with an interpretable surrogate, offering fast model-agnostic explanations without theoretical guarantees
Efficiency Property: SHAP values always sum to $f(x) - E[f(X)]$ , providing complete attribution
Feature importance rankings help with global model understanding; partial dependence plots reveal feature effects and interactions
Local vs. Global: SHAP supports both local (waterfall) and global (summary) explanations; LIME is primarily local
Validation: Always validate interpretations with domain experts — statistical correlation does not imply causation

Practice Exercises

SHAP vs LIME Agreement: Compare SHAP and LIME explanations for the same predictions - when do they disagree?
Adversarial Analysis: Find predictions where high feature values lead to low SHAP values (feature correlation effects)
Interactive Dashboard: Build a Streamlit app that shows SHAP explanations for user-selected predictions
Causal Interpretation: Discuss when SHAP values can vs cannot be interpreted causally

Model Interpretability: SHAP, LIME

Model Interpretability: SHAP, LIME

Introduction

Why Interpretability Matters

SHAP (SHapley Additive exPlanations)

Theory

DfShapley Value

Shapley Value Formula

ThShapley Value Axioms (Unique Solution)

Efficiency Property

SHAP Implementation

LIME (Local Interpretable Model-agnostic Explanations)

Theory

DfLIME Optimization Problem

LIME Objective

LIME Neighborhood Weighting

ThLIME Local Approximation Guarantees

📝LIME Explanation for Loan Application

LIME Implementation

Partial Dependence Plots

Global vs Local Interpretability

Comparison: SHAP vs LIME

📋Key Takeaways

Practice Exercises

Need Expert Data Science Help?