Multiple Linear Regression

Regression Analysis

Extending Regression to Multiple Predictors

Multiple linear regression models the relationship between a response variable and several predictors simultaneously. It estimates each variable's unique contribution while controlling for others.

Real Estate — Predict house prices from size, location, age, and amenities
Medicine — Assess treatment effects while controlling for patient demographics
Marketing — Quantify individual channel contributions to total sales

Each coefficient tells the story of one variable holding all others constant.

Extends simple regression to multiple predictors:


import numpy as np

import pandas as pd

import statsmodels.api as sm

from scipy import stats

import matplotlib.pyplot as plt



np.random.seed(42)

n = 200



# House price data: size, bedrooms, age

house_size = np.random.uniform(1000, 3500, n)

bedrooms   = np.random.choice([1,2,3,4,5], n, p=[0.05,0.2,0.4,0.25,0.1])

age        = np.random.uniform(0, 50, n)



price = (50000 + 120*house_size + 8000*bedrooms - 500*age

         + np.random.normal(0, 25000, n))



df = pd.DataFrame({'price':price,'size':house_size,'bedrooms':bedrooms,'age':age})



X = sm.add_constant(df[['size','bedrooms','age']])

model = sm.OLS(df['price'], X).fit()

print(model.summary())



# Interpretation

print("\nCoefficient Interpretation:")

for name, coef, pval in zip(model.params.index, model.params, model.pvalues):

    sig = "***" if pval<0.001 else "**" if pval<0.01 else "*" if pval<0.05 else "ns"

    print(f"  {name:12s}: {coef:>10.2f}  (p={pval:.4f} {sig})")



# F-test: overall model significance

print(f"\nF({model.df_model:.0f},{model.df_resid:.0f}) = {model.fvalue:.2f}, p = {model.f_pvalue:.6f}")

print(f"R² = {model.rsquared:.4f}, Adj R² = {model.rsquared_adj:.4f}")



# Prediction with confidence interval

new_house = pd.DataFrame({'const':1,'size':[2000],'bedrooms':[3],'age':[10]})

pred = model.get_prediction(new_house)

summary = pred.summary_frame(alpha=0.05)

print(f"\nPrediction for 2000 sqft, 3 bed, 10yr old:")

print(f"  Predicted: ${summary['mean'].iloc[0]:,.0f}")

print(f"  95% CI: (${summary['mean_ci_lower'].iloc[0]:,.0f}, ${summary['mean_ci_upper'].iloc[0]:,.0f})")

print(f"  95% PI: (${summary['obs_ci_lower'].iloc[0]:,.0f}, ${summary['obs_ci_upper'].iloc[0]:,.0f})")

Multiple Linear Regression — Theory and Python

Multiple Linear Regression

Extending Regression to Multiple Predictors

Key Takeaways

Need Expert Statistics Help?