Linear Regression: Math, Code and Assumptions

The Foundation of Machine Learning

Linear regression is the most fundamental algorithm in ML. Despite its simplicity, understanding it deeply provides insight into all supervised learning methods.

ML Algorithm Landscape

Linear Regression is the foundation â€” understand this first!

1. Simple Linear Regression

Mathematical Formulation

Model:

\hat{y} = \beta_0 + \beta_1 x + \epsilon

Where:

$\beta_0$ = intercept (bias) â€” value of $y$ when $x = 0$
$\beta_1$ = slope (weight) â€” change in $y$ for unit change in $x$
$\epsilon$ = error term â€” $\epsilon \sim N(0, \sigma^2)$

2. Cost Function (Ordinary Least Squares)

Mean Squared Error (MSE):

J(\beta_0, \beta_1) = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 = \frac{1}{n} \sum_{i=1}^{n} (y_i - (\beta_0 + \beta_1 x_i))^2

Goal: Find $\beta_0, \beta_1$ that minimize $J$

Closed-Form Solution (Normal Equation):

\beta_1 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n}(x_i - \bar{x})^2} = \frac{\text{Cov}(X,Y)}{\text{Var}(X)}

\beta_0 = \bar{y} - \beta_1 \bar{x}

The cost function is convex â€” gradient descent finds the global minimum

3. Gradient Descent

Update Rule:

\beta_j := \beta_j - \alpha \frac{\partial J}{\partial \beta_j}

Partial Derivatives:

\frac{\partial J}{\partial \beta_0} = -\frac{2}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)

\frac{\partial J}{\partial \beta_1} = -\frac{2}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i) \cdot x_i

Where $\alpha$ = learning rate (step size)

4. Multiple Linear Regression

Model:

\hat{y} = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_p x_p = \beta_0 + \sum_{j=1}^{p} \beta_j x_j

Matrix Form:

\hat{\mathbf{y}} = \mathbf{X}\boldsymbol{\beta}

Where $\mathbf{X} \in \mathbb{R}^{n \times (p+1)}$ (design matrix with intercept column)

Normal Equation (Matrix):

\boldsymbol{\hat{\beta}} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}

5. Model Evaluation Metrics

RÂ² Score (Coefficient of Determination):

R^2 = 1 - \frac{SS_{res}}{SS_{tot}} = 1 - \frac{\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}{\sum_{i=1}^{n}(y_i - \bar{y})^2}

$R^2 = 1$ : Perfect fit
$R^2 = 0$ : Model predicts the mean
$R^2 < 0$ : Model is worse than predicting the mean

Adjusted RÂ²:

R^2_{adj} = 1 - \frac{(1-R^2)(n-1)}{n-p-1}

RÂ² = 1 - (30/100) = 0.70 (70% variance explained)

6. Assumptions of Linear Regression

Checking Assumptions with Residual Plots

7. Implementation in Python

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

# Generate sample data
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Fit model
model = LinearRegression()
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)

# Evaluate
print(f"Intercept (Î²â‚€): {model.intercept_[0]:.4f}")
print(f"Slope (Î²â‚): {model.coef_[0][0]:.4f}")
print(f"RÂ² Score: {r2_score(y_test, y_pred):.4f}")
print(f"RMSE: {np.sqrt(mean_squared_error(y_test, y_pred)):.4f}")

# Visualize
plt.scatter(X_test, y_test, color='blue', alpha=0.6, label='Actual')
plt.plot(X_test, y_pred, color='red', linewidth=2, label='Predicted')
plt.xlabel('Feature')
plt.ylabel('Target')
plt.title('Linear Regression Fit')
plt.legend()
plt.show()

Key Takeaways

Linear regression finds the best-fit line through data points
Cost function (MSE) measures prediction error â€” minimize it
Gradient descent iteratively updates weights to find minimum
RÂ² score tells you how much variance the model explains
Validate assumptions before trusting the model
Regularization (Ridge/Lasso) prevents overfitting

Next: Logistic Regression

Extend linear regression to classification with the sigmoid function.

Linear Regression: Math, Code and Assumptions

Linear Regression: Math, Code and Assumptions

The Foundation of Machine Learning

ML Algorithm Landscape

1. Simple Linear Regression

Mathematical Formulation

2. Cost Function (Ordinary Least Squares)

3. Gradient Descent

4. Multiple Linear Regression

5. Model Evaluation Metrics

6. Assumptions of Linear Regression

Checking Assumptions with Residual Plots

7. Implementation in Python

Key Takeaways

Next: Logistic Regression

Need Expert Data Science Help?