Logistic Regression — Complete Guide for Classification

ML FoundationsClassificationFree Lesson

Advertisement

Logistic Regression — Complete Guide for Classification

Despite its name, logistic regression is a classification algorithm. It predicts the probability that an input belongs to a class.


From Linear to Logistic Regression

Linear Regression:    y = wx + b (output: any real number)
Logistic Regression:  y = σ(wx + b) (output: probability 0-1)

The Sigmoid Function:
σ(z) = 1 / (1 + e⁻ᶻ)

Output:
z = 0   → σ(z) = 0.5
z = 2   → σ(z) = 0.88
z = -2  → σ(z) = 0.12
z → ∞   → σ(z) → 1
z → -∞  → σ(z) → 0

Decision boundary:
σ(z) ≥ 0.5 → Class 1
σ(z) < 0.5 → Class 0

Cost Function

Linear regression uses MSE — but it doesn't work for logistic regression!

Why?
If y=1 and ŷ is large → MSE is small (good)
If y=0 and ŷ is large → MSE is large (bad)
But sigmoid is flat at extremes → gradients are tiny → slow learning

Solution: Binary Cross-Entropy (Log Loss)

L = -[y log(ŷ) + (1-y) log(1-ŷ)]

If y=1: L = -log(ŷ)
  ŷ=0.9 → L = 0.11 (good)
  ŷ=0.1 → L = 2.30 (bad)

If y=0: L = -log(1-ŷ)
  ŷ=0.1 → L = 0.11 (good)
  ŷ=0.9 → L = 2.30 (bad)

Python Implementation

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

# Generate data
X, y = make_classification(n_samples=1000, n_features=10,
                           n_informative=5, random_state=42)

# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Fit
model = LogisticRegression()
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)
y_prob = model.predict_proba(X_test)[:, 1]

# Evaluate
from sklearn.metrics import accuracy_score, roc_auc_score
print(f"Accuracy: {accuracy_score(y_test, y_pred):.3f}")
print(f"AUC-ROC: {roc_auc_score(y_test, y_prob):.3f}")

Decision Boundary

The decision boundary is where σ(z) = 0.5:

w₁x₁ + w₂x₂ + b = 0

This is a LINE in 2D, a PLANE in 3D,
and a HYPERPLANE in higher dimensions.

For nonlinear boundaries:
- Add polynomial features
- Use kernel logistic regression
- Or use neural networks

Multiclass Extension

Binary → Multiclass:

Method 1: One-vs-Rest (OvR)
├─ Train K binary classifiers (one per class)
├─ Each classifier: "Is it class k or not?"
└─ Class with highest probability wins

Method 2: Softmax Regression (Multinomial)
├─ Generalizes sigmoid to multiple classes
├─ softmax(zᵢ) = eᶻⁱ / Σeᶻʲ
└─ Output: probability distribution over classes

Evaluation Metrics

Confusion Matrix:
                 Predicted
                 0    1
Actual  0    [ TN   FP ]
        1    [ FN   TP ]

Accuracy:  (TP + TN) / Total
Precision: TP / (TP + FP)  — of predicted positives, how many correct?
Recall:    TP / (TP + FN)  — of actual positives, how many found?
F1 Score:  2 × (Precision × Recall) / (Precision + Recall)
AUC-ROC:   Area under ROC curve (threshold-independent)

Key Takeaways

  1. Logistic regression outputs probabilities using the sigmoid function
  2. Binary cross-entropy is the cost function (not MSE)
  3. Decision boundary is linear in feature space
  4. Multiclass: use Softmax regression or One-vs-Rest
  5. AUC-ROC is the best metric for imbalanced datasets
  6. Logistic regression is fast, interpretable, and a great baseline
  7. Add polynomial features for nonlinear boundaries
  8. Use regularization (L1/L2) to prevent overfitting

Advertisement

Need Expert Machine Learning Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement