Supervised Learning

If-Then Rules That Learn — The Most Interpretable Algorithm

Decision trees split data using simple if-then-else rules. They are easy to visualize, handle mixed data types, and form the basis for powerful ensemble methods.

Gini Impurity — Measuring node purity for optimal splits
Information Gain — Entropy-based splitting criterion
Pruning — Preventing overfitting by limiting tree complexity

"A decision tree is the only ML algorithm that can be explained to your grandmother."

Decision Trees — Complete Guide

Decision trees make predictions by learning simple rules from data — like a flowchart of if-then-else decisions.

How Decision Trees Work

Splitting Criteria

Gini Impurity

Information Gain (Entropy)

CART Algorithm

from sklearn.tree import DecisionTreeClassifier, export_text
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)

tree = DecisionTreeClassifier(max_depth=3, criterion='gini', random_state=42)
tree.fit(X_train, y_train)
print(f"Accuracy: {tree.score(X_test, y_test):.3f}")
print(export_text(tree, feature_names=iris.feature_names))

for name, imp in zip(iris.feature_names, tree.feature_importances_):
    print(f"{name}: {imp:.3f}")

Pruning

Feature Importance

Key Takeaways

What to Learn Next

-> Random Forest Ensemble of decision trees for better accuracy and stability.

-> XGBoost Gradient boosting taken to the extreme — state-of-the-art performance.

-> Ensemble Methods Bagging, boosting, and stacking for stronger models.

Decision Trees — Complete Guide with Visualizations