Anomaly Detection

Machine LearningScikit-LearnFree Lesson

Advertisement

Introduction

Anomaly detection identifies outliers or unusual patterns in data using unsupervised techniques.

Isolation Forest

from sklearn.ensemble import IsolationForest
from sklearn.datasets import make_blobs
import numpy as np

X, _ = make_blobs(n_samples=100, centers=1, random_state=42)

iso = IsolationForest(contamination=0.1, random_state=42)
labels = iso.fit_predict(X)

# -1 for anomalies, 1 for normal
anomalies = X[labels == -1]
print(f"Number of anomalies: {len(anomalies)}")

# Anomaly scores
scores = iso.decision_function(X)

One-Class SVM

from sklearn.svm import OneClassSVM

ocsvm = OneClassSVM(kernel='rbf', gamma='auto', nu=0.1)
ocsvm.fit(X_normal)

labels = ocsvm.predict(X_test)
# 1 = normal, -1 = anomaly

# Decision function scores
scores = ocsvm.decision_function(X_test)

Local Outlier Factor

from sklearn.neighbors import LocalOutlierFactor

lof = LocalOutlierFactor(n_neighbors=20, contamination=0.1)
labels = lof.fit_predict(X)

# Negative outlier factor (more negative = more anomalous)
outlier_scores = lof.negative_outlier_factor_

Novelty Detection

# Train on normal data only
iso = IsolationForest(contamination=0.1, novelty=True)
iso.fit(X_normal)

# Predict on new data
labels = iso.predict(X_new_data)
scores = iso.decision_function(X_new_data)

Practice Problems

  1. Detect outliers with IsolationForest
  2. Use One-Class SVM for novelty detection
  3. Compare LOF vs IsolationForest
  4. Set contamination parameter
  5. Extract anomaly scores

Advertisement

Need Expert Python Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement