Scikit-Learn Basics

Machine LearningScikit-LearnFree Lesson

Advertisement

Introduction

Scikit-Learn provides a consistent interface for machine learning algorithms with estimator, transformer, and evaluator patterns.

Estimator Interface

from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeClassifier

# Every estimator has fit() and predict()
reg = LinearRegression()
X = [[1], [2], [3], [4], [5]]
y = [1, 2, 3, 4, 5]
reg.fit(X, y)
print(reg.predict([[6]]))  # [6.]

clf = DecisionTreeClassifier()
X_clf = [[0, 0], [1, 1], [0, 1], [1, 0]]
y_clf = [0, 1, 1, 0]
clf.fit(X_clf, y_clf)
print(clf.predict([[1, 1]]))  # [1]

Working with Data

from sklearn.datasets import load_iris, make_classification

# Load built-in dataset
iris = load_iris()
X, y = iris.data, iris.target
print(f"Features: {iris.feature_names}")

# Generate synthetic data
X, y = make_classification(n_samples=100, n_features=4, n_classes=2)

Estimator Properties

from sklearn.linear_model import LinearRegression

reg = LinearRegression(fit_intercept=True, normalize=False)
reg.fit([[1], [2], [3]], [1, 2, 3])

# Attributes after fitting
print(f"Coefficients: {reg.coef_}")
print(f"Intercept: {reg.intercept_}")
print(f"Score: {reg.score([[1], [2], [3]], [1, 2, 3])}")

Model Persistence

import joblib
from sklearn.linear_model import LogisticRegression

# Save model
model = LogisticRegression()
model.fit([[1], [2], [3]], [0, 1, 1])
joblib.dump(model, 'model.joblib')

# Load model
loaded_model = joblib.load('model.joblib')
print(loaded_model.predict([[1.5]]))

Practice Problems

  1. Implement linear regression on housing data
  2. Fit decision tree classifier
  3. Extract model coefficients and intercept
  4. Save and load trained model
  5. Use different estimator for same data

Advertisement

Need Expert Python Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement