Linear Algebra — The Language of Data
ℹ️ Why It Matters
Every image, text, dataset, and neural network in AI is represented as numbers arranged in grids. Linear algebra is the math of those grids (matrices and vectors). Without it, there is no modern AI.
What is a Vector?
A vector is a list of numbers arranged in order. Think of it as an arrow pointing to a location in space.
Simple Analogy: Think of a vector like a GPS coordinate — it tells you exactly where something is.
DfVector
A vector is an ordered tuple of numbers that represents a point or direction in space. A vector in n-dimensional space is written as:
Vector Notation
Here,
- =The vector (an ordered list of numbers)
- =Individual components of the vector
- =n-dimensional real coordinate space
Real-World Example:
- A student's grades:
[Math: 85, English: 90, Science: 78]is a 3D vector - An RGB pixel color:
[255, 128, 0]is a 3D vector (orange color) - Word embeddings in NLP: the word "king" might be represented as a 300-dimensional vector
Vector Operations
Addition: Add corresponding elements
Vector Addition
Here,
- =Two vectors of the same dimension
- =Resultant vector with summed components
📝Example: Vector Addition
If and , then:
Scalar Multiplication: Multiply every element by a number
Scalar Multiplication
Here,
- =A scalar (single real number)
- =The vector to be scaled
- =The scaled vector
📝Example: Scalar Multiplication
If and , then:
Dot Product: Multiply corresponding elements and sum them up
Dot Product
Here,
- =The dot product (scalar result)
- =The i-th components of vectors u and v
📝Example: Dot Product
If and :
💡 Why dot product matters
It measures how similar two vectors are. If the dot product is large and positive, the vectors point in the same direction (similar). If it's zero, they're perpendicular (unrelated). If negative, they point in opposite directions (opposite).
Magnitude (Length of a Vector)
Vector Magnitude (Euclidean Norm)
Here,
- =The magnitude (length) of vector v
- =The i-th component of vector v
📝Example: Magnitude
For :
Unit Vector
A unit vector is a vector with magnitude 1. It preserves direction while removing scale.
Unit Vector
Here,
- =The unit vector in the direction of v
- =The original vector
- =The magnitude of v
📝Example: Unit Vector
For with :
What is a Matrix?
A matrix is a rectangular grid of numbers arranged in rows and columns.
DfMatrix
A matrix is a rectangular array of numbers arranged in rows and columns. An matrix has rows and columns.
Matrix Notation
Here,
- =The matrix
- =Element in row i, column j
- =Dimensions (rows × columns)
Real-World Examples:
- Dataset: Each row = one data point, each column = one feature
- Grayscale Image: Each pixel is a number (0=black, 255=white)
- Neural Network Weights: Each layer is a matrix of connection strengths
Matrix Operations
Addition: Add corresponding elements (matrices must be same size)
📝Example: Matrix Addition
Matrix Multiplication: Row × Column, then sum
Matrix Multiplication
Here,
- =First matrix (m × n)
- =Second matrix (n × p)
- =Result matrix (m × p)
📝Example: Matrix Multiplication
⚠️ Important Rule
For to work, columns of A must equal rows of B.
Transpose
Flip rows and columns (mirror along the diagonal).
Matrix Transpose
Here,
- =The transpose of matrix A
- =Element from row j, column i of original
📝Example: Transpose
Identity Matrix
The "1" of matrices. Multiplying anything by it gives the same thing back.
DfIdentity Matrix
The identity matrix is a square matrix with 1s on the diagonal and 0s elsewhere. For any matrix : .
📝Example: 3×3 Identity Matrix
Inverse Matrix
Like dividing. If , then is the inverse of , written as .
Matrix Inverse
Here,
- =The original matrix
- =The inverse matrix
- =The identity matrix
ℹ️ Why it matters in AI
Used in solving systems of linear equations, which comes up in regression and optimization.
Eigenvalues and Eigenvectors
The Big Idea: When you multiply a matrix by certain special vectors, the vector doesn't change direction — it only gets stretched or shrunk. Those special vectors are eigenvectors, and the stretching factor is the eigenvalue.
Eigenvalue Equation
Here,
- =The matrix
- =The eigenvector
- =The eigenvalue (scalar)
Analogy: Imagine putting a rubber sheet on a board and stretching it. Most arrows on the sheet will change direction. But some arrows, when you stretch, still point the same way — just longer or shorter. Those are eigenvectors.
ℹ️ How to find them
Solve the characteristic equation:
📝Example: Finding Eigenvalues
Let
Applications in AI/DS:
- Principal Component Analysis (PCA): Find the directions of maximum variance in data
- Google's PageRank: Eigenvalues help rank web pages
- Stability analysis: Check if a system is stable
- Face recognition (Eigenfaces): Use eigenvalues of face image matrices
Matrix Decompositions
LU Decomposition
Break a matrix into Lower × Upper triangular matrices.
LU Decomposition
Here,
- =The original matrix
- =Lower triangular matrix
- =Upper triangular matrix
Use: Solving systems of equations efficiently.
SVD (Singular Value Decomposition)
The most important matrix decomposition in data science!
Singular Value Decomposition
Here,
- =The original matrix
- =Left singular vectors (orthogonal)
- =Diagonal matrix of singular values
- =Right singular vectors (orthogonal)
Analogy: SVD is like breaking down a number into its prime factors, but for matrices. Every matrix can be decomposed this way.
Applications:
- Image compression: Keep only the largest singular values, throw away the rest
- Recommendation systems (Netflix Prize): Decompose the user-movie rating matrix
- Noise reduction: Remove small singular values (noise)
- Latent Semantic Analysis: Find hidden topics in text
QR Decomposition
QR Decomposition
Here,
- =The original matrix
- =Orthogonal matrix
- =Upper triangular matrix
Use: Least squares problems, eigenvalue algorithms.
Vector Spaces and Subspaces
DfVector Space
A vector space is a collection of vectors where you can add them and multiply by scalars and still stay in the space.
DfSubspace
A subspace is a smaller vector space inside a bigger one.
Examples:
- All 2D vectors
[x, y]form a vector space (the entire 2D plane) - All vectors
[x, 0](y=0) form a subspace (the x-axis)
DfBasis
A basis is a set of linearly independent vectors that can represent every vector in the space.
DfDimension
The dimension is the number of vectors in the basis.
DfSpan
The span is all possible linear combinations of a set of vectors.
📝Example: Span
DfLinear Independence
Vectors are linearly independent if none of them can be written as a combination of the others.
DfRank
The rank is the number of linearly independent rows (or columns) in a matrix.
ℹ️ Why it matters
- Rank tells you how much "information" a matrix contains
- If rank < number of features, your data has redundancy (you can compress it)
- PCA finds a low-rank approximation of your data
Norms
A norm is a way to measure the "size" of a vector.
| Norm | Formula | Name | Use Case |
|---|---|---|---|
| L1 norm | v | ||
| L2 norm | v | ||
| L∞ norm | v | ||
| Lp norm | v |
Applications in AI:
- L1 regularization (Lasso): Shrinks some weights to exactly 0 → feature selection
- L2 regularization (Ridge): Shrinks all weights toward 0 → prevents overfitting
- Distance metrics: KNN, K-means clustering all use norms
Linear Algebra in Machine Learning — Summary
📋Summary: Linear Algebra
- A vector is an ordered list of numbers representing magnitude and direction
- Vector addition adds corresponding components
- Scalar multiplication scales each component by a number
- Dot product measures alignment between two vectors (returns scalar)
- Matrix is a rectangular grid of numbers
- Eigenvalues/eigenvectors reveal hidden structure in data
- SVD decomposes any matrix into orthogonal components
- Norms measure vector/matrix size
| ML Concept | Linear Algebra Concept |
|---|---|
| Dataset | Matrix (rows=samples, cols=features) |
| Neural Network layer | Matrix multiplication + bias |
| PCA | Eigendecomposition / SVD |
| Recommendations | Matrix factorization |
| Image processing | Matrix operations |
| Word embeddings | Vector spaces |
| Regression (XᵀX)⁻¹Xᵀy | Matrix inverse, transpose |