← Math|71 of 100
Discrete Mathematics

Set Theory

Master set operations, Venn diagrams, power sets, and their applications in logic and databases.

📂 Sets📖 Lesson 71 of 100🎓 Free Course

Advertisement

Set Theory

ℹ️ Why It Matters

Sets are the foundational building block of all mathematics and computer science. Every database query, every search algorithm, every machine learning pipeline relies on set operations. When you query a database with JOIN or UNION, you are performing set operations. When a neural network filters features, it partitions data into sets. Understanding sets unlocks your ability to reason about collections of data formally, efficiently, and correctly. Without set theory, modern computation—from SQL databases to GPU-accelerated linear algebra—would not exist.


Set Definition

DfSet

A set is a well-defined collection of distinct objects, called elements or members. The order and repetition of elements do not matter.

Sets are typically denoted with curly braces. For example:

A={1,2,3},B={red,blue,green},C={xR:x>0}A = \{1, 2, 3\}, \quad B = \{\text{red}, \text{blue}, \text{green}\}, \quad C = \{x \in \mathbb{R} : x > 0\}

DfSet Membership

We write aAa \in A to mean "aa is an element of set AA" and aAa \notin A to mean "aa is not an element of AA."

DfEmpty Set

The empty set \emptyset is the unique set containing no elements. It is a subset of every set: A\emptyset \subseteq A for all AA.

DfSubset

ABA \subseteq B if and only if every element of AA is also an element of BB. Formally:

AB    x(xAxB)A \subseteq B \iff \forall x (x \in A \rightarrow x \in B)

ABA \subset B (proper subset) means ABA \subseteq B and ABA \neq B.

DfUniversal Set

The universal set UU is the set of all objects under consideration. The complement of AA is defined relative to UU: Ac={xU:xA}A^c = \{x \in U : x \notin A\}.

DfCardinality

The cardinality A|A| of a finite set AA is the number of distinct elements in AA. For example, {1,2,3}=3|\{1, 2, 3\}| = 3 and =0|\emptyset| = 0.


Set Operations

Union

Intersection

Set Difference

Complement

Symmetric Difference

Summary Table

OperationNotationDefinition
UnionABA \cup BElements in AA or BB (or both)
IntersectionABA \cap BElements in both AA and BB
DifferenceABA \setminus BElements in AA but not BB
ComplementAcA^cElements in UU but not AA
Symmetric DifferenceAΔBA \Delta BElements in exactly one of AA, BB

Venn Diagrams

Venn diagrams are visual representations of set relationships. They use overlapping circles (or other shapes) inside a rectangle representing the universal set UU.

  • Two-set Venn: Two overlapping circles. The overlap is ABA \cap B. The region in AA but outside BB is ABA \setminus B.
  • Three-set Venn: Three overlapping circles create 8 distinct regions (including the outside region). This is useful for visualizing relationships among three sets.
  • Disjoint sets: Two sets with no overlap, AB=A \cap B = \emptyset, are represented as non-overlapping circles.
  • Subset: If ABA \subseteq B, the circle for AA is drawn entirely inside the circle for BB.

Venn diagrams are widely used in probability theory, database query visualization, survey analysis, and data science exploratory analysis.


Power Set

DfPower Set

The power set of a set AA, denoted P(A)\mathcal{P}(A) or 2A2^A, is the set of all subsets of AA, including the empty set and AA itself.

Power Set Cardinality

📝Power Set Example

For A={1,2,3}A = \{1, 2, 3\}, the power set is:

P(A)={,{1},{2},{3},{1,2},{1,3},{2,3},{1,2,3}}\mathcal{P}(A) = \{\emptyset, \{1\}, \{2\}, \{3\}, \{1,2\}, \{1,3\}, \{2,3\}, \{1,2,3\}\}

P(A)=23=8|\mathcal{P}(A)| = 2^3 = 8 subsets.

💡Generating a Power Set in Python

def power_set(s):
    """Generate all subsets of set s."""
    result = [[]]
    for elem in s:
        result = result + [subset + [elem] for subset in result]
    return result

A = {1, 2, 3}
ps = power_set(list(A))
print(f"Power set of {A}:")
for subset in ps:
    print(subset)
print(f"Total subsets: {len(ps)}")  # 8

Cartesian Product

DfCartesian Product

The Cartesian product of sets AA and BB, denoted A×BA \times B, is the set of all ordered pairs (a,b)(a, b) where aAa \in A and bBb \in B:

A×B={(a,b):aA and bB}A \times B = \{(a, b) : a \in A \text{ and } b \in B\}

Cardinality of Cartesian Product

📝Cartesian Product Example

Let A={1,2}A = \{1, 2\} and B={x,y,z}B = \{x, y, z\}. Then:

A×B={(1,x),(1,y),(1,z),(2,x),(2,y),(2,z)}A \times B = \{(1,x), (1,y), (1,z), (2,x), (2,y), (2,z)\}
A×B=2×3=6|A \times B| = 2 \times 3 = 6

⚠️ Not Commutative

A×BB×AA \times B \neq B \times A in general. Ordered pairs respect position: (1,x)(x,1)(1, x) \neq (x, 1) unless A=BA = B.

The Cartesian product is fundamental to defining relations, functions, and the structure of coordinate systems. In databases, it underlies the concept of a cross join. In machine learning, feature spaces are Cartesian products of individual feature domains.


Partitions

DfPartition

A partition of a set SS is a collection of non-empty subsets {A1,A2,,An}\{A_1, A_2, \ldots, A_n\} of SS such that:

  1. Non-empty: AiA_i \neq \emptyset for all ii.
  2. Pairwise disjoint: AiAj=A_i \cap A_j = \emptyset for all iji \neq j.
  3. Covering: A1A2An=SA_1 \cup A_2 \cup \cdots \cup A_n = S.

Every element of SS belongs to exactly one block of the partition.

📝Partition Examples

  • The sets {1,3},{2,5},{4,6}\{1, 3\}, \{2, 5\}, \{4, 6\} form a partition of {1,2,3,4,5,6}\{1, 2, 3, 4, 5, 6\}.
  • The equivalence classes of an equivalence relation on SS always form a partition of SS.
  • {even integers,odd integers}\{ \text{even integers}, \text{odd integers} \} is a partition of Z\mathbb{Z}.

Partitions are critical in probability (events that partition the sample space), database design (table normalization), and clustering algorithms (each data point belongs to exactly one cluster).


Inclusion-Exclusion Principle

Two-Set Inclusion-Exclusion

Three-Set Inclusion-Exclusion

General Inclusion-Exclusion

📝Inclusion-Exclusion Problem

Problem: In a survey of 200 people, 120 like tea, 80 like coffee, and 30 like both. How many like neither?

Solution: TC=120+8030=170|T \cup C| = 120 + 80 - 30 = 170. So 200170=30200 - 170 = 30 people like neither.

💡Derangements via Inclusion-Exclusion

The number of derangements (permutations with no fixed points) of nn elements is:

Dn=n!k=0n(1)kk!D_n = n! \sum_{k=0}^{n} \frac{(-1)^k}{k!}

This is derived directly from inclusion-exclusion applied to the sets Ai={permutations where element i is fixed}A_i = \{\text{permutations where element } i \text{ is fixed}\}.


Python Implementation

from itertools import combinations
from functools import reduce

# Basic set operations
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# Union
print(f"A ∪ B = {A | B}")              # {1, 2, 3, 4, 5, 6, 7, 8}

# Intersection
print(f"A ∩ B = {A & B}")              # {4, 5}

# Difference
print(f"A - B = {A - B}")             # {1, 2, 3}
print(f"B - A = {B - A}")             # {6, 7, 8}

# Symmetric Difference
print(f"A △ B = {A ^ B}")             # {1, 2, 3, 6, 7, 8}

# Complement (given a universal set)
U = set(range(1, 11))
print(f"Aᶜ = {U - A}")               # {6, 7, 8, 9, 10}

# Power Set
def power_set(s):
    """Generate all subsets of set s."""
    result = [[]]
    for elem in s:
        result = result + [subset + [elem] for subset in result]
    return result

A_list = [1, 2, 3]
ps = power_set(A_list)
print(f"\nPower set of {A_list}: {len(ps)} subsets")
for subset in ps:
    print(f"  {subset}")

# Cartesian Product
def cartesian_product(set_a, set_b):
    """Compute the Cartesian product of two sets."""
    return {(a, b) for a in set_a for b in set_b}

X = {1, 2}
Y = {'a', 'b', 'c'}
cp = cartesian_product(X, Y)
print(f"\nX × Y = {cp}")
print(f"|X × Y| = {len(cp)}")

# Inclusion-Exclusion
def inclusion_exclusion_two(set_a, set_b):
    """|A ∪ B| using inclusion-exclusion."""
    return len(set_a) + len(set_b) - len(set_a & set_b)

A_sets = {1, 2, 3, 4, 5}
B_sets = {4, 5, 6, 7}
print(f"\n|A ∪ B| = {inclusion_exclusion_two(A_sets, B_sets)}")

# Partition check
def is_partition(sets, universe):
    """Check if sets form a partition of universe."""
    flat = set()
    for s in sets:
        if not s:
            return False
        flat |= s
    if flat != universe:
        return False
    for i, s1 in enumerate(sets):
        for j, s2 in enumerate(sets):
            if i < j and s1 & s2:
                return False
    return True

U = {1, 2, 3, 4, 5, 6}
parts = [{1, 3}, {2, 5}, {4, 6}]
print(f"\nIs partition? {is_partition(parts, U)}")  # True

# Equivalence relation and partition
def equivalence_classes(relation, elements):
    """Group elements into equivalence classes."""
    visited = set()
    classes = []
    for e in elements:
        if e not in visited:
            cls = {x for x in elements if (e, x) in relation}
            classes.append(cls)
            visited |= cls
    return classes

# Parity equivalence: (a,b) in relation if a ≡ b (mod 2)
rel = {(a, b) for a in range(6) for b in range(6) if a % 2 == b % 2}
elements = set(range(6))
print(f"Equivalence classes (mod 2): {equivalence_classes(rel, elements)}")

Applications in AI/ML

Feature Engineering

In machine learning, features are often subsets of a larger feature space. Set operations enable efficient feature engineering:

  • Feature selection identifies a subset F{f1,f2,,fn}F \subseteq \{f_1, f_2, \ldots, f_n\} of relevant features. The power set of all features has size 2n2^n, making exhaustive search intractable for large nn.
  • Set intersection identifies features shared across multiple datasets or models.
  • Set difference finds features unique to one dataset, aiding in anomaly detection.
  • Cartesian product constructs multi-dimensional feature spaces. If feature XX takes 5 values and feature YY takes 10 values, the joint feature space has 5×10=505 \times 10 = 50 possible combinations.
  • Partitions form the basis of clustering. K-means partitions data points into kk groups, where each point belongs to exactly one cluster. Decision trees partition the feature space into regions with distinct class distributions.
  • Inclusion-exclusion is used in probabilistic graphical models to avoid double-counting when computing joint probabilities over overlapping event sets.
  • Power sets appear in model selection. The set of all possible feature subsets is P(features)\mathcal{P}(\text{features}), and subset selection algorithms (greedy forward selection, backward elimination) search this space.

Database Queries

SQL directly implements set operations:

-- UNION (set union)
SELECT name FROM employees UNION SELECT name FROM contractors;

-- INTERSECT (set intersection)
SELECT name FROM employees INTERSECT SELECT name FROM managers;

-- EXCEPT (set difference)
SELECT name FROM employees EXCEPT SELECT name FROM fired;

-- CROSS JOIN (Cartesian product)
SELECT * FROM colors CROSS JOIN sizes;

Probability

Set theory underpins probability. Events are subsets of a sample space Ω\Omega. The probability of event AA is a measure P:2Ω[0,1]P: 2^\Omega \rightarrow [0, 1]. Inclusion-exclusion computes P(AB)P(A \cup B) without double-counting.


Common Mistakes

MistakeWhy It's WrongCorrection
{1,2,3}={3,2,1}\{1, 2, 3\} = \{3, 2, 1\} is wrongSets are unorderedThey are equal: A=BA = B regardless of element order
{1,1,2}\{1, 1, 2\} is a valid setSets have distinct elements{1,1,2}={1,2}\{1, 1, 2\} = \{1, 2\}; duplicates are removed
AB=ABA \cup B = A \cap B alwaysUnion and intersection are differentOnly when A=BA = B or AB=A \cap B = \emptyset
A\emptyset \subset A is falseEmpty set is a proper subset of every non-empty setA\emptyset \subseteq A is true for all AA
(AB)c=AcBc(A \cup B)^c = A^c \cup B^cDe Morgan's Law states otherwise(AB)c=AcBc(A \cup B)^c = A^c \cap B^c
A×B=B×AA \times B = B \times AOrdered pairs are position-sensitive(a,b)(b,a)(a,b) \neq (b,a) unless a=ba = b
2n2^n subsets is too many for large nnPower set size grows exponentiallyn=20220106n = 20 \Rightarrow 2^{20} \approx 10^6; n=30109n = 30 \Rightarrow 10^9
AB=A+B\|A \cup B\| = \|A\| + \|B\|This overcounts the intersectionAB=A+BAB\|A \cup B\| = \|A\| + \|B\| - \|A \cap B\|

Interview Questions

📝Interview Question 1

Q: Given two arrays, find their union and intersection. Approach: Convert both arrays to sets. Union is set_a | set_b. Intersection is set_a & set_b. Both are O(n+m)O(n + m) average case.

📝Interview Question 2

Q: Check if one set is a subset of another. Approach: Use A <= B or A.issubset(B). Equivalent to verifying ABc=A \cap B^c = \emptyset. O(min(A,B))O(\min(|A|, |B|)).

📝Interview Question 3

Q: Find the symmetric difference of two sorted arrays. Approach: Symmetric difference is elements in exactly one set. Convert to sets and use A ^ B. For sorted arrays, use two-pointer technique for O(n+m)O(n + m).

📝Interview Question 4

Q: How many subsets of {1,2,,10}\{1, 2, \ldots, 10\} have an even sum? Approach: There are 210=10242^{10} = 1024 subsets. Exactly half have an even sum (512). This follows from the bijection: adding or removing the element 1 toggles parity.

📝Interview Question 5

Q: Given three sets AA, BB, CC with A=30|A| = 30, B=25|B| = 25, C=20|C| = 20, AB=10|A \cap B| = 10, AC=8|A \cap C| = 8, BC=6|B \cap C| = 6, ABC=3|A \cap B \cap C| = 3. Find ABC|A \cup B \cup C|. Approach: ABC=30+25+201086+3=54|A \cup B \cup C| = 30 + 25 + 20 - 10 - 8 - 6 + 3 = 54.


Practice Problems

📝Problem 1: Basic Operations

Let A={1,2,3,4,5,6}A = \{1, 2, 3, 4, 5, 6\}, B={4,5,6,7,8,9}B = \{4, 5, 6, 7, 8, 9\}, and U={1,2,,10}U = \{1, 2, \ldots, 10\}. Find: (a) ABA \cup B (b) ABA \cap B (c) ABA \setminus B (d) BAB \setminus A (e) AcA^c (f) AΔBA \Delta B

💡Solution 1

(a) AB={1,2,3,4,5,6,7,8,9}A \cup B = \{1, 2, 3, 4, 5, 6, 7, 8, 9\} (b) AB={4,5,6}A \cap B = \{4, 5, 6\} (c) AB={1,2,3}A \setminus B = \{1, 2, 3\} (d) BA={7,8,9}B \setminus A = \{7, 8, 9\} (e) Ac={7,8,9,10}A^c = \{7, 8, 9, 10\} (f) AΔB={1,2,3,7,8,9}A \Delta B = \{1, 2, 3, 7, 8, 9\}

📝Problem 2: Power Set

List all subsets of {a,b,c}\{a, b, c\} that contain at least two elements. How many are there?

💡Solution 2

Subsets with at least 2 elements: {a,b},{a,c},{b,c},{a,b,c}\{a,b\}, \{a,c\}, \{b,c\}, \{a,b,c\}. There are 4 such subsets. In general, for a set of size nn, the number of subsets with at least kk elements is i=kn(ni)\sum_{i=k}^{n} \binom{n}{i}.

📝Problem 3: Inclusion-Exclusion

In a class of 50 students, 30 study math, 25 study physics, and 10 study both. How many study neither?

💡Solution 3

MP=30+2510=45|M \cup P| = 30 + 25 - 10 = 45. So 5045=550 - 45 = 5 students study neither.

📝Problem 4: Cartesian Product

If A=4|A| = 4 and A×B=20|A \times B| = 20, what is B|B|?

💡Solution 4

A×B=AB=4B=20|A \times B| = |A| \cdot |B| = 4 \cdot |B| = 20, so B=5|B| = 5.

📝Problem 5: Partition Validation

Is {{1,3,5},{2,4}}\{\{1, 3, 5\}, \{2, 4\}\} a partition of {1,2,3,4,5}\{1, 2, 3, 4, 5\}? Is {{1,2},{3,4},{5}}\{\{1, 2\}, \{3, 4\}, \{5\}\}?

💡Solution 5

First: Yes. Both sets are non-empty, disjoint, and their union is {1,2,3,4,5}\{1, 2, 3, 4, 5\}. Second: Yes. Three non-empty, pairwise disjoint sets that cover the entire universe.

📝Problem 6: De Morgan's Verification

Verify De Morgan's Law for A={1,2,3}A = \{1, 2, 3\}, B={2,3,4}B = \{2, 3, 4\}, U={1,2,3,4,5}U = \{1, 2, 3, 4, 5\}.

💡Solution 6

(AB)c={1,2,3,4}c={5}(A \cup B)^c = \{1, 2, 3, 4\}^c = \{5\}

AcBc={4,5}{1,5}={5}A^c \cap B^c = \{4, 5\} \cap \{1, 5\} = \{5\}. Confirmed: (AB)c=AcBc(A \cup B)^c = A^c \cap B^c.


Quick Reference

ConceptFormula / Rule
UnionAB={x:xAxB}A \cup B = \{x : x \in A \lor x \in B\}
IntersectionAB={x:xAxB}A \cap B = \{x : x \in A \land x \in B\}
DifferenceAB={x:xAxB}A \setminus B = \{x : x \in A \land x \notin B\}
ComplementAc=UAA^c = U \setminus A
Symmetric DiffAΔB=(AB)(BA)A \Delta B = (A \setminus B) \cup (B \setminus A)
Power Set SizeP(A)=2A|\mathcal{P}(A)| = 2^{|A|}
Cartesian ProductA×B=AB|A \times B| = |A| \cdot |B|
Inclusion-Exclusion (2)AB=A+BAB\|A \cup B\| = \|A\| + \|B\| - \|A \cap B\|
De Morgan 1(AB)c=AcBc(A \cup B)^c = A^c \cap B^c
De Morgan 2(AB)c=AcBc(A \cap B)^c = A^c \cup B^c
IdempotentAA=AA \cup A = A, AA=AA \cap A = A
DominationAU=UA \cup U = U, A=A \cap \emptyset = \emptyset
AbsorptionA(AB)=AA \cup (A \cap B) = A, A(AB)=AA \cap (A \cup B) = A
Complement LawsAAc=UA \cup A^c = U, AAc=A \cap A^c = \emptyset

Cross-References

  • Boolean Algebra: Set operations mirror logical operations. Union corresponds to OR, intersection to AND, complement to NOT. This connection powers digital circuit design and search engines.
  • Probability Theory: Events are subsets of a sample space. P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B) is the probabilistic form of inclusion-exclusion.
  • Database Theory: SQL UNION, INTERSECT, and EXCEPT directly implement set operations. Understanding set theory is essential for writing correct and efficient queries.
  • Graph Theory: Vertex and edge sets define graphs. Graph isomorphism checks whether two graphs have the same edge set.
  • Linear Algebra — The set of solutions to a homogeneous linear system forms a vector space (a special kind of set closed under addition and scalar multiplication).
  • Formal Languages: Languages are sets of strings. Regular expressions, context-free grammars, and automata all operate on sets of strings.
  • Combinatorics — Counting techniques rely on partitions, inclusion-exclusion, and the pigeonhole principle—all rooted in set theory.
  • Machine Learning — Feature spaces, label sets, training/test splits, and ensemble methods all involve partitioning data into sets.
Lesson Progress71 / 100