CW

Python Data Types and Structures

Module 1: Introduction & Python BasicsFree Lesson

Advertisement

Python Data Types and Structures

Understanding data types and structures is fundamental to efficient data manipulation in Python.

Primitive Types

# Numeric Types
x_int = 42          # int
x_float = 3.14      # float
x_complex = 2+3j    # complex

# Boolean
flag = True         # bool (subclass of int: True == 1, False == 0)

# String
name = "Data Science"  # str

Type Checking and Conversion

type(42)          # <class 'int'>
type(3.14)        # <class 'float'>
int(3.99)         # 3 (truncates toward zero)
float(42)         # 42.0
str(100)          # '100'
bool(0)           # False
bool("")          # False
bool([])          # False
bool(None)        # False

Collections Overview

list• Ordered• Mutable• Allows duplicates• Indexed [0, 1, 2...]Example:[1, 2, 3, "a", 4.5]Use: sequences,stacks, queuestuple• Ordered• Immutable• Allows duplicates• Indexed [0, 1, 2...]Example:(1, 2, 3, "a", 4.5)Use: dict keys,function returnsdict• Key-Value pairs• Mutable• Keys are unique• Keys must be hashableExample:{"{"}a: 1, b: 2{"}"}Use: mappings,JSON, configset• Unordered• Mutable• No duplicates• Elements must be hashableExample:{"{"}1, 2, 3, "a"{"}"}Use: membershiptesting, set ops

Mutability Comparison

# Mutable: list, dict, set
lst = [1, 2, 3]
lst[0] = 99          # Valid: [99, 2, 3]

# Immutable: int, float, str, tuple, frozenset
tup = (1, 2, 3)
tup[0] = 99          # TypeError: 'tuple' does not support item assignment

Core Operations

List Operations

lst = [3, 1, 4, 1, 5, 9, 2, 6]

lst.append(7)        # [3, 1, 4, 1, 5, 9, 2, 6, 7]
lst.insert(0, 0)     # [0, 3, 1, 4, 1, 5, 9, 2, 6, 7]
lst.pop()            # returns 7, lst = [0, 3, 1, 4, 1, 5, 9, 2, 6]
lst.remove(1)        # removes first occurrence of 1
lst.sort()           # in-place sort
sorted_lst = sorted(lst)  # returns new sorted list
lst.reverse()        # in-place reverse
lst.extend([10, 11]) # appends all elements

Dictionary Operations

data = {"name": "Alice", "age": 30, "city": "NYC"}

data["name"]              # "Alice"
data.get("salary", 0)     # 0 (default if key missing)
data["salary"] = 85000    # add new key-value pair
del data["city"]          # remove key-value pair
data.keys()               # dict_keys(["name", "age", "salary"])
data.values()             # dict_values(["Alice", 30, 85000])
data.items()              # dict_items([("name", "Alice"), ...])

Set Operations

A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

A | B    # Union:        {1, 2, 3, 4, 5, 6, 7, 8}
A & B    # Intersection: {4, 5}
A - B    # Difference:   {1, 2, 3}
A ^ B    # Symmetric:    {1, 2, 3, 6, 7, 8}

Time Complexity of Operations

Operationlistdictsettuple
Access by index/keyO(1)O(1) avgN/AO(1)
Search (x in s)O(n)O(1) avgO(1) avgO(n)
InsertO(1) append, O(n) insertO(1) avgO(1) avgN/A
DeleteO(n)O(1) avgO(1) avgN/A
UpdateO(1) by indexO(1) avgN/AN/A
LengthO(1)O(1)O(1)O(1)

Slicing Syntax

lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

lst[2:5]       # [2, 3, 4]
lst[:3]        # [0, 1, 2]
lst[7:]        # [7, 8, 9]
lst[::2]       # [0, 2, 4, 6, 8]  (every 2nd element)
lst[::-1]      # [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]  (reversed)

Unpacking and Packing

# Tuple/list unpacking
a, b, *rest = [1, 2, 3, 4, 5]
# a=1, b=2, rest=[3, 4, 5]

first, *middle, last = (10, 20, 30, 40, 50)
# first=10, middle=[20, 30, 40], last=50

# Dictionary unpacking
dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}
merged = {**dict1, **dict2}  # {"a": 1, "b": 3, "c": 4}

Nested Structures

# List of dicts (common in data science)
students = [
    {"name": "Alice", "grade": 95, "courses": ["ML", "Stats"]},
    {"name": "Bob", "grade": 88, "courses": ["DL", "NLP"]},
    {"name": "Charlie", "grade": 92, "courses": ["Stats", "DL"]}
]

# Access nested data
students[0]["courses"][1]  # "Stats"

# Dict of lists
matrix = {
    "row1": [1, 2, 3],
    "row2": [4, 5, 6],
    "row3": [7, 8, 9]
}

Specialized Collections

from collections import defaultdict, Counter, namedtuple, deque

# defaultdict - auto-initializes missing keys
word_count = defaultdict(int)
for word in ["hello", "world", "hello"]:
    word_count[word] += 1  # {"hello": 2, "world": 1}

# Counter - frequency counting
Counter(["a", "b", "a", "c", "a"])  # Counter({'a': 3, 'b': 1, 'c': 1})

# namedtuple - immutable with named fields
Point = namedtuple("Point", ["x", "y"])
p = Point(3, 4)
p.x  # 3

# deque - O(1) append/popleft
dq = deque([1, 2, 3])
dq.appendleft(0)  # deque([0, 1, 2, 3])
dq.pop()          # returns 3

Summary

  • Lists → ordered, mutable sequences (most flexible)
  • Tuples → ordered, immutable sequences (hashable, faster)
  • Dicts → key-value mappings (O(1) lookup)
  • Sets → unordered unique elements (set operations)
  • Choose the right structure based on mutability needs and access patterns.

Advertisement

Need Expert Data Science Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement