Pandas Data Manipulation

Data SciencePandasFree Lesson

Advertisement

Introduction

Pandas provides powerful tools for data cleaning, transformation, and analysis.

Adding and Modifying Columns

df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})

# Add new column
df["c"] = df["a"] + df["b"]

# Conditional column
df["large"] = df["a"] > 1

# Apply function
df["double"] = df["a"].apply(lambda x: x * 2)

Filtering

# Boolean mask
df[df["a"] > 1]

# Multiple conditions
df[(df["a"] > 1) & (df["b"] < 6)]

# Query method
df.query("a > 1 and b < 6")

Sorting

df.sort_values("a")
df.sort_values("a", ascending=False)
df.sort_values(["a", "b"], ascending=[True, False])

Handling Duplicates

df.drop_duplicates()                    # Remove duplicate rows
df.drop_duplicates(subset=["a"])       # Based on column
df.duplicated().sum()                   # Count duplicates

Practice Problems

  1. Filter DataFrame by multiple criteria
  2. Add computed columns
  3. Sort by multiple columns
  4. Remove duplicates intelligently
  5. Use apply with custom functions

Advertisement

Need Expert Python Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement