Introduction
Advanced aggregation techniques using groupby with custom functions.
Custom Aggregations
import pandas as pd
def range_func(x):
return x.max() - x.min()
def coefficient_of_variation(x):
return x.std() / x.mean()
df.groupby("category").agg({
"value": [range_func, "mean", "std"]
})
Named Aggregation (Python 3.7+)
df.groupby("category").agg(
total="value".sum,
average="value".mean,
count=("value", "count")
)
Transform
# Add group statistics back to dataframe
df["group_mean"] = df.groupby("category")["value"].transform("mean")
df["group_normalized"] = df["value"] / df["group_mean"]
Filter Groups
# Keep only groups with certain criteria
df.groupby("category").filter(lambda x: x["value"].sum() > 100)
Practice Problems
- Create custom aggregation functions
- Apply multiple transformations
- Filter groups by criteria
- Add computed columns to original data
- Implement rolling aggregations