The Interview Question
"You notice that revenue dropped 15% week-over-week. Walk me through how you would investigate and diagnose the root cause."
This is a classic structured problem-solving question that tests your analytical thinking, business acumen, and ability to work under pressure.
Why Companies Ask This
ℹ️
Amazon and Uber want data scientists who can be the "first responder" when something goes wrong. They need someone who can quickly narrow down the problem space, identify the root cause, and recommend action — not just run queries aimlessly.
Interviewers are evaluating:
- Structured Thinking — Can you decompose a complex problem systematically?
- Business Knowledge — Do you understand revenue drivers?
- Hypothesis-Driven Approach — Can you form and test hypotheses efficiently?
- Data Literacy — Do you know which data to look at and when?
- Communication Under Pressure — Can you explain your thinking clearly?
The Revenue Investigation Framework
Phase 1: Clarify & Scope (2 minutes)
Before diving into data, clarify:
- What metric dropped? (Gross revenue? Net revenue? By product line?)
- Time period? (When did it start? Is it a single event or trend?)
- Segmentation? (Is it across all users or specific segments?)
- External factors? (Holidays? Competitor actions? Economic events?)
Phase 2: Decompose & Segmentation (5 minutes)
Break the revenue metric into its components:
Revenue = Users × Conversion Rate × Average Order Value × Frequency
Or for subscription businesses:
Revenue = (Existing Subscribers × Retention Rate + New Subscribers) × ARPU
Phase 3: Hypothesize & Prioritize (3 minutes)
Generate hypotheses and prioritize by likelihood and impact:
High likelihood, high impact — Test first High likelihood, low impact — Quick check Low likelihood, high impact — Don't ignore Low likelihood, low impact — Skip for now
Phase 4: Investigate & Validate (10 minutes)
Systematically test each hypothesis with data.
Phase 5: Synthesize & Recommend (5 minutes)
Pull findings together and provide actionable recommendations.
Example: Investigating a 15% Revenue Drop at Uber
Phase 1: Clarify
"Before diving in, let me clarify: Is this 15% drop in gross bookings, net revenue after driver payouts, or profit? And is it a sudden drop this week, or has it been declining over several weeks?"
Interviewer: "It's gross bookings, dropped suddenly this week compared to last week."
Phase 2: Decompose
Gross Bookings = Trips × Average Trip Value
Where:
Trips = Active Riders × Trips per Rider
Average Trip Value = Distance × Price per Mile + Surcharges
Phase 3: Hypotheses
hypotheses = [
{
'hypothesis': 'Fewer active riders (acquisition or retention issue)',
'likelihood': 'medium',
'impact': 'high',
'data_needed': 'DAU, new user signups, churn rate',
},
{
'hypothesis': 'Existing riders taking fewer trips',
'likelihood': 'medium',
'impact': 'high',
'data_needed': 'trips per active rider, frequency distribution',
},
{
'hypothesis': 'Average trip value decreased (shorter trips, lower prices)',
'likelihood': 'high',
'impact': 'medium',
'data_needed': 'trip distance, pricing, surge multipliers',
},
{
'hypothesis': 'External event (weather, competitor, holiday)',
'likelihood': 'medium',
'impact': 'medium',
'data_needed': 'calendar events, weather data, competitor news',
},
{
'hypothesis': 'Product bug or pricing error',
'likelihood': 'low',
'impact': 'high',
'data_needed': 'error logs, pricing system status',
},
{
'hypothesis': 'Seasonal pattern (this is normal)',
'likelihood': 'medium',
'impact': 'n/a',
'data_needed': 'year-over-year comparison, seasonal indices',
},
]
Phase 4: Investigation
import pandas as pd
import numpy as np
def investigate_revenue_drop(bookings_df, users_df, trips_df):
"""
Systematic investigation of revenue drop.
"""
results = {}
# 1. Top-level decomposition
this_week = bookings_df[bookings_df['week'] == 'current']
last_week = bookings_df[bookings_df['week'] == 'previous']
results['revenue_change'] = (
(this_week['gross_bookings'].sum() - last_week['gross_bookings'].sum()) /
last_week['gross_bookings'].sum()
)
# 2. Driver analysis: Users vs. Frequency vs. Value
metrics = {
'active_users': this_week['active_users'].sum() / last_week['active_users'].sum(),
'trips_per_user': (
this_week['trips'].sum() / this_week['active_users'].sum()
) / (
last_week['trips'].sum() / last_week['active_users'].sum()
),
'avg_trip_value': (
this_week['gross_bookings'].sum() / this_week['trips'].sum()
) / (
last_week['gross_bookings'].sum() / last_week['trips'].sum()
),
}
results['driver_decomposition'] = metrics
# 3. Geographic segmentation
geo_impact = (
this_week.groupby('city')['gross_bookings'].sum() /
last_week.groupby('city')['gross_bookings'].sum() - 1
).sort_values()
results['geo_impact'] = geo_impact.head(5) # Worst performing cities
# 4. User segment analysis
segment_impact = (
this_week.groupby('user_segment')['gross_bookings'].sum() /
last_week.groupby('user_segment')['gross_bookings'].sum() - 1
).sort_values()
results['segment_impact'] = segment_impact
# 5. Time-of-day analysis
hourly_change = (
this_week.groupby('hour')['trips'].sum() /
last_week.groupby('hour')['trips'].sum() - 1
)
results['hourly_pattern'] = hourly_change
# 6. Cohort analysis (are new users behaving differently?)
new_user_bookings = (
this_week[this_week['is_new_user']]['gross_bookings'].sum() /
last_week[last_week['is_new_user']]['gross_bookings'].sum() - 1
)
existing_user_bookings = (
this_week[~this_week['is_new_user']]['gross_bookings'].sum() /
last_week[~last_week['is_new_user']]['gross_bookings'].sum() - 1
)
results['new_vs_existing'] = {
'new_user_change': new_user_bookings,
'existing_user_change': existing_user_bookings,
}
return results
# Example output:
# revenue_change: -0.15
# driver_decomposition: {
# 'active_users': 0.95, # -5% (not the main driver)
# 'trips_per_user': 0.92, # -8% (significant)
# 'avg_trip_value': 1.00 # 0% (not a factor)
# }
#
# → Primary driver: Existing users taking fewer trips
# → Next question: Why? Weather? Competitor? Product change?
Amazon-Specific Investigation Approach
The "Working Backwards" Method
Amazon prefers starting from the customer and working backwards:
- Customer Impact — Which customers are affected?
- Root Cause — What changed in their experience?
- System Impact — What technical or process change caused it?
- Business Impact — What's the financial impact?
- Corrective Action — How do we fix it and prevent recurrence?
Amazon Revenue Streams
For Amazon, revenue drops could be in:
- E-commerce sales — Product purchases
- AWS — Cloud services
- Advertising — Ad revenue
- Subscriptions — Prime memberships
- Third-party seller services — Marketplace fees
Each has different drivers and investigation paths.
Uber-Specific Investigation Approach
The "Marketplace Health" Framework
Uber's revenue depends on marketplace liquidity:
Revenue = Requests × Acceptance Rate × Completion Rate × Average Fare
Investigate each component:
- Requests — Are fewer people opening the app?
- Acceptance Rate — Are drivers declining more rides?
- Completion Rate — Are more rides being cancelled?
- Average Fare — Are trips shorter or is surge pricing lower?
Common Mistakes to Avoid
⚠️
These mistakes show a lack of structured thinking:
- Jumping to conclusions without data — Form hypotheses, then test
- Not decomposing the metric — Revenue is an aggregate; break it down
- Ignoring base rates — Is 15% actually unusual for this time period?
- Forgetting external factors — Check calendar, weather, competitors
- Not considering data quality — Is the metric accurate?
- Over-complicating — Sometimes it's simple (holiday, bug, data pipeline issue)
How to Structure Your Answer
Minute 1-2: Clarify the problem Minute 3-5: Decompose the metric Minute 6-8: Generate and prioritize hypotheses Minute 9-14: Investigate each hypothesis with data Minute 15-17: Synthesize findings Minute 18-20: Recommend actions and next steps