The Interview Question
"How would you improve Instagram Reels? What data would you look at, and what experiments would you run?"
This question tests your ability to think like a product-minded data scientist β someone who doesn't just analyze data but uses it to drive product decisions.
Why Companies Ask This
βΉοΈ
Meta and Airbnb want data scientists who can bridge the gap between data and product. They're not looking for someone who just runs SQL queries β they want someone who can identify opportunities, propose solutions, and measure impact.
Interviewers evaluate:
- Product Empathy β Do you understand user needs and pain points?
- Data-Driven Thinking β Can you identify the right data to inform decisions?
- Prioritization β Can you separate signal from noise in user behavior?
- Experimentation Design β Can you test hypotheses rigorously?
- Impact Orientation β Do you connect improvements to business outcomes?
The Product Improvement Framework
Step 1: Understand the Product & Users
- Who are the users? What are their goals?
- What is the product's value proposition?
- What does the current user journey look like?
Step 2: Identify Problems & Opportunities
- Where are users struggling or dropping off?
- What do the data patterns reveal?
- What are users saying (qualitative data)?
Step 3: Propose Solutions
- What specific improvements would address the problems?
- What are the trade-offs of each approach?
- How would you prioritize?
Step 4: Design Experiments
- How would you validate your hypotheses?
- What metrics would you track?
- What are the risks and mitigation strategies?
Step 5: Measure & Iterate
- How would you know if it worked?
- What would you do next based on results?
Example: Improving Airbnb's Search Experience
Step 1: Understanding the Product
"Airbnb's search is fundamentally different from Google β it's not just about relevance, it's about inspiration and trust. Users aren't just looking for a place to sleep; they're looking for an experience."
Key user segments:
- Vacation planners β browsing for inspiration, flexible on dates
- Business travelers β need specific locations, fixed dates, fast booking
- Group travelers β coordinating multiple people, need space
- Budget-conscious β price is primary filter
Step 2: Identifying Problems from Data
import pandas as pd
import numpy as np
# Analyze search-to-booking funnel
def analyze_search_funnel(searches_df, bookings_df):
"""Identify drop-off points in the search funnel."""
# Merge search sessions with booking outcomes
funnel = searches_df.merge(
bookings_df[['search_id', 'booked', 'booking_value']],
on='search_id',
how='left'
)
# Overall funnel metrics
metrics = {
'total_searches': len(funnel),
'searches_with_clicks': funnel['clicked_listing'].sum(),
'searches_with_saves': funnel['saved_listing'].sum(),
'searches_with_inquiry': funnel['sent_message'].sum(),
'searches_with_booking': funnel['booked'].sum(),
'conversion_rate': funnel['booked'].mean(),
'avg_searches_before_booking': (
funnel.groupby('user_id')['search_id'].count()
[funnel[funnel['booked']]['user_id'].unique()]
).mean(),
}
return metrics
# Identify where users drop off
drop_off_analysis = {
'view_to_click': 0.45, # 45% of searches result in a click
'click_to_save': 0.12, # 12% of clicks result in a save
'save_to_inquiry': 0.25, # 25% of saves result in a message
'inquiry_to_book': 0.35, # 35% of inquiries result in booking
}
# Biggest drop-off: view β click (55% abandonment)
Key findings from data:
| Finding | Evidence | Implication |
|---|---|---|
| Users search 8+ times before booking | 67th percentile of search count | Discovery is hard |
| 55% abandon after viewing results | View-to-click rate | Results may not match intent |
| Saved listings have 3x higher conversion | Save-to-booking rate | Users need time to decide |
| Mobile users have 40% lower conversion | Device-level analysis | Mobile UX needs improvement |
Step 3: Proposing Solutions
Solution A: Intent-Based Search Re-ranking
"Re-rank search results based on inferred user intent. For users who consistently save beach properties, prioritize beachfront listings even if they didn't explicitly filter."
Solution B: Enhanced Visual Previews
"Show more photos and a 360Β° preview on the search results page, reducing the need to click through to individual listings."
Solution C: Smart Date Flexibility
"For users with flexible dates, show price calendars highlighting the cheapest nearby dates, similar to Google Flights."
Solution D: Social Proof Integration
"Add '12 people are looking at this' and 'Booked 3 times this week' to create urgency and trust."
Prioritization Matrix
prioritization = pd.DataFrame({
'solution': ['Intent Re-ranking', 'Visual Previews',
'Date Flexibility', 'Social Proof'],
'impact_score': [8, 7, 6, 5], # Expected impact on conversion
'effort_score': [6, 8, 4, 3], # Engineering effort (10=hardest)
'confidence': [0.7, 0.6, 0.8, 0.5], # Data support confidence
})
prioritization['ice_score'] = (
prioritization['impact_score'] *
prioritization['confidence'] *
(10 - prioritization['effort_score'])
)
# Rank by ICE score
prioritization.sort_values('ice_score', ascending=False)
Meta-Specific Product Sense
The "Move Fast" Culture
Meta values speed of iteration. Your answer should show you can:
- Ship a minimum viable experiment quickly
- Learn from results and iterate
- Not over-engineer before validating hypotheses
The "Social Graph" Angle
For Meta products, always consider:
- Network effects β How does the feature affect other users?
- Content creation β Does it incentivize or disincentivize creation?
- Privacy implications β What data are you using and how?
Real Example: Instagram Stories Feature Improvement
"Instagram Stories had a 70% drop-off from viewing to creating. The data showed that users who added even one sticker or text overlay had 3x higher retention. The improvement: a 'Quick Create' template that pre-populated trending stickers and text styles, reducing creation time from 45 seconds to 15 seconds."
Airbnb-Specific Product Sense
The "Belong Anywhere" Angle
Airbnb's mission is about belonging, not just lodging. Your improvements should consider:
- Trust β How do users build confidence in unfamiliar listings?
- Community β How does the feature connect hosts and guests?
- Experience β How does it enhance the travel experience, not just the booking?
Real Example: Airbnb Wish List Improvement
"Wish Lists had low engagement because users saved listings but never returned. The data showed that wish lists with 5+ items had 40% higher return rates. The improvement: auto-suggest similar listings when saving, creating richer collections, and sending personalized notifications when saved listings drop in price."
Data Sources to Consider
Quantitative Data
- Behavioral logs β What users actually do
- Funnel metrics β Where users drop off
- Cohort analysis β How behavior changes over time
- Segment analysis β How different user groups behave differently
Qualitative Data
- User surveys β What users say they want
- Usability testing β Where users struggle
- Support tickets β What users complain about
- Social media β What users say publicly
# Example: Combining quantitative and qualitative data
def mixed_method_analysis(behavioral_data, survey_data):
"""
Triangulate findings from behavioral and survey data.
"""
# Find behavioral patterns
frustration_signals = behavioral_data[
(behavioral_data['rage_clicks'] > 3) |
(behavioral_data['session_abort'] == True) |
(behavioral_data['back_button_rate'] > 0.8)
]
# Cross-reference with survey responses
frustrated_users = frustration_signals['user_id'].unique()
survey_responses = survey_data[
survey_data['user_id'].isin(frustrated_users)
]
# Identify top pain points
pain_points = (
survey_responses['pain_point']
.value_counts()
.head(5)
)
return {
'frustrated_user_pct': len(frustrated_users) / len(behavioral_data),
'top_pain_points': pain_points,
'correlation_with_churn': calculate_churn_correlation(
frustrated_users, behavioral_data
)
}
Common Mistakes to Avoid
β οΈ
These mistakes signal that you don't understand product-driven data science:
- Proposing solutions without understanding the problem β Always start with data
- Ignoring user segments β Different users have different needs
- Overcomplicating the solution β Start with the simplest experiment
- Forgetting about negative outcomes β What if your change hurts something else?
- Not considering implementation constraints β Be realistic about what's feasible
- Only thinking about metrics, not users β Remember the human behind the data