rca · hard
Streaming metric dropped 80%: walk me through your root cause analysis
A key streaming metric dropped 80% overnight. Walk me through your root cause analysis.
An 80% drop is not a behavioral signal. It is a data alarm. The interviewer’s first implicit test is whether you recognize that before you build a single hypothesis.
Structure a strong answer
strong
"Before I hypothesize anything, I want to flag the magnitude. An 80% drop in a streaming metric almost never reflects real user behavior change overnight. My first working assumption is that something broke in how we're measuring, not what users are doing. So I'll start there, then build out from data layer to product layer to external causes.
Five clarifying questions before I go further: Which metric exactly, and how is it calculated? Watch time per subscriber? Stream starts? Completion rate? When did it start, and was the drop sudden or gradual? Is it isolated to one platform (iOS, Android, web) or uniform across all? Is it all users or a specific cohort or region? And what shipped in the prior 48 hours, including any recommendation model updates?
Assuming I hear 'sudden, overnight, iOS only, nothing obviously shipped': I go straight to technical triage. Bad app release causing playback failures, a CDN routing change misconfigured for a region, or a DRM error on that client. I'd check Crashlytics for playback errors, CDN logs for that surface, and rollback candidates from the last release. If it's uniform across platforms, I move to data layer first: did the event pipeline ingest correctly, did a logging configuration change, is the dashboard reading from a stale or misconfigured data source?
My full hypothesis tree, in priority order: (1) Data layer, including CDN logging failure, event pipeline outage, instrumentation or dashboard misconfiguration; (2) Technical, including playback errors, buffering spikes, DRM failures, device-specific crashes; (3) Product change, including recommendation model push, UI redesign, autoplay behavior change, search ranking update; (4) Content, including catalog removal, a batch of AI-generated titles with low completion signal tanking top-slot recommendations; (5) External, including ISP outage in a key region, competitor content launch, or seasonality. At a streaming company in 2026, I'd treat a recommendation model rollout as a co-equal hypothesis with instrumentation, because continuous model deployment is now standard practice. Netflix and Spotify ship model updates constantly, and a regression in personalization ranking can crater watch time or completion rate without any engineer touching UI code.
I'd close with a sequenced action plan: data validation and pipeline audit first, then segment by platform and cohort to isolate the failure surface, then an engineering sync if technical causes survive triage, then a cross-functional loop with the data science team if the model deployment path looks live. The fix depends entirely on what we find, but the first hour is always measurement integrity before user behavior."
weak
"I'd look at internal and external factors. Maybe a competitor launched a big new show, or the content slate wasn't strong this month. I'd also consider seasonality. I'd segment by platform and then investigate further." This treats 80% as a plausible signal from real user behavior rather than a data alarm. It opens with hypotheses that assume the measurement is correct, which experienced interviewers at streaming companies specifically watch for. It lacks a prioritized order, a validation step for each hypothesis, and any acknowledgment of AI model deployment as a failure mode.
What the interviewer is testing
The magnitude of 80% is deliberate. Interviewers at Netflix and Spotify use this number to check whether you question the data before you explain it. Candidates who jump to content quality or churn cohort explanations fail that test without knowing it.
Beyond the data-first instinct, the question tests three things: whether your clarifying questions are diagnostic (not generic), whether you can build a streaming-specific hypothesis tree rather than reciting generic internal/external buckets, and whether you understand that recommendation model regressions are now first-class failure modes, not edge cases.
Netflix publicly reported per-subscriber engagement down roughly 20% in H2 2024 versus H1 2023, attributed to fewer originals and a shrinking content slate. That kind of drop is gradual, behavioral, and well-understood. An 80% overnight drop is a different category of problem, and conflating them is the tell.
Completion rate dropping 80% on one platform points to a bad release or CDN config. Completion rate dropping 80% uniformly across all platforms points to a recommendation or content change. The platform segmentation question is not housekeeping: it determines which layer of the hypothesis tree you enter first.
The 2026 addition: any streaming RCA that doesn’t name AI recommendation model rollout as a top-three hypothesis is working from an outdated map of how these products operate. Model deployment is now continuous, the failure surface is large, and a PM who can name it, triage it, and loop in data science from call one reads as someone who knows the actual system.