Retention dropped 5% last week. Walk me through how you'd diagnose it.

How to run a retention RCA in a PM interview. Covers data integrity gates, cohort decay types, segmentation order, and 2026 AI-cause additions.

Retention dropped 5%: diagnose it

This question is not asking you to find the answer. It is asking you to demonstrate that you will not waste a company’s time chasing the wrong signal. The interviewers who report the clearest failures all describe the same move: the candidate skips the integrity check, assumes the metric is real, and spends ten minutes segmenting a measurement artifact.

Clarify before diagnosing

Five percent of what, measured how? D7, D30, or rolling 28-day retention have completely different diagnostic paths. A sudden drop (single-day cliff) points to a technical or logging cause. A gradual decay over two to four weeks points to a product or activation cause. Ask both questions, then proceed.

Structure a strong answer

strong

"Before I segment anything, I want to rule out a measurement problem. Did the metric definition change, did a logging pipeline go down, or did the denominator shift (for example, a change in how we count active users)? A surprising share of sudden drops are artifacts. If the data checks out, I want to know which cohorts are affected, because that changes everything."

"If new cohorts (users acquired in the last six to eight weeks) show the drop and older cohorts are stable, this is an acquisition or onboarding problem: the people coming in are lower quality or hitting a broken first-run experience. If all cohorts are decaying equally, the core product loop broke for existing users. If D1 and D7 retention are flat but D30 is down, the re-engagement system (push, email, in-app) is probably failing, not the product itself."

"Once I know the decay type, I segment in this order: cohort vintage, acquisition channel, platform and device, geography, feature usage. I stop when one segment accounts for seventy percent or more of the drop. For each hypothesis I want a falsifiability test: if it is an iOS notification permission regression, I expect push-open rates on iOS to have dropped in the same window."

"One thing I'd add for 2026: if this product has AI features, I'd check whether a model update, prompt change, or eval threshold change shipped in the window before the drop. AI-driven recommendation or generation changes can silently degrade the core use case without a traditional bug report."

"On business impact: a five-percent D30 drop on a ten-million MAU product is roughly five hundred thousand users not returning. I'd anchor leadership on that number before diving deep, and I'd name a leading indicator to watch within one to two weeks rather than waiting a full cohort cycle to confirm the fix."

weak

"It could be a competitor launch, or seasonality, or a bug, or the onboarding flow changed, or push notifications stopped working, or we had an iOS update." Listing eight causes with no prioritization principle is what interviewers call hypothesis dumping. It signals that you cannot operate under ambiguity, which is exactly what the question is testing.

What the interviewer is actually testing

The surface question is “can you be structured.” The real question is: do you know that retention is a lagging metric (visible problems typically started two to four weeks earlier), and do you know the three distinct decay types that require separate investigations? A candidate who treats a five-percent drop as a single, undifferentiated number and segments by geography first is chasing a correlation that may have nothing to do with cause.

The six failure modes to avoid: hypothesis dumping without a prioritization principle; skipping the data integrity gate; not asking which retention metric is dropping; cohort blindness (segmenting by region or device before checking cohort vintage); vague recovery recommendations (“I’d A/B test something”); and, at AI-native companies, omitting model and prompt changes as a cause category entirely.

Recovery is part of the answer

Most candidates stop at “I’d investigate further.” Strong candidates go one step further: identify which sub-segment of churned users is recoverable versus structurally lost, and name the minimum viable experiment to test the fix. That is not gold-plating the answer. It is what a strong PM would actually do on Monday.

For more on how leading versus lagging signals shape this kind of diagnosis, see leading vs. lagging indicators. For a parallel question where the cohort split matters the same way, see MAU flat, DAU down.

Clarify before diagnosing

Structure a strong answer

What the interviewer is actually testing

Recovery is part of the answer

Related