glossary · metrics
Cohort analysis
Grouping users by a shared starting point, then tracking what share return over time, to diagnose retention rather than just measure it.
Cohort analysis groups users by a shared starting point and tracks what percentage return at each elapsed interval. The separating idea from ordinary segmentation is the time dimension: not “who are these users?” but “how does this type of user behave as they age?” That framing is what makes the technique useful for diagnosing retention problems and for performing well in analytical PM interviews, where interviewers are specifically testing whether you go from “what happened” to “why” to “what we should change.”
The retention triangle
A cohort heatmap has a fixed structure. Rows are cohorts, defined by sign-up week or month. Columns are time elapsed since cohort start. Each cell holds the percentage of the original cohort still active at that elapsed point.
Two reads are required. Across a row: track how a single cohort ages. Down a column: compare different cohorts at the same lifecycle point, for example January sign-ups at week 2 versus February sign-ups at week 2. Reading only a single cell or headline number skips the diagnosis.
Three curve shapes and what they tell you
Most products lose 70 to 75% of users in the first week. What matters is the shape of what follows.
- Perpetual decline: retention keeps falling toward zero. This is a PMF problem. Onboarding improvements will not fix a product that gives users no reason to return.
- Flatten-and-hold: the curve drops early, then stabilizes at a floor. PMF exists with a retained core. The job is expanding how many users reach that floor.
- Smile curve: the curve drops, then ticks back up as churned users return without any campaign. This is the rarest and most commercially promising shape. Organic reactivation means the product solves a recurring need in users’ lives, which is a signal no re-engagement budget can manufacture.
Naming the shape rather than quoting a number is the first mark of a strong answer.
Acquisition cohorts vs behavioral cohorts
Acquisition cohorts answer “what happened”: are last month’s sign-ups sticking longer than the prior month’s? Descriptive, useful for spotting trends across time.
Behavioral cohorts answer “why” and become prescriptive. Segment users by whether they completed a specific early action, then compare long-term retention across the two groups. If users who created a playlist in week one (Spotify) show 28-plus percentage points higher month-3 retention than users who did not, that is not a correlation to note. That is an instruction: shorten time-to-first-playlist in onboarding. The same logic applies to Slack’s “2,000 messages sent” threshold and Duolingo’s “completed three lessons in the first session.” Behavioral cohorts stop being academic the moment the retention delta is large enough to justify an onboarding redesign. That is the leap most weak answers miss.
The 2026 complication: AI tourists
AI-assisted onboarding has made sign-up frictionless. Products now regularly acquire large cohorts of users who explore briefly, then churn within two to three weeks before reaching any activation event. These AI tourists inflate cohort sizes and depress early retention metrics, making a product with genuine PMF look weaker than it is.
The correction is the M3 rebase: measure month-12 retention against the month-3 baseline rather than month-0. By month three, tourist churn has cleared and the remaining users are closer to the true addressable audience. Mentioning this unprompted in an interview signals that you understand how modern growth mechanics interact with measurement, not just how to read a heatmap. No existing benchmark or “best practice” from before 2024 accounts for this distortion.
From analysis to intervention
Reading the triangle is the starting point. The PM’s job is connecting the reading to a concrete change.
In 2026, warehouse-native analytics stacks let behavioral cohort audiences sync directly to messaging tools. A cohort of “users who did not complete the key activation event by day 7” can become a targeted intervention audience within hours, no manual export required. That closes the loop from diagnosis to action: identify the activation event that predicts retention, find users who have not hit it, intervene before they churn.
A flat or smiling retention curve is viability evidence. In a funding or roadmap conversation, it answers the question “why keep investing?” with data rather than conviction. Feasibility is no longer the hard part; retention is.
Weak vs strong interview answer
weak
"Our January cohort had higher 7-day retention than February, so January was a better month." Single number, no curve shape, no control for external factors (holidays, competitor outages, a campaign), no behavioral segmentation, no activation event identified, no intervention named. Interviewers at Meta and Google are testing whether you go from "what happened" to "why" to "what we should change." This answer stops at the first step.
strong
"I'd start with the acquisition cohort heatmap to read the curve shape, not the headline number. Perpetual decline signals a PMF problem. Flatten-and-hold means PMF found with a retained subset. Smile curve means organic reactivation: the rarest and best signal. If we recently ran a large AI-assisted onboarding push, I'd rebase to month 3 before concluding anything, so tourist churn doesn't contaminate the read. Once I have the shape, I'd pivot to behavioral cohorts: segment by users who hit the key activation event versus those who didn't. The delta in six-month retention between those two groups tells me exactly what to optimize onboarding toward. I'd connect it to a concrete change: shorten time-to-activation, not add more tooltips."
For a worked application of this thinking to a metric drop, see how to diagnose a DAU drop. For the mechanics of what retention actually measures, see retention and DAU/MAU.