How would you define the north star metric for this feature?

How to pick a north star metric for a specific feature in a PM interview, including the AI feature variant and why DAU always fails the test.

Define the north star metric for a feature

The north star for a feature is the single number that, if it rises, you are confident users got value. Not that they visited. Not that they clicked. That they got something done and it mattered.

Interviewers ask this to see whether you distinguish between activity and value delivery. Most candidates fail by picking a volume metric and calling it a day.

What the question is actually testing

Three things at once: whether you can clarify before you commit (stage? business model?), whether you know the difference between a north star, a guardrail, and a launch success metric, and whether you understand that a leading indicator predicts retention or revenue rather than just counting surface interactions.

Facebook’s activation target of 10 friends in 7 days is the canonical example because it operates at the feature level. It predicts long-term retention better than DAU on the feature ever could, because it captures a behavior that only occurs when the product delivered real value. Slack’s north star is messages sent within a team (or paid teams) for the same reason: a feature-level input that leads to retention, not a headcount of people who opened the app.

Structure a strong answer

strong

"Before I name a metric, I want to clarify two things: what stage is this feature at, and what does value look like for the user in this specific context? [Interviewer confirms: it's a new AI summarization feature inside a B2B writing tool, targeting activation.] Given that, I'd anchor on 'percentage of users who edit or act on a summary within the same session.' Not views, not saves: a downstream action that only happens if the summary was genuinely useful. This is a leading indicator of retention. If users trust the output enough to act on it, they'll build a habit. I'd track DAU on the feature and summary-generation volume as health metrics in parallel, and I'd add a guardrail on 'summaries discarded without interaction' to catch quality degradation. I'm not using raw task completion because in a writing tool, abandoning the summary to write manually is a distinct signal worth separating. The north star should be the behavior that, if it rises, we are confident users got value. Everything else is diagnostic."

weak

"I'd pick Daily Active Users for this feature. It tells us how many people are using it every day, which shows engagement and adoption. I'd also track session length and click-through rate as supporting metrics." DAU goes up when users open the feature and leave frustrated. Session length on an AI feature goes up when the model is slow or hallucinating and users spend time cleaning up the output. Listing three equal-priority metrics signals no prioritization instinct. The interviewer hears: this person would ship friction and call it a win.

The 2026 trap: AI features make standard metrics actively misleading

Session length, DAU, and click-through rate all have a failure mode that predates AI but is now unavoidable: they reward surface activity, not value delivery. For a non-deterministic feature (a summarizer, a Copilot, a recommendation engine), those numbers go up for the wrong reasons. Interviewers at Meta, Google, and Stripe report that candidates default to model accuracy or latency when asked about AI feature north stars. Those are health metrics, not north stars.

The right anchor for an AI feature is a value-exchange signal: did the user accept the output and take the next step in their actual job? “Accepted and acted on” captures the viable/lovable test simultaneously. Viable means someone pays to have this problem solved at scale. Lovable means the feature met them where they were and got them to the next step without making them fix its work.

The lifecycle-stage trap

A north star right for activation is wrong for a mature feature. At activation, you want to see the first instance of value delivery (first summary acted on, first document completed). At retention, you want frequency and depth (percentage of sessions where the feature was used at least once). Candidates who skip this clarification almost always name a retention metric for an activation-stage feature and lose the thread.

The PM judgment

Pick the number that only rises when the user succeeded at their actual task. Treat volume metrics as diagnostics. If the metric is gameable by making the feature slower, more verbose, or more intrusive, it is the wrong north star.

Further reading: north star metric framework, leading vs. lagging indicators, measure success for a new AI product.