Product metrics definition

A product metric is a quantitative signal that tells you whether something you care about is moving in the right direction. The error most candidates make is treating metrics as a vocabulary list (DAU, NPS, LTV, churn) rather than as a judgment layer on top of a specific product’s value exchange. Metrics questions make up roughly 24% of PM interview loops at MAANG-tier companies, tested not because interviewers want to hear you recite DAU, but because how you build a measurement system reveals whether you understand what a product is actually doing for users.

The definitional hierarchy

These four terms are often conflated. Each has a distinct job.

Metric: Any quantitative signal you track. Metrics are neutral. DAU is a metric. So is error rate.

KPI (key performance indicator): A metric the team has committed to move. KPIs track operational health: are we executing well? They change when strategy changes.

North star metric: The single metric that captures the core value exchange between the product and its users. A north star is not a KPI. KPIs track how the machine is running; a north star tracks whether the machine is doing the right thing. Slack’s north star is messages sent per active team per week, not DAU. DAU is an output; message volume is the behavior that predicts whether teams are actually getting value from the tool.

OKR (objective and key result): A goal-setting construct. The objective is qualitative; the key results are measurable signals of progress. OKRs organize work around a north star; they do not replace one.

The hierarchy: metrics are the raw material. KPIs select the ones the team commits to move. The north star picks the one that captures the core value exchange. OKRs give that ambition a time-bound structure. Every answer that jumbles these signals the candidate has only seen the terms in a glossary.

Leading vs. lagging indicators

A lagging indicator confirms what already happened: retention rate, revenue, churn. A leading indicator predicts what will happen: feature adoption rate, activation rate, task recurrence interval.

The decision rule: when diagnosing a problem, start with lagging indicators to confirm you have one. When instrumenting a new feature, set up leading indicators first so you can intervene before the lag metric moves. State which type you are discussing and why. Interviewers test this distinction directly in the “a metric dropped, what do you do?” format.

The conflicting-metric tradeoff

The conflicting-metric tradeoff has replaced the funnel drop-off question as the dominant analytical move in 2026 PM interviews. Nearly every Meta and Stripe loop includes a version of it: engagement is up, revenue is down. What do you do?

A strong answer names which metric is closer to the core value exchange (north star), which is a guardrail (signals you are breaking something to hit the other), and what the tradeoff implies about user behavior right now. “It depends” without specifying what it depends on is not an answer.

The metrics dump anti-pattern

Listing metric types without connecting them to the product’s specific value exchange is the metrics dump. It signals memorization, not judgment. Instagram optimized for MAU for years while the DAU/MAU ratio eroded. The gap was the real signal: users were returning less frequently even as the count grew. Tracking the wrong metric at the wrong abstraction level is a judgment problem, not a data problem.

The 2026 addition: AI product measurement

For any AI product, a single north star is insufficient. Strong candidates use a three-layer measurement model:

Business outcome: The north star. Documents completed per active user per week for an AI writing assistant.
Model performance: Precision, recall, F1 on the task. These are offline metrics, evaluated on held-out data before shipping.
Trust and reliability: Correction rate (how often users edit the AI output significantly), escalation rate, task abandonment rate.

The offline vs. online distinction is a seniority signal. Offline measures the model on held-out data. Online measures what users actually do in production. A north star that climbs while the model quietly degrades is a known failure mode; flag it before you’re asked.

Strong vs. weak answer

strong

"I start with the value exchange: what does this product do for the user, and what does the user do for the business? Then I build a metric hierarchy: one north star that captures that exchange, two or three leading indicators that predict it, and one or two guardrail metrics that flag I am not breaking something else to hit the north star. For an AI product, I add the three-layer model: business outcome, model performance (precision, recall), and trust metrics (correction rate, escalation rate). I close by naming the tradeoff I would watch: the two metrics most likely to conflict, and my decision rule for resolving them. For an AI writing assistant, my north star is documents completed per active user per week. Leading indicators: session depth and prompt iterations. Guardrail: task abandonment rate. Trust layer: correction rate, because rising completion numbers with a high correction rate means users are tolerating the output, not relying on it."

weak

"I would track DAU, NPS, churn, and conversion rate." No north star. No connection to the specific product's value exchange. No leading vs. lagging distinction. No guardrail logic. Interviewers call this the metrics dump. It confirms that the candidate memorized a list but has not built a measurement system before.

Where to go deeper

The north star metric entry covers picking and defending one under interviewer pressure. For the interview application, see two metrics conflict: engagement vs. revenue and measure success for a new AI product. The KPI tree is the right tool for decomposing a north star into its component levers when the interviewer asks you to go deeper.