framework · metrics
Vanity metrics vs. actionable metrics
Best for: Metrics questions, success-measurement questions, RCA rounds, and stakeholder reviews
A vanity metric is a number that can move in the wrong direction for your product while pointing upward in your deck. An actionable metric is one you can take a specific action to improve, trace to a cause, and compare reliably over time. Eric Ries named the distinction in The Lean Startup (2011), but the framing that matters in interviews is not the definition: it is the three-part test you apply on the fly to any number you or the interviewer puts on the table.
The three-part test
Run every candidate metric through these questions in order. If any answer is no, the metric is doing vanity work regardless of what it’s called.
- Can I take a specific action based on this number? Not “investigate further” but a concrete change to the product, the onboarding flow, the pricing tier, or the go-to-market. If your response to a metric moving is always “let’s dig in,” you have a proxy for a metric, not a metric.
- Can I isolate what caused it to move? A metric you cannot link to a specific input is a weather report. It tells you what happened; it cannot tell you what to do next or whether the last thing you shipped helped.
- Is it reproducible and comparable over time? A number that shifts definition between quarters or that a single large customer can swing 20 points is not a reliable basis for a team decision.
This test has a named source (Ries, extended by Dave McClure’s formulation: if you can act on it, it’s actionable), and it gives you a live protocol rather than a list to memorize. That matters because metrics questions appear in roughly 21% of PM interviews, and interviewers at senior levels will immediately stress-test a label by throwing a specific number at you.
Why context decides the label
The persistent nuance interviewers test: the same metric can be actionable in one context and vanity in another. DAU/MAU ratio is the canonical example.
Reported raw across an entire product, DAU/MAU is vanity. It averages over radically different user types, use cases, and lifecycle stages. A ratio of 0.40 tells you nothing about whether casual monthly browsers are converting to daily power users or whether daily users are the ones most at risk of churning.
Segmented by activation cohort, DAU/MAU becomes actionable. Users who completed onboarding step three in their first session may show a 0.62 ratio; users who skipped it show 0.21. Now you have a specific intervention point and a measurable expected lift. Same metric, opposite verdict.
The rule: a metric earns the actionable label when you can segment it by a variable you control. If you cannot segment it, you have aggregate noise.
Named examples with the swap
| Vanity version | Why it fails the test | Actionable swap |
|---|---|---|
| Total app downloads | Says nothing about use, intent, or retention | D7 retention on users who completed any core action |
| Page views | Inflated by bots, bounces, and accidental visits | Engaged sessions (defined threshold of time + action) |
| Messages sent | Volume inflated by automation and low-intent activity | D7 retention on users who sent at least one message in session one |
| NPS score (raw) | High NPS from users who have touched 20% of features is a churn signal, not a win | NPS segmented by activation depth and feature breadth |
| Average SaaS activation rate | The cross-product average is ~37% (Userpilot benchmark across 547 products); yours relative to that number means nothing without your cohort baseline | Activation rate by acquisition channel, so you know which sources produce users who actually get to value |
The AI-product additions (2026)
Feasibility largely ceased to be a binding constraint for product teams in 2025-2026. Anything can be built quickly. That means teams can ship constantly, and there is always something impressive to point at. Vanity metrics multiply because the surface area for things that look good in a board review expanded dramatically.
The new vanity metrics showing up in AI PM interviews:
- LLM citation count / AI visibility score: measures whether an AI mentions your product, not whether the resulting user visited, converted, or stayed.
- Token throughput: a unit-cost metric masquerading as a quality signal. High throughput is table stakes, not product health.
- Model calls made: tells you the system is busy; says nothing about whether users got what they came for.
The actionable equivalents:
- Conversion from AI-referred traffic (citation to paying account)
- Task completion rate in agentic flows, with human-verified ground truth
- Hallucination-corrected resolution rate (the proportion of agent responses that resolved the user’s intent without a re-prompt or manual correction)
The underlying principle: in 2026, the metrics that prove a product is worth building must speak to viability (someone is paying for this, at a margin that sustains the business) and lovability (users return and expand usage because the product is genuinely better, not because switching is slightly inconvenient). Impressions, raw active users, and AI-activity metrics do not clear either bar.
Use it, do not recite it
strong
"I use a three-part check: can I take a specific action based on this number, can I isolate what caused it to move, and can I compare it reproducibly over time? If any of those fails, it's doing vanity work regardless of what it's called.
Apply that to a messaging product: messages sent sounds actionable, but it's vanity if I can't segment by intent: a reminder bot inflating send volume tells me nothing about whether users are getting value. D7 retention on users who sent at least one message in their first session is actionable because it isolates activation quality and I can move it with a specific onboarding change. I'd set a guardrail on message quality signals (reply rate, thread continuation) so I don't just push users to send one low-effort message to hit the threshold.
For an AI product in 2026: model citation count is the new page view. It looks great in a board deck but it doesn't tell me if the user completed their task or if the agent's response was accurate. Task completion rate with human-verified ground truth is the actionable swap. It clears both bars I care about: viability (someone paid for this to work) and lovability (it worked well enough that they didn't have to redo it).
When I present metrics to stakeholders I always include the 'so what': the specific decision this number unlocks or closes. If I can't answer that in one sentence, the metric doesn't belong in the review."
weak
"Vanity metrics are things like page views and followers that look good but don't mean anything. Actionable metrics actually help you make decisions, like conversion rate or retention."
This defines the terms correctly and signals no depth. It names no context, applies no test, and cannot defend a specific choice. When the interviewer follows with "OK, but is DAU a vanity metric?" (and they will) this answer has nowhere to go. Any candidate who read one blog post says the same thing.
In an RCA
When a metric drops, the vanity-vs-actionable distinction surfaces as a diagnostic question: are you looking at the right level? A drop in “weekly active users” is often not the actionable signal; it’s a symptom. The actionable layer is one level down: which activation cohort dropped, which acquisition channel, which feature surface. The RCA framing is identical to the test: what action does this specific breakdown unlock? If you’re still in the aggregate, you’re still in vanity territory.
Credibility in stakeholder reviews
The credibility test for a PM: can you defend why you chose a metric, not just report it? Interviewers at Stripe, Google, and Meta routinely push back with “why not X instead?” The answer that clears the bar names the specific decision the metric enables, the intervention you’d make if it moved, and the guardrail that prevents gaming. Reporting a number without that context is presenting a vanity metric regardless of what the number is.
See also: North Star Metric, AARRR pirate metrics, and proving viability for AI products.