NPS (Net Promoter Score) definition for product managers

NPS is the metric interviewers use to test PM maturity, not the metric mature PMs optimize. Knowing the formula is table stakes. What separates strong candidates is knowing when NPS misleads, what it structurally cannot measure, and why the “recommend to a friend” question breaks down entirely for agentic and AI-native products.

The formula and thresholds

Ask users: “How likely are you to recommend this product to a friend or colleague?” on a 0-10 scale.

Promoters (9-10): actively advocate and tend to expand.
Passives (7-8): satisfied but not enthusiastic; excluded from the calculation entirely.
Detractors (0-6): at risk of churn and may actively discourage others.

Formula: % Promoters minus % Detractors = NPS (-100 to +100).

Score interpretation: above 0 means more promoters than detractors; 50+ is strong; 70+ is world-class.

2026 benchmarks

Segment	Average NPS
B2B SaaS	~29
B2C software	~47
Consumer electronics	50-65
E-commerce	45-55

Notable scores: Apple 68-72, Tesla 73-78, Netflix 50-64, Amazon 60-62. A SaaS company at NPS 30-40 is in normal range, not failing. A consumer app at 18 has a real problem. Use these numbers in interviews to ground-check whatever figure a company cites.

Why NPS is a lagging indicator, not a north star

Gartner predicted that more than 75% of organizations would abandon NPS as their primary CX metric by 2025. As of 2026, only 23% of enterprise CX leaders still use it as their primary metric. Fred Reichheld, the metric’s inventor, publicly stated he is “sick of surveys” and called NPS “the worst misbranding.” This context matters for interviews: reaching for NPS as your first answer is a tell that you have not rethought your metric stack.

NNGroup identified six structural flaws worth knowing cold:

1. Information loss through binning. A detractor moving from 2 to 5 shows measurable improvement in user sentiment but is invisible to NPS. The detractor band runs 0-6; a 0 and a 6 are treated identically.

2. Passives are excluded. The 7-8 segment is often the most movable cohort, and NPS discards them from the calculation entirely.

3. Loyalty does not equal usability. A product can score NPS 60 because users are locked in by switching costs, and NPS -10 because a genuinely better competitor just launched. The score captures perception, not product quality.

4. Small-sample invalidity. NPS is statistically unreliable below several hundred responses per segment. Most B2B products never reach this threshold for meaningful subgroup comparisons.

5. Too broad to detect granular changes. A single question cannot isolate onboarding friction from core-value delivery from support quality.

6. Susceptible to gaming. Teams that route happy customers first, offer discounts for high scores, or time surveys post-resolution can move NPS 10-15 points with no underlying product improvement.

There is also a geographic calibration problem: Japanese and Korean users report negative NPS for products they actually like (-47% and -11% average, respectively). US raters weight quality; UK raters weight ease; Dutch raters weight innovation. A global NPS figure without geographic segmentation is nearly uninterpretable.

When not to use NPS

NPS works reasonably well as a directional trend tracker for high-volume B2C products with a clear recommendation moment. It underperforms or breaks down in several cases:

B2B SaaS with complex buying committees. The champion who fills out the survey is often not the economic buyer. A promoter champion at a company about to downgrade is noise. Average B2B NPS is 29 versus B2C software at 47; the gap reflects structural measurement error as much as product quality differences.

Low-volume enterprise products. Under a few hundred responses per segment, the number is statistical noise shaped like a metric.

Agentic and AI-native products. When the product is an agent completing work autonomously, there is no natural “recommend to a friend” moment. Users judge agents by task completion, error rate, and the right level of autonomy, not brand advocacy. NPS measures brand perception; for agents, task completion rate and effort score are more direct signals.

Early-stage products. NPS with fewer than a few hundred active users is noise. Run interviews instead.

A decision tool: NPS vs. the alternatives

Signal you need	Better tool
Willingness to recommend, trend direction	NPS
Friction in a specific flow	CES (Customer Effort Score): 94% of low-effort users intend to repurchase vs. 4% of high-effort users
Post-interaction satisfaction	CSAT
Actual retention health	Cohort retention curves
AI and agent task quality	Task completion rate, error recovery rate, effort score

CES is worth naming explicitly in interviews. The research showing 94% repurchase intent from low-effort interactions versus 4% from high-effort ones is a far stronger predictor of retention than NPS, and almost no one reaches for it first.

How to use NPS in a metrics interview

strong

"NPS measures brand loyalty via the recommend-to-a-friend question, scored 0-10. Promoters (9-10) minus Detractors (0-6) gives a -100 to +100 range. For B2B SaaS, a 29-35 is typical; 50+ is genuinely strong. I use NPS as a directional signal and trend tracker, not a north star. Three reasons I don't rely on it alone: first, the detractor band runs 0-6, so a user who scores 2 and one who scores 6 are treated identically, which loses signal. Second, it's a lagging indicator: by the time NPS drops, churn has usually already started. Third, for B2B SaaS it correlates poorly with expansion revenue because your champion promoter often isn't the economic buyer. I'd pair NPS with retention cohorts, feature adoption rates, and qualitative detractor interviews. For AI products specifically I'd replace or supplement NPS with task completion rate and effort score, because there's rarely a natural 'recommend this agent to a friend' moment."

weak

"NPS stands for Net Promoter Score. You ask users to rate 0-10 how likely they are to recommend your product. Promoters are 9-10, passives 7-8, detractors 0-6. You subtract detractors from promoters. A good score is above 50." This fails on three counts: it's purely definitional (any candidate who Googled for five minutes can say this), it shows no critical thinking about when NPS misleads, and it ignores the 2026 context where NPS's credibility as a primary metric has significantly eroded. Interviewers at companies like Stripe, Google, and Anthropic use NPS questions to test whether you'll critically evaluate a metric or just recite it.

The viable/lovable problem with NPS

NPS measures willingness to recommend, a rough proxy for perceived value. That is a viability signal: users recommend what they believe is worth the cost. It says almost nothing about whether the product is lovable, meaning whether it meets people where they work, anticipates their needs, and reduces the effort of getting something done. A product can score NPS 60 because it is the only option in the market. A product users genuinely rely on and would miss can score NPS 30 because it is hard to evangelize to non-users. The score is downstream of decisions made long before the survey was sent.

For related concepts, see churn, retention, and DAU/MAU.