ai pm · thesis

When not to use AI: killing an idea as the senior move

Updated Jun 2026 Calibrated to the strong-hire bar

Killing an AI feature on purpose is the highest-signal PM behavior in a 2026 interview. Not because the idea failed in testing, not because engineering said it was too hard, but because you ran the calculus early and saw clearly it would not clear the viability and lovability bars. That judgment is what interviewers at senior AI PM panels are testing when they ask “tell me about a product decision you reversed” or “when would you not ship an AI feature?” Junior PMs answer with a memorized list of AI limitations. Senior PMs answer with a specific decision and the exact signal that triggered it.

Why kill decisions matter more now

In 2026, feasibility is essentially free. If you can describe an AI feature precisely enough, you can build it. That collapse has moved the real gate: the question is no longer “can we build it?” It is “will anyone pay, and will anyone stay?”

This shows up in the numbers. AI initiative abandonment jumped from 17% to 42% in a single year as CFOs ended the blank-check period for AI innovation. A pilot without a defined path to scale is deferred waste. Stopping an AI idea before it consumes build capacity is sound economics, not risk aversion. Interviewers at frontier labs know this and now explicitly test for restraint.

Three failure modes, named precisely

An AI feature can fail viability for three distinct reasons. The interview answer is stronger when it names the specific mode rather than gesturing at “it wasn’t ready.”

Relevance failure. The feature is not tied to a real, paying problem. Users engage in a demo but have no existing spend category that maps to the outcome the AI produces. The kill signal: no workarounds, no adjacent tool spend, no waiting list with skin in the game.

Realism failure. The feature cannot be economically maintained at the accuracy bar the use case requires. Token cost per query at the required context length exceeds the margin available from the user’s lifetime value contribution. Or: the accuracy bar for trust in this domain is higher than any model reliably clears, and the errors erode trust faster than the feature builds it. Model drift monitoring and accuracy evals become a permanent cost center that changes gross margin.

Practicality failure. The org cannot operationally support what the feature needs to stay non-drifted. The feedback loop requires annotation capacity, eval pipelines, or human review that the team does not have. A feature that degrades silently because no one is running evals costs trust and generates support load with no path to correction.

What a strong kill-decision answer looks like

Name the feature specifically, walk through why it initially looked viable and lovable, then identify the exact signal that changed the calculus. End with what you built instead and the metric that confirmed the decision.

strong

"We had an AI-suggested reply feature for our customer support tool. It looked strong early: agents said they felt faster and the tech was straightforward to ship. The kill signal came from unit economics. At the query volume our heaviest support orgs ran, token cost per seat exceeded what we could recover from the price tier those customers were on. We also found the accuracy bar we needed for escalations and refund disputes was higher than the model cleared reliably, generating suggestions agents actively distrusted. Distrust is worse than no suggestion: it adds a verification step to every response. We killed the inline suggestion feature and instead shipped a search-and-summarize mode agents invoke explicitly. Cost per query dropped 60% and agent satisfaction improved because the tool was useful when called rather than intrusive by default."

weak

"We decided not to use AI because it wasn't ready yet. Hallucinations were a problem and we didn't think the accuracy was good enough." This is a caveat list, not a decision. It implies you'd build the feature later without specifying what "ready" means or what metric would clear the bar. It frames the kill as a deferral rather than deliberate judgment. The interviewer hears: this person knows AI limitations in the abstract but has not exercised product judgment about a specific feature's economics or fit.

Lovable is about absence, not just presence

The second bar a killed AI feature fails is lovability. Lovable in 2026 means the product meets users where they are and anticipates needs without being obnoxious. An AI feature that surfaces at the wrong moment in a workflow, adds cognitive load to a task the user already figured out, or interrupts to offer help the user did not want is lovable in zero ways regardless of accuracy.

If the user only needs the outcome once and a deterministic solution is cheaper, faster, and more predictable, the AI feature fails the lovability test by existing. The honest answer to “should a model even be here?” is sometimes no.

The kill is the product decision. Knowing when to make it is what separates a senior AI PM from someone who adds AI because the brief asked for it.

For the viability framework behind these economics, see proving viability. For the feasibility-is-free context that makes kill decisions more frequent, see feasibility is free. For patterns that should be killed on lovability grounds alone, see obnoxious AI antipatterns.