ai pm · thesis

Should a model even be here?

Updated Jun 2026 Calibrated to the strong-hire bar

The question that separates a senior AI PM from a capable feature designer is not “which model should we use?” It is “should a model be here at all?” In 2026, when feasibility is effectively free, the gate that used to be owned by technical constraints now belongs to the PM. If you cannot run the should-we-even test before you start designing, interviewers at frontier labs will catch it, and they will catch it by the fact that you never asked.

Why feasibility being free makes the gate harder, not easier

In 2023, “can we build this with AI?” was a real question. Foundation models were expensive, brittle, and slow. Technical feasibility filtered out a large share of bad ideas before they consumed eng capacity.

That filter is gone. In 2026, you can assemble almost any AI capability in a day or two using frontier APIs and agent frameworks. The collapse of feasibility as a gate has two consequences. First, bad AI ideas reach the build phase far more often. Second, the PM is now the only person in the room whose job it is to run the gate that feasibility used to run automatically.

Viable (real willingness to pay, a market large enough to cover cost and margin) and lovable (meets people where they work, anticipates needs without being obnoxious) are the two filters that cannot be automated away. Neither of them is answered by “yes, we could build this.” The PM’s job shifted: you now own both gates explicitly.

The four tests

Run these in order before any feature design work starts. If a problem fails an early gate, you do not need to evaluate the later ones.

Gate 1: Problem type. Can you solve this with a well-designed form, a filter, or a lookup table? If yes, you do not have a model-worthy problem. AI earns its place on problems that are ambiguous and context-dependent, where human judgment currently does not scale. An AI-powered onboarding assistant for a B2B tool where the actual problem is a confusing UI does not clear this gate. The correct fix is better UI. A model explaining bad UI is still bad UI with overhead.

Gate 2: Error cost asymmetry. The question is not just “how often will the model be wrong?” It is “what does wrong cost, and is it reversible?” A wrong movie recommendation is low-stakes and correctable. A wrong dosage instruction is not. High error cost plus low reversibility is a disqualifier. The model does not belong in the decision path; it can belong in the research path, with a human making the call.

Gate 3: Trust trajectory. Autonomous action by an AI system requires users to have a working mental model of its behavior. Before that model exists, suggestions requiring confirmation are the correct starting point, not because of technical limits, but because of adoption physics. If your design assumes the level of trust that takes six months to establish, the feature will fail in the first three months regardless of accuracy. Gate 3 asks: where are users on the trust curve today, and does the interaction you are designing match that position?

Gate 4: Cost structure at scale. API costs, latency, prompt versioning, and model update maintenance are non-zero and compound. At 1 million queries per day, a lookup table beats an LLM call on margin, latency, and reliability simultaneously. Before writing a model into any design, know what the per-query cost is at the scale the business needs to hit margin. If the number does not work, no amount of product polish fixes it.

Five seconds of boring thinking

Before writing a model into any design, spend five seconds asking: “What is the simplest non-AI solution to this specific user need, and why is it insufficient?” If you cannot answer “why insufficient,” you do not have a model-worthy problem. This is the tell interviewers listen for. The strongest signal a senior AI PM sends, unprompted, is naming a product decision where they chose a lookup table or a simple rule over a model because it was a tenth of the cost and more reliable. That sentence carries scar tissue. It tells the interviewer you have shipped, not just studied.

The interview trap: the premise already contains a model

A common setup in AI PM interviews: the prompt already contains a model. “Design an AI feature for X.” “How would you improve Y using AI?” Weak candidates accept the premise and start designing. Strong candidates surface it.

The correct move is to name the assumption before answering it. Say out loud that you are checking whether a model is the right tool before designing with one. Interviewers at companies like OpenAI, Anthropic, Google DeepMind, and Meta now explicitly grade this instinct as hard as the design itself. Their engineering teams are full of people who can build. They need PMs who prevent unnecessary builds.

strong

"Before I design anything, I want to check whether a model is the right tool here. Let me run four gates quickly. Problem type: is this ambiguous and context-dependent, or can a filter or lookup handle it? Error cost: if the output is wrong, what does that cost and can the user correct it? Trust trajectory: where are users today on trusting autonomous AI action, and does the interaction I'd design match that? Cost structure: what is the per-query cost at the scale we need? If those four pass, a model earns its place. If gate one fails, I'd use a rule-based filter and move on. If gate two fails with high error cost and low reversibility, the model can inform a human decision but should not make it. I would state clearly which gate the problem clears or fails before touching any feature design, and if a model does belong here, I'd name the fallback for when it gets it wrong."

weak

"We could use a model to handle this. I'd probably start with GPT-4o and iterate on the prompt." This accepts the premise, skips the gate, and jumps to implementation. It signals the candidate has not been burned by a bad model-inclusion decision. The interviewer hears: this person treats AI as the default tool and determinism as the exception. That is the wrong orientation for a PM role at a company where eng can already build anything you can describe.

What the call looks like

The gate is binary. Either the problem clears it or it does not. A strong answer ends with a stated reason in one sentence: “This is a deterministic filtering problem; a model would add latency and cost without adding accuracy, so I’d use a rule-based filter.” Or: “This is ambiguous enough that human judgment does not scale and the error cost is low and recoverable, so a model earns its place here.” Then, if you decide a model belongs, note what the fallback is when the model fails. That last step is what separates people who have shipped from people who have read about shipping.

For the economics that underpin gate four, see feasibility is free. For what to do when you already have a model in production and it outputs something wrong, see when the AI is wrong. For the kill-decision framework when a model was already in the design and should come out, see kill the AI idea.