How would you price an AI product without killing margin?

Q: How would you price an AI product without killing margin?

The AI PM pricing question. Token economics, the autonomy-attribution matrix, and margin levers, structured as a spoken interview answer.

This question is not a pricing 101 check. In 2026, with feasibility close to free, interviewers at AI-first companies are testing whether you understand that pricing discipline is the product strategy. A PM who cannot hold the unit economics conversation is a PM who will ship features that erode the business’s ability to keep building. The tell is whether you reach for the cost stack before you reach for a model name.

Structure a strong answer

strong

"Before naming a price, I'd clarify three things in under a minute: What segment (consumer, SMB, enterprise)? What is the core value unit (time saved, tasks completed, revenue generated)? And is this a standalone AI product or a feature layered onto an existing product? Those answers constrain everything that follows.

Then I'd map the true cost stack, not the list price. The visible user-facing tokens are 10 to 20 percent of actual inference cost. Fully-loaded COGS includes internal reasoning tokens, retrieval and RAG calls, system prompt context, and infra overhead (roughly 10 to 15 percent of inference spend on top). In agentic products, internal token consumption runs 50 to 90 percent of total usage. This is the Token Iceberg: what users see is the surface. ICONIQ's January 2026 data puts inference at an average of 23 percent of revenue at scaling-stage AI companies. I'd model that number explicitly before setting any floor.

To choose a pricing model I'd use an autonomy-attribution matrix. Two axes: how much does the AI act without human confirmation (autonomy), and how directly can you trace economic value back to an AI action (attribution)? Low-low maps to seat-based pricing (familiar, predictable, misaligned with actual value created). High-high maps to outcome-based pricing: Intercom Fin at $0.99 per resolved ticket, Salesforce Agentforce's credit-per-completed-task model. Harvey charges a fixed monthly fee per AI agent. GitHub Copilot moved to usage-based billing in June 2026. Middle-ground products go hybrid: base subscription covering a token or action allowance, with overage priced per unit. 92 percent of AI software companies now use mixed models. Traditional SaaS captures 10 to 20 percent of created value; AI products priced on outcomes capture 25 to 50 percent.

For the floor: fully-loaded unit cost divided by target gross margin. ICONIQ 2026 shows the average AI product gross margin at 52 percent, up from 41 percent in 2024 and 45 percent in 2025, versus 80 to 90 percent for traditional SaaS. Below 60 percent, the LTV:CAC math institutional investors use to evaluate B2B software breaks down. For a concrete illustration: a $80 per month SaaS seat that adds AI inference absorbs roughly $15 in direct variable cost, dropping gross margin from 80 to around 65 percent. Price the AI tier wrong, and that falls below 60 percent at heavy usage.

For willingness-to-pay, I'd run 15 to 20 customer interviews using Van Westendorp: acceptable, expensive, prohibitively expensive. Price near 'expensive,' not 'prohibitively expensive.'

Then I'd design for margin preservation using three levers PMs control directly. Model routing: route 80 percent of queries to smaller, cheaper models and escalate to frontier only when complexity requires it. Prompt caching: major APIs offer a 90 percent discount on cached context; for products with long system prompts or repeated retrieval, this is significant. Usage tier design: at maximum consumption within any tier, margins must stay above 55 percent. Free tiers that generate negative-margin load are the anti-pattern. Combined, routing plus caching can cut inference cost 50 to 70 percent without measurable quality loss at median usage.

Finally, stress-test the model. Does pricing survive a 20 percent provider cost increase? Are the heaviest users profitable? What happens if a third-party API embedded in the workflow raises rates? A concrete recommendation: hybrid base subscription with a token or action allowance, overage at X per unit, targeting 60 percent gross margin at median usage and 70-plus percent at scale as model costs compress. The success metric is gross margin per customer cohort, not blended margin."

weak

"I'd use value-based pricing. We'd research what customers are willing to pay and price accordingly, then benchmark against competitors." This answer skips the cost floor entirely. Interviewers at AI companies will immediately probe the token economics and expose the gap. Picking per-seat without naming the autonomy-attribution trade-off, ignoring that power users can become margin-negative at high usage, or defaulting to cost-plus ("figure out our cost and mark it up 30 percent") all signal that the candidate has not connected pricing to the economics that determine whether the product can keep being built. Naming only tiering as a margin lever, without mentioning model routing or prompt caching, signals a shallow understanding of what a PM actually controls.

The Token Iceberg: why weak answers fail on first follow-up

The most common failure is treating the user-visible interaction as the unit of cost. A typical chat turn appears to cost 200 to 300 tokens. The true fully-loaded cost is often 10 times that once internal reasoning, retrieval, system prompts, and infra overhead are counted. In agentic products, internal token consumption is 50 to 90 percent of total usage. Interviewers at companies building agentic products (Salesforce, Harvey, Cursor, Sierra) will probe this directly. Name the Token Iceberg unprompted. It signals you have shipped something real, not just theorized about it.

The autonomy-attribution matrix as a live decision tool

The matrix has two axes. Autonomy: how much does the AI act without human confirmation? Attribution: how directly can you trace economic value back to an AI action? Low-low maps to seat pricing (safe, predictable, misaligned with value created). High-high maps to outcome pricing (highest value capture, but you absorb cost variance, so cost floor math must be precise). Middle zones map to hybrid and workflow-based models. Use this framing explicitly in the interview room. It shows you have a decision tool, not a memorized list of examples.

The 60 percent gross margin threshold

ICONIQ’s January 2026 data shows the average AI product gross margin at 52 percent, up from 41 percent in 2024 and 45 percent in 2025. Traditional SaaS runs at 80 to 90 percent. The 60 percent threshold matters because below it, the LTV:CAC math that institutional investors apply to B2B software breaks. Gross margin target zones by product type: AI-native products should target 50 to 60 percent; AI-enabled products 60 to 79 percent; AI-augmented products around 80 percent. Naming these zones shows awareness of where your product sits in the spectrum and what economic targets are realistic versus aspirational.

Margin preservation levers PMs control

Most candidates name tiering as the only lever. Strong candidates name three, and explain why each is a PM-level decision (not just an engineering optimization):

Model routing: the PM advocates for routing 80 percent of queries to smaller, cheaper models at the architecture stage. This is a product decision because it involves tradeoffs on quality, latency, and edge-case handling that PMs must own.
Prompt caching: a 90 percent discount on cached context with major providers. For products with long system prompts or repeated retrieval context, this is a meaningful structural cost lever, and the PM needs to know to push for it during planning.
Usage tier design: every tier must be profitable at its consumption ceiling, not just at median usage. Defining that ceiling is a PM decision. The anti-pattern is free tiers that generate negative-margin load from high-intent users.

The 2026 context

In 2026, feasibility is table stakes. An LLM-powered feature that took six months of ML work in 2022 ships in a sprint. This question is actually testing whether you understand that pricing discipline has become the product strategy itself. With feasibility commoditized, the companies that win identify viable problems (markets willing to pay, margins that support continued building) and build products people genuinely want to use, meeting users where they work and completing tasks with the right level of AI involvement. The PM who cannot hold the unit economics conversation is a PM who will ship features that erode the business’s ability to keep building. Treat pricing not as a decision made after the product is built, but as a constraint that shapes architecture and defines what “success” means.