glossary · general

Product backlog definition

An ordered list of everything the team might build, used to sequence work against outcomes rather than accumulate requests.

Updated Jun 2026 Calibrated to the strong-hire bar

A product backlog is not a to-do list. The Scrum Guide defines it as “an emergent, ordered list of what is needed to improve the product” and the single source of work for the Scrum team. In practice, most backlogs drift into request dumps: long, unsorted queues that no one trusts and no one trims. In 2026, with AI making most builds fast and cheap, the backlog’s core job has shifted from sequencing good work to blocking bad work. Every item should answer two questions before it ranks: is this a problem someone is willing to pay to have solved (viable), and will this solution make their life meaningfully better, not just marginally different (lovable)?

Product backlog vs sprint backlog

The product backlog contains everything the product might do, in priority order, with no fixed time boundary. The sprint backlog is the subset the team pulls from the top and commits to within a single sprint. The product owner owns the product backlog and is solely responsible for its order. The team owns the sprint backlog for the duration of the sprint. Interviewers test this distinction directly; conflating the two signals a surface-level understanding of how Scrum is structured.

Grooming (refinement): what actually happens

Backlog refinement is not a Scrum ceremony. The Scrum Guide treats it as a continuous activity, capped at roughly 10% of the team’s capacity. In practice: regular sessions where the team estimates, splits, and clarifies items before they reach sprint planning.

In a healthy refinement:

  • Items at the top have clear acceptance criteria and a shared size estimate.
  • Items that no longer connect to the current outcome are deleted, not deprioritized.
  • New requests are scored against existing commitments, not just appended to the bottom.

The bottleneck is usually the input. Backlogs balloon when PMs say yes to requests instead of routing them through a viability question first.

Prioritization frameworks: choose by constraint

RICE (Reach x Impact x Confidence / Effort): best for continuous discovery comparing diverse items. Originated at Intercom. Confidence is the most underused lever: a 50% score halves the item’s rank and forces you to name what you do not know.

MoSCoW (Must / Should / Could / Won’t): useful for fixed-scope releases or contract-bound delivery. A poor fit for continuous discovery because it implies a release boundary that may not exist.

WSJF (Weighted Shortest Job First): Cost of Delay / Job Duration. Best when delay has a real, quantifiable cost: a regulatory deadline, a competitor window, a seasonal moment.

Value vs Effort: a rough signal, not a ranking. Use it early to cull obvious no-gos, then switch to a more precise lens.

No framework substitutes for talking to users. Scores are estimates, and estimates are almost always optimistic.

How AI changed backlog management in 2026

Tools like Jira Rovo, ClickUp Brain, and Linear AI now handle the mechanical layer of grooming: auto-triaging requests, deduplicating similar items, and surfacing confidence-weighted scores before a human reviews them. Product Owners running AI-assisted refinement report saving up to 10 hours per week.

The structural shift matters more. Backlogs at AI-first companies now include agent tasks alongside user stories: multi-step autonomous workflows that require acceptance criteria covering guardrails, confidence thresholds, and rollback conditions, not just functional behavior. “The agent books the flight” is not a story. It is a starting point for a harder question about what the agent is allowed to do and how a user recovers if it fails.

Linear caps the backlog to items plausibly shippable in the next two quarters. Stripe operates similarly. A backlog that extends beyond two quarters is a decision-avoidance mechanism, not a plan.

What clears the bar in the interview

The classic prompt: “Walk me through how you prioritize your backlog.”

weak

"I score every item using RICE and prioritize by the highest score." This treats prioritization as arithmetic rather than judgment. RICE scores are only as good as the estimates behind them, which are almost always political or optimistic. It also skips the upstream step: are these items solving real, paid-for problems at all? Interviewers hear this constantly and it signals a PM who optimizes the list rather than questions it.

strong

"Before touching a framework, I define what winning looks like this quarter: a specific north star movement or milestone. Then I ask which items, if shipped, would demonstrably cause that movement, and which are requests that wouldn't change the number. The second category gets killed or parked. For survivors, I pick the lens that fits the constraint: RICE for continuous discovery; MoSCoW for a fixed release; WSJF when cost of delay is the real variable. I'm explicit about what no framework can tell me: confidence estimates are guesses, and real prioritization requires talking to users. In 2026 I'd add: our AI triage surfaces and scores requests automatically. My job is to audit the top of the list weekly and delete things, not add them."

The clearest seniority signal: does the candidate start with the item list or start with the outcome? A senior PM defines “what would have to be true” before ranking anything. That is what separates a PM who manages a list from one who shapes a direction.

For a full model answer and follow-up questions, see how do you prioritize your backlog?. For the frameworks referenced here, see RICE, MoSCoW, and WSJF.