glossary · discovery
Dogfooding (definition)
Using your own product in real work, not demos, to surface issues before external users encounter them.
Dogfooding is the practice of using your own product as a real work tool, not a demo environment. The term traces to a 1988 internal email from Microsoft manager Paul Maritz to test manager Brian Valentine, pressing the LAN Manager team to ship by committing to use the product themselves. Most interview answers stop there. What actually matters for a PM is the distinction between shallow dogfooding (opening the app, clicking around) and operational dependency (removing the tool would disrupt your own team’s workflow). The second version is a PMF signal. The first is closer to QA.
Why operational dependency is the only version that counts
Shallow dogfooding catches obvious bugs. Operational dogfooding catches whether the product is actually worth using. The difference shows up in what you learn.
When Amazon built AWS, its internal rule was that no team could bypass the public API. Engineers consumed their own infrastructure the same way external customers did. That pressure directly shaped AWS’s design: every sharp edge in the developer experience was felt first by an internal team that had to live with it. The constraint was not “give us feedback.” It was “you cannot escape using this.”
JetBrains ran Junie, their AI coding assistant, internally from December 2024 before any closed beta. The YouTrack team manages YouTrack’s own development inside YouTrack. That is not a policy choice. It is a structural commitment that prevents the team from designing features they do not personally depend on.
Anthropic calls their internal program “antfooding” (employees are the ants). As of 2026: 59% of Anthropic employees use Claude daily, up from 28% a year prior. Productivity gains doubled from roughly +20% to +50%. Code output per developer increased 200%. Task complexity scores rose from 3.2 to 3.8. Most strikingly, 27% of Claude-assisted work involves tasks that would not otherwise have been completed at all. That last number is worth quoting in an interview, because it is not a satisfaction score or a vanity metric. It is a measure of capability expansion.
Anthropic’s product process has shifted accordingly: instead of writing PRDs first, teams build working prototypes with Claude Code in hours, ship them internally, and let usage patterns inform the spec. The prototype becomes the spec; usage becomes the research.
Where dogfooding fails or misleads
Internal users are not representative users. They know the product better than any customer will. They know which workflows to avoid, which edge cases to route around, which error messages to ignore. This suppresses signal on the problems that will hit customers hardest.
Power dynamics make it worse. When a VP uses a product daily, their team observes what they do, not what they say. Honest negative feedback requires explicit permission and structural separation from the people building the product.
Dogfooding also cannot replace alpha or beta testing. Internal usage tells you whether your team finds the product useful for internal workflows. Beta testing tells you whether target users find it useful for their workflows. For a product built by PMs and tested by PMs, those may align. For a consumer healthcare product built by engineers, they almost certainly do not.
To run dogfooding without suppressing honest signal:
- Collect behavioral data (feature adoption rates, task completion, time on task) rather than relying on self-reported feedback.
- Include people in onboarding situations who have not been briefed on the product’s goals.
- Separate the feedback channel from the product team’s social graph: anonymous logging or a researcher collecting sessions rather than a shared Slack channel where criticism is visible to the people who built the thing.
How AI changes the stakes
Traditional dogfooding surfaces bugs and UX friction. AI dogfooding surfaces something harder to find: probabilistic failure.
AI systems are context-sensitive and non-deterministic. A demo runs on a clean context with a cooperative prompt. Real use runs on messy inputs, long context windows, multi-step pipelines, and edge cases that were never in the test set. Hallucinations, latency problems, silent errors in automated pipelines, and brittle behavior on unusual inputs do not appear in controlled pilots. They appear after weeks of real use by people who are actually trying to get work done.
In 2026, feasibility is free. Any PM can ship an AI feature. The question is whether it is viable (people actually use it when it counts) and lovable (it meets people where they work, without being obnoxious). Shallow dogfooding tests neither. Operational dogfooding, where removing the tool would break your own team’s workflow, is the only version that confirms both at the same time.
This is also why sustained internal adoption has become a credibility signal with investors. VCs and late-stage investors have added AI authenticity due diligence to their process: founders are asked to show internal adoption data before funding rounds close. Voluntary adoption that spreads organically across functions (including legal, accounting, and growth) to the point where removing the tool would disrupt real work is the genuine signal. Self-reported NPS is not.
Dogfooding vs. alpha and beta testing
These are distinct stages with different purposes.
Dogfooding: internal employees, real work, no assigned scenarios. The goal is operational signal and early failure discovery before controlled testing begins.
Alpha testing: a small closed group, often employees plus invited external users, with specific scenarios or tasks to evaluate.
Beta testing: a broader external group, closer to launch, testing scale, edge cases, and market readiness.
Dogfooding precedes both. It tells you whether the product is coherent enough to put in front of external users at all.
Talking about dogfooding in interviews
The question surfaces in a few forms: “How do you validate product direction internally before launch?”, “Tell me about a time you used your own product to catch something,” or “How do you build a culture of internal feedback?”
The weak answer is a process recitation: “We set up a dogfooding program with a Slack channel and weekly readouts.” The strong answer connects internal usage to a product decision. What did you learn that changed the roadmap? What would you have shipped without that internal signal? What did usage data show that surveys missed?
weak
"We dogfooded the product before launch. I used it every day and filed bugs. It was really helpful for building empathy with users." No signal tracked. No distinction between surface bugs and PMF signal. Treats dogfooding as a pre-launch checklist item rather than a judgment tool.
strong
"We ran internal dogfooding, but I tracked two things beyond bug counts: whether adoption was spreading organically to teams outside product and design, and whether people were actually changing their workflows around the tool. The first tells you if the value proposition generalizes beyond the people who built it. The second tells you if the dependency is real. For our AI features specifically, I required real work use, not test scenarios, because hallucinations and latency issues only surface under the variance of actual tasks. We also collected behavioral telemetry rather than relying on feedback surveys, because social pressure inside a team makes self-reported data unreliable. That telemetry is what caught that our power users had built a manual workaround for one workflow, which told us exactly where the product was failing them."
For related validation methods, see A/B testing, feature flags, and MVP. For the broader framing on why viability is the hard problem in 2026, see feasibility is free and proving viability.