ai pm · thesis
AI tools in PM take-home assignments: the company policy matrix
The first question candidates ask about a PM take-home is “can I use AI?” That is the wrong question. The right question is: what does this company’s policy reveal about what they are actually testing, and how do I pass that test? A company that bans AI is testing whether you can produce quality work without leverage. A company that requires it is testing whether your judgment survives when the mechanical work is handled for you. The policy is not a rule about tools. It is a signal about the bar.
The policy spectrum
PM take-home policies cluster into five categories. Most companies never state which one they are, so candidates must infer.
Full ban. The take-home is explicitly designed to detect and penalize AI use. Anthropic is the primary example: its published candidate guidance states that take-homes must be completed without Claude unless the instructions explicitly say otherwise, while live interviews are always no-AI. This is not soft discouragement. Anthropic redesigned its take-home test multiple times since early 2026 because Claude kept acing the original version. The test is now explicitly AI-resistant. When Anthropic reversed its earlier blanket AI ban in hiring (Fortune, July 2025), it replaced the old format with one that requires the kind of specific reasoning a model cannot generate from scratch. Worth noting: Anthropic actively encourages Claude use during the application phase (resume, cover letter). The take-home is the exception, not the rule.
Silent (ask first). The assignment instructions say nothing about AI. This is the most common policy. The correct move: email your recruiter before starting with “The instructions don’t address AI tool use. Can you clarify whether that’s permitted for this assignment?” Most will answer clearly. If they don’t respond before the deadline, treat it as a full ban and document that you asked.
AI-assisted welcome. Stripe, Shopify, Vercel, and most YC-stage startups operate here. The implicit norm: use what you’d use on the job, and be ready to explain every choice in the debrief. Stripe PMs run internal “AI-staycations,” extended vibe-coding sessions where full AI use is the expectation. Their take-homes skew toward developer-facing problems (APIs, dashboards, developer experience) where tooling judgment is part of the signal. Not using AI here signals you’re out of step with how they work.
AI-expected (written strategy). Some companies state that AI assistance is expected and score the quality of your reasoning, not the absence of tool use. Amazon’s approach is to probe process in the debrief: “What did you use AI for? Where did you override it? Why?” The Amazon intern rescission case involved a candidate who disclosed using an AI overlay tool throughout their entire interview process with no demonstrated independent judgment. The problem was not AI use. It was the absence of a point of view distinct from the model’s output.
AI-required (vibe coding). Google’s PM loop now includes a 45-minute vibe-coding round, confirmed as of April 2026, where AI tools are required, not optional. Adobe includes vibe coding in its take-home assessment. Over 30% of top-50 tech companies include prototyping rounds in PM interviews as of 2026. These rounds score prompting fluency, scope judgment, tradeoff reasoning, and user-centeredness. The AI tool is assumed. Everything else is the interview.
Per-company quick reference
| Company | Policy | What it is actually testing |
|---|---|---|
| Anthropic | Full ban (take-homes); AI encouraged (resume, cover letter) | Can you reason without the tool? The test is designed to be AI-resistant after Claude aced earlier versions. |
| AI-required in vibe-coding round; written rounds vary by team | Prompting fluency plus judgment. The tool is given; your choices are not. | |
| Meta | No formal PM policy; SWE added AI-enabled coding round (Oct 2025) | Varies by team. Ask recruiter. Written strategy rounds default to no AI. |
| Amazon | Silent; debrief heavily weights process transparency | Writing quality and reasoning under the Leadership Principles. Rescinded an offer over undisclosed AI overlay use. |
| Stripe | AI-expected | Developer empathy, viability reasoning, clear writing about tradeoffs. |
| Shopify | AI-expected; disclosure appreciated | Speed and judgment; they want to see how you work, not that you can work without tools. |
| Figma | Silent; design judgment is the bar | Can you build something users would actually want, not just something that runs? |
| Adobe | AI-required (vibe coding component in take-home) | Same five-dimension rubric as Google’s vibe-coding round. |
| Perplexity | No published policy; AI-native culture | Ask first. Assume disclosure is appreciated. |
| OpenAI | Silent | Contact recruiter. Culture is AI-native but policy varies by role and team. |
What detection looks like from your side
Detection rates of AI-generated interview answers doubled from 15% in June 2025 to 35% in December 2025, per Fabric’s 50,000-plus candidate dataset. Reviewers are not running software. They are doing two things: reading for the tells that mark AI-drafted prose, and asking five follow-up questions in the debrief.
The written tells reviewers catch without a debrief:
- Uniform structure. Every section is the same length. Headers are balanced. The conclusion restates the intro. Real strategic memos are lumpy: some sections are two sentences because there is nothing more to say, others run long because the problem is hard.
- Generic tradeoffs. AI-generated tradeoffs are always clean: “Option A is faster to ship but less scalable.” Real tradeoffs are messier: “Option A requires legal sign-off in two markets and the PM in EMEA hasn’t been looped in.”
- No opinion. The document presents three options and recommends one but never says anything that could be wrong. Real recommendations have a bet in them.
The debrief tell is more reliable than any written signal. The single most reliable detection method: ask the candidate to explain one decision made in their submission, then follow up four more times. AI-drafted work collapses under specific follow-up because the decision was never actually made. It was generated. If you used AI to draft any part of your submission (permitted or not), you must be able to walk through every claim from memory, explain why you chose the structure you did, name what you considered and rejected, and give a before/after on any metric you cited.
Written strategy vs. vibe-coding: different norms entirely
A written strategy take-home (PRD, strategy memo, prioritization exercise) and a vibe-coding round have almost no shared norms.
Written strategy take-homes are evaluated on the quality of your judgment: the clarity of your framing, the specificity of your tradeoffs, whether your recommendation would survive a real business review. AI drafting is detectable not by fingerprint analysis but by absence of a point of view. A strong submission has a specific opinion a reviewer could disagree with. An AI-washed submission covers all sides and commits to nothing. AI helps with structure and coverage; it hurts with opinions and specificity.
Vibe-coding rounds are a completely different format. The AI tool is not just allowed, it is the mechanism. Interviewers score across five dimensions: problem framing and scope control, prompting and tool fluency, tradeoff reasoning, user-centered decisions, and communication under pressure. Silence is a scoring event. Building too many features and delivering nothing that works end-to-end is the most common failure. See the full vibe-coding round guide for the minute-by-minute structure.
The disclosure script
If asked “did you use AI for this?” in any debrief, the right answer is specific and affirmative, not defensive.
Strong: “Yes. I used [tool] to draft the initial user research synthesis and to stress-test my recommendation against objections I might have missed. I edited both heavily. The prioritization logic and the business case numbers came from my own analysis. Here is where I overrode what the model suggested and why.”
Weak: “I used some AI tools to help organize my thoughts.” (No specificity, no agency, reads as a hedge.)
Catastrophic: denying AI use when you used it, or claiming full AI use with no demonstrated judgment of your own. Both end the process.
The disclosure script works when there is real judgment behind it. The companies worth working for are not trying to catch you using tools. They are trying to catch you substituting the tool’s judgment for yours.
The viable/lovable filter
In 2026, feasibility is free. Any PM with Cursor, Bolt, or Lovable can produce a working prototype in 45 minutes. Take-home AI policies are no longer gatekeeping a capability advantage. They are a viability and lovability filter.
Companies that ban AI are testing whether your output quality holds without mechanical leverage: a proxy for cultural fit. Can you work the way we work? Companies that require AI are testing whether your judgment survives when the mechanical work is handled for you. Did you build something worth building (viable) and does it actually fit how users work (lovable)? A technically complete prototype that solves the wrong problem fails the vibe-coding round just as cleanly as a polished memo that recommends the obvious.
Sundar Pichai noted in April 2026 that 75% of new Google code is AI-generated and engineer-approved. The floor on execution has risen so high that judgment is the only remaining differentiator. The matrix question is not “can I use AI?” It is “what is this company testing now that AI handles the mechanical part, and do I have a real answer?”
Related: how interviewers catch AI answers covers the debrief mechanics that surface AI-drafted work. The vibe-coding round guide covers the format, five-dimension rubric, and minute-by-minute structure for rounds where AI is required. Feasibility is free grounds the 2026 context for why the bar shifted from “can you build?” to “is it viable and lovable?”