technical · standard
"Explain caching to a non-technical executive"
Explain caching to a non-technical executive, and tell me the tradeoff you'd own as the PM.
This question tests whether you can translate engineering concepts into product decisions, not whether you know how Redis works. The interviewer is checking for two things: clarity (can a non-technical exec follow you?) and judgment (do you know which technical choices are actually product choices in disguise?). Both Stripe and Google use this question deliberately: their products touch data where staleness is genuinely consequential, so the answer reveals whether you understand the business cost of being wrong.
Structure a strong answer
Give a one-line analogy, then name the tradeoff you own. Do not stop at the analogy.
strong
"A cache keeps a copy of an answer nearby so we skip recomputing it every time someone asks. Think of it like keeping your most-used spice on the counter instead of walking to the pantry each time you cook. It makes things faster and cheaper, and that speed matters: research shows 100ms of added latency can drop conversions by around 7%. The tradeoff I own as PM is staleness: a cached price or account balance can be out of date. My rule is to cache aggressively for slow-changing content like product images or help articles, and never cache anything where the user's decision depends on it being current, like a payment balance, real-time inventory, or a live auction price. Where to draw that line is a product call, not just an engineering one. In an AI product context, we also cache the computation of long system prompts across thousands of requests. That's where I track cache hit rate as a PM metric, because a drop below 70% tells me we've broken something in our prompt structure."
weak
"Caching uses Redis to store key-value pairs in memory with a TTL." This fails on two counts: it doesn't land with a non-technical exec (jargon without business meaning), and it skips the PM's actual job, which is deciding what to cache and what never to cache. A slightly better but still weak answer gives only the analogy without the tradeoff. That scores well on communication but shows no product judgment. Interviewers at Stripe and Google specifically probe the staleness decision because their products make it genuinely consequential: Stripe touches payment and balance data; Google runs Search index freshness and Maps real-time accuracy.
The interviewer’s rubric
A 3/5 answer gives the analogy and stops. A 4/5 adds the staleness tradeoff with a concrete example. A 5/5 does all of that and connects the decision to a specific product context, including, in 2026, AI products where prompt caching is a direct PM-owned cost lever.
What separates 4/5 from 5/5 is naming the verticals where the call is genuinely hard. Showing a cached bank balance to a user who is mid-transfer is a trust problem, not a latency problem. Showing a stale help article is usually fine. The PM’s job is to hold that distinction explicitly so engineers have a clear rule.
In the feasibility-is-free era, the question is no longer “can we afford to cache this?” It is: “what level of staleness will your users tolerate before trust breaks?” That reframe shifts caching from infrastructure into product strategy, and the interviewer is listening for exactly that shift.
The 2026 angle: prompt caching as a PM decision
In AI products, caching has become a direct unit economics lever that PMs own. When an LLM processes the same long system prompt thousands of times per day, you can cache the computed representation of that prefix and skip re-processing it on each call.
Anthropic charges cached input tokens at 0.1x the base rate (90% off). OpenAI applies automatic caching at 0.5x (50% off). Google Gemini ranges from 0.1x to 0.25x plus storage fees. A real example: a customer-support agent team dropped their monthly LLM bill from $4,200 to $680 (84% reduction) purely through restructuring prompts to maximize cache hit rate.
The decision is not always obvious. Prompt caching only delivers ROI for repeat-prefix patterns: agents that reuse a long system prompt, RAG pipelines with a fixed retrieval preamble, code editors that send the same repository context. One-off queries see no benefit. So the viability question is workload shape, not just cost, and that is a PM judgment call.
One operational detail worth knowing: a single stray whitespace character in a system prompt invalidates the cache prefix hash. That makes prompt versioning a PM-owned process concern, not just a nice-to-have.
If the interviewer digs deeper
Be ready for three follow-up probes.
“What is TTL?” Time to live: how long the cached copy is kept before it expires and forces a fresh lookup. Your answer: “I set TTL based on how fast the underlying data changes and what the cost of showing stale data is to the user. For a product image, days. For a payment balance, zero.”
“How do you handle cache invalidation?” The hard part of caching is knowing when to clear it. Your PM framing: “I think about events that should trigger a refresh, like a confirmed payment, a price change, or a live inventory update. I work with engineering to define those triggers as product requirements, not leave them as implementation details.”
“What metric do you track?” Cache hit rate. Above 80% is healthy for most workloads. Below 70% signals a structural problem with what you are caching or how your prompts are structured, not a tuning issue. For AI products specifically, a falling hit rate often means your prompts are non-deterministic or your team has been editing the system prompt without a versioning process.
What’s actually scored
Communication and judgment. Can you translate between engineering and the business? Do you know that the staleness decision is a product call, not purely a technical one? In 2026, a candidate who connects this to prompt caching for AI products, with a concrete hit-rate number and cost example, signals the engineering-to-business translation skill that AI PM roles are specifically hiring for. The bar is not knowing how caching works. It is knowing which caching decisions are really product decisions about what your users will trust.