system design · hard

Design WhatsApp: system design for PM interviews

Design WhatsApp.

Updated Jun 2026 Calibrated to the strong-hire bar

This question is a technical-depth check wrapped in a familiar product. The interviewer is not looking for a distributed systems implementation; they are checking whether you understand WhatsApp’s core privacy contract and can reason about what it allows and forecloses at the product level. The single most common failure mode is treating end-to-end encryption as an infrastructure detail rather than the architectural constraint that every feature decision flows from.

Structure a strong answer

strong

"I'll scope to 1:1 and small group chat, text and media, with offline delivery. I'm treating E2E encryption as a hard constraint because it's the privacy contract that gives WhatsApp its trust with two-billion-plus users. Every other choice flows from that. Should I also cover linked devices or voice/video, or stay focused here?"

Then walk the life of a message using the three-tick model as your spine: (1) The sender's client encrypts with the Signal Protocol and assigns a client_msg_id idempotency key before anything touches the network. One grey tick fires when the server persists the encrypted blob to the recipient's inbox. (2) The server tries to push to the recipient's open WebSocket; if offline, it sends a best-effort FCM/APNs wake-up. The wake-up is not delivery. Two grey ticks fire when the recipient device acknowledges receipt. (3) Two blue ticks fire when the recipient opens the message. Each tick is a distinct backend event and write path.

Proactively name the E2E tradeoffs before being asked: server-side full-text search is gone (client-side index only); server-side content moderation is gone (metadata rate analysis and user reports instead); server-side AI processing of message content is gone. When Meta added Meta AI to WhatsApp, they solved this by routing @Meta AI messages to an explicit non-E2E endpoint. The user opts in; the UI surfaces the break in encryption. That is not a hack, it is the only honest solution to the AI-in-chat problem under true E2E.

On group chat at scale: in a 1024-member group, the sender's client produces a separate encrypted key copy for every recipient device. This is client CPU cost, not server fan-out cost. It is why E2E group size limits exist. Read receipts in large groups are approximated or batched above roughly 100 members because naive per-user acks are an O(n) write problem per message read. Name both facts unprompted and you have separated yourself from most candidates.

On multi-device sync: the server never decrypts anything, so there is no server-side sync. The sender encrypts separately for each registered device in the recipient's linked set. When a new device is linked, the primary phone re-encrypts session keys for it. The server brokers key exchange but never holds keys.

Close honestly: "The biggest unresolved tension in 2026 is ambient AI assistance. Thread summarization requires reading plaintext. True E2E and server-side summarization are mutually exclusive. The product resolution is to make AI invocation explicit and consensual, not ambient. That is a viable product choice, not a failure of the architecture."

weak

"We'll use HTTPS to encrypt messages and store them in a database. For scale, we add more chat servers." This conflates transport encryption with end-to-end encryption, which are different things. It also misses that push notification delivery and actual message delivery are not the same event. When asked about spam moderation or AI summarization, saying "the server scans messages before delivery" is architecturally incompatible with E2E and is an immediate credibility loss at any company that ships it. The interviewer is not testing implementation depth; they are testing whether you understand what the product's privacy promise actually means in engineering terms.

The PM judgment

Two things separate a strong-hire answer from a pass. First, framing E2E encryption as a product constraint rather than a feature: the server is intentionally blind, and every capability decision (moderation, search, AI, backup) has to be made with that blindness as a given. Second, recognizing the 2026 tension: WhatsApp is now a privacy-contract platform that also needs to carry AI, business messaging, and payments. Viable means Meta can monetize via the Business API without breaking the trust that drives retention. The candidate who frames the design around that tension demonstrates PM thinking. The candidate who produces a generic scalable chat backend demonstrates SWE thinking.

A table-stakes detail interviewers use as a filter: if you do not mention the client_msg_id idempotency key that prevents duplicate messages on mobile network retries, you have signaled unfamiliarity with the basics. It is a small thing that matters.

Asked at