product sense · hard

"Design a news feed" (2026 version)

Design a news feed.

Updated Jun 2026 Calibrated to the strong-hire bar

In 2026, “design a news feed” is not a ranking question. Feasibility is effectively free: you can build a transformer-based system that maximizes any signal you name. The hard problem is picking the right objective. A feed optimized for time-spent is not viable long-term (advertiser trust erodes, regulatory risk accumulates, creator supply dries up) and not lovable (users report feeling worse after a doomscroll session, trust declines with each session they regret). The interviewer is checking whether you understand that the constraint has shifted from “can we rank this?” to “what should we optimize for, and how do we know when we have gone too far?”

Start with a clarifying question that actually constrains the design

Before touching any architecture, ask one question: is this a social feed (posts from people you follow), a content discovery feed (algorithmic best-of-topic), or a hybrid? The answer changes everything. A social feed is a two-sided market problem where creator reach and consumer relevance are in constant tension. A discovery feed is a cold-start and taste-mapping problem. A hybrid is both. Get the answer before proposing a north star.

Structure a strong answer

State the objective function explicitly. Then build the ranking system from that objective, not the other way around.

strong

"First: is this social, discovery, or hybrid? Assuming hybrid, the most interesting case. My north star is sessions where the user returns within 48 hours having acted on at least one piece of content beyond a passive scroll. That operationalizes long-term retention without optimizing for doomscroll. It also gives me a failure signal: if 48-hour return rate is high but content interaction depth is flat, users are coming back out of habit, not because the feed served them.

The ranking system has three explicit layers. Retrieval: a transformer-based dual encoder matches user interest embeddings against content embeddings, not keyword matching. LinkedIn's generative recommender processes 1,000+ historical interactions as an ordered sequence with causal attention, treating the user's history as a trajectory rather than a bag of signals. Converting raw engagement counts to percentile buckets improved their recall@10 by 15%. That is the retrieval approach I would propose. Ranking: a multi-objective model that jointly predicts probability of deep engagement (comment, save, share) and predicted regret (would this user later flag or hide this content). Weighted sum with a regret penalty keeps the feed from optimizing toward engagement-bait. Re-ranking: a diversity pass that enforces creator breadth, topic variety, and an integrity check: no two consecutive posts from the same creator, no cluster of posts that reinforce a single viewpoint without a counter-signal.

Monetization sits inside the auction, not bolted on top. Sponsored posts compete against organic content against a predicted quality threshold. If the ad does not clear the quality floor, it does not show regardless of bid price. LinkedIn is at roughly 40% sponsored content density in 2026, which is near the ceiling before user trust visibly erodes. I would instrument quality-floor rejection rate and track it against CSAT on sponsored post interactions.

On AI curation: the atomic unit of a feed is no longer necessarily a post. For news-type content, I would surface a brief AI synthesis of a story cluster with sources visible, and put the original posts one tap away. Reuters Institute 2026 found that news audiences increasingly want 'explain the impact in my life' rather than a list of articles. I would A/B that against raw posts with a 14-day retention metric as the primary readout, not click-through rate.

One regulatory constraint to name before the interviewer does: the EU Digital Services Act requires large platforms to offer a chronological feed toggle. Meta is testing this now. I would design the chronological toggle from the start, because retrofitting it onto a ranking system built without it breaks the re-ranking integrity layer.

The trade-off I would force a decision on: creator reach versus consumer relevance. LinkedIn's generative recommender boosted native video performance by 69%, but organic post reach for text-only creators dropped 50 to 65%. That is not a side effect; it is a structural outcome of the objective function. If deep engagement signals correlate with video, small text-heavy creators get de-amplified. The feed is a two-sided market, and killing small-creator supply eventually kills content diversity, which eventually kills consumer retention. I would propose a creator health guardrail: track the share of creators in the bottom quartile by follower count who received at least a minimum threshold of organic impressions in the last 30 days. If that guardrail drops below threshold, the ranking objective gets a reach-diversity weight added."

weak

"I'd use machine learning to personalize the feed based on what users engage with, show content from people they're close to, and insert ads every few posts." This fails for four specific reasons: it names the output without specifying the objective function (what is the model predicting, and why that proxy?); "engage with" is undefined: a like, a 5-second dwell, a share, and a return visit tomorrow are wildly different signals with different optimization implications; "insert ads every few posts" treats monetization as slot-filling rather than an auction and quality problem, ignoring that at 40% ad density, adding more ads erodes organic signal, not just user sentiment; and there is no failure mode thinking: no filter bubble concern, no creator supply risk, no regulatory constraint. An interviewer at Meta or LinkedIn in 2026 will immediately ask "what happens to creators with small audiences under your system?" and the candidate who only modeled the consumer side will have nothing to say.

The two-sided market most candidates miss

The consumer side of a feed is what most candidates analyze. The creator side is where answers get separated. If your ranking system de-amplifies small creators in favor of high-engagement video content, you compress the supply of diverse content over time. The feed gets more homogeneous, serendipity drops, and the “saw something unexpected that made me think” sessions that drive long-term retention disappear. The creator health guardrail is not altruistic; it is a viable supply chain concern.

North star choices and their trade-offs

Four candidates for north star, each with a specific failure mode:

  • Time in feed: easy to game with outrage and autoplay; does not distinguish healthy engagement from regret; regulatory and advertiser risk as a headline metric.
  • DAU/MAU ratio: measures habit, not value; a platform people check out of anxiety has strong DAU/MAU and terrible user sentiment.
  • Content diversity score: protects against filter bubbles but can be gamed by surface-level topic variation while keeping viewpoint narrow.
  • 48-hour return with action: harder to game because it requires the user to voluntarily return and do something intentional. Closest proxy to “time well spent” without relying on self-report. Its weakness: slow feedback loop for A/B tests.

What interviewers at specific companies are actually probing for

At Meta, the live question is the EU DSA chronological toggle: how do you build a ranking system that degrades gracefully to chronological without breaking integrity guarantees? Name it before they ask. The EdgeRank-to-ML history also matters: 1% retention uplift equals billions in ad revenue at scale, which is why the quality floor on ads is an economic calculation, not just a values statement. At LinkedIn, they want the two-sided creator-consumer trade-off and the ability to articulate the generative recommender’s objective function at a conceptual level, not just “we use ML.” Mobile is 72% of LinkedIn usage; first-hour comment response boosts visibility roughly 35%: both are constraints on ranking latency requirements worth naming. At Reddit, the question is community health: does your ranking system surface subreddit-native content or does it homogenize the feed across communities? At Perplexity, the question is the atomic unit: is the feed item a post, a source, or an AI synthesis? Deciding that is the design question, not the ranking question.

Guardrail metrics alongside the north star

A strong answer names guardrails explicitly. Candidates who list only a north star leave the interviewer wondering if they understand failure modes:

  • Creator health rate: share of bottom-quartile creators (by follower count) reaching a minimum impression threshold per month.
  • Content regret rate: share of sessions with at least one hide or flag action; tracks engagement-bait in the ranking model.
  • Sponsored content quality floor rejection rate: how often does the ad auction fail to produce a qualifying ad? Rising rejection rate means the quality floor is working. Crashing rejection rate means standards have been lowered.
  • Filter bubble index: diversity of viewpoints in a given user’s top-50 impressions over 30 days; a concentrated score triggers a serendipity injection in the re-ranker.

For the viable/lovable framing that grounds this design, see feasibility is free and lovable, not just usable. For the north star trade-off methodology, see north star metric.