Design an LLM Serving Platform (ChatGPT)

HLD

0/10 sections

00:00/ 45m

USection 01Design an LLM Serving Platform (ChatGPT)

Problem Understanding

Restate the problem in your own words.

What this step looks like in a real interview: Before you draw a single box, demonstrate that you understand the system you're designing. Read the problem statement and example products at the top of this section, then write what the system is, who uses it, and the headline scale + UX expectations — in your own words. Strong candidates anchor the rest of the interview by getting the framing right here.

ProblemWhat you're designing

Design an LLM Serving Platform (ChatGPT)

Design a ChatGPT-class LLM serving platform: users send chat completions with multi-turn conversation context, the system routes to the right model variant, batches requests on GPUs for throughput, streams tokens back as they generate (TTFT < 1 s), reuses per-conversation KV-cache, and runs moderation on input + output. GPU is the scarce resource — utilisation is the headline cost metric. The decisive trade-offs are static vs continuous batching, per-conversation KV-cache vs prompt-prefix-only, and how to route across model variants (small / large / specialised) for cost-quality optimisation.

Real-world examples

ChatGPT (OpenAI)
Conversational LLM at hundreds of millions of users; full serving stack including moderation.
Claude (Anthropic)
Same shape; emphasis on constitutional AI safety layer.
Gemini (Google)
Multi-modal LLM serving stack on Google's TPU fleet.
GitHub Copilot
Specialised LLM serving for code completion; latency-defining.
Together AI / Replicate
Multi-model API platforms; same routing + batching primitives.

Your task: read the problem above, then write what the system is, who uses it, the rough scale, and the headline UX expectation — in your own words. Submit for AI review when you're ready.

Auto-saved to this browser

Audio recording unsupported on this browser.

WPM—Fillers0Pace—Thinking0%

Your understanding

0 words0 chars

Submit for AI review

Senior-engineer feedback on what's strong, what's missing, and a calibrated score for this section.

0 / 200 chars — add ~200 more

Welcome

Click any step in the sidebar to jump around — sections don't have to be done in order. Press ? any time to see all shortcuts.

HLD prep is optimised for larger screens

Problem Understanding

Design an LLM Serving Platform (ChatGPT)