Problem Understanding
Restate the problem in your own words.
Design a Distributed Search Engine (Google / Elasticsearch)
Design a distributed search engine: users submit free-text queries (with optional operators, filters, phrase quotes) and get back a ranked list of matching documents with snippets in under 300 ms — across billions of documents. The architecture has a query coordinator that scatter-gathers across N inverted-index shards, a multi-stage ranker combining BM25 + PageRank + freshness + personalisation, and a snippet generator that highlights matching context. Indexing is incremental as the crawler produces new + updated documents. The decisive trade-offs are shard-by-document vs shard-by-term, real-time vs near-real-time indexing, and tail-latency mitigation when one slow shard slows the whole query.
- Google SearchThe canonical web search; trillions of pages; sub-300 ms p95.
- BingMicrosoft's web search; same architecture; powers many downstream APIs.
- Elasticsearch / OpenSearchOpen-source distributed inverted index; powers most internal enterprise search.
- Algolia / VespaHosted relevance-tuned search; Vespa is Yahoo's high-throughput ranker.
Your task: read the problem above, then write what the system is, who uses it, the rough scale, and the headline UX expectation — in your own words. Submit for AI review when you're ready.
Click any step in the sidebar to jump around — sections don't have to be done in order. Press ? any time to see all shortcuts.