Design Uber (Dispatch & Trip)
Geospatial driver index, dispatch matching, and the trip state machine across services.
Intro
Uber's design centres on real-time geolocation: drivers stream location updates; riders ask 'who's near me?'. The matching, billing and trip lifecycle services hang off that geospatial core.
Functional
- Riders request rides with category and destination.
- System matches a nearby suitable driver in seconds.
- Trip lifecycle: requested → matched → started → completed/cancelled.
- Surge pricing per region/time.
Non-functional
- Match within ~3 seconds at p95.
- Driver location updates: 1 every 4 seconds × 5 M active drivers ≈ 1.25 M QPS.
- Geographically partitioned to keep latency low.
Components
Driver location service
High-throughput updates; writes to in-memory geo index.
Geospatial index
Quadtree / Google S2 cells; sharded by region.
Matching service
On request, queries index for k nearest drivers; ranks; offers.
Trip service
State machine — emits domain events to Kafka.
Pricing service
Surge multiplier per cell; reads demand/supply ratios.
Payments
Idempotent charge on trip completion.
Trade-offs
Quadtree vs. S2 vs. Geohash
Pros
- S2: spherical, uniform cells, used at Uber/Google.
- Geohash: simplest, string-keyed.
Cons
- Quadtrees skew with population density without rebalancing.
Push-offer vs. broadcast-pool
Pros
- Push-offer keeps individual driver UX simple.
- Broadcast-pool fits Pool/Express better.
Cons
- Push-offer can underuse drivers; broadcast adds complexity.
Scale concerns
- Region failover — match must continue if a city's index is offline.
- Hot cells (downtown 9 AM) vs. quiet cells — autoscaling per cell.
- Update batching — coalesce 1-Hz updates into the index every few seconds.