Design DoorDash
Variants:
DoorDash System Design Requirements
Functional Requirements
Search Restaurants
Users should be able to search for restaurants by location, cuisine, and price.
Place Orders
Users should be able to select items from a menu and place an order.
Real-time Order Tracking
Users should be able to track the status of their order in real-time, from preparation to delivery.
Driver Dispatch
The system must efficiently dispatch drivers (Dashers) to pick up and deliver orders.
Non-Functional Requirements
High Availability
The service must be highly available, especially during peak meal times.
Low Latency
Search results and order status updates should be fast.
Scalability
The system must be able to handle a high volume of concurrent orders and real-time location updates.
Reliability
Orders should not be lost, and delivery estimates should be accurate.
CAP Theorem Trade-offs
Trade-off Explanation:
For a food delivery service, we prioritize Availability and Partition Tolerance (AP). It's more important for users to be able to place orders and for Dashers to receive them than to have strict consistency. Eventual consistency is acceptable for most parts of the system, though we'll need stronger consistency for the order processing and payment components.
Scale Estimates
API Design
The API for a food delivery service like DoorDash needs to serve three different types of users: customers, restaurants, and Dashers. Your interviewer will expect you to consider the different needs of each user type when designing the API.
For simplicity, we'll focus on the core customer-facing endpoints for searching restaurants, placing an order, and tracking its status.
Search for restaurants by location and cuisine.
Place a new food order.
Get the status and location of an order.
Database Schema
The database schema for DoorDash needs to manage a complex set of relationships between users, restaurants, menus, orders, and Dashers. Your interviewer will be interested in how you model these entities to support the entire order lifecycle.
We'll have tables for users
, restaurants
, menu_items
, orders
, and dashers
. The orders
table will be central to the system, linking together the customer, restaurant, and Dasher for each transaction.
restaurants
menu_items
orders
dashers
High-Level Architecture
The architecture for a food delivery service like DoorDash is a complex, real-time system with many moving parts. A microservices architecture is a good choice to manage this complexity and allow for independent scaling of different components.
Core Services
- Order Service: This service is the orchestrator of the entire system. It manages the state of an order from creation to completion, coordinating with all other services.
- Restaurant Service: This service manages restaurant information, menus, and hours of operation.
- Dasher Service: This service manages Dasher profiles, real-time locations, and availability.
- Dispatch Service: This is the "brains" of the operation. It uses geospatial data and other signals to match an incoming order with the best available Dasher.
Deep Dive: Dasher Matching (Geospatial Queries)
A critical part of any delivery service is efficiently matching a new order with the best available driver. In an interview, you'll need to explain how you would find the closest available Dashers to a restaurant in real-time.
Geospatial Indexing
To efficiently query for Dashers within a certain radius of a restaurant, we need to use a geospatial index. Standard database indexes are not designed for this type of query. A common approach is to use a geohashing algorithm, which converts latitude and longitude coordinates into a short string. We can then use a standard B-tree index on the geohash strings to perform efficient location-based searches. Many databases and search engines, like Elasticsearch and PostgreSQL (with PostGIS), have built-in support for geospatial indexing.
Geohashing
Why: Geohashing is a technique that encodes a geographic location into a short string of letters and numbers. It's a hierarchical system, so the longer the geohash, the more precise the location. This makes it a great choice for finding nearby points of interest.
How it works: The world is divided into a grid of 32 cells, each represented by a character. Each cell is then recursively divided into 32 smaller cells, and so on. This creates a hierarchical structure that allows us to efficiently find all points within a given area by searching for all geohashes that share a common prefix.
We can then store these geohashes in a standard B-tree index in our database, or in a key-value store like Redis. When a new order comes in, we can calculate the geohash for the restaurant's location and then query for all Dashers with a matching geohash prefix.
Trade-offs:
- Pros: Extremely fast and efficient for real-time geospatial queries.
- Cons: Redis is an in-memory store, so it can be expensive at a very large scale. Requires careful data management to ensure that the location data in Redis is always up-to-date.
Database with Geospatial Indexing
Why: For a more persistent and integrated solution, we can use a database that has built-in support for geospatial indexing, such as PostgreSQL with the PostGIS extension. This keeps our location data in the same database as the rest of our Dasher information, which can simplify the system architecture.
How it works: We would store the Dasher's location in a geometry
or geography
column in the dashers
table. We would then create a spatial index (such as a Quadtree, R-tree, or Geohash-based index) on this column. When a new order comes in, we can perform a spatial query to find all Dashers within a certain distance of the restaurant. For example, in PostGIS, we could use the ST_DWithin
function to efficiently find all Dashers within a given radius.
Trade-offs:
- Pros: Provides strong consistency and durability. Keeps the location data in the same database as the rest of the Dasher's information.
- Cons: Can be slower than an in-memory solution like Redis. May not scale as well for very high-throughput location updates.
Deep Dive: Real-time Order Tracking
Customers expect to be able to track their order in real-time, from the moment it's placed until it arrives at their door. Your interviewer will want to know how you would implement this feature.
WebSockets
Why: WebSockets provide a persistent, bidirectional communication channel between the client and the server, making them ideal for pushing real-time location updates to the customer.
How it works: When a customer opens the order tracking page, their client establishes a WebSocket connection with our server. The Dasher's app will periodically publish their location to the Dasher Service. This service will then push the new location to the customer's client via the WebSocket connection.
Trade-offs:
- Pros: Low latency, efficient.
- Cons: Can be complex to manage at scale. Requires careful handling of connection state.
gRPC Streaming
Why: gRPC is a high-performance RPC framework that is particularly well-suited for mobile clients, where battery life and network efficiency are critical.
How it works: The client and server establish a persistent gRPC stream. The server can then push location updates to the client as they become available. gRPC uses HTTP/2, which is more efficient than the HTTP/1.1 used by WebSockets, and it uses Protocol Buffers for serialization, which is more compact than JSON.
Trade-offs:
- Pros: Highly efficient, great for mobile clients.
- Cons: Not natively supported by browsers (requires a proxy), more complex to set up than WebSockets.
Long Polling
Why: Long polling is a simpler alternative to WebSockets that can be used as a fallback for older clients or in environments where WebSockets are not supported.
How it works: The client makes an HTTP request to the server, which holds the request open until a location update is available. Once an update is sent, the client immediately makes another request.
Trade-offs:
- Pros: Simple to implement, supported by all browsers.
- Cons: Higher latency and less efficient than WebSockets or gRPC.
Deep Dive: Search
A powerful and intuitive search experience is a key feature of any successful food delivery app. Your interviewer will expect you to discuss how you would build a search service that can handle a large number of restaurants and provide relevant results to users.
Elasticsearch for Search
Why: Elasticsearch is a powerful, open-source search engine that is purpose-built for complex search queries. It provides advanced features like full-text search, filtering, ranking, and geospatial search out of the box.
How it works: We would create a separate Elasticsearch cluster and index all of our restaurant and menu data in it. The Restaurant Service would be responsible for keeping the Elasticsearch index in sync with the primary database. When a user performs a search, the request would go directly to the Search Service, which would then query the Elasticsearch cluster.
Trade-offs:
- Pros: Highly scalable and performant. Provides a rich set of search features.
- Cons: Adds another system to manage and maintain. Keeping the search index in sync with the primary database can be challenging.
Complete Design
Now that we've covered all the major components individually, let's look at how everything fits together in our complete DoorDash system design. This diagram shows the end-to-end flow from a customer placing an order to a Dasher delivering it.
The complete architecture demonstrates how the Order Service, Restaurant Service, Dasher Service, and Dispatch Service work together to provide a scalable and reliable food delivery platform. The use of a geospatial index for Dasher matching and a real-time tracking system ensures an efficient and user-friendly experience.