Rate Limiter in System Design. Part 2 - Commonly Used Algorithms
In Part 1, I provided a general introduction at the conceptual level. In this Part 2, I will discuss the most commonly used algorithms when building a
Leaky Bucket
Conceptually, the Leaky Bucket algorithm operates as follows. The server utilizes a fixed-sized bucket, and the bottom of the bucket has a hole (output) to hold a certain number of tokens. In this context, a token represents a request sent to the server.
Tokens are sent to the server by the user (client), and the server acts as the agent that allows tokens to enter or exit the bucket.
To facilitate visualization, you can refer to the accompanying illustration below:
The Leaky Bucket algorithm is not overly complex to implement and maintain on a server or load balancer. Memory management is also relatively straightforward, as everything can be configured from the beginning. Due to its high accuracy and efficient resource utilization, many large-scale systems utilize the Leaky Bucket algorithm as a Rate Limiter, with NGINX being a notable example.
Despite its many advantages, the Leaky Bucket algorithm also has some drawbacks:
Burstiness: One drawback of the Leaky Bucket algorithm is that it does not handle sudden or bursty loads of requests well. When a large number of requests arrive at the same time, the algorithm has to process all of them within the same time frame. This can lead to overload and decreased system performance.
Delayed response: The Leaky Bucket algorithm can result in delayed response for access requests that occur after a prolonged "idle" interval, where no requests are being processed. In such cases, the "leak" of requests from the bucket only occurs when the next interval begins, resulting in a longer waiting time for the requests to be processed.
Lack of flexibility in handling prioritized or urgent requests.
Implementation errors: If there is a mistake in implementing the algorithm and the bucket is not handled correctly, some requests may be accepted and processed even though the request rate has exceeded the set limit. This can allow a large number of requests to flood the system, causing overload and potentially making the server unstable or slow.
It's important to consider these limitations and evaluate whether the Leaky Bucket algorithm is suitable for specific use cases or if alternative rate-limiting algorithms should be employed.
Fixed Window Counter
The Fixed Window Counter algorithm (abbreviated as FWC) differs from the Leaky Bucket algorithm as it is based on time rather than using a dynamically allocated capacity to control requests. FWC divides time into equal intervals, and each interval has its own counter. When a request arrives, based on the timestamp, the request is allocated to the predefined interval, and the corresponding counter operates (e.g., increments by 1).
When the counter reaches a certain threshold, the request is denied execution and either needs to wait or retry in the next time interval.
With the Fixed Window Counter algorithm, we have the following complexities:
Space Complexity: O(1) for storing the counter for each window.
Time Complexity: O(1) for performing operations like Get and incrementing the counter, as they are simple operations.
The Fixed Window Counter algorithm offers several advantages:
Easy implementation: The algorithm is relatively simple to implement, as it involves dividing time into fixed intervals and maintaining counters for each interval.
Lower memory usage: The algorithm requires minimal memory usage, as it only needs to store the counters for each window.
Compatibility with built-in concurrency technologies: The Fixed Window Counter algorithm can leverage built-in concurrency technologies such as Redis, which provides efficient support for distributed and concurrent operations.
However, there are still several drawbacks to the Fixed Window Counter algorithm:
Lack of flexibility: Due to its reliance on pre-defined configurations, the algorithm lacks the ability to scale up dynamically when needed. Developers have to manually adjust the configurations for each campaign or when there is a sudden increase in request volume.
Inherited issues from previous windows: The algorithm does not effectively address situations where a previous window exceeds the request limit. Subsequent windows may not be able to compensate for the excess requests, as the algorithm lacks the ability to dynamically adjust the capacity.
Inability to prioritize or handle urgent requests: The Fixed Window Counter algorithm treats all requests within a window equally and does not differentiate or prioritize urgent requests. This limitation may not be suitable for scenarios where certain requests require immediate processing.
Sliding Window Log & Sliding Window Counter
The two aforementioned algorithms, although they are easy to implement, each has its own limitations. The sliding Window Log & Counter (SWLC) algorithm is used to mitigate some of these challenges.
The Sliding Window Log & Counter algorithm (SWLC) stores and operates based on the executing agent. When a request is sent to the server, the system identifies the request based on specific criteria such as tokens, IP addresses, etc.
SWLC commonly utilizes an SLA (Service Level Agreement) table, where it logs all user actions. It is a simple key-value structure, with the key consisting of the (token, IP) combination and the value representing the requested data sent by the user.
The Sliding Window Log and counter algorithm described in the diagram above sets a maximum limit of two requests per user within each window. The counter will track and reject any third request from the same user within the same time window.
At its basic level, the Sliding Window Log and Counter algorithm has the following complexities:
Space Complexity: O (number of requests read by the server in one window). The server needs to perform this task to identify the user/IP associated with each request.
Time Complexity (Sliding Window Log): O (number of requests read by the server in one window)
Time Complexity (Sliding Window Counter): O(1). This is a key difference between Sliding Window Counter (SWC) and Sliding Window Log (SWL). SWC only stores the current window (without the need to look back at previous windows).
The advantages of this algorithm include addressing the drawbacks of the two algorithms mentioned earlier and combining the three elements of quantity, agent, and time mentioned in Part 1. So, what are the disadvantages of this algorithm?
Difficult to implement: Combining all three factors certainly adds complexity.
Resource consumption requires significant memory usage for storing all users within a window.
Complexity increases with the number of requests.
Summary
Here are some commonly used algorithms for building a Rate Limiter system. If there are any other algorithms or better ones, please let me and everyone else know!
Thank you all
Read more
Part 1: Rate Limiter in System Design. Part 1 - Concepts and Applications