hour are allowed → A user makes 5.000 requests in the last minute of some hour → And 5.000 requests during the first minute of the next hour → Making 10.000 requests in total in 2 minutes → Possibly the server will overloading Malte Schlüter 4.999 4.999 1 req
→ A user makes 4.000 requests the previous hour and 0 requests this hour → At the beginning of this hour there are 1.000 remaining requests → 5.000 - 100% * 4.000 = 1.000 Malte Schlüter
→ A user makes 4.000 requests the previous hour and 0 requests this hour → 15 minutes in to the current hour are 25% of the window → Remaining requests are actual 2.000 requests → 5.000 - 75% * 4.000 = 2.000 Malte Schlüter
→ A user makes 4.000 requests the previous hour and 0 requests this hour → 30 minutes in to the current hour are 50% of the window → Remaining requests are actual 3.000 requests → 5.000 - 50% * 4.000 = 3.000 Malte Schlüter
→ A user makes 4.000 requests the previous hour and 500 requests this hour → 30 minutes in to the current hour are 50% of the window → Remaining requests are actual 2.500 requests → 5.000 - 50% * 4.000 - 500 = 2.500 Malte Schlüter
→ A user makes 4.000 requests the previous hour and 500 requests this hour → 15 minutes in to the current hour are 25% of the window → Remaining requests are actual 1.500 requests → 5.000 - 75% * 4.000 - 500 = 1.500 Malte Schlüter
set of tokens → A new token is added to the bucket with a predefined frequency (e.g. every second) → If the bucket still contains tokens, the event is allowed; otherwise, it’s denied → If the bucket is at full capacity, new tokens are discarded Malte Schlüter