Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apidays New York 2024 - The subtle art of API r...

Apidays New York 2024 - The subtle art of API rate limiting by Josh Twist, Zuplo

The subtle art of API rate limiting
Josh Twist, Co-founder & CEO at Zuplo

Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024)

------

Check out our conferences at https://www.apidays.global/

Do you want to sponsor or talk at one of our conferences?
https://apidays.typeform.com/to/ILJeAaV8

Learn more on APIscene, the global media made by the community for the community:
https://www.apiscene.io

Explore the API ecosystem with the API Landscape:
https://apilandscape.apiscene.io/

apidays

May 14, 2024
Tweet

More Decks by apidays

Other Decks in Technology

Transcript

  1. Me Founded several services in Microsoft Azure: API Management Logic

    Apps Mobile Services Power Automate Pro Head of Product at Stripe and Facebook Been a vendor and a customer of API gateways / management
  2. Agenda Why do we need this? What’s the canonical approach?

    Private or Public Latency tradeoffs and optimizing the customer experience Observability Demo Content
  3. Why do we need rate limiting? Protect resource consumption Protect

    resource starvation Ensure good experience for all customers Discourage abusers Enforce business model
  4. A canonical response Status - 429 Status Text - Too

    Many Requests Header - Retry-After: 3600 Body - Use Problem Details format (IETF RFC 7807)
  5. A canonical response Status - 429 Status Text - Too

    Many Requests Header - Retry-After: 3600 Body - Use Problem Details format (IETF RFC 7807) Do you want to do this?
  6. Private or public limits? Disclosing limits allows malicious users to

    easily maximize their consumption of your resources without hitting the limit Not disclosing limits makes it hard for consumers to avoid hitting your rate limit Informal Partnership 
(e.g. Free Tier) Formal Partnership 
(e.g. Contracted Customer) Hide limits Disclose limits
  7. The latency / accuracy tradeoff Being accurate means completing the

    rate limit check before allowing the request to proceed == a slower API for everyone You incur the latency on every request, whether an problem is afoot or not Consider a more lenient approach where you run the check asynchronously and cache blocked consumers Here’s how
  8. Async rate limiting Have a local cache to store known

    limited consumers (store the retry time also) Check the rate limit in parallel with performing the primary request If rate limit check comes back blocked, update the cache and, if the primary request not compete, override the response. Otherwise, the next request will be blocked anyway.
  9. Observability? RPS (requests per second) report See rate limited responses

    Break down by bucket (IP, user etc) What are the minimum requirements