A11Intermediate

Rate Limiting Design

30 minWhen designing new APIs

Format: Design a rate limiting strategy for a given scenario.

Scenario: You have an AI chat API and need to design rate limiting.

Decisions to Make:

  1. Limiting dimension: By user? By IP? By API Key?
  2. Limiting window: Per minute? Per hour? Per day?
  3. Limit amounts: How many for free users? How many for paid users?
  4. Response when exceeded: What error to return? HTTP status code?
  5. How to inform users of remaining quota? (Response headers? Separate API?)

Reference Design:

Free users: 10 requests per minute, 100 requests per day
Basic paid: 60 requests per minute, 5000 requests per day
Pro paid: 300 requests per minute, 50000 requests per day

Exceeded limit response:
HTTP 429 Too Many Requests
{
  "error": "rate_limit_exceeded",
  "message": "Too many requests, please try again later",
  "retry_after_seconds": 30,
  "limit": 10,
  "remaining": 0,
  "reset_at": "2026-02-16T12:00:00Z"
}

Discussion Questions:

  • What if a user uses multiple IPs to bypass the limit?
  • If limits are too strict, will normal user experience suffer?
  • If limits are too loose, will malicious users abuse it?

My Notes