Skip to main content
To ensure service stability, Anyone enforces rate limits on API requests.

Rate limit rules

  • Each token has a requests per minute limit
  • Different models may have different rate limits
  • Exceeding the limit results in a 429 Too Many Requests error

What to do when rate limited

When you receive a 429 error:
{
  "error": {
    "message": "Rate limit exceeded",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}
How to handle it:
  1. Wait and retry — the Retry-After response header tells you how many seconds to wait
  2. Use exponential backoff — double the delay between each retry (1s → 2s → 4s → 8s)
  3. Control concurrency — reduce the number of simultaneous requests
import time
import openai

def call_with_retry(client, max_retries=3, **kwargs):
    for i in range(max_retries):
        try:
            return client.chat.completions.create(**kwargs)
        except openai.RateLimitError:
            wait = 2 ** i  # exponential backoff
            time.sleep(wait)
    raise Exception("Max retries exceeded")

Increasing your limit

If the default limit isn’t enough, you can:
  • Top up more credits (higher balance generally means more generous limits)
  • Contact support to request a higher limit
The smart approach is to implement retry logic in your client, rather than assuming requests will never be rate limited.