To ensure service stability, Anyone enforces rate limits on API requests.
Rate limit rules
- Each token has a requests per minute limit
- Different models may have different rate limits
- Exceeding the limit results in a
429 Too Many Requests error
What to do when rate limited
When you receive a 429 error:
{
"error": {
"message": "Rate limit exceeded",
"type": "rate_limit_error",
"code": "rate_limit_exceeded"
}
}
How to handle it:
- Wait and retry — the
Retry-After response header tells you how many seconds to wait
- Use exponential backoff — double the delay between each retry (1s → 2s → 4s → 8s)
- Control concurrency — reduce the number of simultaneous requests
import time
import openai
def call_with_retry(client, max_retries=3, **kwargs):
for i in range(max_retries):
try:
return client.chat.completions.create(**kwargs)
except openai.RateLimitError:
wait = 2 ** i # exponential backoff
time.sleep(wait)
raise Exception("Max retries exceeded")
Increasing your limit
If the default limit isn’t enough, you can:
- Top up more credits (higher balance generally means more generous limits)
- Contact support to request a higher limit
The smart approach is to implement retry logic in your client, rather than assuming requests will never be rate limited.