Rate limits
Limits apply at several layers: request rate per key, concurrent generations per account, a system-wide capacity valve, and optional per-key spend caps. Every 429 response carries a Retry-After header — honoring it is the only back-off logic you need.
Limits at a glance
| Layer | Default limit | Scope | On exceed |
|---|---|---|---|
| Request rate | 120 requests / minute (fixed 60-second window) | Per API key; per-key overrides available for volume accounts | 429 ERR_RATE_LIMIT |
| Concurrent generations | 100 in flight (queued or processing) | Per account; higher limits per contract | 429 ERR_QUOTA_EXCEEDED (scope: account) |
| Capacity valve | System-wide daily capacity | All accounts, image models | 429 ERR_QUOTA_EXCEEDED (reason: pool_capacity_exceeded) |
| Poll interval | 1 poll / 5 seconds | Per API key per creation | 429 ERR_RATE_LIMIT |
| Monthly spend cap | Optional USD amount, set per key | Per API key | 402 ERR_INSUFFICIENT_BALANCE (reason: monthly_budget_exceeded) |
Per-key request rate
Each API key may make 120 requests per minute by default, measured over a fixed 60-second window. Exceeding it returns 429 ERR_RATE_LIMIT with details carrying the limit, the window, and seconds until it resets. Higher per-key limits are available for volume accounts.
Concurrent generations
An account may have up to 100 generations in flight (queued or processing) at once by default. Exceeding it returns 429 ERR_QUOTA_EXCEEDED with details: {"limit": N, "scope": "account"}. Higher concurrency is available per contract.
Capacity valve
A system-wide daily capacity valve protects the image models under peak load. When it trips, requests return 429 ERR_QUOTA_EXCEEDED with details.reason = "pool_capacity_exceeded" and a short retry-after — these are transient.
Monthly spend cap
Keys can carry an optional monthly spend cap — a USD amount (e.g. a $50.00 monthly budget). A request that would exceed it returns 402 ERR_INSUFFICIENT_BALANCE with details.reason = "monthly_budget_exceeded" and the cap value in USD.
Polling interval
Status polling has its own limit: at most one poll every 5 seconds per creation per key — see Job status.
The 429 responses
All 429s use the standard error envelope and carry a Retry-After header (seconds). The details object identifies which limit tripped:
# Per-key request rate
HTTP/1.1 429 Too Many Requests
Retry-After: 17
{"error": {"code": "ERR_RATE_LIMIT", "message": "…",
"details": {"limit": 120, "window": "1m", "retry_after": 17}}}
# Account concurrency
HTTP/1.1 429 Too Many Requests
Retry-After: 2
{"error": {"code": "ERR_QUOTA_EXCEEDED", "message": "…",
"details": {"limit": 100, "scope": "account"}}}
# Capacity valve
HTTP/1.1 429 Too Many Requests
Retry-After: 5
{"error": {"code": "ERR_QUOTA_EXCEEDED", "message": "…",
"details": {"reason": "pool_capacity_exceeded", "pool": "…", "retry_after": 5}}}Handling 429s
Read the Retry-After header, wait at least that many seconds (plus a little jitter so parallel workers don't retry in lockstep), and retry:
import random
import time
import requests
BASE = "https://api.rendergrid.io/api/public/v1"
HEADERS = {"Authorization": "Bearer rg_live_xxx"}
def post_with_backoff(path: str, payload: dict, max_attempts: int = 5):
for attempt in range(max_attempts):
resp = requests.post(f"{BASE}{path}", headers=HEADERS, json=payload)
if resp.status_code != 429:
return resp
retry_after = int(resp.headers.get("Retry-After", "2"))
time.sleep(retry_after + random.uniform(0, 1)) # jitter
return respCombine retries with an idempotency key so retried generation requests are never double-charged.