LLM Rate Limit Calculator

Estimate whether an AI app will hit 429 rate limits from RPM, TPM, average input tokens, average output tokens, retry factor, and concurrent users.

RPM vs TPM

Requests per minute limits cap request count. Tokens per minute limits cap total input and output tokens. Long outputs can make TPM the real bottleneck.

Plan before launch

Use this calculator with your expected traffic shape, then read the rate limit guide for queueing, retries, streaming, and output token controls.

Read the RPM vs TPM guide