Free tool

LLM API rate limits cheatsheet

RPM and TPM by provider and tier. Bookmark this. Avoid the 429.

Hit a 429? Here’s where each provider’s tiers sit, what you spend to move up, and how to plan around them. Numbers below come from each provider’s public rate-limit docs, focused on the most-used model per provider. Most providers cap on three axes at once — RPM, TPM, and a daily or spend ceiling — so the actual limit you hit is whichever runs out first.

OpenAI

Showing limits for GPT-4o. o1 / o3 models have lower per-model RPM caps (e.g., o1 starts at 500 RPM on Tier 1). Mini variants get higher TPM.

Tier	Qualification	RPM	TPM	Daily limit
Tier 1	$5 spent + 7 days	500	30,000	200 requests
Tier 2	$50 spent + 7 days	5,000	450,000	—
Tier 3	$100 spent + 7 days	5,000	800,000	—
Tier 4	$250 spent + 14 days	10,000	2,000,000	—
Tier 5	$1,000 spent + 30 days	30,000	30,000,000	—

Anthropic

Showing limits for Claude Sonnet 4.6. Opus has half the TPM at the same tier. Haiku has 2× TPM.

Tier	Qualification	RPM	TPM	Daily limit	Notes
Build Tier 1	Credit card added	50	40,000	—
Build Tier 2	$40 spent + 7 days	1,000	80,000	—
Build Tier 3	$200 spent + 7 days	2,000	160,000	—
Build Tier 4	$400 spent + 14 days	4,000	400,000	—
Scale	Sales contract	—	—	—	Custom limits, negotiated.

Google

Showing limits for Gemini 2.0 Flash. Gemini 1.5 Pro has lower RPM (300 on Tier 1).

Tier	Qualification	RPM	TPM	Daily limit	Notes
Free	Google account	15	1,000,000	1,500 requests	Used for training by default.
Tier 1	Billing enabled	2,000	4,000,000	—
Tier 2	$250 spent + 30 days	10,000	10,000,000	—

Groq

Showing limits for Llama 3.3 70B.

Tier	Qualification	RPM	TPM	Daily limit	Notes
Free	Sign up	30	6,000	14,400 requests
Pay-as-you-go	Billing enabled	1,000	300,000	—

DeepSeek

Showing limits for DeepSeek V3 (deepseek-chat). No published per-tier limits; concurrency-based.

Tier	Qualification	RPM	TPM	Daily limit	Notes
Pay-as-you-go	Billing enabled	—	—	—	DeepSeek doesn't publish RPM/TPM — concurrency-controlled. Expect ~60 concurrent requests.

Mistral

Showing limits for Mistral Large 2.

Tier	Qualification	RPM	TPM	Daily limit	Notes
Free (Experiment)	Sign up	1	500,000	1B tokens / month
Production	Billing enabled	200	2,000,000	—

xAI

Showing limits for Grok-3.

Tier	Qualification	RPM	TPM	Daily limit	Notes
Pay-as-you-go	Billing enabled	60	240,000	—	Higher tiers negotiated.

How to avoid rate limits in production

Exponential backoff on 429s — most providers tell you to retry after N seconds via Retry-After.
Rotate across multiple API keys (and providers) — Tokenwise can do this automatically with a fallback rule.
Cross-provider fallback (Anthropic → OpenAI → Gemini) so a single provider outage doesn’t take you down.
Batch what you can — OpenAI, Anthropic, and Mistral offer batch endpoints at 50% cost with a 24-hour SLA.

Source: provider docs, last verified May 24, 2026. Limits change — check the provider’s page for canonical numbers, or pull the same data from /api/llm-prices.json.

Tokenwise fallbacks route around rate limits automatically — try it.