Free tool

LLM pricing comparison

Every major model, per-million-token pricing, context window, modalities — sortable and bookmarkable.

							Notes
Llama 3.1 8B (Groq)	Groq	$0.050	$0.080	20.0M	128K	text	Lightning-fast on Groq's LPU hardware.
Gemini 1.5 Flash	Google	$0.075	$0.30	13.3M	1M	textvisionaudio
Gemini 2.0 Flash	Google	$0.10	$0.40	10.0M	1M	textvisionaudio
GPT-4o mini	OpenAI	$0.15	$0.60	6.7M	128K	textvision
Mistral Small 3	Mistral	$0.20	$0.60	5.0M	32K	text
DeepSeek V3	DeepSeek	$0.27	$1.10	3.7M	128K	text	Off-peak (UTC 16:30–00:30) is 50% cheaper.
GPT-4.1 mini	OpenAI	$0.40	$1.60	2.5M	1M	textvision
DeepSeek R1	DeepSeek	$0.55	$2.19	1.8M	128K	text	Reasoning model — outputs include chain-of-thought.
Llama 3.3 70B (Groq)	Groq	$0.59	$0.79	1.7M	128K	text	Open-weight model. Pricing + speed depend on host.
Claude Haiku 4.5	Anthropic	$0.80	$4	1.3M	200K	textvision
o3-mini	OpenAI	$1.10	$4.40	909K	200K	text
Gemini 1.5 Pro	Google	$1.25	$5	800K	2M	textvisionaudio	2M context — the largest of any production model.
GPT-4.1	OpenAI	$2	$8	500K	1M	textvision
Grok-2	xAI	$2	$10	500K	128K	textvision
Mistral Large 2	Mistral	$2	$6	500K	128K	textvision
GPT-4o	OpenAI	$2.50	$10	400K	128K	textvision
Claude Sonnet 4.6	Anthropic	$3	$15	333K	200K	textvision
Claude 3.5 Sonnet	Anthropic	$3	$15	333K	200K	textvision	Older but still popular for cost-stability reasons.
Grok-3	xAI	$3	$15	333K	1M	textvision
o1	OpenAI	$15	$60	67K	200K	text	Reasoning tokens billed but hidden from output.
Claude Opus 4.7	Anthropic	$15	$75	67K	1M	textvision	Cache write 1.25× input price. Extended thinking optional. JSON via prefill or tool-use trick.

Last verified May 24, 2026 · 21 models shown · Data also available as JSON.

Source: provider pricing pages, May 2026. Prices can change — verify on each provider’s site for production use.

Want to track these costs on your actual traffic? Try Tokenwise — one line of code, $19/mo.