LLM Token Cost Calculator
Paste a prompt, get the token count and monthly cost across every major model. No signup, no email harvest.
| Model | Input / 1M | Output / 1M | Per request | Monthly ↑ |
|---|---|---|---|---|
Llama 3.1 8B (Groq) Groq | $0.050 | $0.080 | <$0.0001 | $0.021 |
Gemini 1.5 Flash Google | $0.075 | $0.30 | <$0.0001 | $0.032 |
Gemini 2.0 Flash Google | $0.10 | $0.40 | <$0.0001 | $0.042 |
GPT-4o mini OpenAI | $0.15 | $0.60 | <$0.0001 | $0.063 |
Mistral Small 3 Mistral | $0.20 | $0.60 | <$0.0001 | $0.084 |
DeepSeek V3 DeepSeek | $0.27 | $1.10 | <$0.0001 | $0.113 |
GPT-4.1 mini OpenAI | $0.40 | $1.60 | <$0.0001 | $0.168 |
DeepSeek R1 DeepSeek | $0.55 | $2.19 | <$0.0001 | $0.231 |
Llama 3.3 70B (Groq) Groq | $0.59 | $0.79 | <$0.0001 | $0.248 |
Claude Haiku 4.5 Anthropic | $0.80 | $4.00 | <$0.0001 | $0.336 |
o3-mini OpenAI | $1.10 | $4.40 | <$0.0001 | $0.462 |
Gemini 1.5 Pro Google | $1.25 | $5.00 | <$0.0001 | $0.525 |
GPT-4.1 OpenAI | $2.00 | $8.00 | <$0.0001 | $0.840 |
Grok-2 xAI | $2.00 | $10.00 | <$0.0001 | $0.840 |
Mistral Large 2 Mistral | $2.00 | $6.00 | <$0.0001 | $0.840 |
GPT-4o OpenAI | $2.50 | $10.00 | $0.0001 | $1.05 |
Claude Sonnet 4.6 Anthropic | $3.00 | $15.00 | $0.0001 | $1.26 |
Claude 3.5 Sonnet Anthropic | $3.00 | $15.00 | $0.0001 | $1.26 |
Grok-3 xAI | $3.00 | $15.00 | $0.0001 | $1.26 |
o1 OpenAI | $15.00 | $60.00 | $0.0006 | $6.30 |
Claude Opus 4.7 Anthropic | $15.00 | $75.00 | $0.0006 | $6.30 |
Token count uses the ceil(chars / 4) heuristic — accurate to within ±10% for English prose across cl100k, o200k, and Claude tokenizers. Code and non-Latin scripts tokenize denser; expect 10–25% more tokens than shown.
Per-request cost = (effective input price × input tokens + output price × output tokens) ÷ 1,000,000. With caching on, effective input price = input × (1 − hit) + cached × hit. Models that don’t support caching (Grok, Mistral, Llama via Groq) use the full input price regardless of the toggle.
Source: provider pricing pages, last verified May 24, 2026. Same data as JSON.
Want this tracked on real production traffic? Try Tokenwise— one line of code, <50 ms overhead, $19/mo.