Cost comparison
// sorted cheapest → most expensivePricing changes. Stay on top of it.
Get a short email every time a major model changes price, releases, or gets deprecated. No spam — only when it matters for your bill.
Compare token pricing across Claude, GPT, Gemini and Grok. Set your workload — see real numbers per request, day, and month. No fluff.
Get a short email every time a major model changes price, releases, or gets deprecated. No spam — only when it matters for your bill.
A token is roughly 3-4 characters of English text — or about ¾ of a word. The sentence "Hello, how are you?" is 7 tokens. Code, non-English languages, and unusual symbols use more tokens per character. As a rough rule: 1,000 tokens ≈ 750 words.
Generation is computationally heavier than reading. Input tokens are processed in parallel; output tokens are generated one at a time, sequentially. Across all major providers, output is 3-6x more expensive than input. This is why long-output workloads (writing, code generation) cost much more than long-input ones (summarization, classification).
Only if it's good enough for your task. The 90/10 rule applies: cheap models (Haiku 4.5, GPT-5.4 Mini, Gemini Flash, Grok 4.1 Fast) handle classification, routing, extraction and summarization perfectly. For complex reasoning, coding, or anything user-facing, the price jump to Sonnet 4.6 or GPT-5.4 usually pays for itself in fewer retries and better outputs.
Both can dramatically lower your bill. Prompt caching (system prompts, long documents) cuts cached input cost by up to 90% on Anthropic, OpenAI and Gemini. Batch API gives a flat 50% discount on non-urgent workloads. This calculator shows standard rates — your real cost can be 50-95% lower with these enabled.
Pricing here was last verified in May 2026 from official documentation. Major providers change their lineup every few months — new models, deprecations, occasional price cuts. Subscribe to the email above to get notified when something changes.
We focus on the current production tier from the four major providers: Anthropic, OpenAI, Google, and xAI. Open-source models (Llama, DeepSeek, Mistral) and specialty models (Codex, embeddings) are excluded — they need separate calculators because the cost model differs (self-hosting, batch sizes, hosting markup).