PRICES UPDATED MAY 2026
// AI API cost calculator

How much will your AI app actually cost to run?

Compare token pricing across Claude, GPT, Gemini and Grok. Set your workload — see real numbers per request, day, and month. No fluff.

Cost comparison

// sorted cheapest → most expensive
Sponsored / Google AdSense

Questions, answered

What's a token, really?

A token is roughly 3-4 characters of English text — or about ¾ of a word. The sentence "Hello, how are you?" is 7 tokens. Code, non-English languages, and unusual symbols use more tokens per character. As a rough rule: 1,000 tokens ≈ 750 words.

Why is output more expensive than input?

Generation is computationally heavier than reading. Input tokens are processed in parallel; output tokens are generated one at a time, sequentially. Across all major providers, output is 3-6x more expensive than input. This is why long-output workloads (writing, code generation) cost much more than long-input ones (summarization, classification).

Should I just use the cheapest model?

Only if it's good enough for your task. The 90/10 rule applies: cheap models (Haiku 4.5, GPT-5.4 Mini, Gemini Flash, Grok 4.1 Fast) handle classification, routing, extraction and summarization perfectly. For complex reasoning, coding, or anything user-facing, the price jump to Sonnet 4.6 or GPT-5.4 usually pays for itself in fewer retries and better outputs.

What about prompt caching and batch discounts?

Both can dramatically lower your bill. Prompt caching (system prompts, long documents) cuts cached input cost by up to 90% on Anthropic, OpenAI and Gemini. Batch API gives a flat 50% discount on non-urgent workloads. This calculator shows standard rates — your real cost can be 50-95% lower with these enabled.

How often are these prices updated?

Pricing here was last verified in May 2026 from official documentation. Major providers change their lineup every few months — new models, deprecations, occasional price cuts. Subscribe to the email above to get notified when something changes.

Why isn't model X listed?

We focus on the current production tier from the four major providers: Anthropic, OpenAI, Google, and xAI. Open-source models (Llama, DeepSeek, Mistral) and specialty models (Codex, embeddings) are excluded — they need separate calculators because the cost model differs (self-hosting, batch sizes, hosting markup).