Calculate and compare API pricing across GPT-4o, Claude, Gemini, Llama, Mistral and more. Know your exact AI costs before you build.
Enter your token usage to see the exact cost for any model
Compare pricing, context window, and speed across all major models
| Model | Provider | Input $/1M | Output $/1M | Context | Speed | Best For |
|---|
Choose your use case and see what you'd pay across all models per month
AI APIs charge per token — roughly ¾ of a word. Most models charge separately for input (your prompt) and output (the response). Output tokens are typically 3–5× more expensive than input tokens.
For high-volume applications, Gemini 1.5 Flash, GPT-4o mini, and Claude 3 Haiku offer the best price-to-performance ratio. For complex reasoning, GPT-4o and Claude 3.5 Sonnet lead on quality.
Use prompt caching to reduce input costs by up to 90%. Batch API calls where possible. Route simple queries to cheaper models (Haiku/Flash) and reserve premium models for complex tasks.