Cheapest AI API 2026:
Lowest Cost LLMs Ranked

Every major AI API ranked by cost per token. Updated June 2026 with the latest pricing from OpenAI, Anthropic, Google, Meta, and Mistral.

Cheapest AI APIs Ranked by Input Price (May 2026)

Prices are per million tokens (1M tokens ≈ 750,000 words). Sorted by input token cost, lowest first.

#ModelProviderInput /1MOutput /1MBest For
1 Gemini 2.5 Flash-LiteGoogle$0.10$0.40 Ultra budget
2 GPT-4o miniOpenAI$0.15$0.60 High volume
2 Gemini 2.5 FlashGoogle$0.15$0.60 Long context
3 Llama 3.1 8BMeta (hosted)$0.18$0.18 Simple tasks
4 Mistral 7BMistral$0.25$0.25 EU data
5 Claude 3.5 HaikuAnthropic$0.80$4.00 Quality budget
6 Llama 3.1 70BMeta (hosted)$0.90$0.90 Open source
7 Gemini 2.5 ProGoogle$2.00$12.00 Premium tasks
8 GPT-4oOpenAI$2.50$10.00 Balanced
9 Claude 3.5 SonnetAnthropic$3.00$15.00 Complex tasks
10 Claude 3 OpusAnthropic$15.00$75.00 Max quality
Key insight: The price gap between cheapest and most expensive is 50x on input tokens ($0.10 vs $5.00). For high-volume applications, choosing the right budget model can cut your API bill by 95%.

Top 3 Cheapest AI APIs — Detailed Breakdown

🥇 #1: Gemini 2.5 Flash-Lite — $0.10/M input

Google's most affordable model in 2026. Flash-Lite is designed for simple, repetitive tasks at extreme scale. It handles classification, extraction, summarization, and basic Q&A well. Not suitable for complex multi-step reasoning or nuanced writing.

🥈 #2: GPT-4o mini — $0.15/M input

OpenAI's budget model punches well above its price point. GPT-4o mini delivers strong performance on structured tasks, follows instructions reliably, and integrates seamlessly with OpenAI's tools ecosystem (function calling, Assistants API, fine-tuning).

🥈 #2 (tied): Gemini 2.5 Flash — $0.15/M input

Same price as GPT-4o mini but with a massive 1M token context window — making it the best budget choice for long-document processing. Summarizing a 200-page PDF costs less than $0.05 in input tokens.

How Much Can You Save by Switching Models?

Assume a customer support chatbot processing 1 million requests/month with 500 input + 200 output tokens each:

ModelInput CostOutput CostMonthly Totalvs GPT-4o
Gemini 2.5 Flash-Lite$50$80$130-94%
GPT-4o mini$75$120$195-91%
Claude Haiku 3.5$400$800$1,200-44%
GPT-4o$1,250$2,000$3,250
Claude 3.5 Sonnet$1,500$3,000$4,500+38%

Should You Use Open-Source Models (Llama, Mistral)?

Meta's Llama and Mistral's open-weight models can be significantly cheaper when self-hosted, but running your own infrastructure adds complexity and fixed costs. Through hosted providers, the pricing above is competitive with the cheapest closed-source options.

Self-hosting makes sense if you process more than 50M tokens/month and have engineering capacity. Below that threshold, hosted APIs are simpler and often cheaper when you factor in GPU costs and maintenance.

Tips for Minimizing AI API Costs

Use our free calculator to estimate exactly what your workload will cost across all major models.

Calculate My AI API Cost →