Choosing an AI provider used to be simple: OpenAI was the default. In 2026, there are four major providers, twenty-plus models, and pricing structures that vary wildly. A prompt that costs $30 on GPT-5.5 might cost $0.28 on DeepSeek V4 Flash. This guide puts every model side by side so you can make an informed decision.
All prices below are per million tokens (MTok) in USD, sourced from official provider documentation as of May 2026.
OpenAI — 6 Models
OpenAI offers the widest model range, from the ultra-cheap Nano to the premium GPT-5.5 Pro. Their Batch API provides 50% off all prices for non-real-time workloads.
| Model | Input | Cached | Output | Context |
|---|---|---|---|---|
| GPT-5.5 | $5.00 | $0.50 | $30.00 | 1M |
| GPT-5.5 Pro | $30.00 | — | $180.00 | 1M |
| GPT-5.4 | $2.50 | $0.25 | $15.00 | 1M |
| GPT-5.4 Mini | $0.75 | $0.075 | $4.50 | 1M |
| GPT-5.4 Nano | $0.20 | $0.02 | $1.25 | 128K |
| GPT-5.4 Pro | $21.00 | — | $168.00 | 1M |
Batch API: 50% off all prices for async workloads (24-hour turnaround). Combine with caching for maximum savings.
GPT-5.5 Pro / GPT-5.4 Pro: Extended thinking models designed for the most complex reasoning tasks. No cache pricing available — these are meant for one-shot difficult problems, not high-volume repeated calls.
GPT-5.4 Nano: The budget option. At $0.20 input, it's 25x cheaper than GPT-5.5. Best for classification, extraction, and simple tasks.
Anthropic (Claude) — 6 Models
Anthropic's lineup spans from Haiku (fast and cheap) to Opus (the most capable). They're the only provider that charges a premium for cache writes — you pay 25% more the first time, then 90% less on every subsequent read.
| Model | Input | Cache Write | Cache Read | Output | Context |
|---|---|---|---|---|---|
| Opus 4.7 | $5.00 | $6.25 | $0.50 | $25.00 | 1M |
| Opus 4.6 | $5.00 | $6.25 | $0.50 | $25.00 | 1M |
| Opus 4.5 | $5.00 | $6.25 | $0.50 | $25.00 | 1M |
| Sonnet 4.6 | $3.00 | $3.75 | $0.30 | $15.00 | 1M |
| Sonnet 4.5 | $3.00 | $3.75 | $0.30 | $15.00 | 1M |
| Haiku 4.5 | $1.00 | $1.25 | $0.10 | $5.00 | 200K |
Cache write premium: Anthropic charges 1.25x the normal input price on the first cache write. This means caching only saves money if you send the same prefix 3+ times. For a prompt sent twice, you break even.
Opus 4.5 vs 4.6 vs 4.7: All three are priced identically. Use the latest (4.7) unless you have a specific reason to use an older version. Opus 4.6 and 4.5 remain available for compatibility.
Sonnet 4.5 vs 4.6: Also identically priced. Sonnet 4.6 is the recommended choice for most applications — it's faster and more capable than 4.5.
Haiku 4.5: Anthropic's budget option at $1/$5. Best for high-volume tasks that need Claude's instruction-following quality without Opus-level reasoning.
Google (Gemini) — 6 Models
Google offers the most aggressive pricing on their older models. Gemini 2.5 Flash-Lite at $0.10 input is the cheapest model from any major provider. Their newer Gemini 3.x models are more capable but priced higher.
| Model | Input | Cached | Output | Context |
|---|---|---|---|---|
| Gemini 3.1 Pro | $2.00 | $0.50 | $12.00 | 1M |
| Gemini 3 Deep Think | $4.00 | $1.00 | $24.00 | 1M |
| Gemini 3 Flash | $0.50 | $0.05 | $3.00 | 1M |
| Gemini 2.5 Pro | $1.25 | $0.125 | $10.00 | 1M |
| Gemini 2.5 Flash | $0.30 | $0.03 | $2.50 | 1M |
| Gemini 2.5 Flash-Lite | $0.10 | $0.01 | $0.40 | 1M |
200K threshold: Gemini 3.1 Pro doubles its pricing for prompts over 200K tokens — from $2.00 to $4.00 input, $12.00 to $18.00 output. If your prompts regularly exceed 200K, factor this into your cost calculations.
Gemini 2.5 Pro: Also has a 200K threshold — $1.25 input becomes $2.50, $10.00 output becomes $15.00. Same pattern across Google's Pro-tier models.
Free tier: Google is the only provider that offers a genuine free tier for API access (rate-limited). Good for prototyping and low-volume personal projects.
Gemini 2.5 Flash-Lite: At $0.10/$0.40, this is the cheapest model from any major provider. It's limited compared to Flash, but for simple extraction and classification tasks, it's unbeatable on cost.
DeepSeek — 2 Models
DeepSeek's strategy is simple: be the cheapest. Their cache read pricing is 98-99% below normal input prices — the most aggressive in the industry. The V4 Pro model is currently at a 75% promotional discount.
| Model | Input | Cache Hit | Output | Context |
|---|---|---|---|---|
| V4 Flash | $0.14 | $0.0028 | $0.28 | 1M |
| V4 Pro* | $0.435 | $0.0036 | $0.87 | 1M |
* V4 Pro: 75% off until May 31, 2026. Original prices: $1.74 input, $3.48 output. Cache hit price reduced to 1/10 of launch price as of April 26, 2026.
Cache pricing: DeepSeek's cache hit price is 1/50th of the normal input price for V4 Flash, and 1/120th for V4 Pro. For agent applications with repeated context, cached tokens are essentially free.
Promotion risk: V4 Pro's 75% discount ends May 31, 2026. After that, it costs $1.74/$3.48 — still cheap, but 4x more expensive. Budget for the post-promotion price if building production systems.
Thinking mode: Both models support thinking mode for complex reasoning. Thinking tokens are billed at output rates.
The Full Picture: All 20 Models Ranked by Input Price
Here's every model from all four providers, ranked cheapest to most expensive on input price:
| # | Model | Provider | Input | Output | Ratio |
|---|---|---|---|---|---|
| 1 | Gemini 2.5 Flash-Lite | $0.10 | $0.40 | 4x | |
| 2 | DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 | 2x |
| 3 | GPT-5.4 Nano | OpenAI | $0.20 | $1.25 | 6.3x |
| 4 | Gemini 2.5 Flash | $0.30 | $2.50 | 8.3x | |
| 5 | DeepSeek V4 Pro* | DeepSeek | $0.435 | $0.87 | 2x |
| 6 | Gemini 3 Flash | $0.50 | $3.00 | 6x | |
| 7 | GPT-5.4 Mini | OpenAI | $0.75 | $4.50 | 6x |
| 8 | Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | 5x |
| 9 | Gemini 2.5 Pro | $1.25 | $10.00 | 8x | |
| 10 | Gemini 3.1 Pro | $2.00 | $12.00 | 6x | |
| 11 | GPT-5.4 | OpenAI | $2.50 | $15.00 | 6x |
| 12 | Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | 5x |
| 13 | Claude Sonnet 4.5 | Anthropic | $3.00 | $15.00 | 5x |
| 14 | Gemini 3 Deep Think | $4.00 | $24.00 | 6x | |
| 15 | GPT-5.5 | OpenAI | $5.00 | $30.00 | 6x |
| 16 | Claude Opus 4.7 | Anthropic | $5.00 | $25.00 | 5x |
| 17 | GPT-5.4 Pro | OpenAI | $21.00 | $168.00 | 8x |
| 18 | GPT-5.5 Pro | OpenAI | $30.00 | $180.00 | 6x |
The Ratio column shows output-to-input price multiplier. DeepSeek has the lowest ratio (2x) — their output tokens are proportionally much cheaper than competitors. This matters for applications that generate long responses.
Head-to-Head: Same Tier, Different Providers
Comparing models at similar capability levels reveals the price gaps:
Frontier Models (Best Capability)
| Model | Input | Output | Monthly Cost* |
|---|---|---|---|
| Gemini 3.1 Pro | $2.00 | $12.00 | $3,900 |
| Claude Opus 4.7 | $5.00 | $25.00 | $6,375 |
| GPT-5.5 | $5.00 | $30.00 | $7,500 |
* Based on 10K requests/day, 5K input tokens, 500 output tokens per request.
Mid-Tier Models (Best Balance)
| Model | Input | Output | Monthly Cost* |
|---|---|---|---|
| Gemini 2.5 Pro | $1.25 | $10.00 | $3,375 |
| GPT-5.4 | $2.50 | $15.00 | $6,000 |
| Claude Sonnet 4.6 | $3.00 | $15.00 | $6,375 |
Budget Models (Cheapest)
| Model | Input | Output | Monthly Cost* |
|---|---|---|---|
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | $30 |
| DeepSeek V4 Flash | $0.14 | $0.28 | $71 |
| GPT-5.4 Nano | $0.20 | $1.25 | $75 |
| Claude Haiku 4.5 | $1.00 | $5.00 | $300 |
Cached Token Pricing: Who Saves the Most?
If your application sends the same system prompt or context repeatedly, cache pricing is more important than normal input pricing. Here's how providers compare on cache read rates:
| Provider | Best Cache Rate | Model | Savings vs Normal |
|---|---|---|---|
| DeepSeek | $0.0028 | V4 Flash | 98% |
| DeepSeek | $0.0036 | V4 Pro | 99% |
| $0.01 | 2.5 Flash-Lite | 90% | |
| OpenAI | $0.02 | GPT-5.4 Nano | 90% |
| Anthropic | $0.10 | Haiku 4.5 | 90% |
All providers offer roughly 90% savings on cached tokens — except DeepSeek, which offers 98-99%. For high-volume agent applications, DeepSeek's cache pricing makes repeated tokens essentially free.
Provider Strengths Summary
Choose OpenAI when:
- You need the widest model selection (6 tiers from Nano to Pro)
- Your workload can use the Batch API for 50% savings
- You're already in the OpenAI ecosystem
- You need extended reasoning (GPT-5.5 Pro / GPT-5.4 Pro)
Choose Anthropic when:
- Accuracy and low hallucination are critical
- You're building code generation or developer tools
- You need strong instruction following and agentic capabilities
- You want consistent pricing across model generations
Choose Google when:
- Cost is the primary concern (cheapest models available)
- You need a free tier for prototyping
- You need strong multilingual support
- You want 1M context on budget models (Flash-Lite and Flash)
Choose DeepSeek when:
- You need the absolute lowest cost per token
- Your application has high cache hit rates (agent workflows)
- You're building high-volume, cost-sensitive applications
- You want the lowest output-to-input price ratio (2x vs 5-8x)
Calculate your exact costs for any model
Open Cost CalculatorStay Current
AI API pricing changes frequently. Providers drop prices, launch promotions, and release new models every few months. Bookmark this page — we update it regularly as rates change. For the most current pricing, always check the provider's official pricing page:
- OpenAI: openai.com/api/pricing
- Anthropic: anthropic.com/pricing
- Google: ai.google.dev/pricing
- DeepSeek: api-docs.deepseek.com/quick_start/pricing
Pricing sourced from official provider documentation (May 11, 2026). DeepSeek V4 Pro pricing reflects the 75% promotional discount valid until May 31, 2026. Google's Gemini 2.5 Pro and 3.1 Pro have higher pricing for prompts exceeding 200K tokens. All figures reflect standard API pricing without volume discounts.