Choosing an AI provider used to be simple: OpenAI was the default. In 2026, there are four major providers, twenty-plus models, and pricing structures that vary wildly. A prompt that costs $30 on GPT-5.5 might cost $0.28 on DeepSeek V4 Flash. This guide puts every model side by side so you can make an informed decision.

All prices below are per million tokens (MTok) in USD, sourced from official provider documentation as of May 2026.

OpenAI — 6 Models

OpenAI offers the widest model range, from the ultra-cheap Nano to the premium GPT-5.5 Pro. Their Batch API provides 50% off all prices for non-real-time workloads.

Model Input Cached Output Context
GPT-5.5 $5.00 $0.50 $30.00 1M
GPT-5.5 Pro $30.00 $180.00 1M
GPT-5.4 $2.50 $0.25 $15.00 1M
GPT-5.4 Mini $0.75 $0.075 $4.50 1M
GPT-5.4 Nano $0.20 $0.02 $1.25 128K
GPT-5.4 Pro $21.00 $168.00 1M
OpenAI Notes

Batch API: 50% off all prices for async workloads (24-hour turnaround). Combine with caching for maximum savings.

GPT-5.5 Pro / GPT-5.4 Pro: Extended thinking models designed for the most complex reasoning tasks. No cache pricing available — these are meant for one-shot difficult problems, not high-volume repeated calls.

GPT-5.4 Nano: The budget option. At $0.20 input, it's 25x cheaper than GPT-5.5. Best for classification, extraction, and simple tasks.

Anthropic (Claude) — 6 Models

Anthropic's lineup spans from Haiku (fast and cheap) to Opus (the most capable). They're the only provider that charges a premium for cache writes — you pay 25% more the first time, then 90% less on every subsequent read.

Model Input Cache Write Cache Read Output Context
Opus 4.7 $5.00 $6.25 $0.50 $25.00 1M
Opus 4.6 $5.00 $6.25 $0.50 $25.00 1M
Opus 4.5 $5.00 $6.25 $0.50 $25.00 1M
Sonnet 4.6 $3.00 $3.75 $0.30 $15.00 1M
Sonnet 4.5 $3.00 $3.75 $0.30 $15.00 1M
Haiku 4.5 $1.00 $1.25 $0.10 $5.00 200K
Anthropic Notes

Cache write premium: Anthropic charges 1.25x the normal input price on the first cache write. This means caching only saves money if you send the same prefix 3+ times. For a prompt sent twice, you break even.

Opus 4.5 vs 4.6 vs 4.7: All three are priced identically. Use the latest (4.7) unless you have a specific reason to use an older version. Opus 4.6 and 4.5 remain available for compatibility.

Sonnet 4.5 vs 4.6: Also identically priced. Sonnet 4.6 is the recommended choice for most applications — it's faster and more capable than 4.5.

Haiku 4.5: Anthropic's budget option at $1/$5. Best for high-volume tasks that need Claude's instruction-following quality without Opus-level reasoning.

Google (Gemini) — 6 Models

Google offers the most aggressive pricing on their older models. Gemini 2.5 Flash-Lite at $0.10 input is the cheapest model from any major provider. Their newer Gemini 3.x models are more capable but priced higher.

Model Input Cached Output Context
Gemini 3.1 Pro $2.00 $0.50 $12.00 1M
Gemini 3 Deep Think $4.00 $1.00 $24.00 1M
Gemini 3 Flash $0.50 $0.05 $3.00 1M
Gemini 2.5 Pro $1.25 $0.125 $10.00 1M
Gemini 2.5 Flash $0.30 $0.03 $2.50 1M
Gemini 2.5 Flash-Lite $0.10 $0.01 $0.40 1M
Google Notes

200K threshold: Gemini 3.1 Pro doubles its pricing for prompts over 200K tokens — from $2.00 to $4.00 input, $12.00 to $18.00 output. If your prompts regularly exceed 200K, factor this into your cost calculations.

Gemini 2.5 Pro: Also has a 200K threshold — $1.25 input becomes $2.50, $10.00 output becomes $15.00. Same pattern across Google's Pro-tier models.

Free tier: Google is the only provider that offers a genuine free tier for API access (rate-limited). Good for prototyping and low-volume personal projects.

Gemini 2.5 Flash-Lite: At $0.10/$0.40, this is the cheapest model from any major provider. It's limited compared to Flash, but for simple extraction and classification tasks, it's unbeatable on cost.

DeepSeek — 2 Models

DeepSeek's strategy is simple: be the cheapest. Their cache read pricing is 98-99% below normal input prices — the most aggressive in the industry. The V4 Pro model is currently at a 75% promotional discount.

Model Input Cache Hit Output Context
V4 Flash $0.14 $0.0028 $0.28 1M
V4 Pro* $0.435 $0.0036 $0.87 1M

* V4 Pro: 75% off until May 31, 2026. Original prices: $1.74 input, $3.48 output. Cache hit price reduced to 1/10 of launch price as of April 26, 2026.

DeepSeek Notes

Cache pricing: DeepSeek's cache hit price is 1/50th of the normal input price for V4 Flash, and 1/120th for V4 Pro. For agent applications with repeated context, cached tokens are essentially free.

Promotion risk: V4 Pro's 75% discount ends May 31, 2026. After that, it costs $1.74/$3.48 — still cheap, but 4x more expensive. Budget for the post-promotion price if building production systems.

Thinking mode: Both models support thinking mode for complex reasoning. Thinking tokens are billed at output rates.

The Full Picture: All 20 Models Ranked by Input Price

Here's every model from all four providers, ranked cheapest to most expensive on input price:

# Model Provider Input Output Ratio
1 Gemini 2.5 Flash-Lite Google $0.10 $0.40 4x
2 DeepSeek V4 Flash DeepSeek $0.14 $0.28 2x
3 GPT-5.4 Nano OpenAI $0.20 $1.25 6.3x
4 Gemini 2.5 Flash Google $0.30 $2.50 8.3x
5 DeepSeek V4 Pro* DeepSeek $0.435 $0.87 2x
6 Gemini 3 Flash Google $0.50 $3.00 6x
7 GPT-5.4 Mini OpenAI $0.75 $4.50 6x
8 Claude Haiku 4.5 Anthropic $1.00 $5.00 5x
9 Gemini 2.5 Pro Google $1.25 $10.00 8x
10 Gemini 3.1 Pro Google $2.00 $12.00 6x
11 GPT-5.4 OpenAI $2.50 $15.00 6x
12 Claude Sonnet 4.6 Anthropic $3.00 $15.00 5x
13 Claude Sonnet 4.5 Anthropic $3.00 $15.00 5x
14 Gemini 3 Deep Think Google $4.00 $24.00 6x
15 GPT-5.5 OpenAI $5.00 $30.00 6x
16 Claude Opus 4.7 Anthropic $5.00 $25.00 5x
17 GPT-5.4 Pro OpenAI $21.00 $168.00 8x
18 GPT-5.5 Pro OpenAI $30.00 $180.00 6x

The Ratio column shows output-to-input price multiplier. DeepSeek has the lowest ratio (2x) — their output tokens are proportionally much cheaper than competitors. This matters for applications that generate long responses.

Head-to-Head: Same Tier, Different Providers

Comparing models at similar capability levels reveals the price gaps:

Frontier Models (Best Capability)

Model Input Output Monthly Cost*
Gemini 3.1 Pro $2.00 $12.00 $3,900
Claude Opus 4.7 $5.00 $25.00 $6,375
GPT-5.5 $5.00 $30.00 $7,500

* Based on 10K requests/day, 5K input tokens, 500 output tokens per request.

Mid-Tier Models (Best Balance)

Model Input Output Monthly Cost*
Gemini 2.5 Pro $1.25 $10.00 $3,375
GPT-5.4 $2.50 $15.00 $6,000
Claude Sonnet 4.6 $3.00 $15.00 $6,375

Budget Models (Cheapest)

Model Input Output Monthly Cost*
Gemini 2.5 Flash-Lite $0.10 $0.40 $30
DeepSeek V4 Flash $0.14 $0.28 $71
GPT-5.4 Nano $0.20 $1.25 $75
Claude Haiku 4.5 $1.00 $5.00 $300

Cached Token Pricing: Who Saves the Most?

If your application sends the same system prompt or context repeatedly, cache pricing is more important than normal input pricing. Here's how providers compare on cache read rates:

Provider Best Cache Rate Model Savings vs Normal
DeepSeek $0.0028 V4 Flash 98%
DeepSeek $0.0036 V4 Pro 99%
Google $0.01 2.5 Flash-Lite 90%
OpenAI $0.02 GPT-5.4 Nano 90%
Anthropic $0.10 Haiku 4.5 90%

All providers offer roughly 90% savings on cached tokens — except DeepSeek, which offers 98-99%. For high-volume agent applications, DeepSeek's cache pricing makes repeated tokens essentially free.

Provider Strengths Summary

Choose OpenAI when:

  • You need the widest model selection (6 tiers from Nano to Pro)
  • Your workload can use the Batch API for 50% savings
  • You're already in the OpenAI ecosystem
  • You need extended reasoning (GPT-5.5 Pro / GPT-5.4 Pro)

Choose Anthropic when:

  • Accuracy and low hallucination are critical
  • You're building code generation or developer tools
  • You need strong instruction following and agentic capabilities
  • You want consistent pricing across model generations

Choose Google when:

  • Cost is the primary concern (cheapest models available)
  • You need a free tier for prototyping
  • You need strong multilingual support
  • You want 1M context on budget models (Flash-Lite and Flash)

Choose DeepSeek when:

  • You need the absolute lowest cost per token
  • Your application has high cache hit rates (agent workflows)
  • You're building high-volume, cost-sensitive applications
  • You want the lowest output-to-input price ratio (2x vs 5-8x)

Calculate your exact costs for any model

Open Cost Calculator

Stay Current

AI API pricing changes frequently. Providers drop prices, launch promotions, and release new models every few months. Bookmark this page — we update it regularly as rates change. For the most current pricing, always check the provider's official pricing page:

  • OpenAI: openai.com/api/pricing
  • Anthropic: anthropic.com/pricing
  • Google: ai.google.dev/pricing
  • DeepSeek: api-docs.deepseek.com/quick_start/pricing

Pricing sourced from official provider documentation (May 11, 2026). DeepSeek V4 Pro pricing reflects the 75% promotional discount valid until May 31, 2026. Google's Gemini 2.5 Pro and 3.1 Pro have higher pricing for prompts exceeding 200K tokens. All figures reflect standard API pricing without volume discounts.