LLM API Pricing Comparison
Compare pricing across OpenAI, Anthropic (Claude), Google (Gemini), Mistral, Cohere, and Azure OpenAI.
All prices in USD per 1 million tokens | Cached input pricing shown where available | Data current as of November 2025
Quick Cost Calculator
Loading models…
| Model | Provider | Input Cost | Output Cost | Cached Input | Context Window |
|---|---|---|---|---|---|
| Loading data… | |||||
Key Insights & Special Features
- Cached Input Pricing: OpenAI, Google, and Azure offer reduced rates for cached input tokens (typically 10% of standard input price). This is beneficial for applications that reuse the same context.
- Context Windows: Google’s Gemini models offer the largest context windows (1M tokens), while most others cap at 200K-256K tokens. Larger windows suit long-document processing and complex reasoning tasks.
- Best Value for Flagship Models: OpenAI’s GPT-5 ($1.25 input / $10 output) and Google’s Gemini 2.5 Pro (identical pricing) are competitively priced at the high end. Claude Sonnet 4.5 at $3/$15 is premium but focuses on coding excellence.
- Most Cost-Effective Options: Cohere’s Command R7B ($0.0375 input / $0.15 output) and Mistral’s Small ($0.014 input / $0.042 output) offer the lowest per-token costs for budget-conscious users.
- Batch API Discounts: Azure and Google offer 50% batch API discounts (check their documentation for specific terms). This benefits non-urgent, high-volume workloads.
- Extended Thinking Tokens: Models with reasoning capabilities (like o-series and Gemini with thinking mode) may charge for extended thinking tokens separately—verify with provider documentation.
Comparison by Use Case
- Coding & Complex Tasks: Claude Sonnet 4.5, GPT-5, or Gemini 2.5 Pro. All are competitive but choose based on coding benchmark performance.
- Cost-Sensitive Production: Cohere Command R ($0.15 input / $0.60 output) or Mistral Medium offer a good balance between capability and cost.
- Large Context Processing: Google Gemini Flash (1M tokens, $0.30 input / $2.50 output) is hard to beat for long documents.
- Budget Projects: Mistral Small or Cohere Command R7B for the absolute lowest costs with acceptable quality.
- Enterprise/Compliance: Azure OpenAI offers regional deployment options and data residency guarantees; Google and Azure support batch processing for cost reduction.

