LLM API Pricing Comparison

LLM API Pricing Comparison

Compare pricing across OpenAI, Anthropic (Claude), Google (Gemini), Mistral, Cohere, and Azure OpenAI.

All prices in USD per 1 million tokens | Cached input pricing shown where available | Data current as of November 2025

Quick Cost Calculator

Loading models…
ModelProviderInput CostOutput CostCached InputContext Window
Loading data…

Key Insights & Special Features

  • Cached Input Pricing: OpenAI, Google, and Azure offer reduced rates for cached input tokens (typically 10% of standard input price). This is beneficial for applications that reuse the same context.
  • Context Windows: Google’s Gemini models offer the largest context windows (1M tokens), while most others cap at 200K-256K tokens. Larger windows suit long-document processing and complex reasoning tasks.
  • Best Value for Flagship Models: OpenAI’s GPT-5 ($1.25 input / $10 output) and Google’s Gemini 2.5 Pro (identical pricing) are competitively priced at the high end. Claude Sonnet 4.5 at $3/$15 is premium but focuses on coding excellence.
  • Most Cost-Effective Options: Cohere’s Command R7B ($0.0375 input / $0.15 output) and Mistral’s Small ($0.014 input / $0.042 output) offer the lowest per-token costs for budget-conscious users.
  • Batch API Discounts: Azure and Google offer 50% batch API discounts (check their documentation for specific terms). This benefits non-urgent, high-volume workloads.
  • Extended Thinking Tokens: Models with reasoning capabilities (like o-series and Gemini with thinking mode) may charge for extended thinking tokens separately—verify with provider documentation.

Comparison by Use Case

  • Coding & Complex Tasks: Claude Sonnet 4.5, GPT-5, or Gemini 2.5 Pro. All are competitive but choose based on coding benchmark performance.
  • Cost-Sensitive Production: Cohere Command R ($0.15 input / $0.60 output) or Mistral Medium offer a good balance between capability and cost.
  • Large Context Processing: Google Gemini Flash (1M tokens, $0.30 input / $2.50 output) is hard to beat for long documents.
  • Budget Projects: Mistral Small or Cohere Command R7B for the absolute lowest costs with acceptable quality.
  • Enterprise/Compliance: Azure OpenAI offers regional deployment options and data residency guarantees; Google and Azure support batch processing for cost reduction.

Verified by MonsterInsights