| Provider | Model | Input $/M | Output $/M | Cached $/M | Monthly | Context | Tier |
|---|
Cost = (input tokens × input price + output tokens × output price) / 1,000,000. Prices are per million tokens (MTok). For example, Claude Sonnet 4.6 at 1,000 input + 500 output tokens costs $0.003 + $0.0075 = $0.0105 per request.
Tokens are the units LLMs process. On average, 1 English word ≈ 1.3 tokens, and 1 character ≈ 0.25 tokens. A 1,000-word document is roughly 1,300 tokens. Code tends to use more tokens per character than natural language.
DeepSeek V3.2 ($0.28/$0.42 per MTok) and Mistral Small 3.2 ($0.06/$0.18) offer the best price-to-quality ratio for code. For complex architecture, Claude Opus 4.6 or GPT-5.4 provide better results at higher cost. Use the Code Assistant preset to compare.
Prompt caching stores stable prompt prefixes (system instructions, few-shot examples) so you don't re-send them every request. Cached input costs 90% less with Anthropic and Google, ~90% with OpenAI. If 60% of your input is cacheable, you save ~54% on input costs. Use the Optimizer tab to calculate exact savings.
Batch API lets you send requests in bulk and receive results within 24 hours, at 50% discount on all tokens. Supported by Anthropic, OpenAI, Google and Amazon. Ideal for analytics, content generation, data processing — any workload that doesn't need real-time responses.
Prices are sourced directly from official provider pricing pages (Anthropic, OpenAI, Google, DeepSeek, Mistral, Meta, xAI, Amazon) and updated regularly. Last update: March 2026. All calculations run locally in your browser — no data is sent to any server.
Model routing sends simple queries to a cheap model (e.g. GPT-4.1 Nano at $0.10/MTok) and complex ones to a premium model (e.g. Claude Opus at $5/MTok). A small classifier decides per-request. This can save 50–80% with minimal quality loss on mixed workloads.
Frontier models (Claude Opus 4.6, GPT-5.4, Gemini 2.5 Pro, Grok 4) deliver the best quality for complex tasks. Optimal models (Claude Sonnet 4.6, GPT-5, Mistral Large 3) balance cost and quality. Budget models (Haiku, GPT-5 Mini, Flash, DeepSeek) are best for high-volume, simpler tasks where speed matters more than nuance.
Prices updated: . Sources: official provider pricing pages.