Claude API Pricing Breakdown 2025 | Generated by AI
Here’s a clearer breakdown of the API usage pricing for each of the Anthropic Claude models you listed—no tables, just straightforward, easy-to-read prose:
Claude Sonnet 4
- Input tokens: $3 per million
- Output tokens: $15 per million
- Discounts available: Up to 90% off with prompt caching, and up to 50% off with batch processing. (custom.typingmind.com, Reddit, Anthropic, Anthropic)
Claude 3.5 Sonnet (now deprecated)
- Input tokens: $3 per million
- Output tokens: $15 per million
- Batch and cache rates: Same tiered system as other Sonnet iterations: $1.50 for batch input, $7.50 for batch output, and $0.30 for 5‑minute cache hits. (Anthropic Docs)
Claude 3.7 Sonnet
- Base pricing: Exactly the same as 3.5—$3 per million input, $15 per million output—including when using the hybrid “thinking” mode. (Reddit, Anthropic Docs)
Claude Opus 4
- Input tokens: $15 per million
- Output tokens: $75 per million
- Batch and cache discounts: Batch input is $7.50; batch output is $37.50; cache hits cost $1.50. (Anthropic, Amazon Web Services, Inc., Anthropic Docs, Wikipedia)
Quick Summary
- All Sonnet variants (3.5, 3.7, 4): $3 per million input / $15 per million output, with deeper discounts for batch and caching.
- Opus 4: Substantially higher rates, at $15 / $75 per million, but optimized for deep reasoning, long tasks, and higher performance.
Additional Insights
- Model evolution: Claude 3.5 Sonnet set new benchmarks in coding ability when it launched in June 2024, but the pricing remained unchanged with 3.7 in February 2025 and again with Sonnet 4 in May 2025, despite performance improvements. (Business Insider, Anthropic, Anthropic Docs, Wikipedia)
- Use case alignment: If your workload is chat-focused or high-volume, Sonnet models offer outstanding value-per-performance. For deeply complex tasks or prolonged agentic workflows, Opus or a hybrid strategy that mixes both may be more efficient.
Let me know if you’d like cost examples based on specific token volumes, comparisons with caching, or guidance on when to choose which model—happy to walk you through it!