Claude API Pricing Breakdown 2025

Claude API Pricing Breakdown 2025 | Generated by AI

Home 2025.10

Here’s a clearer breakdown of the API usage pricing for each of the Anthropic Claude models you listed—no tables, just straightforward, easy-to-read prose:

Claude Sonnet 4

Input tokens: $3 per million
Output tokens: $15 per million
Discounts available: Up to 90% off with prompt caching, and up to 50% off with batch processing. (custom.typingmind.com, Reddit, Anthropic, Anthropic)

Claude 3.5 Sonnet (now deprecated)

Input tokens: $3 per million
Output tokens: $15 per million
Batch and cache rates: Same tiered system as other Sonnet iterations: $1.50 for batch input, $7.50 for batch output, and $0.30 for 5‑minute cache hits. (Anthropic Docs)

Claude 3.7 Sonnet

Base pricing: Exactly the same as 3.5—$3 per million input, $15 per million output—including when using the hybrid “thinking” mode. (Reddit, Anthropic Docs)

Claude Opus 4

Input tokens: $15 per million
Output tokens: $75 per million
Batch and cache discounts: Batch input is $7.50; batch output is $37.50; cache hits cost $1.50. (Anthropic, Amazon Web Services, Inc., Anthropic Docs, Wikipedia)

Quick Summary

All Sonnet variants (3.5, 3.7, 4): $3 per million input / $15 per million output, with deeper discounts for batch and caching.
Opus 4: Substantially higher rates, at $15 / $75 per million, but optimized for deep reasoning, long tasks, and higher performance.

Additional Insights

Model evolution: Claude 3.5 Sonnet set new benchmarks in coding ability when it launched in June 2024, but the pricing remained unchanged with 3.7 in February 2025 and again with Sonnet 4 in May 2025, despite performance improvements. (Business Insider, Anthropic, Anthropic Docs, Wikipedia)
Use case alignment: If your workload is chat-focused or high-volume, Sonnet models offer outstanding value-per-performance. For deeply complex tasks or prolonged agentic workflows, Opus or a hybrid strategy that mixes both may be more efficient.

Let me know if you’d like cost examples based on specific token volumes, comparisons with caching, or guidance on when to choose which model—happy to walk you through it!

Back Donate