Clear Sessions Save Tokens Always | Generated by AI
Question: For OpenCode / Claude Code / Codex CLI, should we create a new session or run /clear after finishing an unrelated task to reduce token usage, or does prompt caching (as used by Gemini/Claude) handle this automatically so we don’t need to worry about it?
Answer:
The Short Answer
Yes, you should /clear or start a new session between unrelated tasks. Prompt caching does help, but it does NOT eliminate the cost of carrying stale context from a previous task — it just makes re-reading that stale context cheaper, not free.
How the Context Window Actually Works
Each message you send in a session re-sends the entire conversation history to the model. This is true for Claude Code, OpenCode (using Claude/Gemini backends), and Codex CLI alike.
As the conversation advances through turns, each user message and assistant response accumulates within the context window. Previous turns are preserved completely, following a linear growth pattern.
So a 50-turn session sends significantly more tokens per message than a 5-turn session — even if the model appears to “remember” everything instantly.
What Prompt Caching Actually Does
Prompt caching (used by Anthropic’s Claude and Google’s Gemini) reduces the cost of re-reading already-seen content, but it does not eliminate it:
-
Prompt caching works because Claude Code frequently re-reads the same context. Files don’t change between requests. System prompts are static. Conversation history grows incrementally. The cache captures all of this.
-
In real-world data tracking 100M tokens: 84% of input tokens were served from Anthropic’s prompt cache.
-
However, cache read tokens count fully against your usage quota. Every message re-sends the complete instruction set regardless of whether it changed. A 100-message session costs 1.5M cache reads just from instructions.
So the cache reduces cost per token (to about 10% of standard price), but the tokens still accumulate and still count. A long session with irrelevant prior context wastes quota on every subsequent message.
Why /clear or New Session Matters for Unrelated Tasks
When you switch to a completely different task within the same session:
- All prior tool calls, file reads, and conversation turns are still in context.
- Every new message you send re-sends all that irrelevant history.
- The model may get confused or “distracted” by stale context (context rot).
- You pay cache-read tokens for all that irrelevant content repeatedly.
Token costs scale with context size: the more context Claude processes, the more tokens you use. Clear between tasks: Use /clear to start fresh when switching to unrelated work. Stale context wastes tokens on every subsequent message.
Practical Guidance Per Tool
Claude Code
- Use
/clearbetween unrelated tasks. Use/compactwhen you notice Claude losing track, and/clearwhen switching to completely different work. - Use
/compact(not/clear) when still on the same task but the session is getting long — it summarizes history while preserving intent. - Claude Code automatically optimizes costs through prompt caching (which reduces costs for repeated content like system prompts) and auto-compaction (which summarizes conversation history when approaching context limits).
Codex CLI (OpenAI)
- Use the
/compactcommand when conversations get long to summarize history and free up tokens. - Use
/forkto branch a session for a different approach, preserving the original. - Starting fresh with a new session for truly unrelated work is recommended.
OpenCode
- OpenCode is newer and a provider-agnostic
/compactcommand and auto-compaction support are proposed features — not yet universally available across all backends. - Starting a new session between unrelated tasks is still the safest approach for now.
The “One Session Per Logical Task” Rule
One session per logical task works best. One bug fix, one feature, one refactor. Don’t try to fix three bugs and add two features in one conversation.
Between sessions, use a CLAUDE.md file to persist important project context: A CLAUDE.md file with key architecture decisions, file locations, and conventions means Claude spends fewer tokens rediscovering your project structure each time.
Summary Table
| Scenario | Recommendation |
|---|---|
| Same task, long session | /compact — summarize, keep going |
| Switching to unrelated task | /clear or new session |
| Related follow-up task | Keep session OR use /compact first |
| Resuming old work later | /resume with CLAUDE.md for context |
| Very long codebase analysis | New session per logical unit |
Bottom line: Prompt caching is a cost multiplier (cheaper reads), not a “free context” pass. Irrelevant prior context always costs you — clearing it between unrelated tasks is one of the highest-leverage optimizations available.
References:
- Claude Code Cost Management Docs
- Anthropic Context Windows Docs
- 7 Ways to Cut Claude Code Token Usage – DEV Community
- Claude Code Token Usage: Real Data From 100M Tokens
- Claude Code Context Window Optimization
- OpenCode /compact Feature Discussion
- Codex CLI Session Resume Guide