Hermes Iteration Budget Explanation | Generated by AI

Home 2026.05

Here’s how the iteration budget works in Hermes:

Iteration Budget — How It Works

The core class is agent/iteration_budget.py — a thread-safe consume/refund counter (~62 lines). Each AIAgent instance (parent or subagent) holds one.

Defaults:

The loop in agent/conversation_loop.py line 796:

while (api_call_count < agent.max_iterations
       and agent.iteration_budget.remaining > 0) \
      or agent._budget_grace_call:

Each iteration = one API call (which may include multiple tool calls in that turn).

Budget flow per iteration:

  1. Check interrupt — if user sent /stop or new message, break immediately
  2. Grace call check — if _budget_grace_call is set, consume the flag and let this one last iteration run (then exit regardless)
  3. consume() — tries to deduct 1 from remaining. If already at 0, prints the ⚠️ warning you saw and breaks
  4. Run API call — model responds, tools execute
  5. Refund on certain conditions:
    • execute_code iterations get refunded (line 3403, 3883) — programmatic tool calls don’t eat your budget
    • Context compression restarts get refunded (line 3403) — retrying with compressed context doesn’t count
    • Ollama context-too-small errors get refunded (line 1100)

When budget is fully exhausted (line 4332-4349):

The message you saw:

⚠ Iteration budget reached (60/60) — response may be incomplete

This means the agent hit max_iterations=60 (either configured in your config.yaml or set by the caller). The response was the model’s summary attempt after budget exhaustion.

Key design decisions:


Back Donate