Ollama Memory Usage Explained | Generated by AI

Home 2026.03

Question: Why does ollama run qwen3.5:9b show only 159 MB memory usage, which seems very small?

Answer:

The 159 MB figure you’re seeing is not the actual memory used by the model — it’s likely just the resident memory (RES) of the main Ollama server process itself, or a misleading partial reading from a tool like Activity Monitor / top / htop that doesn’t account for how modern LLM runners handle memory.

On Apple Silicon Macs (M2/M3), Ollama uses memory-mapped files and unified memory very efficiently. This means:

Typical Real Memory Usage for qwen3.5:9b on Mac M2/M3 (March 2026)

Why It Looks So Small at First

In short: 159 MB is normal for the idle server process. The model is actually using several GB when active — that’s expected and efficient on Apple Silicon. If you’re seeing poor performance or errors, check total system memory pressure instead of just the process size.

References:


Back Donate