Understanding LLM Temperature Parameter Basics | Generated by AI
In LLM APIs (like OpenAI, Copilot API, Anthropic, etc.), temperature is a parameter that controls how random or deterministic the model’s output is.
What Temperature Does
-
Low temperature (close to 0): The model becomes deterministic, always choosing the most likely next token. Example: If you ask for a definition, it will give the most standard/expected answer every time.
-
High temperature (closer to 1 or above): The model samples more freely, exploring less likely tokens. This increases variety, creativity, and unpredictability. Example: For brainstorming, storytelling, or generating code variations, a higher temperature may be better.
What “0.7” Means
-
temperature = 0.7
is a moderate balance.- It allows some randomness and diversity.
- The model will not be as rigid as
0
, but not as “wild” as1.2
. - This is why 0.7 is often used as the default in many APIs, including Copilot-like systems—it gives creative but still reliable results.
How to Use It
In most APIs, you pass it as a parameter in the request. For example:
OpenAI API (chat completion):
{
"model": "gpt-4.1",
"messages": [
{"role": "user", "content": "Write a short story about a dragon and a robot."}
],
"temperature": 0.7
}
GitHub Copilot API:
The parameter works similarly. Copilot uses temperature internally to decide between more literal completions vs. more exploratory suggestions. Most IDE integrations set it around 0.7
, but some contexts (like code completion) may use lower values for stability.
✅ Rule of thumb:
- 0–0.3 → Deterministic, factual, predictable. Good for math, API docs, config, unit tests.
- 0.4–0.7 → Balanced, natural writing, standard coding. Good default for most cases.
- 0.8–1.2 → Creative, exploratory, risk-taking. Good for brainstorming, stories, poetry.
Do you want me to also show you how temperature interacts with top_p
(nucleus sampling), since both are often available in Copilot/LLM APIs?