Understanding LLM Temperature Parameter Basics | Generated by AI
In LLM APIs (like OpenAI, Copilot API, Anthropic, etc.), temperature is a parameter that controls how random or deterministic the model’s output is.
What Temperature Does
-
Low temperature (close to 0): The model becomes deterministic, always choosing the most likely next token. Example: If you ask for a definition, it will give the most standard/expected answer every time.
-
High temperature (closer to 1 or above): The model samples more freely, exploring less likely tokens. This increases variety, creativity, and unpredictability. Example: For brainstorming, storytelling, or generating code variations, a higher temperature may be better.
What “0.7” Means
-
temperature = 0.7is a moderate balance.- It allows some randomness and diversity.
- The model will not be as rigid as
0, but not as “wild” as1.2. - This is why 0.7 is often used as the default in many APIs, including Copilot-like systems—it gives creative but still reliable results.
How to Use It
In most APIs, you pass it as a parameter in the request. For example:
OpenAI API (chat completion):
{
"model": "gpt-4.1",
"messages": [
{"role": "user", "content": "Write a short story about a dragon and a robot."}
],
"temperature": 0.7
}
GitHub Copilot API:
The parameter works similarly. Copilot uses temperature internally to decide between more literal completions vs. more exploratory suggestions. Most IDE integrations set it around 0.7, but some contexts (like code completion) may use lower values for stability.
✅ Rule of thumb:
- 0–0.3 → Deterministic, factual, predictable. Good for math, API docs, config, unit tests.
- 0.4–0.7 → Balanced, natural writing, standard coding. Good default for most cases.
- 0.8–1.2 → Creative, exploratory, risk-taking. Good for brainstorming, stories, poetry.
Do you want me to also show you how temperature interacts with top_p (nucleus sampling), since both are often available in Copilot/LLM APIs?