Temperature, Top-P, and Sampling Tuning — Building AI Agents with Claude: From Prompts to Production

Temperature, Top-P, and Sampling Tuning

14 min read Lesson 12 / 28

Temperature, Top-P, and Sampling Parameters

Sampling parameters control the randomness and creativity of Claude's outputs. Understanding them lets you tune behavior precisely for your use case.

Temperature

Temperature scales the probability distribution over possible tokens. Lower = more deterministic; higher = more diverse.

import anthropic

client = anthropic.Anthropic()

prompt = "Complete this sentence: The best way to learn programming is..."

# Deterministic — same answer every time (good for structured output, code, facts)
deterministic = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=128,
    temperature=0.0,
    messages=[{"role": "user", "content": prompt}],
)

# Balanced — some variation (good for most conversational use)
balanced = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=128,
    temperature=0.5,
    messages=[{"role": "user", "content": prompt}],
)

# Creative — high variation (good for brainstorming, creative writing)
creative = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=128,
    temperature=1.0,
    messages=[{"role": "user", "content": prompt}],
)

Top-P (Nucleus Sampling)

Top-P limits sampling to the smallest set of tokens whose cumulative probability exceeds P. Use it instead of temperature, not alongside it.

# Top-P of 0.9 considers tokens covering 90% of probability mass
response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=512,
    top_p=0.9,
    messages=[{"role": "user", "content": "Write a creative product tagline for a new AI assistant."}],
)

Recommended Settings by Use Case

Use Case	Temperature	Top-P
Code generation	0.0 – 0.2	Default
Factual Q&A	0.0 – 0.3	Default
Summarization	0.3 – 0.5	Default
Conversational chat	0.5 – 0.7	Default
Creative writing	0.8 – 1.0	0.95
Brainstorming	0.9 – 1.0	Default

JavaScript Example

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

// Code generation — deterministic
const codeResponse = await client.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  temperature: 0.1,
  messages: [
    {
      role: "user",
      content: "Write a Python function to merge two sorted lists.",
    },
  ],
});

// Marketing copy — creative
const copyResponse = await client.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 512,
  temperature: 0.9,
  messages: [
    {
      role: "user",
      content: "Write 5 different taglines for a productivity app.",
    },
  ],
});

When in doubt, start at temperature=0.3 for task-oriented work and adjust based on observed output diversity.