Reasoning Effort¶

Overview¶

Reasoning models like GPT-5, GPT-5-mini, and GPT-5-nano are LLMs trained with reinforcement learning to perform complex problem-solving. They "think before they answer" — producing internal chains of thought (reasoning tokens) before generating the visible response. These models excel at coding, scientific reasoning, and multi-step planning tasks.

The reasoning_effort parameter controls how many reasoning tokens the model generates internally, letting you tune the trade-off between response speed/cost and solution quality.

Available values:

Value	Description
`low`	Favors speed and economical token usage
`medium`	Balances speed and reasoning accuracy (default for reasoning models)
`high`	Prioritizes complete, thorough reasoning over speed

GPT-5.2 additionally supports none (no internal reasoning, lowest latency) and xhigh (maximum reasoning depth).

For full details, see the official OpenAI documentation:

How Reasoning Tokens Work¶

When a reasoning model processes a request, it generates reasoning tokens internally before producing the final answer. Key characteristics:

Not visible in the API response, but they occupy context window space
Billed as output tokens — check OpenAI pricing for reasoning token costs
Token count is available in usage.output_tokens_details in the response
Can range from hundreds to tens of thousands of tokens per request depending on problem complexity and effort level
Discarded after the final answer is generated (not cached between turns)

Context Budget

Reserve at least 25,000 tokens for reasoning and outputs when experimenting. Use max_output_tokens to control costs. If a response hits the limit before producing output, the finish reason will be max_output_tokens.

Code Example¶

This example from the repo demonstrates comparing reasoning effort levels using PydanticAI with the OpenAI Responses API:

from pydantic import BaseModel, Field
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIResponsesModelSettings, ReasoningEffort


class Solution(BaseModel):
    answer: str = Field(description="Solution to the problem")


async def test_reasoning_effort():
    """Compare reasoning effort levels on a complex problem."""
    problem = """
    A farmer has chickens and rabbits. There are 35 heads and 94 legs total.
    How many chickens and how many rabbits are there?
    """

    efforts: list[ReasoningEffort] = ["low", "medium", "high"]

    for effort in efforts:
        agent: Agent[None, Solution] = Agent(
            "openai:gpt-5.2",
            output_type=Solution,
            model_settings=OpenAIResponsesModelSettings(
                openai_reasoning_effort=effort,
                openai_reasoning_summary="detailed",
            ),
        )

        result = await agent.run(f"Solve this problem:\n{problem}")

        print(f"Reasoning Effort: {effort}")
        print(f"Answer: {result.output.answer}")

Key settings in OpenAIResponsesModelSettings:

openai_reasoning_effort — controls reasoning depth ("low", "medium", "high")
openai_reasoning_summary — set to "detailed" to get a summary of the model's internal reasoning chain

Example Output¶

Problem: A farmer has chickens and rabbits. 35 heads, 94 legs. How many of each?

Low Reasoning:

Answer: 23 chickens, 12 rabbits
Reasoning: Using x + y = 35 and 2x + 4y = 94, solving gives x = 23, y = 12

High Reasoning:

Answer: 23 chickens, 12 rabbits
Reasoning: Let x = chickens, y = rabbits.
Constraint 1: x + y = 35 (total heads)
Constraint 2: 2x + 4y = 94 (total legs)
From Constraint 1: x = 35 - y
Substitute into Constraint 2: 2(35 - y) + 4y = 94
Simplify: 70 - 2y + 4y = 94
Solve: 2y = 24, y = 12
Therefore: x = 35 - 12 = 23
Verification: 23 + 12 = 35 ✓, 2(23) + 4(12) = 46 + 48 = 94 ✓

Running¶

cd reasoning_effort
uv run reasoning_demo.py

Use Cases¶

High Reasoning Effort¶

Best for tasks where accuracy matters more than speed:

Mathematical proofs and multi-step logic problems
Complex code debugging and optimization
Strategic planning (TSP, scheduling, resource allocation)
Scientific reasoning and analysis

Medium Reasoning Effort¶

Good default for general-purpose tasks:

Data analysis and interpretation
Code generation
Technical explanations

Low Reasoning Effort¶

Use when speed and cost matter more than depth:

Simple calculations and factual queries
Classification and categorization
Quick lookups and straightforward assistance

Prompting Tips for Reasoning Models¶

Reasoning models work differently from standard GPT models. OpenAI recommends treating them like an experienced colleague rather than giving explicit step-by-step instructions:

Keep prompts high-level — describe the goal, not the individual steps. The model's internal reasoning handles the decomposition.
Avoid chain-of-thought prompts — phrases like "think step by step" are unnecessary and can actually degrade performance since the model already reasons internally.
Use delimiters for clarity — triple quotes, XML tags, or markdown headers help the model parse complex inputs.