Skip to content

Prompts & Retry API

This page documents the exported prompt builders, the static prompt/tool description constants, and the auto-retry helpers.

Prompt Builders

get_subagent_system_prompt

subagents_pydantic_ai.get_subagent_system_prompt(configs, include_dual_mode=True)

Generate system prompt section describing available subagents.

Parameters:

Name Type Description Default
configs list[SubAgentConfig]

List of subagent configurations.

required
include_dual_mode bool

Whether to include dual-mode execution explanation.

True

Returns:

Type Description
str

Formatted system prompt section.

Example
Python
configs = [
    SubAgentConfig(
        name="researcher",
        description="Researches topics",
        instructions="...",
    ),
]
prompt = get_subagent_system_prompt(configs)
Source code in src/subagents_pydantic_ai/prompts.py
Python
def get_subagent_system_prompt(
    configs: list[SubAgentConfig],
    include_dual_mode: bool = True,
) -> str:
    """Generate system prompt section describing available subagents.

    Args:
        configs: List of subagent configurations.
        include_dual_mode: Whether to include dual-mode execution explanation.

    Returns:
        Formatted system prompt section.

    Example:
        ```python
        configs = [
            SubAgentConfig(
                name="researcher",
                description="Researches topics",
                instructions="...",
            ),
        ]
        prompt = get_subagent_system_prompt(configs)
        ```
    """
    lines = ["## Available Subagents", ""]
    lines.append("Use the `task` tool to delegate work to these subagents:")
    lines.append("")

    for config in configs:
        name = config["name"]
        description = config["description"]
        lines.append(f"- **{name}**: {description}")

        # Add hint if agent cannot ask questions
        if config.get("can_ask_questions") is False:
            lines[-1] += " *(cannot ask clarifying questions)*"

    return "\n".join(lines)

get_task_instructions_prompt

subagents_pydantic_ai.get_task_instructions_prompt(task_description, can_ask_questions=True, max_questions=None)

Generate the task instructions for a subagent.

Parameters:

Name Type Description Default
task_description str

The task to perform.

required
can_ask_questions bool

Whether the subagent can ask the parent questions.

True
max_questions int | None

Maximum number of questions allowed.

None

Returns:

Type Description
str

Formatted task instructions.

Source code in src/subagents_pydantic_ai/prompts.py
Python
def get_task_instructions_prompt(
    task_description: str,
    can_ask_questions: bool = True,
    max_questions: int | None = None,
) -> str:
    """Generate the task instructions for a subagent.

    Args:
        task_description: The task to perform.
        can_ask_questions: Whether the subagent can ask the parent questions.
        max_questions: Maximum number of questions allowed.

    Returns:
        Formatted task instructions.
    """
    lines = ["## Your Task", "", task_description, ""]

    if can_ask_questions:
        lines.append("## Asking Questions")
        lines.append("If you need clarification, use the `ask_parent` tool.")
        if max_questions is not None:
            lines.append(f"You may ask up to {max_questions} questions.")
        lines.append("Keep questions specific and essential.")
    else:
        lines.append("## Note")
        lines.append("Complete this task using your best judgment.")
        lines.append("You cannot ask the parent for clarification.")

    return "\n".join(lines)

Prompt & Description Constants

These string constants are exported for inspection and overriding. The *_DESCRIPTION constants are the default model-facing tool descriptions used by create_subagent_toolset (override them per-tool via its descriptions argument).

subagents_pydantic_ai.SUBAGENT_SYSTEM_PROMPT = 'You are a specialized subagent working on a delegated task.\n\n## Your Role\nYou have been spawned by a parent agent to handle a specific task. Focus entirely\non completing the assigned task to the best of your ability.\n\n## Communication\n- If you need clarification, use the `ask_parent` tool to ask the parent agent\n- Keep questions specific and actionable\n- Do not ask unnecessary questions - use your judgment when possible\n- If you cannot complete a task, explain why clearly\n\n## Task Completion\n- Complete the task thoroughly before returning\n- Provide clear, structured results\n- If the task cannot be completed, explain what was attempted and why it failed\n' module-attribute

subagents_pydantic_ai.DUAL_MODE_SYSTEM_PROMPT = '## Subagent Execution Modes\n\nYou can delegate tasks to subagents in two modes:\n\n### Sync Mode (Default)\n- Use for simple, quick tasks\n- Use when you need the result immediately\n- Use when the task requires back-and-forth communication\n- The task runs and you wait for the result\n\n### Async Mode (Background)\n- Use for complex, long-running tasks\n- Use when you can continue with other work while waiting\n- Use for tasks that can run independently\n- Returns a task handle immediately - check status later\n' module-attribute

subagents_pydantic_ai.DEFAULT_GENERAL_PURPOSE_DESCRIPTION = 'A general-purpose agent for a wide variety of tasks.\nUse this when no specialized subagent matches the task requirements.\nCapable of research, analysis, writing, and problem-solving.' module-attribute

subagents_pydantic_ai.TASK_TOOL_DESCRIPTION = 'Delegate a task to a specialized subagent. The subagent runs independently with its own context and tools, and returns a result when done.\n\n## When to use\n- Complex multi-step tasks that can run independently from your main work\n- Research or exploration tasks (e.g., "find all usages of function X", "understand how module Y works") — delegate so you can continue other work\n- Multiple independent subtasks that can run in parallel — launch several subagents simultaneously for maximum efficiency\n- Tasks that require deep focus on a single area while you handle the big picture\n\n## When NOT to use\n- Trivial tasks you can do faster yourself (single file read, simple grep)\n- Tasks that require your full conversation context — subagents don\'t share your message history\n- Tasks that need back-and-forth with the user — subagents work autonomously\n\n## Usage notes\n- **Be specific**: Subagents don\'t share your context. Include all necessary details in the description: file paths, function names, expected behavior, constraints. The more specific, the better the result.\n- **Launch in parallel**: When you have multiple independent tasks, call `task()` multiple times in a single response. They run concurrently.\n- **Synthesize results**: When subagents return, combine and analyze their results before presenting to the user. Don\'t just relay raw output.\n- **Choose the right subagent**: Match the subagent_type to the task. Use "general-purpose" when no specialized subagent fits.\n\n## Execution modes\n- **"sync"** (default): Blocks until the subagent completes. Use for quick tasks or when you need the result immediately.\n- **"async"**: Returns a task handle immediately. Use for long-running tasks where you can continue other work. Check results with `check_task()` or wait with `wait_tasks()`.\n- **"auto"**: Automatically picks sync or async based on task complexity.\n\nReturns:\n- In sync mode: The subagent\'s response as a string.\n- In async mode: A task handle with task_id for status checking.\n' module-attribute

subagents_pydantic_ai.CHECK_TASK_DESCRIPTION = "Check the status of a background (async) task and get its result if completed.\n\nUse this after launching async tasks to see if they're done. Returns the task status (running, completed, failed, waiting_for_answer) and the result if available." module-attribute

subagents_pydantic_ai.ANSWER_SUBAGENT_DESCRIPTION = 'Answer a question from a background subagent that is waiting for clarification.\n\nWhen a task has status WAITING_FOR_ANSWER, the subagent needs information from you before it can continue. Provide a clear, specific answer.' module-attribute

subagents_pydantic_ai.LIST_ACTIVE_TASKS_DESCRIPTION = 'List all currently active background tasks with their status.\n\nUse this to see what async tasks are running and their current state.' module-attribute

subagents_pydantic_ai.WAIT_TASKS_DESCRIPTION = 'Wait for one or more background tasks to finish before continuing.\n\nA task is "finished" when it is completed, failed, or cancelled.\n\n## Modes\n\n- **mode="all"** (default): block until every task in `task_ids` is finished, or the timeout is reached. Use when you genuinely need every result together before the next step (e.g. final synthesis across all subagents).\n- **mode="any"**: return as soon as ONE task finishes. Use when the subagents are independent and you can start acting on each finisher immediately — this avoids stalling on the slowest task. After reacting to the finisher, call `wait_tasks` again on the remaining ids (or use `check_task`) to handle the rest.\n\n## When to prefer `mode="any"`\n\nWhen you\'ve dispatched several async tasks in parallel and any individual result is independently useful (e.g. routing decisions, progressive synthesis, fan-out research). Reactive orchestration is almost always faster than waiting on the slowest agent.\n\n## Output\n\nThe result lists every requested task with its current state and a header showing `mode`, `<finished>/<total> finished`, and how many are still running. Unfinished tasks remain in the background — you can keep working or wait on them again later.' module-attribute

subagents_pydantic_ai.SOFT_CANCEL_TASK_DESCRIPTION = 'Request cooperative cancellation of a background task. The subagent will be notified and can clean up before stopping. Use this for graceful cancellation.' module-attribute

subagents_pydantic_ai.HARD_CANCEL_TASK_DESCRIPTION = "Immediately cancel a background task. The task will be forcefully stopped. Use only when soft cancellation doesn't work or immediate stopping is required." module-attribute

Retry

See Auto-Retry for a conceptual overview.

RetryConfig

subagents_pydantic_ai.RetryConfig dataclass

Resolved retry policy for a subagent run.

Attributes:

Name Type Description
max_retries int

Number of additional attempts after the first failure. Defaults to 3 so subagents are resilient to flaky model gateways/networks out of the box. Set 0 to disable retrying entirely (the legacy agent.run() opt-out path).

initial_delay float

Seconds to wait before the first retry.

max_delay float

Upper bound for the backoff delay, in seconds.

backoff_multiplier float

The delay is multiplied by this each attempt.

jitter bool

When True, the delay is randomised in [0, computed_delay] (full jitter) to avoid a thundering herd across many concurrent subagents.

retry_on RetryPredicate | None

Predicate deciding whether an exception is transient. None uses :func:is_transient_error.

Source code in src/subagents_pydantic_ai/retry.py
Python
@dataclass(frozen=True)
class RetryConfig:
    """Resolved retry policy for a subagent run.

    Attributes:
        max_retries: Number of *additional* attempts after the first
            failure. Defaults to `3` so subagents are resilient to
            flaky model gateways/networks out of the box. Set `0` to
            disable retrying entirely (the legacy `agent.run()`
            opt-out path).
        initial_delay: Seconds to wait before the first retry.
        max_delay: Upper bound for the backoff delay, in seconds.
        backoff_multiplier: The delay is multiplied by this each attempt.
        jitter: When `True`, the delay is randomised in
            `[0, computed_delay]` (full jitter) to avoid a thundering
            herd across many concurrent subagents.
        retry_on: Predicate deciding whether an exception is transient.
            `None` uses :func:`is_transient_error`.
    """

    max_retries: int = 3
    initial_delay: float = 1.0
    max_delay: float = 30.0
    backoff_multiplier: float = 2.0
    jitter: bool = True
    retry_on: RetryPredicate | None = None

    @classmethod
    def from_config(cls, config: SubAgentConfig) -> RetryConfig:
        """Build a :class:`RetryConfig` from a :class:`SubAgentConfig`.

        Missing keys fall back to the dataclass defaults, so a config
        without any `retry_*` keys yields the default policy (3
        retries with exponential backoff).
        """
        return cls(
            max_retries=config.get("max_retries", 3),
            initial_delay=config.get("retry_initial_delay", 1.0),
            max_delay=config.get("retry_max_delay", 30.0),
            backoff_multiplier=config.get("retry_backoff_multiplier", 2.0),
            jitter=config.get("retry_jitter", True),
            retry_on=config.get("retry_on"),
        )

    def should_retry(self, exc: BaseException) -> bool:
        """Return whether *exc* is retryable under this policy."""
        predicate = self.retry_on or is_transient_error
        return predicate(exc)

from_config(config) classmethod

Build a :class:RetryConfig from a :class:SubAgentConfig.

Missing keys fall back to the dataclass defaults, so a config without any retry_* keys yields the default policy (3 retries with exponential backoff).

Source code in src/subagents_pydantic_ai/retry.py
Python
@classmethod
def from_config(cls, config: SubAgentConfig) -> RetryConfig:
    """Build a :class:`RetryConfig` from a :class:`SubAgentConfig`.

    Missing keys fall back to the dataclass defaults, so a config
    without any `retry_*` keys yields the default policy (3
    retries with exponential backoff).
    """
    return cls(
        max_retries=config.get("max_retries", 3),
        initial_delay=config.get("retry_initial_delay", 1.0),
        max_delay=config.get("retry_max_delay", 30.0),
        backoff_multiplier=config.get("retry_backoff_multiplier", 2.0),
        jitter=config.get("retry_jitter", True),
        retry_on=config.get("retry_on"),
    )

should_retry(exc)

Return whether exc is retryable under this policy.

Source code in src/subagents_pydantic_ai/retry.py
Python
def should_retry(self, exc: BaseException) -> bool:
    """Return whether *exc* is retryable under this policy."""
    predicate = self.retry_on or is_transient_error
    return predicate(exc)

run_with_retry

subagents_pydantic_ai.run_with_retry(agent, user_prompt, *, run_kwargs, retry, on_retry=None, sleep=asyncio.sleep, event_stream_handler=None, cancel_check=None, inject_messages=None) async

Run agent with auto-retry on transient errors.

When retry.max_retries <= 0 this is exactly agent.run(...) — the legacy path, unchanged. Otherwise the agent is driven via agent.iter() so that, on a transient failure, the accumulated message history from the failed attempt is replayed via message_history on the next attempt and the subagent resumes instead of restarting from scratch.

Parameters:

Name Type Description Default
agent Any

The pydantic-ai Agent to run.

required
user_prompt str | None

Initial prompt. After a retry that captured history it is set to None because the prompt is already replayed inside message_history.

required
run_kwargs dict[str, Any]

Extra kwargs forwarded to agent.run/agent.iter (deps, toolsets, ...). A caller-supplied message_history is honoured as the starting history.

required
retry RetryConfig

The resolved retry policy.

required
on_retry OnRetryCallback | None

Optional callback invoked before each retry sleep with (attempt, exc, delay). May be sync or async.

None
sleep Callable[[float], Awaitable[None]]

Async sleep function, injectable for tests.

sleep
event_stream_handler Any | None

Optional override for the agent's configured event_stream_handler. When None the agent's own handler (agent.event_stream_handler) is used, so streaming to a platform (e.g. tool-call/reasoning events to Kafka) keeps working across retries — matching agent.run() semantics.

None
cancel_check Callable[[], bool] | None

Optional callable polled between graph nodes for cooperative (soft) cancellation. When it returns True the run stops at the next node boundary by raising asyncio.CancelledError. Only honoured on the retry-driven path (max_retries > 0); the legacy agent.run() fast path (max_retries <= 0) does not expose node boundaries, so soft cancel is best-effort there.

None
inject_messages Callable[[], Awaitable[list[str]]] | None

Optional async callable awaited before each model request; its returned strings are appended to that request as user instructions (unprompted parent -> child steering). Like cancel_check, only honoured on the retry-driven path (max_retries > 0); the legacy agent.run() fast path does not expose node boundaries, so steering messages stay queued there.

None

Returns:

Type Description
Any

The AgentRunResult of the first successful attempt.

Raises:

Type Description
Exception

The last exception when retries are exhausted or the error is not transient. asyncio.CancelledError is a BaseException and is never caught here, so cooperative/hard task cancellation propagates unchanged.

Source code in src/subagents_pydantic_ai/retry.py
Python
async def run_with_retry(
    agent: Any,
    user_prompt: str | None,
    *,
    run_kwargs: dict[str, Any],
    retry: RetryConfig,
    on_retry: OnRetryCallback | None = None,
    sleep: Callable[[float], Awaitable[None]] = asyncio.sleep,
    event_stream_handler: Any | None = None,
    cancel_check: Callable[[], bool] | None = None,
    inject_messages: Callable[[], Awaitable[list[str]]] | None = None,
) -> Any:
    """Run *agent* with auto-retry on transient errors.

    When `retry.max_retries <= 0` this is exactly `agent.run(...)` —
    the legacy path, unchanged. Otherwise the agent is driven via
    `agent.iter()` so that, on a transient failure, the accumulated
    message history from the failed attempt is replayed via
    `message_history` on the next attempt and the subagent resumes
    instead of restarting from scratch.

    Args:
        agent: The pydantic-ai `Agent` to run.
        user_prompt: Initial prompt. After a retry that captured history
            it is set to `None` because the prompt is already replayed
            inside `message_history`.
        run_kwargs: Extra kwargs forwarded to `agent.run`/`agent.iter`
            (`deps`, `toolsets`, ...). A caller-supplied
            `message_history` is honoured as the starting history.
        retry: The resolved retry policy.
        on_retry: Optional callback invoked before each retry sleep with
            `(attempt, exc, delay)`. May be sync or async.
        sleep: Async sleep function, injectable for tests.
        event_stream_handler: Optional override for the agent's configured
            `event_stream_handler`. When `None` the agent's own handler
            (`agent.event_stream_handler`) is used, so streaming to a
            platform (e.g. tool-call/reasoning events to Kafka) keeps working
            across retries — matching `agent.run()` semantics.
        cancel_check: Optional callable polled between graph nodes for
            cooperative (soft) cancellation. When it returns `True` the run
            stops at the next node boundary by raising
            `asyncio.CancelledError`. Only honoured on the retry-driven path
            (`max_retries > 0`); the legacy `agent.run()` fast path
            (`max_retries <= 0`) does not expose node boundaries, so soft
            cancel is best-effort there.
        inject_messages: Optional async callable awaited before each model
            request; its returned strings are appended to that request as
            user instructions (unprompted parent -> child steering). Like
            `cancel_check`, only honoured on the retry-driven path
            (`max_retries > 0`); the legacy `agent.run()` fast path does not
            expose node boundaries, so steering messages stay queued there.

    Returns:
        The `AgentRunResult` of the first successful attempt.

    Raises:
        Exception: The last exception when retries are exhausted or the error
            is not transient. `asyncio.CancelledError` is a `BaseException`
            and is never caught here, so cooperative/hard task cancellation
            propagates unchanged.
    """
    # An explicit handler overrides the agent's; otherwise inherit the agent's
    # own, exactly as agent.run() does (event_stream_handler or self.…).
    handler = event_stream_handler or getattr(agent, "event_stream_handler", None)

    if retry.max_retries <= 0:
        # Fast path: agent.run() already drives streaming and honours the
        # agent's handler. Only forward an explicit override.
        if event_stream_handler is not None:
            run_kwargs = {**run_kwargs, "event_stream_handler": event_stream_handler}
        return await agent.run(user_prompt, **run_kwargs)

    message_history = run_kwargs.pop("message_history", None)
    prompt = user_prompt
    attempt = 0
    while True:
        run = None
        try:
            async with agent.iter(prompt, message_history=message_history, **run_kwargs) as run:
                await _drive_run(agent, run, handler, cancel_check, inject_messages)
            return run.result
        except Exception as exc:
            if attempt >= retry.max_retries or not retry.should_retry(exc):
                raise
            attempt += 1
            # Resume from wherever the failed attempt got to. `run` is
            # None only if `agent.iter()` failed before yielding.
            if run is not None:
                accumulated = run.all_messages()
                if accumulated:
                    message_history = accumulated
                    prompt = None
            delay = compute_backoff_delay(attempt, retry)
            if on_retry is not None:
                maybe_coro = on_retry(attempt, exc, delay)
                if asyncio.iscoroutine(maybe_coro):
                    await maybe_coro
            await sleep(delay)

is_transient_error

subagents_pydantic_ai.is_transient_error(exc)

Return True if exc looks like a transient networking failure.

Treated as transient (worth retrying):

  • ModelHTTPError with a 408/409/425/429/5xx status code — gateway hiccups, rate limits or upstream overload, typical with proxies such as LiteLLM.
  • ModelAPIError that is not an HTTP error — connection resets, read timeouts and other transport-level problems surfaced by the model client.

Everything else (auth/4xx, UnexpectedModelBehavior, UsageLimitExceeded, UserError, validation errors, task cancellation, ...) is treated as non-transient and is not retried.

Source code in src/subagents_pydantic_ai/retry.py
Python
def is_transient_error(exc: BaseException) -> bool:
    """Return `True` if *exc* looks like a transient networking failure.

    Treated as transient (worth retrying):

    - `ModelHTTPError` with a 408/409/425/429/5xx status code — gateway
      hiccups, rate limits or upstream overload, typical with proxies
      such as LiteLLM.
    - `ModelAPIError` that is *not* an HTTP error — connection resets,
      read timeouts and other transport-level problems surfaced by the
      model client.

    Everything else (auth/4xx, `UnexpectedModelBehavior`,
    `UsageLimitExceeded`, `UserError`, validation errors, task
    cancellation, ...) is treated as non-transient and is not retried.
    """
    if isinstance(exc, ModelHTTPError):
        return exc.status_code in _TRANSIENT_STATUS_CODES
    # A bare ModelAPIError (no HTTP status) is a transport/connection
    # error from the model client — safe to retry.
    return isinstance(exc, ModelAPIError)

compute_backoff_delay

subagents_pydantic_ai.compute_backoff_delay(attempt, cfg, rng=random.uniform)

Compute the delay (seconds) before retry attempt (1-based).

Exponential backoff (initial_delay * multiplier ** (attempt - 1)) capped at cfg.max_delay. With cfg.jitter the result is randomised in [0, delay] (full jitter). rng is injectable for deterministic tests.

Source code in src/subagents_pydantic_ai/retry.py
Python
def compute_backoff_delay(
    attempt: int,
    cfg: RetryConfig,
    rng: Callable[[float, float], float] = random.uniform,
) -> float:
    """Compute the delay (seconds) before retry *attempt* (1-based).

    Exponential backoff (`initial_delay * multiplier ** (attempt - 1)`)
    capped at `cfg.max_delay`. With `cfg.jitter` the result is
    randomised in `[0, delay]` (full jitter). `rng` is injectable for
    deterministic tests.
    """
    base = cfg.initial_delay * (cfg.backoff_multiplier ** (attempt - 1))
    delay = min(base, cfg.max_delay)
    if cfg.jitter:
        delay = rng(0.0, delay)
    return delay