Skip to content

Context Management for Pydantic AI

Automatic conversation summarization and context management for Pydantic AI agents.


Context Management for Pydantic AI helps your agents handle long conversations without exceeding model context limits. Choose between intelligent LLM summarization or fast sliding window trimming.

  • Intelligent Summarization

    LLM-powered compression that preserves key information

  • Sliding Window

    Zero-cost message trimming for maximum speed

  • Safe Cutoff

    Never breaks tool call/response pairs

  • Flexible Configuration

    Message, token, or fraction-based triggers

The recommended way to add context management:

Python
from pydantic_ai import Agent
from pydantic_ai_summarization import ContextManagerCapability

agent = Agent(
    "openai:gpt-4.1",
    capabilities=[ContextManagerCapability(max_tokens=100_000)],
)

Combine with limit warnings:

Python
from pydantic_ai_summarization import ContextManagerCapability, LimitWarnerCapability

agent = Agent(
    "openai:gpt-4.1",
    capabilities=[
        LimitWarnerCapability(max_iterations=40, max_context_tokens=100_000),
        ContextManagerCapability(max_tokens=100_000),
    ],
)

Available Options

Option Type LLM Cost Best For
ContextManagerCapability Capability Per compression Production apps (recommended)
SummarizationCapability Capability High Quality-focused apps
SlidingWindowCapability Capability Zero Speed/cost-focused apps
LimitWarnerCapability Capability Zero Warning before limits hit
SummarizationProcessor Processor High Standalone use
SlidingWindowProcessor Processor Zero Standalone use
LimitWarnerProcessor Processor Zero Standalone use

Alternative: Processor API

Python
from pydantic_ai import Agent
from pydantic_ai_summarization import create_summarization_processor

processor = create_summarization_processor(
    trigger=("tokens", 100000),
    keep=("messages", 20),
)

agent = Agent(
    "openai:gpt-4o",
    history_processors=[processor],
)

result = await agent.run("Hello!")

Zero-Cost Sliding Window

Simply discards old messages — no LLM calls:

Python
from pydantic_ai import Agent
from pydantic_ai_summarization import create_sliding_window_processor

processor = create_sliding_window_processor(
    trigger=("messages", 100),
    keep=("messages", 50),
)

agent = Agent(
    "openai:gpt-4o",
    history_processors=[processor],
)

result = await agent.run("Hello!")

Choosing a Processor

Use SummarizationProcessor when:

  • Context quality is critical
  • You need to preserve key information from long conversations
  • LLM cost is acceptable for your use case

Use SlidingWindowProcessor when:

  • Speed and cost are priorities
  • Recent context is most important
  • You're running many parallel conversations
  • You want deterministic, predictable behavior
Package Description
Pydantic Deep Agents Full agent framework (uses this library)
pydantic-ai-backend File storage and Docker sandbox
pydantic-ai-todo Task planning toolset
subagents-pydantic-ai Multi-agent orchestration
pydantic-ai The foundation — agent framework by Pydantic

Next Steps