Skip to content

Types API

pydantic_ai_summarization.types

Type definitions for summarization-pydantic-ai.

TokenCounter = Callable[[Sequence[ModelMessage]], int] | Callable[[Sequence[ModelMessage]], Awaitable[int]] module-attribute

Function type that counts tokens in a sequence of messages.

Supports both synchronous and asynchronous callables. When an async callable is provided, the middleware will await the result.

Example
Python
# Sync counter (simple, fast)
def my_token_counter(messages: Sequence[ModelMessage]) -> int:
    return sum(len(str(msg)) for msg in messages) // 4

# Async counter (using pydantic-ai's model-based counting)
async def model_token_counter(messages: Sequence[ModelMessage]) -> int:
    from pydantic_ai import models
    model = models.infer_model("openai:gpt-4.1")
    usage = await model.count_tokens(list(messages), None, None)
    return usage.request_tokens or 0

processor = SummarizationProcessor(
    model="openai:gpt-4.1",
    token_counter=my_token_counter,
)

ContextSize = ContextFraction | ContextTokens | ContextMessages module-attribute

Union type for all context size specifications.

Can be: - ("fraction", float) - fraction of max_input_tokens (requires max_input_tokens) - ("tokens", int) - absolute token count - ("messages", int) - message count

Examples:

Python
# Trigger at 80% of context window
trigger: ContextSize = ("fraction", 0.8)

# Trigger at 100k tokens
trigger: ContextSize = ("tokens", 100000)

# Trigger at 50 messages
trigger: ContextSize = ("messages", 50)

ContextFraction = tuple[Literal['fraction'], float] module-attribute

Context size specified as a fraction of max_input_tokens.

Example: ("fraction", 0.8) means 80% of max_input_tokens.

ContextTokens = tuple[Literal['tokens'], int] module-attribute

Context size specified as an absolute token count.

Example: ("tokens", 100000) means 100,000 tokens.

ContextMessages = tuple[Literal['messages'], int] module-attribute

Context size specified as a message count.

Example: ("messages", 50) means 50 messages.

WarningOn = Literal['iterations', 'context_window', 'total_tokens'] module-attribute

Warning categories supported by LimitWarnerProcessor.

Can be: - "iterations" - warn as request count approaches the configured maximum - "context_window" - warn as the current message history approaches the configured context budget - "total_tokens" - warn as cumulative run token usage approaches the configured maximum

Example
Python
warn_on: list[WarningOn] = ["iterations", "context_window"]