Types API¶

`pydantic_ai_summarization.types` ¶

Type definitions for summarization-pydantic-ai.

`TokenCounter = Callable[[Sequence[ModelMessage]], int] | Callable[[Sequence[ModelMessage]], Awaitable[int]]` `module-attribute` ¶

Function type that counts tokens in a sequence of messages.

Supports both synchronous and asynchronous callables. When an async callable is provided, the middleware will await the result.

Example

Python

# Sync counter (simple, fast)
def my_token_counter(messages: Sequence[ModelMessage]) -> int:
    return sum(len(str(msg)) for msg in messages) // 4

# Async counter (using pydantic-ai's model-based counting)
async def model_token_counter(messages: Sequence[ModelMessage]) -> int:
    from pydantic_ai import models
    model = models.infer_model("openai:gpt-4.1")
    usage = await model.count_tokens(list(messages), None, None)
    return usage.request_tokens or 0

processor = SummarizationProcessor(
    model="openai:gpt-4.1",
    token_counter=my_token_counter,
)

`ContextSize = ContextFraction | ContextTokens | ContextMessages` `module-attribute` ¶

Union type for all context size specifications.

Can be: - ("fraction", float) - fraction of max_input_tokens (requires max_input_tokens) - ("tokens", int) - absolute token count - ("messages", int) - message count

Examples:

Python

# Trigger at 80% of context window
trigger: ContextSize = ("fraction", 0.8)

# Trigger at 100k tokens
trigger: ContextSize = ("tokens", 100000)

# Trigger at 50 messages
trigger: ContextSize = ("messages", 50)

`ContextFraction = tuple[Literal['fraction'], float]` `module-attribute` ¶

Context size specified as a fraction of max_input_tokens.

Example: ("fraction", 0.8) means 80% of max_input_tokens.

`ContextTokens = tuple[Literal['tokens'], int]` `module-attribute` ¶

Context size specified as an absolute token count.

Example: ("tokens", 100000) means 100,000 tokens.

`ContextMessages = tuple[Literal['messages'], int]` `module-attribute` ¶

Context size specified as a message count.

Example: ("messages", 50) means 50 messages.

`WarningOn = Literal['iterations', 'context_window', 'total_tokens']` `module-attribute` ¶

Warning categories supported by LimitWarnerProcessor.

Can be: - "iterations" - warn as request count approaches the configured maximum - "context_window" - warn as the current message history approaches the configured context budget - "total_tokens" - warn as cumulative run token usage approaches the configured maximum

Example

Python

warn_on: list[WarningOn] = ["iterations", "context_window"]

Types API¶

pydantic_ai_summarization.types ¶

TokenCounter = Callable[[Sequence[ModelMessage]], int] | Callable[[Sequence[ModelMessage]], Awaitable[int]] module-attribute ¶

ContextSize = ContextFraction | ContextTokens | ContextMessages module-attribute ¶

ContextFraction = tuple[Literal['fraction'], float] module-attribute ¶

ContextTokens = tuple[Literal['tokens'], int] module-attribute ¶

ContextMessages = tuple[Literal['messages'], int] module-attribute ¶

WarningOn = Literal['iterations', 'context_window', 'total_tokens'] module-attribute ¶

`pydantic_ai_summarization.types` ¶

`TokenCounter = Callable[[Sequence[ModelMessage]], int] | Callable[[Sequence[ModelMessage]], Awaitable[int]]` `module-attribute` ¶

`ContextSize = ContextFraction | ContextTokens | ContextMessages` `module-attribute` ¶

`ContextFraction = tuple[Literal['fraction'], float]` `module-attribute` ¶

`ContextTokens = tuple[Literal['tokens'], int]` `module-attribute` ¶

`ContextMessages = tuple[Literal['messages'], int]` `module-attribute` ¶

`WarningOn = Literal['iterations', 'context_window', 'total_tokens']` `module-attribute` ¶