Types API¶
pydantic_ai_summarization.types
¶
Type definitions for summarization-pydantic-ai.
TokenCounter = Callable[[Sequence[ModelMessage]], int] | Callable[[Sequence[ModelMessage]], Awaitable[int]]
module-attribute
¶
Function type that counts tokens in a sequence of messages.
Supports both synchronous and asynchronous callables. When an async
callable is provided, the middleware will await the result.
Example
# Sync counter (simple, fast)
def my_token_counter(messages: Sequence[ModelMessage]) -> int:
return sum(len(str(msg)) for msg in messages) // 4
# Async counter (using pydantic-ai's model-based counting)
async def model_token_counter(messages: Sequence[ModelMessage]) -> int:
from pydantic_ai import models
model = models.infer_model("openai:gpt-4.1")
usage = await model.count_tokens(list(messages), None, None)
return usage.request_tokens or 0
processor = SummarizationProcessor(
model="openai:gpt-4.1",
token_counter=my_token_counter,
)
ContextSize = ContextFraction | ContextTokens | ContextMessages
module-attribute
¶
Union type for all context size specifications.
Can be:
- ("fraction", float) - fraction of max_input_tokens (requires max_input_tokens)
- ("tokens", int) - absolute token count
- ("messages", int) - message count
Examples:
ContextFraction = tuple[Literal['fraction'], float]
module-attribute
¶
Context size specified as a fraction of max_input_tokens.
Example: ("fraction", 0.8) means 80% of max_input_tokens.
ContextTokens = tuple[Literal['tokens'], int]
module-attribute
¶
Context size specified as an absolute token count.
Example: ("tokens", 100000) means 100,000 tokens.
ContextMessages = tuple[Literal['messages'], int]
module-attribute
¶
Context size specified as a message count.
Example: ("messages", 50) means 50 messages.
WarningOn = Literal['iterations', 'context_window', 'total_tokens']
module-attribute
¶
Warning categories supported by LimitWarnerProcessor.
Can be:
- "iterations" - warn as request count approaches the configured maximum
- "context_window" - warn as the current message history approaches the
configured context budget
- "total_tokens" - warn as cumulative run token usage approaches the configured maximum