Triggers¶
Triggers define when summarization should occur.
Trigger Types¶
Message-Based¶
Trigger when message count exceeds a threshold:
from pydantic_ai_summarization import SummarizationProcessor
processor = SummarizationProcessor(
model="openai:gpt-4.1",
trigger=("messages", 50), # Trigger at 50+ messages
...
)
Token-Based¶
Trigger when token count exceeds a threshold:
processor = SummarizationProcessor(
model="openai:gpt-4.1",
trigger=("tokens", 100000), # Trigger at 100k+ tokens
...
)
Fraction-Based¶
Trigger at a percentage of the model's context window:
processor = SummarizationProcessor(
model="openai:gpt-4.1",
trigger=("fraction", 0.8), # Trigger at 80% capacity
max_input_tokens=128000, # GPT-4's context window
...
)
Required Parameter
Fraction-based triggers require max_input_tokens to be set.
Multiple Triggers¶
Combine multiple triggers with OR logic:
processor = SummarizationProcessor(
model="openai:gpt-4.1",
trigger=[
("messages", 50), # OR
("tokens", 100000), # OR
("fraction", 0.8),
],
max_input_tokens=128000,
...
)
Keep Configuration¶
The keep parameter uses the same format as triggers:
# Keep last 20 messages
keep=("messages", 20)
# Keep last 10k tokens
keep=("tokens", 10000)
# Keep last 20% of context
keep=("fraction", 0.2)
max_input_tokens is also required for fraction-based keep
The max_input_tokens requirement is not limited to fraction triggers. Using
keep=("fraction", ...) also requires max_input_tokens — validation rejects any
fraction-based trigger or keep value when max_input_tokens is unset (and it must
be greater than 0).
Preserving the Head (Sliding Window)¶
SlidingWindowProcessor
(and SlidingWindowCapability)
support an optional keep_head parameter alongside keep. While keep retains messages
from the tail, keep_head retains messages from the start of the conversation —
useful for preserving a system prompt or initial instructions that should always survive
trimming:
from pydantic_ai_summarization import SlidingWindowProcessor
processor = SlidingWindowProcessor(
trigger=("messages", 100),
keep=("messages", 50), # keep last 50 messages
keep_head=("messages", 1), # always preserve the first message (system prompt)
)
keep_head accepts the same ("messages", n), ("tokens", n), or ("fraction", f) forms.
A fraction-based keep_head likewise requires max_input_tokens.
Common Configurations¶
Conservative (Long Conversations)¶
processor = SummarizationProcessor(
model="openai:gpt-4.1",
trigger=("tokens", 150000),
keep=("messages", 30),
)
Aggressive (Short Context Models)¶
processor = SummarizationProcessor(
model="openai:gpt-4.1",
trigger=("tokens", 30000),
keep=("messages", 10),
)