Retrieval¶
memv uses hybrid search combining vector similarity and BM25 text search, merged with Reciprocal Rank Fusion.
How It Works¶
flowchart LR
Q[Query] --> EMB[Embed Query]
Q --> TXT[Query Text]
EMB --> VS[Vector Search<br>knowledge + episodes]
TXT --> BM[BM25 Search<br>knowledge + episodes]
VS & BM --> RRF[RRF Fusion<br>k=60]
RRF --> R[RetrievalResult]
Vector Search¶
The query is embedded using your EmbeddingClient and compared against stored embeddings via sqlite-vec. This captures semantic similarity — "Where does the user work?" matches "Employed at Anthropic" even though they share no keywords.
BM25 Text Search¶
The query is also run through FTS5 full-text search. This catches exact keyword matches that vector search might miss — proper nouns, technical terms, specific numbers.
Reciprocal Rank Fusion¶
Results from both searches are merged using RRF:
With k=60, this balances both signals without either dominating. Items that rank well in both searches score highest.
Tuning¶
vector_weight¶
The vector_weight parameter (0.0 to 1.0) controls the balance:
# Semantic-heavy (good for vague queries)
result = await memory.retrieve(query, user_id=uid, vector_weight=0.8)
# Keyword-heavy (good for specific terms)
result = await memory.retrieve(query, user_id=uid, vector_weight=0.2)
# Balanced (default)
result = await memory.retrieve(query, user_id=uid, vector_weight=0.5)
top_k¶
Controls how many results to return per category:
include_episodes¶
By default, retrieval searches both knowledge and episodes. Disable episode retrieval if you only want extracted facts:
Output Formatting¶
RetrievalResult provides two formatting options:
to_prompt()¶
Groups knowledge by source episode — designed for LLM context injection:
## Relevant Context
### User's Work Background
_The user discussed their job at Anthropic during an initial introduction._
Key facts:
- The user works at Anthropic as a researcher
- Their focus area is AI safety, specifically interpretability
### Additional Facts
- The user prefers Python for data analysis
as_text()¶
Simple newline-separated statements:
The user works at Anthropic as a researcher
Their focus area is AI safety, specifically interpretability
The user prefers Python for data analysis
Embedding Cache¶
An optional LRU cache with TTL reduces API calls for repeated or similar queries: