File Uploads¶

Upload files for agent processing. The agent can analyze, search, and work with uploaded files using built-in file tools.

Overview¶

pydantic-deep supports two ways to upload files:

run_with_files() - Helper function that uploads files and runs the agent in one call
deps.upload_file() - Direct upload to dependencies for more control

Uploaded files are:

Stored in the backend (StateBackend, FilesystemBackend, etc.)
Visible to the agent in the system prompt
Accessible via file tools (read_file, grep, glob, execute)

Quick Start¶

Using run_with_files()¶

The simplest way to process files:

import asyncio
from pydantic_deep import create_deep_agent, DeepAgentDeps, run_with_files
from pydantic_deep.backends import StateBackend

async def main():
    agent = create_deep_agent()
    deps = DeepAgentDeps(backend=StateBackend())

    # Upload and process files in one call
    with open("data.csv", "rb") as f:
        result = await run_with_files(
            agent,
            "Analyze this data and find trends",
            deps,
            files=[("data.csv", f.read())],
        )

    print(result)

asyncio.run(main())

Using deps.upload_file()¶

For more control over the upload process:

async def main():
    agent = create_deep_agent()
    deps = DeepAgentDeps(backend=StateBackend())

    # Upload files separately
    deps.upload_file("config.json", b'{"debug": true}')
    deps.upload_file("data.csv", csv_bytes)

    # Run agent - it sees uploaded files in system prompt
    result = await agent.run("Summarize the config and data", deps=deps)

How It Works¶

File Storage¶

When you upload a file:

Content is written to the backend at /uploads/<filename>
Metadata is tracked in deps.uploads dict
Agent sees file info in dynamic system prompt

deps.upload_file("sales.csv", csv_bytes)

# File stored at: /uploads/sales.csv
# Metadata tracked:
print(deps.uploads)
# {'/uploads/sales.csv': {'name': 'sales.csv', 'path': '/uploads/sales.csv', 'size': 1024, 'line_count': 50}}

System Prompt¶

The agent sees uploaded files in its context:

## Uploaded Files

Files uploaded by the user:
- `/uploads/sales.csv` (1.0 KB, 50 lines)
- `/uploads/config.json` (128 B, 5 lines)

Use `read_file`, `grep`, `glob` or `execute` to work with these files.
For large files, use `offset` and `limit` in `read_file`.

Agent Tools¶

The agent can use these tools to work with uploaded files:

Tool	Usage
`read_file`	Read file content (with offset/limit for large files)
`grep`	Search for patterns in files
`glob`	Find files by pattern
`execute`	Run scripts that process files (with DockerSandbox)

Custom Upload Directory¶

By default, files are uploaded to /uploads/. You can customize this:

# run_with_files with custom directory
result = await run_with_files(
    agent,
    "Process configs",
    deps,
    files=[("app.json", config_bytes)],
    upload_dir="/configs",  # Files go to /configs/
)

# Direct upload with custom directory
deps.upload_file("db.json", data, upload_dir="/data")
# Stored at: /data/db.json

Multiple Files¶

Upload multiple files at once:

files = [
    ("sales_q1.csv", q1_data),
    ("sales_q2.csv", q2_data),
    ("sales_q3.csv", q3_data),
]

result = await run_with_files(
    agent,
    "Compare sales across all quarters",
    deps,
    files=files,
)

Binary Files¶

Binary files (images, PDFs, etc.) are handled with limited support:

# Binary file upload
deps.upload_file("image.png", png_bytes)

# line_count will be None for binary files
print(deps.uploads["/uploads/image.png"]["line_count"])  # None

Note

Binary files are stored but text-based analysis is limited. For full binary processing, consider using DockerSandbox with appropriate tools.

Large Files¶

For large files, the agent should use pagination:

deps.upload_file("large_log.txt", log_bytes)  # 100,000 lines

# Agent will see:
# - `/uploads/large_log.txt` (5.2 MB, 100000 lines)

# Agent can then:
# 1. read_file("/uploads/large_log.txt", limit=100)  # First 100 lines
# 2. read_file("/uploads/large_log.txt", offset=100, limit=100)  # Next 100
# 3. grep("ERROR", "/uploads/large_log.txt")  # Search for patterns

Subagent Access¶

Uploaded files are shared with subagents:

deps.upload_file("data.csv", csv_bytes)

# Main agent can delegate to subagent
# Subagent will have access to /uploads/data.csv
result = await agent.run(
    "Delegate data analysis to the data-analyst subagent",
    deps=deps,
)

Complete Example¶

"""Full file uploads workflow example."""

import asyncio
from pydantic import BaseModel
from pydantic_deep import (
    create_deep_agent,
    DeepAgentDeps,
    StateBackend,
    run_with_files,
)


class AnalysisResult(BaseModel):
    """Structured analysis result."""
    summary: str
    total_records: int
    insights: list[str]


async def main():
    # Create agent with structured output
    agent = create_deep_agent(
        model="openai:gpt-4.1",
        output_type=AnalysisResult,
        instructions="""
        You are a data analyst. When analyzing files:
        1. Read the file to understand structure
        2. Perform analysis
        3. Return structured insights
        """,
    )

    deps = DeepAgentDeps(backend=StateBackend())

    # Sample data
    sales_data = b"""date,product,quantity,revenue
2024-01-15,Widget A,50,2500
2024-01-16,Widget B,30,1800
2024-01-17,Widget A,75,3750
2024-01-18,Widget C,20,1600
2024-01-19,Widget B,45,2700
"""

    # Run with file upload
    result = await run_with_files(
        agent,
        "Analyze the sales data: identify top product, total revenue, and trends",
        deps,
        files=[("sales.csv", sales_data)],
    )

    # Type-safe access to structured result
    print(f"Summary: {result.summary}")
    print(f"Total records: {result.total_records}")
    print("Insights:")
    for insight in result.insights:
        print(f"  - {insight}")


if __name__ == "__main__":
    asyncio.run(main())

API Reference¶

run_with_files()¶

async def run_with_files(
    agent: Agent[DeepAgentDeps, OutputT],
    query: str,
    deps: DeepAgentDeps,
    files: list[tuple[str, bytes]] | None = None,
    *,
    upload_dir: str = "/uploads",
) -> OutputT:
    """Run agent with file uploads.

    Args:
        agent: The agent to run.
        query: The user query/prompt.
        deps: Agent dependencies.
        files: List of (filename, content) tuples to upload.
        upload_dir: Directory to store uploads.

    Returns:
        Agent output (type depends on agent's output_type).
    """

deps.upload_file()¶

def upload_file(
    self,
    name: str,
    content: bytes,
    *,
    upload_dir: str = "/uploads",
) -> str:
    """Upload a file to the backend and track it.

    Args:
        name: Original filename (e.g., "sales.csv")
        content: File content as bytes
        upload_dir: Directory to store uploads

    Returns:
        The path where the file was stored.
    """

UploadedFile¶

class UploadedFile(TypedDict):
    """Metadata for an uploaded file."""
    name: str        # Original filename
    path: str        # Path in backend (e.g., /uploads/sales.csv)
    size: int        # Size in bytes
    line_count: int | None  # Number of lines (None for binary)

Next Steps¶

Basic Usage - Core functionality
Docker Sandbox - Execute code on uploaded files
Structured Output - Type-safe results