Skip to content

Tools & Actions

Tools are Python functions that agents can execute. Agent SDK makes it easy to turn any function into a tool using decorators.

Creating Tools

Basic Tool

Use @tool_message to define a status message that will be shown when the tool runs.

from agent_sdk import tool_message

@tool_message("Calculating {a} + {b}...")
def add(a: int, b: int) -> int:
    """
    Adds two numbers. 
    The docstring is VERY important. The LLM uses it to understand how to use the tool.
    """
    return a + b

Approval Required

Use @approval_required to force human intervention before this tool runs. This requires the HumanInTheLoop middleware to be active in the Runner.

from agent_sdk import tool_message, approval_required

@approval_required
@tool_message("DELETING file: {path}")
def delete_file(path: str) -> str:
    """Deletes a file permanently."""
    # ... logic ...
    return "File deleted"

Standard Tools

The SDK comes with several built-in tools in agent_sdk.tools:

  • web_search(query): Searches the web (DuckDuckGo).
  • wikipedia_search(query): Searches Wikipedia.
  • read_file(path): Reads a file securely.
  • list_directory(path): Lists folder contents.
  • run_python_code(code): Executes code in a sandbox.

Advanced RLM & Data Analysis Tools (v1.0.0)

For interacting with complex data, large documents, and multimodal inputs, the SDK provides the following advanced tools:

  • recursive_document_analysis(file_path, question): A powerful alternative to RAG for massive documents. It uses a "Rolling State" (Map-Reduce) algorithm to read a massive document chunk-by-chunk. This prevents context window explosion while ensuring the agent understands the entire plot or structure of a huge file.
  • search_inside_document(file_path, query): Semantically searches inside a large document. It creates a temporary, on-the-fly ChromaDB in memory, embeds the document, and returns only the top 3 most relevant paragraphs to the agent.
  • analyze_media(file_path, question): Gives the agent the ability to "see". It passes an image or video to a Vision model (like Gemini 1.5 Pro) and returns the textual analysis to the agent.
  • save_document_to_memory(file_path): Allows the agent to autonomously take a document it just read or created and index it permanently into the ChromaRAG database for future recall.
from agent_sdk.tools import recursive_document_analysis, analyze_media

agent = Agent(
    name="Researcher",
    model="gpt-4o",
    tools={
        "read_long_book": recursive_document_analysis,
        "watch_video": analyze_media
    }
)

Sandboxing

For executing code generated by the agent, use the run_python_code tool. It automatically selects the best available sandbox:

  1. DockerSandbox: (Preferred) Runs code in a Docker container for maximum isolation.
  2. LocalSandbox: Runs code in a separate process with restricted globals.

To enable the sandbox tool:

from agent_sdk.tools import run_python_code

agent = Agent(
    name="Coder",
    model="gpt-4o",
    tools={"run_python_code": run_python_code}
)