Claw field notebook
last updated 2026-05-15 edit on GitHub colophon
OpenAI / Agents SDK / ASDK.2 · 3 min read

Core concepts — agents, handoffs, tools, guardrails

The nine primitives the OpenAI Agents SDK ships — agents, sandbox agents, handoffs, agents-as-tools, tools, guardrails, human-in-the-loop, sessions, tracing — explained with the minimum code to feel each one.

The ten primitives at a glance#

The README counts nine because Handoffs and Agents-as-tools are listed as one combined item there; below they’re split into two rows since they’re used as distinct patterns. Total: ten:

#PrimitiveOne-liner
1AgentAn LLM with instructions + (optional) tools, guardrails, and handoffs.
2Sandbox AgentAn agent that has a container/workspace it can read and edit. New in v0.14.0.
3HandoffSpecialist agent takes over the rest of the turn.
4Agents as toolsManager keeps control; specialists are called as tools.
5ToolsFunctions, MCP servers, or hosted tools the agent can call.
6GuardrailsInput + output validation hooks.
7Human in the loopPause for human approval mid-run.
8SessionsAutomatic conversation memory.
9TracingBuilt-in spans for every run, tool call, handoff.
10Realtime AgentsVoice agents using gpt-realtime-2 with full agent features.

The rest of this page walks through each with the minimum code to feel it. (Realtime Agents is voice-specific and covered separately in the realtime quickstart — this page focuses on the text-shaped primitives.)

1. Agent#

An Agent is the building block. Three required-ish fields: name, instructions, and optionally a model.

from agents import Agent

agent = Agent(
    name="History Tutor",
    instructions="You answer history questions clearly and concisely.",
)

Run it via the Runner:

from agents import Runner

result = await Runner.run(agent, "When did the Roman Empire fall?")
print(result.final_output)

result.final_output is the final text the agent produced. result.to_input_list() is the conversation as a list you can feed back in for another turn.

2. Sandbox Agent#

A SandboxAgent is an Agent that has access to a configurable container/workspace — a filesystem the agent can read, write, and run commands against. Useful for long-horizon tasks (multi-file edits, file inspection, applying patches).

Sandbox Agents were added in v0.14.0. You declare a Manifest (what’s in the workspace at start) and a SandboxRunConfig (which sandbox client to use — local Unix, Docker, remote).

from agents import Runner
from agents.run import RunConfig
from agents.sandbox import Manifest, SandboxAgent, SandboxRunConfig
from agents.sandbox.entries import GitRepo
from agents.sandbox.sandboxes import UnixLocalSandboxClient

agent = SandboxAgent(
    name="Workspace Assistant",
    instructions="Inspect the sandbox workspace before answering.",
    default_manifest=Manifest(
        entries={"repo": GitRepo(repo="openai/openai-agents-python", ref="main")},
    ),
)

result = Runner.run_sync(
    agent,
    "Inspect the repo README and summarise what this project does.",
    run_config=RunConfig(sandbox=SandboxRunConfig(client=UnixLocalSandboxClient())),
)
print(result.final_output)

This is the SDK’s answer to “I need an agent that touches files.” Different shape from a plain Agent — different Runner config, different tools available.

3. Handoff#

A handoff lets a specialist take over the rest of the turn. The triage agent decides who’s best for the question; that agent inherits the conversation and produces the final answer.

math_tutor = Agent(name="Math Tutor", instructions="...")
history_tutor = Agent(name="History Tutor", instructions="...")

triage = Agent(
    name="Triage",
    instructions="Route the user to the right tutor.",
    handoffs=[math_tutor, history_tutor],
)

result = await Runner.run(triage, "Solve x^2 + 5x + 6 = 0")

The triage agent runs, decides math, hands off, and the math tutor produces result.final_output. The trace shows both steps.

Use handoffs when: the specialist should own the conversation from that point. The triage doesn’t need to weigh the specialist’s output.

4. Agents as tools#

A manager-pattern alternative — the orchestrator stays in control and calls specialists as tools. The orchestrator gets each specialist’s output back and decides what to do with it.

from agents import Agent

researcher = Agent(name="Researcher", instructions="Find facts.")
writer = Agent(name="Writer", instructions="Write a paragraph.")

orchestrator = Agent(
    name="Manager",
    instructions="Research the topic, then ask the writer to draft.",
    tools=[
        researcher.as_tool(tool_name="research", tool_description="Find facts."),
        writer.as_tool(tool_name="write", tool_description="Draft a paragraph."),
    ],
)

Use agents-as-tools when: the orchestrator should weigh, combine, or revise the specialists’ outputs.

Handoffs vs agents-as-tools is the most-asked question on the SDK. Rule of thumb — if the specialist owns the answer, handoff; if the manager owns the answer, tool.

5. Tools#

Three flavours:

  • Function tools — your own Python/TS functions, exposed via @function_tool:

    from agents import function_tool
    
    @function_tool
    def get_weather(city: str) -> str:
        """Get current weather for a city."""
        return f"{city}: 18°C, partly cloudy"
  • MCP tools — any Model Context Protocol server slots in directly. Filesystem, git, Slack, database — same wiring as Claude or other MCP-aware clients.

  • OpenAI-hosted tools — web search, file search, code interpreter, image generation. These run on OpenAI’s side and don’t need your infrastructure. (Note: ComputerTool and ApplyPatchTool are local-runtime tools — they’re in the SDK, but they run in your environment with a Computer / AsyncComputer implementation you provide. See the tools docs for the full split.)

Function tools are typed (Python type hints become the JSON schema the model sees). The docstring becomes the description.

6. Guardrails#

Guardrails are pre- and post-validation hooks. Reject input that’s out of scope, sanitise output, or fail the run if something dangerous appears.

from agents import GuardrailFunctionOutput, input_guardrail, RunContextWrapper

@input_guardrail
async def topic_check(ctx: RunContextWrapper, agent: Agent, user_input: str):
    if "homework" not in user_input.lower():
        return GuardrailFunctionOutput(
            output_info="Off-topic input rejected",
            tripwire_triggered=True,
        )
    return GuardrailFunctionOutput(output_info=None, tripwire_triggered=False)

When tripwire_triggered=True, the SDK raises an exception that ends the run cleanly.

7. Human in the loop#

The SDK ships built-in support for pausing a run, asking a human, and resuming. The full pattern is in the docs; the shape:

result = await Runner.run(agent, "Delete the temp files.")

if result.interruptions:
    state = result.to_state()                # capture resumable snapshot
    for interruption in result.interruptions:
        # Surface the question to a human; on their answer:
        state.approve(interruption)          # or state.reject(interruption)
    # Resume by passing the state back in (not a new string)
    result = await Runner.run(agent, state)

The key idea: a run that needs approval returns with result.interruptions populated; you turn that into a state snapshot, mutate it (approve / reject), and feed it back to Runner.run.

8. Sessions#

Sessions persist conversation history across Runner.run calls so you don’t pass it around manually.

from agents import SQLiteSession

session = SQLiteSession("user-123-conversation")
result = await Runner.run(agent, "Hello", session=session)
# ... later, same session ...
result2 = await Runner.run(agent, "What did I just say?", session=session)

Other backends: Session (in-memory), Redis (optional install: pip install 'openai-agents[redis]'), SQLAlchemy. Or implement your own backend.

9. Tracing#

Every Runner.run call generates a trace — a tree of spans showing the agent run, every model call, every tool call, every handoff, every guardrail check. Traces export to the OpenAI dashboard by default; you can also point them at Langfuse, Logfire, or your own OpenTelemetry collector.

You don’t enable tracing — it’s on by default. You disable it with RunConfig(tracing_disabled=True) if you need to.

What to do next#

Sources