Most build an AI agent tutorials hand you a wall of framework boilerplate before the agent does anything. The OpenAI Agents SDK goes the other way. It gives you four small pieces — an agent, a runner, tools, and handoffs — and hides the loop that ties them together. You write mostly plain Python.
This tutorial builds up in one file at a time. You start with a single agent, give it a tool, force it to return validated data, route work between specialists with handoffs, and finally add memory. Every snippet runs as-is on the current SDK. By the end you will have a working multi-agent support bot and a clear sense of when this SDK is the right call over heavier options.
- Comfortable with Python functions and basic classes
- New to
async/await? Read the async & await primer first - An OpenAI API key and a virtual env — see the API key setup primer and the venv primer
- The OpenAI Agents SDK is small on purpose:
Agent,Runner, tools, and handoffs cover most real apps. - Tools are just decorated Python functions.
@function_toolturns a function into a tool, and its docstring becomes the spec the model reads. output_typegives you validated data, not a string you have to parse — a Pydantic model in, a typed object out.- Handoffs route a turn to a specialist; sessions give the agent memory across runs. Together they make a real support bot.
What the OpenAI Agents SDK is (and when to use it)#
The OpenAI Agents SDK is a lightweight Python framework for building agents — the production successor to OpenAI’s earlier Swarm experiment. An agent here is a model plus instructions, tools, and optional handoffs. The SDK runs the agentic loop for you, so you focus on the pieces, not the plumbing.
It competes with LangGraph, CrewAI, Google’s ADK, and Pydantic AI. Its pitch is fewer abstractions. There is no graph to wire and no chain to assemble. Reach for it when you want the shortest path from idea to a running agent. For a wider view of the field, see the best AI agent frameworks in 2026 and LangGraph vs CrewAI vs AutoGen. New to agents entirely? Start with what AI agents are.
Setup: install and set your key#
Create a virtual environment, install the package, and export your key. The package name is openai-agents; the import name is agents.
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install openai-agents
export OPENAI_API_KEY=sk-... # Windows PowerShell: $env:OPENAI_API_KEY="sk-..."The SDK reads OPENAI_API_KEY from the environment. If you skip this step, the first run fails with an auth error — not a code bug. Keep the key in a .env file or your shell profile, never in the source.
Your first agent#
An agent needs a name and instructions. The Runner executes it and returns a result whose final_output holds the answer.
import asyncio
from agents import Agent, Runner
agent = Agent(
name="Assistant",
instructions="You are a concise, helpful assistant.",
)
async def main():
result = await Runner.run(agent, "Explain what an AI agent is in two sentences.")
print(result.final_output)
if __name__ == "__main__":
asyncio.run(main())That is a complete program. The SDK handles the model call and the loop; you handle the intent.
Working in a script or notebook and don’t want an event loop? Use the synchronous wrapper: result = Runner.run_sync(agent, "your question"). Same behavior, no asyncio.
Give it a tool#
Agents get useful when they can act. The @function_tool decorator turns any Python function into a tool. The SDK reads the function’s type hints and docstring to build the schema the model sees.
import asyncio
from agents import Agent, Runner, function_tool
@function_tool
def get_weather(city: str) -> str:
"""Return the current weather for a city.
Args:
city: The city name, for example "Paris".
"""
# In real code, call a weather API here.
return f"It's 22°C and sunny in {city}."
agent = Agent(
name="Weather Assistant",
instructions="Answer weather questions. Use get_weather for live conditions.",
tools=[get_weather],
)
async def main():
result = await Runner.run(agent, "What's the weather in Paris right now?")
print(result.final_output)
if __name__ == "__main__":
asyncio.run(main())The model decides when to call the tool, runs it, reads the result, and writes the reply. You never call the tool yourself — you just make it available and describe it well.
The docstring is not documentation here — it is the tool’s specification. The model uses it to decide whether and how to call the tool. A vague docstring produces wrong or skipped calls, so treat that first line as a prompt.
Structured output you can trust#
A string reply is fine for chat. It is a problem when the next step is code. Set output_type to a Pydantic model and the agent returns a validated object instead of prose.
import asyncio
from pydantic import BaseModel
from agents import Agent, Runner
class Ticket(BaseModel):
category: str
priority: str
summary: str
triage = Agent(
name="Ticket Classifier",
instructions="Classify the support message into a structured ticket.",
output_type=Ticket,
)
async def main():
result = await Runner.run(triage, "My invoice is wrong and I was charged twice!")
ticket = result.final_output # a validated Ticket instance
print(ticket.category, "|", ticket.priority)
if __name__ == "__main__":
asyncio.run(main())Now result.final_output is a Ticket, not text. Structured output is the single biggest reliability upgrade when an agent feeds a real system. New to Pydantic? The BaseModel primer covers what you need, and Pydantic AI takes the type-safe idea further.
Multiple agents: handoffs#
One agent rarely does everything well. Handoffs let a router hand the conversation to a specialist. You list the specialists in handoffs=[...], and handoff_description tells the router when to pick each one.
import asyncio
from agents import Agent, Runner
billing_agent = Agent(
name="Billing Agent",
handoff_description="Handles billing, invoices, and refunds.",
instructions="Resolve billing and refund questions clearly.",
)
tech_agent = Agent(
name="Tech Support",
handoff_description="Handles bugs, errors, and how-to questions.",
instructions="Help with technical problems, step by step.",
)
triage_agent = Agent(
name="Triage",
instructions="Route each customer message to the right specialist.",
handoffs=[billing_agent, tech_agent],
)
async def main():
result = await Runner.run(triage_agent, "I was double-charged for my subscription.")
print(result.final_output)
print("Handled by:", result.last_agent.name)
if __name__ == "__main__":
asyncio.run(main())The SDK exposes each handoff to the model as a transfer_to_<agent> tool, so the router “calls” a specialist the same way it calls any tool. A handoff transfers control; the specialist owns the answer.
Handoffs are not the only pattern. If you want the router to stay in charge and treat specialists as helpers, use agents as tools instead: orchestrator = Agent(..., tools=[billing_agent.as_tool(...), tech_agent.as_tool(...)]). Handoff = transfer control; as_tool = keep control.
Add memory with sessions#
By default each run is stateless. Wrap it in a session and the agent remembers earlier turns. SQLiteSession persists the conversation to a file, so memory survives restarts.
import asyncio
from agents import Agent, Runner, SQLiteSession
agent = Agent(name="Assistant", instructions="Reply concisely.")
session = SQLiteSession("customer_42", "conversations.db")
async def main():
await Runner.run(agent, "My name is Priya.", session=session)
result = await Runner.run(agent, "What's my name?", session=session)
print(result.final_output) # remembers "Priya" across runs
if __name__ == "__main__":
asyncio.run(main())Pass the same session object into each Runner.run, and the SDK loads and saves history for you. Sessions turn a stateless call into a real conversation with almost no extra code.
Common mistakes#
These are the traps that cost beginners the most time.
- Forgetting the API key. The first run dies on auth, not logic. Set
OPENAI_API_KEYbefore anything else. - Writing vague tool docstrings. The model routes on the docstring. “Gets data” earns wrong calls; be specific about what the tool does and its arguments.
- Fighting async.
Runner.runis a coroutine —awaitit, or useRunner.run_sync. Calling it without either returns a coroutine object, not a result. - Confusing handoffs with tools. A handoff transfers the turn;
as_toolkeeps the router in control. Pick based on who should own the final answer. - Ignoring
result.last_agent. In a multi-agent run, that field tells you which specialist actually replied — essential for logging and evaluation.
Summary#
You went from an empty file to a multi-agent support bot with memory. The OpenAI Agents SDK earned its keep by staying out of the way: an Agent holds the config, the Runner drives the loop, @function_tool adds abilities, output_type makes replies trustworthy, and handoffs plus sessions turn it into something you would actually ship. The heavy frameworks add power when you need explicit control — but for most agents, this much SDK is enough.
- New to agents? Start with What Are AI Agents? A Complete 2026 Guide.
- Comparing tools? Read the best AI agent frameworks in 2026 and the LangGraph tutorial to feel the difference.
- Going to production? Add evals and connect real tools with an MCP server.
Built something with this? Tell me which pattern you reached for first — a single tool-using agent, or the triage-and-handoff setup — and I’ll point you at the sharp edges to watch.

