InfoWok
Intermediate

OpenAI Agents SDK Tutorial: Build an Agent in Python (2026)

A hands-on 2026 tutorial for the OpenAI Agents SDK in Python. You build up from a one-file agent to tools, validated structured output, multi-agent handoffs, and persistent memory with sessions — every snippet runnable, plus the mistakes that trip people up first.

SK
Sukhveer Kaur
Published July 4, 2026
6 min read
Title card reading 'OpenAI Agents SDK Tutorial' — a 2026 Python guide to building an AI agent with tools, handoffs and memory using the OpenAI Agents SDKAI Engineering
AGENTS · GETTING STARTED
On this page +

Most build an AI agent tutorials hand you a wall of framework boilerplate before the agent does anything. The OpenAI Agents SDK goes the other way. It gives you four small pieces — an agent, a runner, tools, and handoffs — and hides the loop that ties them together. You write mostly plain Python.

This tutorial builds up in one file at a time. You start with a single agent, give it a tool, force it to return validated data, route work between specialists with handoffs, and finally add memory. Every snippet runs as-is on the current SDK. By the end you will have a working multi-agent support bot and a clear sense of when this SDK is the right call over heavier options.

🟡 Intermediate⏱️ 20 minStack: Python 3.10+, openai-agents
Before you start
🎯 Key takeaways
  • The OpenAI Agents SDK is small on purpose: Agent, Runner, tools, and handoffs cover most real apps.
  • Tools are just decorated Python functions. @function_tool turns a function into a tool, and its docstring becomes the spec the model reads.
  • output_type gives you validated data, not a string you have to parse — a Pydantic model in, a typed object out.
  • Handoffs route a turn to a specialist; sessions give the agent memory across runs. Together they make a real support bot.

What the OpenAI Agents SDK is (and when to use it)#

The OpenAI Agents SDK is a lightweight Python framework for building agents — the production successor to OpenAI’s earlier Swarm experiment. An agent here is a model plus instructions, tools, and optional handoffs. The SDK runs the agentic loop for you, so you focus on the pieces, not the plumbing.

It competes with LangGraph, CrewAI, Google’s ADK, and Pydantic AI. Its pitch is fewer abstractions. There is no graph to wire and no chain to assemble. Reach for it when you want the shortest path from idea to a running agent. For a wider view of the field, see the best AI agent frameworks in 2026 and LangGraph vs CrewAI vs AutoGen. New to agents entirely? Start with what AI agents are.

Setup: install and set your key#

Create a virtual environment, install the package, and export your key. The package name is openai-agents; the import name is agents.

bash
python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate
pip install openai-agents
export OPENAI_API_KEY=sk-...      # Windows PowerShell: $env:OPENAI_API_KEY="sk-..."
⚠️ Warning

The SDK reads OPENAI_API_KEY from the environment. If you skip this step, the first run fails with an auth error — not a code bug. Keep the key in a .env file or your shell profile, never in the source.

Your first agent#

An agent needs a name and instructions. The Runner executes it and returns a result whose final_output holds the answer.

python
import asyncio
from agents import Agent, Runner
 
agent = Agent(
    name="Assistant",
    instructions="You are a concise, helpful assistant.",
)
 
async def main():
    result = await Runner.run(agent, "Explain what an AI agent is in two sentences.")
    print(result.final_output)
 
if __name__ == "__main__":
    asyncio.run(main())

That is a complete program. The SDK handles the model call and the loop; you handle the intent.

💡 Tip

Working in a script or notebook and don’t want an event loop? Use the synchronous wrapper: result = Runner.run_sync(agent, "your question"). Same behavior, no asyncio.

Give it a tool#

Agents get useful when they can act. The @function_tool decorator turns any Python function into a tool. The SDK reads the function’s type hints and docstring to build the schema the model sees.

python
import asyncio
from agents import Agent, Runner, function_tool
 
@function_tool
def get_weather(city: str) -> str:
    """Return the current weather for a city.
 
    Args:
        city: The city name, for example "Paris".
    """
    # In real code, call a weather API here.
    return f"It's 22°C and sunny in {city}."
 
agent = Agent(
    name="Weather Assistant",
    instructions="Answer weather questions. Use get_weather for live conditions.",
    tools=[get_weather],
)
 
async def main():
    result = await Runner.run(agent, "What's the weather in Paris right now?")
    print(result.final_output)
 
if __name__ == "__main__":
    asyncio.run(main())

The model decides when to call the tool, runs it, reads the result, and writes the reply. You never call the tool yourself — you just make it available and describe it well.

🔑 Key point

The docstring is not documentation here — it is the tool’s specification. The model uses it to decide whether and how to call the tool. A vague docstring produces wrong or skipped calls, so treat that first line as a prompt.

Structured output you can trust#

A string reply is fine for chat. It is a problem when the next step is code. Set output_type to a Pydantic model and the agent returns a validated object instead of prose.

python
import asyncio
from pydantic import BaseModel
from agents import Agent, Runner
 
class Ticket(BaseModel):
    category: str
    priority: str
    summary: str
 
triage = Agent(
    name="Ticket Classifier",
    instructions="Classify the support message into a structured ticket.",
    output_type=Ticket,
)
 
async def main():
    result = await Runner.run(triage, "My invoice is wrong and I was charged twice!")
    ticket = result.final_output          # a validated Ticket instance
    print(ticket.category, "|", ticket.priority)
 
if __name__ == "__main__":
    asyncio.run(main())

Now result.final_output is a Ticket, not text. Structured output is the single biggest reliability upgrade when an agent feeds a real system. New to Pydantic? The BaseModel primer covers what you need, and Pydantic AI takes the type-safe idea further.

Multiple agents: handoffs#

One agent rarely does everything well. Handoffs let a router hand the conversation to a specialist. You list the specialists in handoffs=[...], and handoff_description tells the router when to pick each one.

python
import asyncio
from agents import Agent, Runner
 
billing_agent = Agent(
    name="Billing Agent",
    handoff_description="Handles billing, invoices, and refunds.",
    instructions="Resolve billing and refund questions clearly.",
)
 
tech_agent = Agent(
    name="Tech Support",
    handoff_description="Handles bugs, errors, and how-to questions.",
    instructions="Help with technical problems, step by step.",
)
 
triage_agent = Agent(
    name="Triage",
    instructions="Route each customer message to the right specialist.",
    handoffs=[billing_agent, tech_agent],
)
 
async def main():
    result = await Runner.run(triage_agent, "I was double-charged for my subscription.")
    print(result.final_output)
    print("Handled by:", result.last_agent.name)
 
if __name__ == "__main__":
    asyncio.run(main())

The SDK exposes each handoff to the model as a transfer_to_<agent> tool, so the router “calls” a specialist the same way it calls any tool. A handoff transfers control; the specialist owns the answer.

📌 Note

Handoffs are not the only pattern. If you want the router to stay in charge and treat specialists as helpers, use agents as tools instead: orchestrator = Agent(..., tools=[billing_agent.as_tool(...), tech_agent.as_tool(...)]). Handoff = transfer control; as_tool = keep control.

Add memory with sessions#

By default each run is stateless. Wrap it in a session and the agent remembers earlier turns. SQLiteSession persists the conversation to a file, so memory survives restarts.

python
import asyncio
from agents import Agent, Runner, SQLiteSession
 
agent = Agent(name="Assistant", instructions="Reply concisely.")
session = SQLiteSession("customer_42", "conversations.db")
 
async def main():
    await Runner.run(agent, "My name is Priya.", session=session)
    result = await Runner.run(agent, "What's my name?", session=session)
    print(result.final_output)   # remembers "Priya" across runs
 
if __name__ == "__main__":
    asyncio.run(main())

Pass the same session object into each Runner.run, and the SDK loads and saves history for you. Sessions turn a stateless call into a real conversation with almost no extra code.

Common mistakes#

These are the traps that cost beginners the most time.

  • Forgetting the API key. The first run dies on auth, not logic. Set OPENAI_API_KEY before anything else.
  • Writing vague tool docstrings. The model routes on the docstring. “Gets data” earns wrong calls; be specific about what the tool does and its arguments.
  • Fighting async. Runner.run is a coroutine — await it, or use Runner.run_sync. Calling it without either returns a coroutine object, not a result.
  • Confusing handoffs with tools. A handoff transfers the turn; as_tool keeps the router in control. Pick based on who should own the final answer.
  • Ignoring result.last_agent. In a multi-agent run, that field tells you which specialist actually replied — essential for logging and evaluation.

Summary#

You went from an empty file to a multi-agent support bot with memory. The OpenAI Agents SDK earned its keep by staying out of the way: an Agent holds the config, the Runner drives the loop, @function_tool adds abilities, output_type makes replies trustworthy, and handoffs plus sessions turn it into something you would actually ship. The heavy frameworks add power when you need explicit control — but for most agents, this much SDK is enough.

🧭 Where to go from here

Built something with this? Tell me which pattern you reached for first — a single tool-using agent, or the triage-and-handoff setup — and I’ll point you at the sharp edges to watch.

Frequently asked questions

Is the OpenAI Agents SDK free to use? +
The SDK itself is open-source and free. You pay only for the model calls it makes — the OpenAI API usage behind each run. You can also point it at non-OpenAI models through the LiteLLM extension, so "free SDK, metered model" is the right mental model.
How is the OpenAI Agents SDK different from LangGraph or LangChain? +
The Agents SDK is deliberately small: an Agent, a Runner, tools, and handoffs. It hides the agent loop and lets you write mostly plain Python. LangGraph gives you an explicit graph you control node by node. Reach for the SDK when you want the shortest path to a working agent; reach for LangGraph when you need fine-grained control over state and branching.
Do I have to use async/await? +
Runner.run is async, but the SDK ships Runner.run_sync for scripts and notebooks where you do not want an event loop. The logic is identical; only the calling style changes.
Can I use it with models other than OpenAI's? +
Yes. The SDK supports other providers through the LiteLLM extension, so you can run the same Agent and Runner code against Anthropic, Google, or local models with a different model string.

References

  1. OpenAI Agents SDK — Quickstart (official docs)
  2. OpenAI Agents SDK — Handoffs (official docs)
  3. OpenAI Agents SDK — Sessions (official docs)
  4. openai/openai-agents-python (GitHub)
Written by
Sukhveer Kaur
Sukhveer KaurSoftware Developer & AI Engineer

Sukhveer is a software developer specialising in AI systems and backend engineering. She has hands-on experience designing agentic AI applications, working with large language model pipelines, autonomous agent frameworks, and cloud-native services in Java and Python. At InfoWok, she bridges the gap between cutting-edge AI research and practical implementation — helping developers understand and apply emerging technologies through clear, experience-backed writing.

New AI engineering guides, the day they ship

Real Python, production depth. No digest spam.

Comments