A 2026 survey of more than 17,000 public MCP servers found that 41% ship with zero authentication. If you built yours by following a ten-minute tutorial, it’s probably in that 41% — open to anyone who finds the URL. Let’s fix that.
By the end of this guide, you’ll build an MCP server in Python that an AI agent can actually call in production: real tools, OAuth 2.1 on every request, and a streamable-HTTP transport you can put behind a real domain. We’ll use FastMCP, the library that crossed roughly 4 million daily downloads in 2026 and is now the default way to build these servers.
MCP (Model Context Protocol — an open standard that lets AI agents call your tools through one consistent interface) is everywhere this year. Most tutorials stop at “hello world.” This one takes you to the part that matters: shipping it without leaving the door open.
What “Production-Ready” Actually Means for an MCP Server
Before any code, you need to know what separates a demo server from one you’d trust on the open internet. The gap is almost entirely about transport and authentication — the two things toy tutorials skip.
Most beginner guides run your server over stdio (standard input/output — the server talks to one client on the same machine through a pipe). Stdio is perfect for local development and zero use in production: it can’t accept a remote connection, so no deployed agent can reach it. Production servers use streamable-HTTP instead — a single HTTP endpoint that works behind load balancers, proxies, and TLS. It replaced the older HTTP+SSE transport in the 2025 MCP spec, and switching to it is a one-line change you’ll make later.
Here’s the full shape of what we’re building:
The diagram shows the one piece tutorials leave out: the OAuth 2.1 gate sitting in front of your tools. Every agent request carries a token; the server validates it before a single tool runs. Without that gate, your tools are a public API with no lock.
Build the MCP Server with FastMCP
With the goal clear, the build itself is short — that’s the appeal of FastMCP. First, the setup checklist:
- Python 3.11+ and a virtual environment
-
pip install fastmcp(this guide used FastMCP 3.x, current in June 2026) - An OAuth provider for later — any issuer that exposes a JWKS endpoint (Auth0, WorkOS, Keycloak, your own)
- A terminal to run the server and a test client
If you’ve never touched MCP at all, read What Are AI Agents? Complete Guide for Developers (2026) first — it covers why agents need tools in the first place.
The four stages ahead are short, and each one builds on the last:
That sequence is the whole post. Step 1 and the deploy step are quick; the auth step in the middle is the one that actually keeps your server safe.
Now the server. A FastMCP tool is just a typed Python function with a decorator. The type hints aren’t optional polish — they’re how the agent knows what arguments to send.
# server.pyfrom fastmcp import FastMCPmcp = FastMCP(name="Ops Server")@mcp.tooldef disk_usage(path: str = "/") -> dict:"""Return disk usage stats for a path on the host."""import shutiltotal, used, free = shutil.disk_usage(path)gb = 1024 ** 3return {"path": path,"total_gb": round(total / gb, 1),"used_gb": round(used / gb, 1),"free_gb": round(free / gb, 1),}if __name__ == "__main__":mcp.run() # stdio by default — fine for local dev, not production
That’s a working server. Run python server.py and an MCP client on the same machine can call disk_usage. The docstring becomes the tool’s description the agent reads, and the return type tells the agent it’s getting structured data back.
MCP has two more primitives worth knowing. Resources expose read-only data the agent can pull in as context, and prompts are reusable instruction templates. Both are one decorator each:
@mcp.resource("config://limits")def limits() -> str:"""Read-only config the agent can load as context."""return "max_disk_pct=90\nalert_email=ops@example.com"@mcp.promptdef triage(host: str) -> str:return f"Check {host} for disk pressure. Summarise the risk in 3 lines."
Tools do things, resources hand over data, prompts shape behaviour. I keep all three in one file until a server grows past a dozen tools — then I split them. Right now, this server works, but it’s wide open. Let’s close it.
Add OAuth 2.1 Auth — the Part Tutorials Skip
This is the section that moves you out of that 41%. The goal is simple: no valid token, no tool call. FastMCP makes the server reject unauthenticated requests with a 401 before any of your code runs.
The 2026 MCP spec standardised on OAuth 2.1 with token audience validation — meaning your server only accepts tokens minted specifically for it, and never forwards them to a downstream API. FastMCP ships a JWTVerifier that does exactly this against your identity provider’s public keys:
# server.py (auth added)from fastmcp import FastMCPfrom fastmcp.server.auth.providers.jwt import JWTVerifierauth = JWTVerifier(jwks_uri="https://auth.example.com/.well-known/jwks.json",issuer="https://auth.example.com",audience="ops-mcp-server", # this server's unique identifier)mcp = FastMCP(name="Ops Server", auth=auth)
That’s the whole change. The JWTVerifier fetches your provider’s public keys from the JWKS URL (a standard endpoint that publishes the keys used to sign tokens) and checks every incoming request. A token with the wrong issuer or audience is rejected. You write zero verification logic.
The first time I wired this up, I expected an afternoon of OAuth pain. It took about fifteen minutes, because the hard parts — key rotation, signature checks, audience matching — live inside the verifier.
Common mistake: setting
audienceto your auth provider’s URL instead of this server’s identifier. The audience is the resource being protected, not who issued the token. Get it wrong and every valid-looking token gets a confusing401.
Deploy It and Test the Token Is Enforced
A locked server still has to be reachable. Two changes take you from local stdio to a deployed, authenticated endpoint. First, switch the transport:
if __name__ == "__main__":mcp.run(transport="http", host="0.0.0.0", port=8000)
transport="http" is FastMCP’s streamable-HTTP mode. Binding to 0.0.0.0 lets the container accept outside connections — which is exactly why the auth from the last step is non-negotiable. Put this behind a reverse proxy with TLS (a managed host or an Nginx/Caddy layer), and load your secrets from environment variables, never from code.
Now prove the lock works. A short client checks both paths — a valid token gets in, anything else is refused:
# test_client.pyimport asynciofrom fastmcp import Clientasync def main():async with Client("https://ops.example.com/mcp",auth="paste-a-valid-jwt-here") as client:tools = await client.list_tools()print("tools:", [t.name for t in tools])result = await client.call_tool("disk_usage", {"path": "/"})print("result:", result.data)asyncio.run(main())
With a good token, you’ll see the tool list and a real result. Run the same client with a junk token and the call fails with 401 Unauthorized before disk_usage ever executes — that single failed request is the proof your server is no longer in the open 41%. I always keep this negative test; it’s the fastest way to catch a misconfigured audience after a deploy.
If you’re deploying alongside a running agent, the patterns in Build an Agentic AI App in Python: FastAPI, Docker & Deploy (Part 2) — containers, environment secrets, and a managed host — apply directly to this server too.
Common Mistakes That Keep Servers Insecure
Most broken production servers fail in the same handful of ways. I’ve hit several of these myself; here’s the short list worth checking before you ship:
- Stdio in production. It can’t take a remote connection. If a deployed agent can’t reach your server, this is usually why — switch to streamable-HTTP.
- Binding
0.0.0.0with no auth. Auditors keep finding servers exposed on every interface with the lock left off. Add theJWTVerifierbefore you expose the port, not after. - Missing type hints. An untyped tool gives the agent no schema, so it guesses arguments and calls fail intermittently. Type every parameter.
- Secrets in tool descriptions. Docstrings are sent to the model. One audited server stored an admin API key in a tool description — never put credentials anywhere the agent can read.
- Token passthrough. Don’t forward the agent’s token to a downstream API. The spec forbids it; validate the audience and stop there.
What to Build Next
You now have a real MCP server: typed tools, OAuth on every call, and a transport you can deploy. The obvious next step is to plug it into an actual agent.
I’d wire this server into a multi-agent system next, because that’s where a locked toolset earns its keep — several agents sharing one trusted tool layer. An authenticated MCP server is the cleanest way to give an agent real-world powers without handing out raw API keys.
From there, the highest-value upgrade is per-tool scopes. Right now a valid token unlocks every tool; a read-only agent and a write-access agent get identical power. Attaching a required scope to each tool — read for disk_usage, a stricter write scope for anything that deletes or spends — means one leaked token can’t do everything. Add rate limiting in front of any tool that touches money or sends email, and you’ve covered the failures that actually cause incidents. Each of these is a small change on top of the foundation you just built.
Conclusion
You’ve built what most tutorials don’t: an MCP server that’s safe to deploy. The recipe is short — typed tools with FastMCP, a JWTVerifier for OAuth 2.1, transport="http" for streamable-HTTP, and a negative test that proves the token is enforced. That’s the difference between a demo and something an agent can use in production.
What’s the first tool you’d expose to an agent through MCP — and would you trust it on the open internet yet? Tell me in the comments.
Read next: Build an Agentic AI App in Python: Multi-Agent Systems (Part 3) — the agent side that consumes a server like this one.
Related: What Are AI Agents? Complete Guide for Developers (2026) · Official docs: FastMCP and the Model Context Protocol spec.






