# Claude Code MCP with Local Models: A Private Agent Stack

> Claude Code MCP on a free local stack: add servers with one command, pick tool-capable Ollama models, set scopes right, and dodge the privacy trap.

*Source: https://www.infowok.com/claude-code-mcp-local-models/ · Sukhveer Kaur · Published July 5, 2026*

---

> **Local AI, Zero Cost — Part 6.** [Part 4](/run-claude-code-for-free-2026/) freed Claude Code from the subscription, and [Part 5](/connect-claude-code-to-any-model/) wired it to the backend of your choice. This part adds the last piece: Claude Code MCP servers, so your free local stack can reach databases, browsers and docs — without quietly giving up the privacy you built it for.

Your local Claude Code setup can already read files, run commands and edit code. What it can't do is query your Postgres database, fetch a live doc page, or remember decisions between sessions. That's what MCP adds, and the setup is one command.

The interesting part is doing it on a *local* stack. Tool calling is exactly where small models are weakest, tool definitions eat the context you barely have, and one careless server choice can undo the whole privacy story. This guide covers the one command, the three scopes, the models that can actually drive tools — and the trap.

<KeyTakeaways>

- **MCP lives in the client, not the model:** Claude Code manages the servers and executes the calls, so every backend from Part 5 — Ollama included — can use MCP tools.
- **Local models need help:** pick tool-tagged models (14B+ or a strong MoE like a GLM flash variant), keep 64k+ context, and add one server at a time — tool definitions eat context fast.
- **Privacy is per-server, not per-model:** a local model with a remote MCP server still sends your data out. Fully private means local model *and* local servers.

</KeyTakeaways>

<Prerequisites>

- A working Claude Code setup on your chosen backend — [Part 5](/connect-claude-code-to-any-model/) if you haven't wired one yet
- Node.js or Python installed (most reference servers launch via `npx` or `uvx`)
- Five minutes of MCP background if it's new to you: [What Is an MCP Server?](/what-is-mcp-server-complete-guide-2026/)

</Prerequisites>

## What MCP Adds to an Already-Agentic CLI

Claude Code ships with serious built-in tools — file reads, edits, shell, search. MCP doesn't replace those; it adds doors to systems the CLI can't see. A database server turns "check the schema" into a real query. A fetch server pulls live documentation. A memory server keeps project decisions across sessions. Each door is a small program Claude Code launches and talks to on your behalf.

**Bottom line: built-in tools make Claude Code an agent in your repo; MCP servers make it an agent in your whole environment.**

## Claude Code MCP Setup: One Command, Three Scopes

Adding a server is a single command ([official docs](https://code.claude.com/docs/en/mcp)). The reference memory server makes a perfect first test — no accounts, no keys, fully local:

```bash
claude mcp add memory -- npx -y @modelcontextprotocol/server-memory
```

Inside a session, `/mcp` shows what's connected. The flag that matters is `--scope`:

- **`local`** (default) — registered for you, in this project only.
- **`project`** — written to `.mcp.json` at the repo root, committed to git, shared with the team.
- **`user`** — available to you in every project.

When the same server is defined twice, local beats project, which beats user — handy for overriding a team default with your own variant.

A good second door is the official fetch server, which lets the model pull live pages — the server runs locally, though the pages it fetches are ordinary web traffic:

```bash
claude mcp add fetch -- uvx mcp-server-fetch
```

One honest warning for the database door: skip the old reference SQLite server. It's been [moved to the archived repo](https://github.com/modelcontextprotocol/servers-archived) and carries a publicly disclosed, unpatched SQL-injection flaw. Pick a maintained community SQLite or Postgres server instead — or [build your own](/build-mcp-server-python-production/), which for an internal database is the safest door anyway.

<Callout type="key">

Claude Code MCP configuration is entirely client-side. The CLI launches the servers, hands their tool list to the model, and executes whatever the model calls. That's why the whole [Part 5](/connect-claude-code-to-any-model/) backend menu — Ollama, DeepSeek, GLM, Kimi — works with MCP unchanged. The servers never know which model is driving.

</Callout>

![Claude Code MCP tool call flow: the CLI sends the prompt and tool list to the local model, gets a tool call back, runs the MCP server itself, and returns the result](./claude-code-mcp-call-flow.svg)

## The Local-Model Catch: Tools Cost Context and Skill

On Anthropic's models, you can pile on MCP servers carelessly. On a local stack you can't, for two reasons.

**Tool definitions eat context.** Every server you add injects its tool descriptions into the context window. On a 200k-token cloud model that's noise; on a local model where [Ollama's own guidance](https://docs.ollama.com/integrations/claude-code) is 64k minimum for repository work, three chatty servers can crowd out your actual code.

**Emitting tool calls is a trained skill.** The model must produce exactly structured calls, every time. Community testing keeps finding the same line: 7–8B models produce inconsistent tool calls, while 14B-class models and mixture-of-experts coder variants handle them reliably. Stick to [models tagged for tools](https://ollama.com/search?c=tools) — Llama 3.1 instruct, Qwen coder variants, or a GLM flash model that fits in 16 GB of RAM. [Part 1](/best-local-llm-for-your-laptop/) maps what your hardware can hold.

<Callout type="warning">

Add one server, test it in a session, then add the next. If the model starts narrating tool use instead of doing it, or calls the wrong tool entirely, you've hit its ceiling — remove a server or step up a model size before blaming MCP.

</Callout>

Two commands make that experiment loop reversible:

```bash
claude mcp list            # every configured server + its connection status
claude mcp remove fetch    # take a server back out
```

**Bottom line: on a local stack, treat every MCP server as a context expense that must earn its keep.**

## The Privacy Trap: Local Model ≠ Local Stack

Here's the part that undoes careless setups. Running the model locally protects the *conversation* — your prompts and code never leave the machine. But an MCP server is its own actor. A remote server (GitHub's hosted MCP, a SaaS connector) receives every query, issue, or page the model asks it for, under that service's data terms, no matter where the model runs.

- **Local model + local servers** (memory, a local database, filesystem) — fully private. Nothing leaves.
- **Local model + remote server** — the conversation stays home; the *tool traffic* doesn't.
- **Cloud backend + any server** — you already accepted the provider's terms in [Part 4](/run-claude-code-for-free-2026/); MCP adds the server's terms on top.

If you connect GitHub's hosted server — our [GitHub MCP tutorial](/github-mcp-server-claude-tutorial-2026/) covers it — do it knowingly: it's enormously useful, and it's remote. And when no server exists for your internal system, [build one in Python](/build-mcp-server-python-production/); a server you wrote and run locally is the most private integration there is.

**Bottom line: privacy is decided per MCP server, not by where the model runs.**

## The Stack at a Glance

| Layer | Fully private choice | What changes if you go remote |
| --- | --- | --- |
| Model | Ollama on localhost | Prompts + code go to the provider |
| Claude Code | always local (it's your CLI) | — |
| MCP servers | memory, local DB, self-built | Tool traffic goes to each remote service |
| Scope | `local` for experiments | `project` shares config with your team via git |

![Local agent stack: Claude Code and local MCP servers inside your machine; a remote MCP server crosses the boundary](./claude-code-mcp-local-stack.svg)

Six parts in, the arc is complete: a model your laptop can run, a coding assistant, private RAG, a free Claude Code, any backend you like — and now Claude Code MCP tools that reach the rest of your environment on your terms. That's a full agent stack with a $0 line item, and you know exactly where every byte goes.

<NextSteps>

- **Start here:** add the memory server with the one-liner above, restart, and test recall across sessions.
- **Level up:** connect your repos with the [GitHub MCP server](/github-mcp-server-claude-tutorial-2026/) — knowing it's the remote row of the table.
- **Go deeper:** write your own local server for an internal system with our [Python MCP server tutorial](/build-mcp-server-python-production/).

</NextSteps>
