InfoWok
Local AI, Zero CostBeginner

Connect Claude Code to Any Model: 6 Exact Configs (2026)

Connect Claude Code to Ollama, DeepSeek, GLM, Kimi, NVIDIA NIM or OpenRouter: copy-paste env vars, settings.json blocks, model mapping and quick fixes.

SK
Sukhveer Kaur
Published July 5, 2026
5 min read
Connect Claude Code to any model tutorial title in a dark terminal-style card listing Ollama, DeepSeek, GLM, Kimi, NIM and OpenRouter backendsLocal AI, Zero Cost
6 EXACT CONFIGS
On this page +
🧰 New here? Set up your environment first · ~5 min
  1. Install Python 3.11+ — confirm with python3 --version.
  2. Create and activate a virtual environment: python3 -m venv .venv then source .venv/bin/activate (Windows: .venv\Scripts\activate). venv, pip & uv primer →
  3. Install the packages this tutorial lists: pip install -U pip <packages>.
  4. Put your LLM API key in a .env file and never commit it. API key + .env primer →

Full walkthrough → Environment Setup primer

Local AI, Zero Cost — Part 5. Part 4 separated the viral hype from the truth: the CLI is free, the Claude models aren’t, and one URL decides who answers. This part is the hands-on half — how to connect Claude Code to each backend, with exact copy-paste configs.

Part 4 told you which backend fits your budget and privacy needs. This tutorial wires it up. Every config below comes from the provider’s own docs, not a video description box. You get the env vars, the settings.json blocks that make them stick, the model-mapping variables nobody mentions, and the proxy step that NVIDIA NIM and OpenRouter truly require. If you want to connect Claude Code to a new engine without guesswork, this is the page to keep open.

Fifteen minutes from now, claude in your terminal will be talking to whichever engine you picked — and you’ll know how to switch back.

🎯 Key takeaways
  • Three variables do all the work: ANTHROPIC_BASE_URL (where requests go), ANTHROPIC_AUTH_TOKEN (the provider’s key), and ANTHROPIC_MODEL (which model answers) — set them, open a fresh terminal, done.
  • Four backends connect directly (Ollama, DeepSeek, GLM, Kimi expose Anthropic-style endpoints); NVIDIA NIM and OpenRouter speak the OpenAI style, so they need a small local translator proxy.
  • Put the config in ~/.claude/settings.json to survive new terminals — and in a per-project .claude/settings.json to give each repo its own backend.
🟢 Beginner⏱️ 15–20 minStack: Claude Code CLI + one of: Ollama, DeepSeek, Z.ai GLM, Moonshot Kimi, NVIDIA NIM, OpenRouter
Before you start
  • Claude Code installed (curl -fsSL https://claude.ai/install.sh | bash) — no subscription needed for third-party backends
  • An account/API key for the provider you chose — Part 4 compares them on cost, privacy and quality
  • Comfort editing a JSON file and pasting terminal commands

The Three Variables That Do Everything#

Claude Code reads its destination from environment variables at launch. That’s the entire mechanism — no forks, no plugins, no patched binaries.

  • ANTHROPIC_BASE_URL — the server that receives every request. Default: Anthropic. Change it, change the engine.
  • ANTHROPIC_AUTH_TOKEN — the credential sent to that server. Third-party endpoints read this one. Also set ANTHROPIC_API_KEY="" so a leftover Anthropic key can’t interfere.
  • ANTHROPIC_MODEL — which of the provider’s models answers. Finer mapping exists too. The ANTHROPIC_DEFAULT_OPUS_MODEL / _SONNET_MODEL / _HAIKU_MODEL trio translates Claude Code’s internal tier names. CLAUDE_CODE_SUBAGENT_MODEL picks a cheaper model for background subagents.

Bottom line: if you understand these three variables, every provider section below is just different values.

🔑 Key point

Environment variables load when Claude Code starts. Edits never reach an already-open session — close it and open a fresh terminal after every config change. This one habit prevents most “it’s not working” moments.

Make It Permanent: settings.json#

Exports die with the terminal. For a setup you’ll keep, put the same values in ~/.claude/settings.json — Claude Code applies its env block to every session (settings docs):

json
{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "your-provider-api-key",
    "ANTHROPIC_API_KEY": ""
  }
}

Claude Code also reads a per-project .claude/settings.json, and project values win. That means one repo can run on GLM while your client work stays on Anthropic — no juggling exports.

Bottom line: shell exports are for trying a backend; settings.json is for keeping it.

Connect Claude Code to Each Provider#

Pick your backend and paste. Each block is the provider’s own published config.

Ollama — local, $0, private#

Since v0.14, Ollama speaks the Anthropic API natively on port 11434. Easiest path:

bash
ollama launch claude

Manual equivalent, if you want control:

bash
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=""
export ANTHROPIC_BASE_URL=http://localhost:11434
claude --model qwen3.5

Any tool-capable model you’ve pulled works after --model. For real repositories, raise the context window to 64k+ in Ollama’s settings — agentic sessions eat context. Part 1 helps you pick a model your RAM can hold.

DeepSeek — pay-per-token, cents per session#

DeepSeek hosts an Anthropic-compatible endpoint and documents the full Claude Code setup, including tier mapping:

bash
export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_AUTH_TOKEN=<your-deepseek-api-key>
export ANTHROPIC_MODEL="deepseek-v4-pro[1m]"
export ANTHROPIC_DEFAULT_HAIKU_MODEL=deepseek-v4-flash
export CLAUDE_CODE_SUBAGENT_MODEL=deepseek-v4-flash

The last two lines are the money-savers. Quick internal calls and subagents run on the flash model; the pro model handles your actual prompts. DeepSeek also auto-maps Claude tier names (opus-prefixed → v4-pro, haiku/sonnet-prefixed → v4-flash) if you skip the mapping vars.

GLM (Z.ai) — the coding-plan favorite#

Z.ai’s GLM Coding Plan is built to plug into Claude Code. The official docs configure it through settings.json:

json
{
  "env": {
    "ANTHROPIC_AUTH_TOKEN": "your-zai-api-key",
    "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic"
  }
}

Prefer automation? npx @z_ai/coding-helper walks through the same setup interactively. Your plan’s GLM model answers by default — no model variable needed to start.

Kimi (Moonshot) — long-context specialist#

Moonshot exposes an Anthropic-style endpoint and documents Claude Code support directly:

bash
export ANTHROPIC_BASE_URL=https://api.moonshot.ai/anthropic
export ANTHROPIC_AUTH_TOKEN=<your-moonshot-api-key>
export ANTHROPIC_MODEL=kimi-k2.7-code

Using the China platform instead? Swap the base URL to https://api.moonshot.cn/anthropic.

NVIDIA NIM and OpenRouter — free tiers, via a proxy#

Here’s the step the viral videos blur past: these two serve the OpenAI-style API, so Claude Code can’t talk to them directly. The bridge is a small translator running on your machine. It shows Claude Code an Anthropic-style endpoint and forwards each request in the provider’s format. The open-source free-claude-code proxy is built for exactly this (it also covers LM Studio and llama.cpp):

  1. Clone and configure the proxy — set the provider (NIM or OpenRouter) and your API key in its config file.
  2. Start it — it prints a localhost address.
  3. Point Claude Code at the proxy:
bash
export ANTHROPIC_BASE_URL=http://localhost:<the-proxy-port>
export ANTHROPIC_API_KEY=""
claude

Model choice lives in the proxy’s config, not in Claude Code. The proxy maps Claude’s tier names to whichever NIM or OpenRouter model you set. Our OpenRouter review covers which :free models are worth mapping.

⚠️ Warning

A proxy adds a moving part. If sessions stall or tools misbehave, check the proxy’s console output first — rate-limit errors from the free tier show up there, not in Claude Code.

When Something Breaks#

  • Config edits seem ignored — the session was already open. Close every Claude Code window and start a fresh terminal. Variables load at launch.
  • 401 / authentication errors — the key sits in the wrong variable. Third-party endpoints want ANTHROPIC_AUTH_TOKEN; keep ANTHROPIC_API_KEY="".
  • The model narrates edits instead of making them — a tool-calling gap. Switch to a coder or agentic variant of the provider’s lineup. Plain chat models can’t drive Claude Code’s file tools reliably.
  • Truncated or amnesiac sessions on local models — context window too small. Set 64k+ in Ollama for repository work.
  • Model-not-found errors — you passed a Claude name to an endpoint that doesn’t know it. Set ANTHROPIC_MODEL (and the mapping trio) to the provider’s own model IDs.
💡 Tip

Keep one tiny shell script per backend (glm.sh, ollama.sh, anthropic.sh) that exports the right variables and runs claude. Switching engines becomes a one-word decision instead of a config-editing session.

Every Config at a Glance#

BackendANTHROPIC_BASE_URLAuth tokenModel setting
Ollamahttp://localhost:11434ollamaclaude --model <name>
DeepSeekhttps://api.deepseek.com/anthropicDeepSeek keydeepseek-v4-pro[1m] + flash mapping
GLM (Z.ai)https://api.z.ai/api/anthropicZ.ai keyplan default
Kimihttps://api.moonshot.ai/anthropicMoonshot keykimi-k2.7-code
NIM / OpenRouterlocal proxy addresskey lives in proxymapped in proxy config
Back to Anthropicunset everythingClaude login/model in-session

That last row matters. To connect Claude Code back to the real Claude models, remove the env block (or unset the variables) and open a new terminal. You’re home. Nothing is permanent, nothing is patched — which is exactly why this whole ecosystem works.

🧭 Where to go from here
  • Start simple: wire up Ollama first — it’s free, private, and failure-proof for learning the switch mechanics.
  • Choosing between backends? Part 4 compares cost, privacy and quality so the config you paste is the right one.
  • Local model underpowered? Part 1 matches models to your RAM, and Part 2 gives the same local stack a VS Code home.
  • Next in the series: give the stack tools beyond the repo — Part 6 adds MCP servers without breaking the privacy story.

Frequently asked questions

Do I need a separate Claude Code install for each provider? +
No. One install works for everything. The backend is chosen by environment variables at launch, so switching providers is just changing ANTHROPIC_BASE_URL and the auth token, then opening a fresh terminal.
Does my API key go in ANTHROPIC_API_KEY or ANTHROPIC_AUTH_TOKEN? +
Third-party endpoints read ANTHROPIC_AUTH_TOKEN. Put the provider's key there and set ANTHROPIC_API_KEY to an empty string so a leftover Anthropic key can't conflict. Ollama uses the literal token "ollama" since localhost needs no real key.
How do I switch back to the real Claude models? +
Unset ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN and any model overrides (or remove the env block from settings.json), open a new terminal, and Claude Code goes back to your Anthropic login or API key as if nothing happened.
Why does my swapped-in model describe edits instead of making them? +
That's a tool-calling gap. Claude Code drives everything through tool calls, so the backend model must support them well. Pick a model the provider markets for agentic coding — a coder or tool-capable variant — rather than a plain chat model.
Can different projects use different backends? +
Yes. Claude Code also reads a per-project .claude/settings.json, so one repo can pin GLM while another stays on Anthropic. Project settings win over your user-level file.

References

  1. Ollama — Claude Code integration, official docs
  2. DeepSeek API — Integrate with Claude Code, official guide
  3. Z.AI developer docs — Claude Code setup
  4. Kimi API Platform — agent support (Claude Code)
  5. free-claude-code — open-source translation proxy (GitHub)
  6. Claude Code — settings documentation
Written by
Sukhveer Kaur
Sukhveer KaurSoftware Developer & AI Engineer

Sukhveer is a software developer specialising in AI systems and backend engineering. She has hands-on experience designing agentic AI applications, working with large language model pipelines, autonomous agent frameworks, and cloud-native services in Java and Python. At InfoWok, she bridges the gap between cutting-edge AI research and practical implementation — helping developers understand and apply emerging technologies through clear, experience-backed writing.

Get the next part the day it lands

One email per new part. No digest spam.

Comments