Local AI, Zero CostBeginner

Connect Claude Code to Any Model: 6 Exact Configs (2026)

Connect Claude Code to Ollama, DeepSeek, GLM, Kimi, NVIDIA NIM or OpenRouter: copy-paste env vars, settings.json blocks, model mapping and quick fixes.

SK

Sukhveer Kaur

Published July 5, 2026

5 min read

Open in ChatGPT Open in Claude

On this page +

The Three Variables That Do Everything Make It Permanent: settings.json Connect Claude Code to Each Provider When Something Breaks Every Config at a Glance

🧰 New here? Set up your environment first · ~5 min

Install Python 3.11+ — confirm with python3 --version.
Create and activate a virtual environment: python3 -m venv .venv then source .venv/bin/activate (Windows: .venv\Scripts\activate). venv, pip & uv primer →
Install the packages this tutorial lists: pip install -U pip <packages>.
Put your LLM API key in a .env file and never commit it. API key + .env primer →

Full walkthrough → Environment Setup primer

Local AI, Zero Cost — Part 5. Part 4 separated the viral hype from the truth: the CLI is free, the Claude models aren’t, and one URL decides who answers. This part is the hands-on half — how to connect Claude Code to each backend, with exact copy-paste configs.

Part 4 told you which backend fits your budget and privacy needs. This tutorial wires it up. Every config below comes from the provider’s own docs, not a video description box. You get the env vars, the settings.json blocks that make them stick, the model-mapping variables nobody mentions, and the proxy step that NVIDIA NIM and OpenRouter truly require. If you want to connect Claude Code to a new engine without guesswork, this is the page to keep open.

Fifteen minutes from now, claude in your terminal will be talking to whichever engine you picked — and you’ll know how to switch back.

🎯 Key takeaways

Three variables do all the work: ANTHROPIC_BASE_URL (where requests go), ANTHROPIC_AUTH_TOKEN (the provider’s key), and ANTHROPIC_MODEL (which model answers) — set them, open a fresh terminal, done.
Four backends connect directly (Ollama, DeepSeek, GLM, Kimi expose Anthropic-style endpoints); NVIDIA NIM and OpenRouter speak the OpenAI style, so they need a small local translator proxy.
Put the config in ~/.claude/settings.json to survive new terminals — and in a per-project .claude/settings.json to give each repo its own backend.

🟢 Beginner⏱️ 15–20 minStack: Claude Code CLI + one of: Ollama, DeepSeek, Z.ai GLM, Moonshot Kimi, NVIDIA NIM, OpenRouter

✅ Before you start

Claude Code installed (curl -fsSL https://claude.ai/install.sh | bash) — no subscription needed for third-party backends
An account/API key for the provider you chose — Part 4 compares them on cost, privacy and quality
Comfort editing a JSON file and pasting terminal commands

The Three Variables That Do Everything#

Claude Code reads its destination from environment variables at launch. That’s the entire mechanism — no forks, no plugins, no patched binaries.

ANTHROPIC_BASE_URL — the server that receives every request. Default: Anthropic. Change it, change the engine.
ANTHROPIC_AUTH_TOKEN — the credential sent to that server. Third-party endpoints read this one. Also set ANTHROPIC_API_KEY="" so a leftover Anthropic key can’t interfere.
ANTHROPIC_MODEL — which of the provider’s models answers. Finer mapping exists too. The ANTHROPIC_DEFAULT_OPUS_MODEL / _SONNET_MODEL / _HAIKU_MODEL trio translates Claude Code’s internal tier names. CLAUDE_CODE_SUBAGENT_MODEL picks a cheaper model for background subagents.

Bottom line: if you understand these three variables, every provider section below is just different values.

🔑 Key point

Environment variables load when Claude Code starts. Edits never reach an already-open session — close it and open a fresh terminal after every config change. This one habit prevents most “it’s not working” moments.

Make It Permanent: settings.json#

Exports die with the terminal. For a setup you’ll keep, put the same values in ~/.claude/settings.json — Claude Code applies its env block to every session (settings docs):

json

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "your-provider-api-key",
    "ANTHROPIC_API_KEY": ""
  }
}

Claude Code also reads a per-project .claude/settings.json, and project values win. That means one repo can run on GLM while your client work stays on Anthropic — no juggling exports.

Bottom line: shell exports are for trying a backend; settings.json is for keeping it.

Connect Claude Code to Each Provider#

Pick your backend and paste. Each block is the provider’s own published config.

Ollama — local, $0, private#

Since v0.14, Ollama speaks the Anthropic API natively on port 11434. Easiest path:

bash

ollama launch claude

Manual equivalent, if you want control:

bash

export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=""
export ANTHROPIC_BASE_URL=http://localhost:11434
claude --model qwen3.5

Any tool-capable model you’ve pulled works after --model. For real repositories, raise the context window to 64k+ in Ollama’s settings — agentic sessions eat context. Part 1 helps you pick a model your RAM can hold.

DeepSeek — pay-per-token, cents per session#

DeepSeek hosts an Anthropic-compatible endpoint and documents the full Claude Code setup, including tier mapping:

bash

export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_AUTH_TOKEN=<your-deepseek-api-key>
export ANTHROPIC_MODEL="deepseek-v4-pro[1m]"
export ANTHROPIC_DEFAULT_HAIKU_MODEL=deepseek-v4-flash
export CLAUDE_CODE_SUBAGENT_MODEL=deepseek-v4-flash

The last two lines are the money-savers. Quick internal calls and subagents run on the flash model; the pro model handles your actual prompts. DeepSeek also auto-maps Claude tier names (opus-prefixed → v4-pro, haiku/sonnet-prefixed → v4-flash) if you skip the mapping vars.

GLM (Z.ai) — the coding-plan favorite#

Z.ai’s GLM Coding Plan is built to plug into Claude Code. The official docs configure it through settings.json:

json

{
  "env": {
    "ANTHROPIC_AUTH_TOKEN": "your-zai-api-key",
    "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic"
  }
}

Prefer automation? npx @z_ai/coding-helper walks through the same setup interactively. Your plan’s GLM model answers by default — no model variable needed to start.

Kimi (Moonshot) — long-context specialist#

Moonshot exposes an Anthropic-style endpoint and documents Claude Code support directly:

bash

export ANTHROPIC_BASE_URL=https://api.moonshot.ai/anthropic
export ANTHROPIC_AUTH_TOKEN=<your-moonshot-api-key>
export ANTHROPIC_MODEL=kimi-k2.7-code

Using the China platform instead? Swap the base URL to https://api.moonshot.cn/anthropic.

NVIDIA NIM and OpenRouter — free tiers, via a proxy#

Here’s the step the viral videos blur past: these two serve the OpenAI-style API, so Claude Code can’t talk to them directly. The bridge is a small translator running on your machine. It shows Claude Code an Anthropic-style endpoint and forwards each request in the provider’s format. The open-source free-claude-code proxy is built for exactly this (it also covers LM Studio and llama.cpp):

Clone and configure the proxy — set the provider (NIM or OpenRouter) and your API key in its config file.
Start it — it prints a localhost address.
Point Claude Code at the proxy:

bash

export ANTHROPIC_BASE_URL=http://localhost:<the-proxy-port>
export ANTHROPIC_API_KEY=""
claude

Model choice lives in the proxy’s config, not in Claude Code. The proxy maps Claude’s tier names to whichever NIM or OpenRouter model you set. Our OpenRouter review covers which :free models are worth mapping.

⚠️ Warning

A proxy adds a moving part. If sessions stall or tools misbehave, check the proxy’s console output first — rate-limit errors from the free tier show up there, not in Claude Code.

When Something Breaks#

Config edits seem ignored — the session was already open. Close every Claude Code window and start a fresh terminal. Variables load at launch.
401 / authentication errors — the key sits in the wrong variable. Third-party endpoints want ANTHROPIC_AUTH_TOKEN; keep ANTHROPIC_API_KEY="".
The model narrates edits instead of making them — a tool-calling gap. Switch to a coder or agentic variant of the provider’s lineup. Plain chat models can’t drive Claude Code’s file tools reliably.
Truncated or amnesiac sessions on local models — context window too small. Set 64k+ in Ollama for repository work.
Model-not-found errors — you passed a Claude name to an endpoint that doesn’t know it. Set ANTHROPIC_MODEL (and the mapping trio) to the provider’s own model IDs.

💡 Tip

Keep one tiny shell script per backend (glm.sh, ollama.sh, anthropic.sh) that exports the right variables and runs claude. Switching engines becomes a one-word decision instead of a config-editing session.

Every Config at a Glance#

Backend	ANTHROPIC_BASE_URL	Auth token	Model setting
Ollama	`http://localhost:11434`	`ollama`	`claude --model <name>`
DeepSeek	`https://api.deepseek.com/anthropic`	DeepSeek key	`deepseek-v4-pro[1m]` + flash mapping
GLM (Z.ai)	`https://api.z.ai/api/anthropic`	Z.ai key	plan default
Kimi	`https://api.moonshot.ai/anthropic`	Moonshot key	`kimi-k2.7-code`
NIM / OpenRouter	local proxy address	key lives in proxy	mapped in proxy config
Back to Anthropic	unset everything	Claude login	`/model` in-session

That last row matters. To connect Claude Code back to the real Claude models, remove the env block (or unset the variables) and open a new terminal. You’re home. Nothing is permanent, nothing is patched — which is exactly why this whole ecosystem works.

🧭 Where to go from here

Start simple: wire up Ollama first — it’s free, private, and failure-proof for learning the switch mechanics.
Choosing between backends? Part 4 compares cost, privacy and quality so the config you paste is the right one.
Local model underpowered? Part 1 matches models to your RAM, and Part 2 gives the same local stack a VS Code home.
Next in the series: give the stack tools beyond the repo — Part 6 adds MCP servers without breaking the privacy story.

Frequently asked questions

Do I need a separate Claude Code install for each provider? +

No. One install works for everything. The backend is chosen by environment variables at launch, so switching providers is just changing ANTHROPIC_BASE_URL and the auth token, then opening a fresh terminal.

Does my API key go in ANTHROPIC_API_KEY or ANTHROPIC_AUTH_TOKEN? +

Third-party endpoints read ANTHROPIC_AUTH_TOKEN. Put the provider's key there and set ANTHROPIC_API_KEY to an empty string so a leftover Anthropic key can't conflict. Ollama uses the literal token "ollama" since localhost needs no real key.

How do I switch back to the real Claude models? +

Unset ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN and any model overrides (or remove the env block from settings.json), open a new terminal, and Claude Code goes back to your Anthropic login or API key as if nothing happened.

Why does my swapped-in model describe edits instead of making them? +

That's a tool-calling gap. Claude Code drives everything through tool calls, so the backend model must support them well. Pick a model the provider markets for agentic coding — a coder or tool-capable variant — rather than a plain chat model.

Can different projects use different backends? +

Yes. Claude Code also reads a per-project .claude/settings.json, so one repo can pin GLM while another stays on Anthropic. Project settings win over your user-level file.

References

#ClaudeCode #Ollama #DeepSeek #GLM #KimiK2 #AICodingAssistant

Share

Written by

Sukhveer KaurSoftware Developer & AI Engineer

Sukhveer is a software developer specialising in AI systems and backend engineering. She has hands-on experience designing agentic AI applications, working with large language model pipelines, autonomous agent frameworks, and cloud-native services in Java and Python. At InfoWok, she bridges the gap between cutting-edge AI research and practical implementation — helping developers understand and apply emerging technologies through clear, experience-backed writing.

Linkedin ↗

Related guides

Comparison · 7 minCursor vs Claude Code (2026): Which Should You Use?Sukhveer Kaur · Jun 27, 2026 Guide · 5 minBuild a Customer Support AI Agent in Python (2026)Sukhveer Kaur · Jul 4, 2026 Guide · 6 minOpenAI Agents SDK Tutorial: Build an Agent in Python (2026)Sukhveer Kaur · Jul 4, 2026

More by Sukhveer Kaur

Guide · 3 minDesigning AI-Native Applications: The Architecture SeriesSukhveer Kaur · Jul 5, 2026 Guide · 7 minBest Local LLM for Your Laptop in 2026: Free and PrivateSukhveer Kaur · Jul 5, 2026 Guide · 5 minClaude Code MCP with Local Models: A Private Agent StackSukhveer Kaur · Jul 5, 2026

Continue the series

← Part 03

Run Claude Code for Free: What Viral Videos Don't Tell You

Part 05 →

Claude Code MCP with Local Models: A Private Agent Stack

Get the next part the day it lands

One email per new part. No digest spam.