You run your agent, walk away to grab coffee, and come back to a wall of red: GraphRecursionError: Recursion limit of 25 reached without hitting a stop condition. Or worse — no error at all, just an agent quietly calling the same tool forty times and burning through your token budget. Either way, you’ve got a LangGraph agent looping without end, and raising the recursion limit hasn’t made it stop.
This guide is the diagnosis-first cure. We’ll read the error for what it really means and walk the four reasons agents get stuck. Then we fix the true cause — the missing stop condition — instead of papering over it with a bigger number. Every fix is a few lines you can drop into a graph you already have.
- You’ve built at least one LangGraph agent — nodes, edges, and a compiled graph. New to it? Start with LangGraph Tutorial: Build Your First Agent.
- Comfortable reading Python with type hints and decorators
- You know what a tool call is (the model asking to run a function). If not, what AI agents really are sets it up.
- The recursion limit is a smoke alarm, not the fire. Hitting it means your graph never reached a stop condition — raising the number just delays the crash.
- Four causes cover almost every loop: a router that never returns
END, a tool result with no “done” signal, a hand-wired cycle with no exit, and a model that can’t see its own tool output. - The durable fixes are small: route to
ENDcorrectly, give tools a clear terminal signal, add a loop guard, and keeprecursion_limitonly as a backstop.
What “Looping” Actually Looks Like
Before fixing anything, read the error properly — it’s telling you more than it seems. LangGraph runs your graph in super-steps, and when the number of steps passes the default limit of 25, it raises GraphRecursionError. A ReAct-style agent spends roughly two steps per round — one to call the model, one to run the tool it asked for — so the default stops you at about a dozen tool calls.
That number is deliberately low. It exists so a runaway agent fails fast instead of running up a bill. So the limit firing is not the bug; it’s the symptom that your graph has no working way to stop. The official docs put it plainly: if you weren’t expecting many iterations, “you likely have a cycle — check your logic for infinite loops.”
recursion_limit: 100, and on a genuinely complex graph that's correct. But if the agent is *stuck*, a higher limit just means it loops 50 times instead of 25 before failing — same bug, bigger bill. Diagnose first; raise the limit only once you know the run is legitimately long.Bottom line: the recursion limit measures whether your agent can stop, not whether your task is too big.
Why Your LangGraph Agent Loops: The Four Usual Suspects
Almost every LangGraph agent looping problem I’ve debugged traces to one of four causes. Print your message history first — for m in result["messages"]: m.pretty_print() — and the culprit usually announces itself.
- The router never returns
END. Your conditional edge always points back to the model. The graph has no exit, so it cycles until the limit fires. This is the single most common cause. - The tool result has no “done” signal. The model calls a tool, gets back something ambiguous, decides the tool is still relevant, and calls it again. Without a clear success or failure marker, it never concludes the job is finished.
- You hand-wired a cycle with no exit. A literal
a → b → ain your edges with no conditional branch out. The docs’ canonical example of this bug is exactly two nodes pointing at each other. - The model can’t see its own tool output. If your state overwrites the message list instead of appending to it, the tool result never reaches the model — so it re-asks the same thing, forever. This is a missing
add_messagesreducer.
Bottom line: a loop is a missing stop condition — find which of these four removed it, and the fix follows directly.
Reproduce the Loop in 20 Lines (No API Key)
Here’s the most useful debugging move of all: you can reproduce the loop with no LLM at all, because the bug lives in the graph’s wiring, not the model. This graph cycles model → tools → model with no way out, and the node bodies are deliberately trivial:
from typing import TypedDictfrom langgraph.graph import StateGraph, STARTclass State(TypedDict):steps: intdef model(state: State) -> dict:return {"steps": state["steps"] + 1}def tools(state: State) -> dict:return {"steps": state["steps"]}builder = StateGraph(State)builder.add_node("model", model)builder.add_node("tools", tools)builder.add_edge(START, "model")builder.add_edge("model", "tools")builder.add_edge("tools", "model") # cycle with no exitgraph = builder.compile()graph.invoke({"steps": 0}, {"recursion_limit": 6})# GraphRecursionError: Recursion limit of 6 reached without hitting a stop condition
Run it and it fails in about six steps — the exact production error, on demand. The cure is one edge: swap the unconditional model → tools for a conditional edge that can reach END.
from langgraph.graph import ENDdef should_continue(state: State) -> str:# A real agent checks for tool calls; here we stop after three rounds.return END if state["steps"] >= 3 else "tools"builder.add_conditional_edges("model", should_continue) # can now reach ENDbuilder.add_edge("tools", "model")graph = builder.compile()graph.invoke({"steps": 0}, {"recursion_limit": 6}) # -> {'steps': 3}, no error
Bottom line: a loop you can reproduce without an API key is a routing bug — and routing is exactly what the next four fixes repair.
Fix 1: Let the Router Reach END
This fixes the most common loop. A routing function must be able to return END, not just the name of your tools node. Here’s the antipattern — a should_continue that can only ever loop:
# BROKEN: this router never lets the graph finishdef should_continue(state: State) -> str:return "tools" # always routes back — there is no exit
The fix is to check whether the model actually asked for a tool, and route to END when it didn’t:
from langgraph.graph import ENDdef should_continue(state: State) -> str:last_message = state["messages"][-1]# No tool calls? The model gave its final answer — stop.if not last_message.tool_calls:return ENDreturn "tools"
Better still, don’t hand-write this at all. LangGraph ships tools_condition, which already routes to END when the last message has no tool calls — the exact logic above, tested and maintained for you:
from langgraph.prebuilt import tools_conditionbuilder.add_conditional_edges("model", tools_condition)builder.add_edge("tools", "model")
If you’re on the prebuilt create_agent, this wiring is already done — which is why prebuilt agents rarely loop on routing alone. Reach for tools_condition before you write your own router.
Bottom line: every cycle in your graph needs at least one conditional edge that can reach END.
Fix 2: Make Tool Results Say “Done”
Routing can be perfect and the agent will still loop if the model never believes the job is finished. The model decides whether to call a tool again by reading the last result — so that result has to carry a clear terminal signal. Compare these two return values:
from langchain.tools import tool# AMBIGUOUS: the model can't tell if this means "done" or "try again"@tooldef book_table(restaurant: str, time: str) -> str:"""Book a table."""return f"{restaurant} {time}"# CLEAR: the result names success and tells the model to stop@tooldef book_table(restaurant: str, time: str) -> str:"""Book a table at a restaurant ONCE. Do not call again after a SUCCESS."""confirmation = _reserve(restaurant, time)return f"SUCCESS — booked {restaurant} at {time}, confirmation {confirmation}. Task complete."
The second version reduced tool calls from double digits to two in a customer-support agent reported on the LangChain issue tracker — because the model now has new information that changes its reasoning, instead of the same vague string it already saw. Two rules carry most of the weight here:
- Name the outcome in the return value. Lead with
SUCCESS —orERROR —so the terminal state is unmistakable, not implied. - Tell the model what to do on failure, in the docstring. “If this returns an error, stop and report it to the user” is the line most tools omit, and it’s exactly what stops a retry spiral.
"ERROR — city not found, do not retry" — rather than letting an exception bubble up into another identical attempt.Bottom line: the model loops on ambiguity, so make every tool result say plainly whether the task is finished.
Fix 3: Add a Loop Guard for the Stubborn Cases
Some models will re-call a tool with identical arguments even when the result was clear. For those, add a guard that detects repetition and ends the run. The cheapest version checks the last two tool calls and stops if they’re the same.
import jsondef _signature(message) -> str:"""A stable fingerprint of a message's tool calls: names + arguments."""return json.dumps([(c["name"], c["args"]) for c in message.tool_calls],sort_keys=True,)def should_continue(state: State) -> str:messages = state["messages"]last = messages[-1]if not last.tool_calls:return END# Find the previous message that also made tool calls.prior = next((m for m in reversed(messages[:-1])if getattr(m, "tool_calls", None)), None)if prior and _signature(prior) == _signature(last):return END # same call twice in a row — break the loopreturn "tools"
For a more graceful exit, LangGraph exposes a managed value called RemainingSteps. Add it to your state and you can stop before the hard error fires, returning a partial answer instead of a crash:
from langgraph.managed import RemainingStepsclass State(TypedDict):messages: Annotated[list, add_messages]remaining_steps: RemainingStepsdef should_continue(state: State) -> str:if state["remaining_steps"] <= 2:return END # bail out gracefully with what we havelast = state["messages"][-1]return "tools" if last.tool_calls else END
Bottom line: a five-line guard turns an opaque GraphRecursionError into a clean, intentional stop.
Fix 4: Keep recursion_limit as a Backstop
Now — and only now — set the limit. recursion_limit is your seatbelt for the runs that are genuinely long or the loops your guards miss, not your primary fix. Raise it for legitimately complex graphs, and catch the error so a failure is graceful instead of a stack trace:
from langgraph.errors import GraphRecursionErrortry:result = graph.invoke(inputs, {"recursion_limit": 50})except GraphRecursionError:# The agent didn't converge — degrade gracefully instead of crashing.result = {"messages": [{"role": "assistant","content": "I couldn't complete that within the step budget."}]}
Pair the step budget with a wall-clock timeout for production agents, so a slow loop can’t hang a request even if it stays under the step count. The two limits guard different failure modes — steps bound the work, time bounds the latency.
pip show langgraph once you've ruled out your own graph.Bottom line: treat the recursion limit as the last line of defense, wrapped in a try/except, not the place you fix loops.
Quick Recap
The whole diagnosis-to-fix path, in one table:
| Symptom in the message log | Cause | Fix |
|---|---|---|
| Model → tools → model, forever, no final answer | Router never returns END | Use tools_condition (Fix 1) |
| Same tool, slightly reworded each time | Tool result has no “done” signal | Lead the return with SUCCESS — / ERROR — (Fix 2) |
| Identical tool + identical args, repeating | Stubborn model | Loop guard or RemainingSteps (Fix 3) |
| Tool result never appears in history | Missing add_messages reducer | Add the reducer to state (Fix 1/diagnose) |
| Long but varied, genuine progress | Not a loop — task is big | Raise recursion_limit, catch the error (Fix 4) |
Conclusion
A looping LangGraph agent is almost never “too complex” — it’s an agent that was never given a way to stop. Once you read GraphRecursionError as “there’s no working exit” rather than “I need a bigger number,” the fix is usually a few lines. Let the router reach END, make tool results announce when they’re done, guard the stubborn repeats, and keep the recursion limit as a seatbelt behind it all.
Start with the message log on your next loop. The pattern you see — same call, vague result, or no END — points straight at which fix you need. You’ll spend a minute on the cure instead of an afternoon on the symptom.
Have you hit a loop that none of these four explained? Tell me what the message history looked like in the comments — the weird ones are the most instructive.
Read next: LangGraph Tutorial: Build Your First Agent — the graph, state, and edges these fixes assume, built from zero.
- Still learning the graph? Revisit Build Your First LangGraph Agent, then the agent loop by hand.
- Comparing frameworks? See LangGraph vs CrewAI vs AutoGen for how each handles control flow.
- Going to production? Pair these guards with a human-in-the-loop checkpoint for the calls you don’t want an agent making alone.
