The Model Is Not Your Agent: Why AI Systems Fail at the Architecture Level

DEV.to author Gursharan Singh just dropped Part 2 of his "AI Agents in Practice" series, and it's doing something refreshing: arguing that your AI agent isn't broken because you picked the wrong model. It's broken because nobody built the right system around it. If you've been watching teams chase frontier models as a fix for unreliable agents, this one's going to sting—in the best way. The article opens with a contrast that says everything. Same request—"I'd like to cancel order #4471 and get a refund." In Part 1's broken system, an agent confidently refunded Priya even though her order had already shipped. Here's what a properly-built system does instead: it reads the actual order status first, sees "shipped," checks that cancellation procedure requires unshipped orders, refuses the cancel action, and responds with alternatives—start a return when it arrives or connect you to a human. Then it stops and waits.

The Five-Step Loop That Defines an Agent

The core insight: agents aren't smart models. They're control loops running models multiple times, carrying state across turns, with tools that let them act in the world. Singh breaks down the loop as five recognizable steps—observe, decide, act, check, repeat. The model picks which step happens next on each iteration. That's the move. Not a fixed script. The model decides within boundaries the system defines. For contrast: a workflow runs steps the developer wrote in advance; an agent decides each step at runtime using the same parts, different wiring.

Three Primitives Agents Actually Compose

Singh identifies three practical primitives that agents don't need to invent—they just compose them. MCP (Model Context Protocol) handles acting—standardized tool calls for querying databases, calling APIs, sending emails. RAG handles knowing—retrieval bringing outside knowledge like company policies or eligibility rules into the agent's context when needed. Skills handle reusable procedures—markdown files naming steps, failure modes, and approval rules that get loaded when relevant instead of stuffed into a growing system prompt every turn. The agent decides which primitive applies right now. Often just one. Sometimes none.

Manual ReAct vs Native Tool Calling

Here's where things get spicy for builders. Manual ReAct treats the model's output as text your code parses with regex—great for demos, brittle in production when capitalization or phrasing drifts and the pattern misses it. Every behavior rule lives in the prompt as English, competing with every other instruction. Native tool calling is the production move: structured schemas define tools as API parameters instead of prose; the model returns a structured object your app reads directly like {"tool": "cancel_order", "arguments": {"order_id": "4471"}}. No regex. Format rules disappear from the prompt. Policy enforcement moves to structural boundaries rather than buried English sentences.

The Line That Actually Defines an Agent

Singh draws a clean distinction worth internalizing: chatbots are reply-only with no tools or control loop; workflows have a controller deciding next steps while the model does work inside predefined steps; agents let the model decide each step at runtime. The failure mode difference matters—chatbots make things up when they don't know, workflows miss edge cases their branching didn't anticipate, and agents make confident-and-wrong decisions when boundaries aren't properly designed. "The line is not 'smart vs dumb.' The line is who decides what happens next—and how much room the system gives the model to be wrong."

Key Takeaways

An agent is a control loop with tools, knowledge, and a stopping condition: observe → decide → act → check → repeat. The model chooses the step; the system gives it room and limits.
Agents compose MCP for acting, RAG for knowing, and Skills for following reusable procedures—deciding when to use which primitive is the central agent move.
What makes something an agent isn't how smart the model is. It's what the system lets the model decide—and whether you've built boundaries that prevent confident mistakes.

The Bottom Line

This article should've been required reading six months ago across every team that shipped a "smart" customer service bot and spent Q1 firefighting edge cases. The AI agent discourse keeps chasing bigger models when the real bottleneck is architectural. Part 3 promises to open the loop on state, stopping conditions, and context as production engineering problems—which based on this installment's quality, might finally get teams to look beyond the model card.

> The Model Is Not Your Agent: Why AI Systems Fail at the Architecture Level