ChatRaj
Application & chatbot architecture

Agentic AI

Agentic AI is a system where a large language model plans multi-step actions, calls external tools, observes the results, and iterates until a goal is reached.

Bottom line
Agentic AI is a system where a large language model plans multi-step actions, calls external tools, observes the results, and iterates until a goal is reached. Unlike a single-shot chatbot, the model itself acts as the controller, deciding what to do next on each loop rather than following hardcoded steps.
Reviewed by ··5 min read
Jump to section

What agentic AI actually is

Agentic AI is a system in which a large language model is given a goal, a set of tools, and the freedom to decide what to do next on every step. The model writes a plan, picks a tool, reads the result, and then decides whether to call another tool, revise the plan, or return a final answer. The LLM is not just a reply generator. It is the controller.

That is the line that separates an agent from a regular chatbot. A chatbot answers the message in front of it. An agent owns a goal across many turns, many tool calls, and sometimes many sub-agents. Concretely, an agent loop looks like this: receive a goal, generate a thought, pick a tool, observe the tool's output, decide whether the goal is met, and repeat.

The pattern only works because modern LLMs got reliable at structured function calling. Once a model can emit a clean JSON tool call and read the response back into its context, you can wrap that exchange in a loop and let the model drive. Everything else in the agentic stack is plumbing around that loop: budget caps, memory, retries, sub-agent handoff, and guardrails.

The think-act-observe loop (ReAct pattern)

The dominant pattern in production agents is ReAct, introduced by Yao et al. in the 2022 paper "ReAct: Synergizing Reasoning and Acting in Language Models" (arXiv 2210.03629). ReAct interleaves natural-language reasoning ("Thought:") with tool calls ("Action:") and tool results ("Observation:") inside the model's own output stream. The model literally writes out its reasoning before each action and then reads the observation before deciding the next step.

The Yao paper showed that interleaving reasoning and acting beat both pure chain-of-thought prompting and pure tool-use baselines on HotpotQA, ALFWorld, and WebShop. The result was important enough that almost every modern agent framework is some variant of it.

Today the loop looks roughly like this:

  1. The user submits a goal.
  2. The model emits a thought, then a tool call.
  3. The runtime executes the tool and writes the result back into the conversation as an observation.
  4. The model reads the observation, emits another thought, and either calls another tool or returns a final answer.
  5. The runtime enforces budget limits: max steps, max tokens, max wall-clock time.

Frameworks differ mainly in how much structure they impose around this loop. LangGraph (now the de facto Python and TypeScript standard in 2026) models the agent as an explicit state graph with checkpointers, interrupts, and human-in-the-loop nodes. The Anthropic Claude Agent SDK (renamed from the Claude Code SDK in September 2025) gives the agent a computer, a file system, and a subprocess runner, then lets the model drive. The OpenAI Agents SDK, CrewAI, and LlamaIndex agents all sit on similar primitives with different ergonomics.

Why agentic AI matters for AI chatbots

For a customer-support chatbot, agentic AI unlocks tasks that a single LLM call cannot finish on its own. A subscription cancellation might require looking up the customer, checking their plan, calling the billing API, confirming the action with the user, and writing back a confirmation. That is a five-step task. A normal chatbot answers a question. An agent finishes a job.

The cost is real. Each step in an agent loop is a full LLM call with the entire growing conversation in context. A ten-step agent is roughly ten times the cost and latency of a single-call chatbot, and the token usage grows quadratically because each turn re-sends the prior context. This is also where the ugly failure modes live.

Common failure modes:

  • Tool-call loops. The model picks the same tool over and over because each observation does not actually move the goal forward. Without a max-steps cap, the agent burns the budget.
  • Hallucinated tool names. The model invents a function that does not exist. Strict JSON schemas and AI guardrails at the runtime layer catch these before they execute.
  • Runaway spending. Without a token cap or a cost ceiling per session, a confused agent can spend more on one conversation than a month of normal chat.
  • Prompt drift. Long agent loops dilute the original system prompt and the model starts ignoring its guardrails. Periodic re-injection of the system prompt fixes this.

Mitigations that actually work in production: hard budget caps (max steps, max tokens, max dollars per session), structured handoff between specialist sub-agents instead of one mega-agent, replay-able state via a checkpointer so failed runs can be debugged, and human-in-the-loop interrupts for any irreversible action.

When to use an agent vs. a workflow

Here is the decision the ChatRaj team makes constantly. Workflows are explicit code that orchestrates LLM calls. Agents are LLMs that orchestrate themselves. Workflows are cheaper, predictable, easy to audit, and trivial to test. Agents are flexible, handle the long tail, and cost more.

A workflow looks like: classify the question, route to one of three retrieval pipelines, call the LLM once with the retrieved context, return the answer. You wrote every arrow. The LLM does not get to invent new steps. This covers 90 percent of customer-support traffic.

An agent looks like: here is the goal, here are twelve tools, figure it out. You did not write the path. The LLM picks it on the fly. This is the right shape for triage where you do not know in advance whether the question is a refund, a how-to, or a bug report. It is also the right shape for multi-step lookups across systems and for complex onboarding flows where the next step genuinely depends on the previous answer.

ChatRaj's stance: most chatbots do not need an agent. A clean RAG workflow with strong prompt engineering, a guarded system prompt, and a single function-calling step covers the common cases at a fraction of the cost. Agentic flows earn their keep on triage routing, multi-system lookups, and onboarding. Reach for the agent when the path through the conversation is genuinely unknown until the user speaks. Reach for the workflow when you already know the five steps.

FAQ

Common Agentic AI questions

A chatbot answers the message in front of it. An agent owns a goal across many turns, plans a sequence of tool calls, observes results, and iterates until the goal is reached or a budget cap stops it.

Was this helpful?

Ship your first chatbot in 60 seconds.

Sign in with Google and you'll be answering visitor questions before your coffee gets cold.

60-second setup · One-line install · Works on any site

Works on any website
SShopify
WWebflow
WPWordPress
SqSquarespace
FFramer
</>Plain HTML