How AI Agents Work: Inside the Core Architecture

Series Part 2 | Level: Beginner | 2026-06-03 — We open the hood on AI agents: agentic loop, ReAct, Chain of Thought, Tree of Thoughts, tool use, and memory systems — all explained from first principles.
▶ Table of Contents (click to expand)
  1. Quick Recap from Part 1
  2. The Three-Beat Rhythm of Every AI Agent
  3. The Agentic Loop — The Heartbeat That Never Stops
  4. Think, Act, Observe — The ReAct Framework
  5. The Reasoning Engine — Chain of Thought and Tree of Thoughts
  6. Tool Use — A 5-Step Process Under the Hood
  7. Memory Systems — How AI Agents Remember
  8. A Real Example — Paris Cycling Trip Planner
  9. Key Takeaways

Quick Recap from Part 1

In Part 1, we answered "What is an AI agent?" — and found that unlike a chatbot that simply responds, an AI agent plans, uses tools, checks results, and drives toward a goal on its own.

Now in Part 2, we go deeper: what's actually happening inside that loop? Let's open the hood and look at the engine.

AI agent internal architecture overview — Brain (LLM), Perception, and Action as three core components

The Three-Beat Rhythm of Every AI Agent

What is the Perception → Decision → Action cycle?

Every AI agent runs on a three-stage cycle — repeated continuously until the task is done.

Stage Name Role
👁️ Perception Collects data from the outside world — text, images, API responses, sensor inputs.
🧠 Decision The LLM combines collected data with domain knowledge to determine what to do next.
Action Executes the chosen step, marks it complete, and moves on to the next.

The Agentic Loop — The Heartbeat That Never Stops

What exactly is an agentic loop?

The agentic loop is the structural backbone of how any AI agent operates. Defined simply: "an LLM that uses tools and repeats a loop based on feedback from the environment."

In code, it looks like this:

memory = [user task]
while llm_should_continue(memory):
    action = llm_get_next_action(memory)   # LLM decides next action
    observations = execute_action(action)   # Execute the action
    memory += [action, observations]        # Accumulate results in memory

What happens inside each loop iteration?

At every iteration, the agent references its entire accumulated memory to decide the next move. It starts with just the user's task, then grows:

[task → action1 → observation1 → action2 → observation2 → ...]

As someone who has spent 25 years managing enterprise networks, this structure feels familiar — it mirrors how a router forwards packets. Incoming packet (input), decide next hop (action), check the response (observation), repeat. The underlying logic is surprisingly universal.

According to Anthropic's guide, controlling the loop is critical: set checkpoints for human feedback, define maximum iteration counts, and build in stopping conditions to prevent runaway loops.

Agentic loop cycle diagram — showing memory accumulation across iterations

Think, Act, Observe — The ReAct Framework

What is the ReAct framework?

ReAct stands for "Reason + Act." It's a framework that alternates between reasoning traces and task-specific actions, treating thinking and doing as complementary — not separate — processes.

The three-step cycle:

  1. Thought: The model analyzes the current situation and writes out its reasoning in natural language. Planning, tracking, updating, handling exceptions — all here.
  2. Action: A concrete step — calling a search API, querying Wikipedia, running a tool.
  3. Observation: Receiving the result of the action, which feeds directly into the next Thought.

The cycle repeats until the agent reaches a final answer.

Does ReAct actually work?

Benchmark Result
ALFWorld +34% over prior methods
WebShop +10% over prior methods
HotPotQA Hallucination and error propagation significantly reduced via Wikipedia API access

The Reasoning Engine — Chain of Thought and Tree of Thoughts

CoT vs ToT — What's the difference?

Aspect Chain of Thought (CoT) Tree of Thoughts (ToT)
Approach Single-track step-by-step reasoning Multiple parallel paths explored simultaneously
Backtracking Not available Available — switches paths when stuck
Best for Sequential, step-solvable problems Complex tasks requiring search and strategy
Game of 24 success rate 4% 74%
💡 Key insight: A 54-billion parameter model given just 8 CoT examples achieved state-of-the-art results on a math benchmark. Better reasoning strategy often beats a bigger model.
Chain of Thought vs Tree of Thoughts comparison diagram — single path vs branching exploration

Tool Use — A 5-Step Process Under the Hood

What happens when an AI agent calls a tool?

Tool use (also called Function Calling) is a structured 5-step conversation between the model and the application.

Step Name Direction What happens
1 Tool definition delivery App → Model App sends available tools list — name, description, parameters.
2 Call decision Model (internal) Model decides which tool to invoke and with what arguments.
3 Call receipt Model → App App receives the tool name and arguments from the model.
4 Logic execution App (internal) App actually runs the function.
5 Result return App → Model Result sent back to model, which generates the final response.

Having run a team-internal LLM service using vLLM on two NVIDIA GPUs, I can say firsthand: complex tasks mean multiple cycles of this loop — each tool result feeding the next decision.


Memory Systems — How AI Agents Remember

What are the three types of agent memory?

AI agent memory is modeled after human memory and comes in three distinct tiers.

Memory Type Role Constraint / Capacity
Sensory Memory Learns embeddings from raw inputs — analogous to raw visual and auditory signals before conscious processing. Input embedding layer
Short-term Memory Active context: conversation history, recent tool outputs, current task state. Hard limit — context window size
Long-term Memory External vector DBs or knowledge graphs. FAISS, HNSW, ScaNN enable fast retrieval at scale. Virtually unlimited storage

How does memory work inside the agentic loop?

Memory is cumulative. Starting from the user's initial task, each action and its observed result gets appended in order. The agent reads this full history every time it needs to decide the next step — which is why long-running agents can become progressively smarter about a specific task as the loop continues.

Three-tier agent memory system — Sensory, Short-term, and Long-term memory comparison

A Real Example — Paris Cycling Trip Planner

Can you show a multi-step task in action?

Here's a concrete example from Hugging Face's Smolagents framework. The user asks: "Plan a one-day cycling tour of Paris."

[Step 1]

  • Thought: I need travel durations. I'll use the get_travel_duration tool.
  • Action: get_travel_duration("Eiffel Tower", "Notre-Dame Cathedral", "bicycling")
  • Observation: "25 minutes"

[Step 2]

  • Thought: Next segment.
  • Action: get_travel_duration("Notre-Dame Cathedral", "Montmartre", "bicycling")
  • Observation: "40 minutes"

[Subsequent steps]

  • Additional destinations added, optimal route assembled.

[Final answer]

  • Eiffel Tower → Notre-Dame (25 min) → Montmartre (40 min) → Luxembourg Gardens → Louvre Museum

What makes this example significant?

The route wasn't predetermined. The agent decided each next step in real time based on what it observed. No hardcoded workflow — just an LLM dynamically adjusting its path based on live results.

That's the difference between a script and an agent.


Key Takeaways

Everything from this post, at a glance.

Concept Core idea Evidence
Perception → Decision → Action The three-stage cycle governing every agent interaction
Agentic Loop Accumulates memory across iterations until goal is reached or stop condition fires
ReAct Think → Act → Observe, repeated ALFWorld +34%
CoT vs ToT Sequential reasoning vs. parallel path exploration 4%74%
Tool Use (5 steps) Define → Decide → Receive → Execute → Return
Memory (3 tiers) Sensory → Short-term (context window) → Long-term (vector DB)
Next, we'll cover the types of AI agents — from simple reactive agents to full multi-agent systems. We'll look at which types fit which situations, and why "a team of agents" sometimes outperforms a single powerful one. If you've ever wondered whether AI agents work better alone or together, Part 3 has your answer.

👤 Author: 20eung (Network Engineer / Self-taught AI Coding Experimenter)

🔗 GitHub Portfolio | isthe.info Blog

📅 First Published: 2026-06-03 | 🔄 Last Updated: 2026-06-03