Building Agents (Tool-Use & ReAct)

The ReAct Loop Over Mock Tools

Strip away the marketing and an "AI agent" is a loop. The model does not magically take actions; your code does. The pattern is ReAct: Reason about what to do, take an Action by calling a tool, Observe the result, then loop until the task is done. The model only ever produces text; the loop around it turns that text into real tool calls and feeds the results back.

The reasoning step is the LLM, and it is non-deterministic, so we cannot unit-test it. But everything around it is deterministic and is exactly what breaks in production: the dispatch, the observation handling, the stop condition. So we test the loop with the model's job replaced by a scripted plan - a fixed list of (tool, args) actions. That is how you actually unit-test an agent: pin the plan, assert the trace.

You will build run_agent(plan, tools, max_steps). plan is a list of steps like ("lookup", {"country": "France"}). tools is a dict mapping a tool name to a Python function that takes the args dict and returns an observation. For each step you call the matching tool and append a (tool, args, observation) triple to a trace - the audit log of what the agent did.

def run_agent(plan, tools, max_steps=10):
    trace = []
    for action, args in plan:
        if len(trace) >= max_steps:
            break
        if action == "finish":
            break
        observation = tools[action](args)
        trace.append((action, args, observation))
    return trace

The rules:

For each step, call tools[action](args) and record (action, args, observation) in order.
A "finish" action ends the loop immediately and is not recorded in the trace (it is a signal, not a tool call).
Stop once the trace already holds max_steps entries, even if the plan has more steps - that guard stops a runaway agent.
If a step has no args, treat it as an empty dict.

In a real app the LLM would emit each next action as JSON and the same loop would run it, so you could swap the scripted plan for live model output without touching the loop. The on-device tutor can drive this very loop live, but grading only ever looks at the deterministic trace. Press Run to watch a two-step plan execute: look up a capital, then do some arithmetic, then finish.

Your turn

Write run_agent(plan, tools, max_steps=10) that executes a scripted plan of (tool, args) steps against a mock tool registry. For each step, call tools[action](args) and append (action, args, observation) to a trace list. A "finish" action stops the loop and is not recorded; stop once the trace holds max_steps entries. Return the trace.

Spotted a problem in this lesson? Report it

Code · runs in your browser

Output

Back Next lesson

The ReAct Loop Over Mock Tools

This lesson is locked

Best on a laptop