Syllabus Lesson 172 of 239 · Building Agents (Tool-Use & ReAct)
Building Agents (Tool-Use & ReAct)

Planning, Reflection & Replanning

A single ReAct loop is fine for one tool call, but real tasks have steps: research a topic, then summarize; book a flight, then a hotel. A capable agent does not improvise one action at a time - it makes a plan, works through it, and reflects after each step to decide whether to keep going, fix course, or stop. That plan-act-reflect cycle is what lets an agent finish a long task without wandering.

1. Decompose the goal into a plan

Planning means turning a fuzzy goal into an ordered list of subtasks. In production the model writes this plan; here we make it deterministic and keyword-driven so it is testable. Same idea, pinned output:

def decompose(goal):
    g = goal.lower()
    if "research" in g or "report" in g:
        return ["gather sources", "extract facts", "write summary"]
    elif "trip" in g or "travel" in g:
        return ["pick dates", "book flights", "book hotel"]
    return ["understand goal", "do the work", "review result"]

2. Reflect on each step's result

After a subtask runs, the agent looks at the outcome and makes a control decision. We give it a tiny result dict {"ok": bool, "last": bool} (did the step succeed, and is it the final step) and three possible decisions:

  • "replan" - the step failed. Whatever we tried did not work, so the current plan is suspect.
  • "done" - the step succeeded and it was the last one. The goal is met.
  • "continue" - it succeeded but there is more to do. Move to the next subtask.

Order matters: a failure means "replan" even on the last step (a broken final step is not "done").

3. Replan once on failure

The controller ties it together. It walks the plan, calls execute(subtask) for each step, marks whether it was the last, and reflects. On "continue" it advances; on "done" it stops successfully; on "replan" it builds a fresh recovery plan and starts over - but only up to max_replans times, so a hopeless task cannot loop forever (the loop-guard discipline from earlier). Run out of replans and it stops with "failed".

def replan(goal, failed_step):
    return ["retry: " + failed_step, "finish up"]   # a simple recovery plan

execute is a mock here - a function you pass in that returns {"ok": ...} for a subtask - standing in for the real tool calls so the controller is fully deterministic and testable. In a live agent the same controller drives real tools and a model-written plan without changing its logic. Press Run to watch a plan succeed, then watch one fail a step and recover on the replan.

Your turn

Build a planning controller. Write decompose(goal) returning the ordered subtask list (research/report -> ["gather sources", "extract facts", "write summary"]; trip/travel -> ["pick dates", "book flights", "book hotel"]; else -> ["understand goal", "do the work", "review result"]). Write reflect(step_result) over {"ok", "last"} returning "replan" if not ok, else "done" if last, else "continue". Write replan(goal, failed_step) -> ["retry: <failed_step>", "finish up"]. Then write run_plan(goal, execute, max_replans=1) that walks the plan, calls execute(subtask) (returns {"ok": bool}), reflects (you set last), records a (subtask, ok, decision) trace, replans up to max_replans times on failure (restarting the recovery plan from its first step), and returns {"status": "done"|"failed", "trace": [...], "replans": int}.

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output