LibraryLearning
Back to course

Agents and Planning • Lesson 3

World Models for Agents

20 minute lesson

Learning objectives

  • Connect world models to planning
  • Understand why agents need internal structure
  • See where simulation helps

What Is It?

Agent systems act over time. They read context, call tools, change external state, wait for results, and decide what to do next. That means they need some model of the task environment, not just a one-shot answer generator.

A world model for agents can be lightweight. It might track goals, tool affordances, dependencies, hidden assumptions, external system state, and the likely consequences of actions. But without that internal structure, the agent becomes forgetful, myopic, or unsafe.

This is why modern agent design increasingly includes planners, memory systems, simulators, and environment-specific state trackers around or inside language models.

How It Actually Works

For practical software agents, the world model often looks like a structured task model.

Component Example
State tracker Current repo branch, failing tests, open PR, user goal
Tool model What each API or command does and what it can break
Transition model If I edit file X, test Y will likely change
Uncertainty model What I do not know yet and how to verify it

1. Maintain state

The agent needs memory of what has already happened and what remains true. This can live in external memory, latent context, or explicit planning data structures.

2. Predict consequences

Before taking an action, a strong agent estimates likely outcomes: success paths, failure modes, required follow-ups.

3. Use simulated branches

Even shallow lookahead is valuable:

Action A -> likely fast but risky
Action B -> slower but safer

4. Reconcile model with observations

After tool results arrive, the agent updates its state. This is the same perception-prediction loop world models use more broadly.

The practical point is that agency forces modelling. Once a system must do multi-step work in an external environment, internal state and predicted transitions become unavoidable.

The Jargon Decoded

  • Agent loop: Repeated cycle of observe, decide, act, and update.
  • Affordance: What actions a tool or environment makes possible.
  • State tracker: Structured representation of current task reality.
  • Lookahead: Evaluating likely future consequences before acting.
  • Execution trace: Record of actions and observations across an agent run.
  • Belief update: Revising internal state based on new evidence.

Why This Matters

This matters because many real agent failures are world-model failures in disguise: losing track of state, mispredicting tool effects, or failing to update after contradictory evidence.

What This Unlocks

Better agent world models mean fewer loops, better recovery from error, safer action selection, and more credible autonomy in coding, operations, and research workflows.

What Still Breaks

Most current agents still rely heavily on prompt context and brittle heuristics. Persistent state, uncertainty handling, and intervention-sensitive planning are still immature in production agent stacks.

Sources

  • LeCun, Bengio, and Hinton, Deep Learning (Nature, 2015)
  • David Ha and Jürgen Schmidhuber, World Models (2018)
  • Julian Schrittwieser et al., MuZero (2020)
  • Danijar Hafner et al., Dream to Control / Dreamer lineage
  • Yann LeCun, A Path Towards Autonomous Machine Intelligence (2022)

Checkpoint questions

  • Why do agents care about internal models?
  • What kinds of planning depend on richer representations?

Exercise

Describe one agent workflow that would improve if it had a stronger internal world model.

Memory recall

Quick quiz

Use retrieval, not rereading. Answer from memory, then check the feedback.

1. Why do agents care about internal models?

2. What kinds of planning depend on richer representations?

3. What is the practical benefit of simulation for agents?

Progress

Mark this lesson complete when done

Next lesson