Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Lecture 2: Intelligent Agents

Lecture 2: Intelligent Agents

AIMA Chapter 2 — 1 hour

Learning Objectives

  • Define agents and environments

  • Understand rationality and performance measures

  • Classify task environments by key properties

  • Describe agent architectures: reflex, model-based, goal-based, utility-based, learning

Agents and Environments

  • Agent: Entity that perceives and acts

  • Environment: Everything outside the agent

  • Percept: Agent’s perceptual input at any instant

  • Percept sequence: Complete history of percepts

The Agent-Environment Loop

Agent–environment loop
  • Agent receives percepts, chooses actions

  • Environment changes in response to actions

  • Agent’s next percept depends on the new state

Good Behavior: Rationality

  • Rational agent: For each possible percept sequence, selects action expected to maximize performance measure

  • Performance measure: Criterion for success (e.g., score, safety)

  • Omniscience: Knowing actual outcome of actions — we assume agents don’t have this

Rationality vs. Omniscience

  • Rational ≠ omniscient

  • Rational agent acts on available information

  • Learning: Improve from experience

  • Autonomy: Operate correctly without constant human intervention

Specifying the Task Environment

PEAS framework:

  • Performance measure

  • Environment

  • Actuators

  • Sensors

Example: Autonomous taxi

  • P: Safe, fast, legal, comfortable

  • E: Streets, traffic, pedestrians

  • A: Steering, accelerator, brake, display

  • S: Cameras, sonar, GPS, odometer

Properties of Task Environments

2×2 or matrix
PropertyOptions
Fully vs. partially observableCan agent see full state?
Single vs. multi-agentOther agents?
Deterministic vs. stochasticNext state determined?
Episodic vs. sequentialCurrent choice affect future?
Static vs. dynamicEnvironment changes while deciding?
Discrete vs. continuousFinite vs. infinite states?
Known vs. unknownTransition model known?

Fully vs. Partially Observable

  • Fully observable: Sensors give access to complete state

  • Partially observable: Noisy/incomplete sensors (e.g., poker, medical diagnosis)

  • Unobservable: No sensors — agent must act blindly

Single vs. Multi-Agent

  • Single-agent: Only one agent (e.g., crossword puzzle)

  • Multi-agent: Other agents (competitive or cooperative)

  • Competitive: Zero-sum games

  • Cooperative: Team goals

The Structure of Agents

Hierarchy or flowchart
  • Agent program: Implementation of agent function

  • Agent function: Maps percept sequences to actions

  • Agent = architecture + program

Simple Reflex Agents

  • Select action based on current percept only

  • Condition–action rules: if percept then action

  • No memory of past percepts

  • Limitation: Cannot handle partial observability

Model-Based Reflex Agents

  • Maintain internal state to track world

  • Model: How world evolves, effect of actions

  • Update state from: (1) how world changes, (2) how actions affect world

  • Can handle partially observable environments

Goal-Based Agents

  • Have goals — desirable states

  • Consider future: “What if I do action A?”

  • Search and planning to achieve goals

  • More flexible than reflex agents

Utility-Based Agents

  • Utility function: Maps states to real numbers (degree of happiness)

  • Handles trade-offs (e.g., fast vs. safe)

  • Handles uncertainty (expected utility)

  • Generalizes goal-based: goal = states with utility above threshold

Learning Agents

  • Performance element: Selects actions (like previous agents)

  • Learning element: Modifies performance element

  • Critic: Feedback on how well agent is doing

  • Problem generator: Suggests exploratory actions

Summary: Agent Types

TypeKey feature
Simple reflexCurrent percept → action
Model-based reflexInternal state + model
Goal-basedGoals + search/planning
Utility-basedUtility function + expected utility
LearningImproves from experience

References

Questions?

Next lecture: Solving Problems by Searching (Chapter 3)