Lecture 4: Search in Complex Environments
Why Complex Environments Matter¶
Many problems have huge or continuous state spaces
Actions can be stochastic; we may not see the full state
Online settings: we discover the world while acting
Local search and approximation often beat systematic search
Learning Objectives¶
Apply local search: hill-climbing, simulated annealing, genetic algorithms
Handle continuous and nondeterministic search spaces
Search in partially observable environments
Design online search agents for unknown environments
Local Search¶
State space: Set of configurations
Objective: Find goal state or maximize objective function
Local search: Keep single current state, move to neighbors
Complete state formulation: All variables assigned
Hill-Climbing Search¶
Move to neighbor with highest value
Greedy: No backtracking
Problems: Local maxima, plateaus, ridges
Variants: Stochastic HC, first-choice HC, random-restart HC
Simulated Annealing¶
Allow downhill moves with probability decreasing over time
Temperature T: High → explore, Low → exploit
Schedule: T decreases (e.g., T = T × 0.95)
Convergence: To global optimum with proper schedule
Local Beam Search¶
Keep k states (not just one)
Generate all successors, keep best k
Stochastic beam search: Probabilistic selection
Shares information across parallel searches
Evolutionary Algorithms¶
Population: Set of individuals (states)
Fitness: Objective function
Selection: Fitter individuals more likely to reproduce
Crossover: Combine two parents
Mutation: Random change
Local Search in Continuous Spaces¶
Gradient descent: Follow ∇f (gradient)
Discretization: Grid or random sampling
Constraint satisfaction: Lagrange multipliers, penalty methods
Nondeterministic Actions¶
Erratic vacuum: Suck may fail, may deposit dirt
AND-OR search trees: OR (agent choice), AND (nature’s choice)
Contingency planning: Plan for possible outcomes
Partially Observable Environments¶
Belief state: Set of possible actual states
Belief-state space: Often exponentially large
Sensorless (conformant) planning: No observations
Contingent planning: Use observations when available
Online Search¶
Offline: Know full problem before acting
Online: Discover state while acting
Competitive ratio: Online cost / optimal offline cost
Online Search Agents¶
LRTA* (Learning Real-Time A*): Update h as we go
Exploration vs. exploitation: Try new states vs. follow known path
Learning: Remember visited states and costs
Summary¶
Local search: Hill-climbing, simulated annealing, genetic algorithms
Continuous: Gradient descent, discretization
Nondeterministic: AND-OR trees, contingency plans
Partial observability: Belief states
Online: LRTA*, exploration
References¶
Russell & Norvig, AIMA 4e, Ch. 4
Chapter PDF:
chapters/chapter-04.pdfaima-python: search4e.ipynb