Lecture 5: Adversarial Search and Games
Why Games Matter¶
Clean testbed for search and learning
Superhuman play: chess, Go, poker
Same ideas: search, evaluation, learning
Applications: negotiation, security, design
Learning Objectives¶
Define two-player zero-sum games
Implement minimax and alpha-beta pruning
Design evaluation functions
Understand Monte Carlo Tree Search
Handle stochastic and partially observable games
Game Theory Basics¶
Players: MAX (us), MIN (opponent)
Zero-sum: One wins, one loses (or sum of payoffs = 0)
Perfect information: Both see full state
Deterministic: No chance (e.g., chess, checkers)
Game Tree¶
Initial state: Board position
Actions: Legal moves
Terminal states: Game over
Utility: +1 (MAX wins), -1 (MIN wins), 0 (draw)
Minimax Algorithm¶
MAX chooses move that maximizes utility
MIN chooses move that minimizes utility
Minimax value: Best achievable outcome against optimal play
Minimax decision: Move with highest minimax value
Minimax Algorithm (Pseudocode)¶
function MINIMAX(state) returns action
return argmax_a MIN-VALUE(RESULT(state, a))
function MAX-VALUE(state)
if TERMINAL-TEST(state) return UTILITY(state)
v ← -∞
for each a in ACTIONS(state):
v ← max(v, MIN-VALUE(RESULT(state,a)))
return vAlpha-Beta Pruning¶
Pruning: Skip branches that don’t affect final decision
α: Best value MAX can guarantee
β: Best value MIN can guarantee
Cut: When α ≥ β, prune remaining siblings
Alpha-Beta: Move Ordering¶
Best-first: Try best moves first
Killer heuristic: Moves that caused cuts before
Transposition table: Cache evaluated positions
Good ordering → more pruning → faster search
Evaluation Functions¶
Cutoff: Don’t search to terminal state
Eval(s): Estimate utility of non-terminal state
Requirements: Must be fast, correlate with winning chances
Example (chess): Material + piece-square tables + mobility
Cutting Off Search¶
Depth limit: Fixed depth (e.g., 4 ply)
Iterative deepening: Increase depth until time runs out
Quiescence: Search until “quiet” state (no captures)
Monte Carlo Tree Search (MCTS)¶
Selection: Traverse tree using UCB
Expansion: Add new node
Simulation: Random playout to end
Backpropagation: Update node statistics
No evaluation function needed!
MCTS: Advantages¶
Asymmetric: Focuses on promising branches
Anytime: Can stop anytime, return best move
Works for: Go, general game playing
AlphaGo: MCTS + neural networks
Stochastic Games¶
Chance nodes: Nature’s turn (dice, cards)
Expectiminimax: Expectation over chance outcomes
Evaluation: Must account for expected value
Partially Observable Games¶
Kriegspiel: Chess where you don’t see opponent’s pieces
Belief state: Set of possible board states
Card games: Hidden information
Summary¶
Minimax: Optimal play in deterministic games
Alpha-beta: Prune branches, same result, faster
Evaluation: Heuristic for cutoff, must be fast
MCTS: Simulation-based, no eval needed
Stochastic/partial: Expectiminimax, belief states
References¶
Russell & Norvig, AIMA 4e, Ch. 5
Chapter PDF:
chapters/chapter-05.pdfaima-python: games4e.ipynb
Questions?¶
Next lecture: Constraint Satisfaction Problems (Chapter 6)