Lecture 20: Learning Probabilistic Models
Learning Objectives¶
Learn Bayesian network parameters (ML, Bayesian)
Apply EM for hidden variables
Learn HMM parameters
Learn network structure
Maximum Likelihood¶
Data: D = {x¹,...,xᵐ}
Likelihood: L(θ) = P(D|θ)
ML estimate: θ* = argmax L(θ)
Bayesian Parameter Learning¶
Prior: P(θ)
Posterior: P(θ|D) ∝ P(D|θ) P(θ)
Predict: P(x|D) = ∫ P(x|θ) P(θ|D) dθ
EM Algorithm¶
Hidden variables: Z unobserved
E-step: P(Z|X,θ)
M-step: θ = argmax E[log P(X,Z|θ)]
Convergence: Local optimum
EM: Mixture of Gaussians¶
Components: K Gaussians
Hidden: Which component each point
E: Soft assignment
M: Update means, covariances
Learning HMMs¶
Baum-Welch: EM for HMM
Parameters: A, B, π
E: Forward-backward
M: Update parameters
Summary¶
ML: Maximize likelihood
Bayesian: Posterior over parameters
EM: Hidden variables
HMM: Baum-Welch
References¶
AIMA Ch. 20
Russell & Norvig, AIMA 4e, Ch. 20
Chapter PDF:
chapters/chapter-20.pdf