Lecture 21: Deep Learning
Learning Objectives¶
Understand feedforward networks
Apply backpropagation and gradient descent
Build convolutional and recurrent networks
Use regularization: dropout, weight decay
Apply transfer learning
Neural Networks¶
Perceptron: Linear + threshold
MLP: Multiple layers, nonlinear activation
Universal approximation: Can approximate any function
Backpropagation¶
Forward pass: Compute activations
Backward pass: Chain rule for gradients
Update: Gradient descent on weights
Activation Functions¶
ReLU: max(0, x)
Sigmoid, tanh: Saturated
Softmax: Output layer for classification
Convolutional Networks¶
Convolution: Local receptive fields
Pooling: Downsampling
Architecture: Conv → Pool → ... → FC
CNNs: Key Ideas¶
Parameter sharing: Same filter everywhere
Translation invariance
Hierarchy: Low → high level features
Recurrent Networks¶
Sequence: Process one step at a time
Hidden state: Carries information
LSTM: Long short-term memory
Regularization¶
Weight decay: L2 penalty
Dropout: Randomly zero activations
Batch norm: Normalize activations
Unsupervised Learning¶
Autoencoders: Reconstruct input
GANs: Generator vs. discriminator
VAE: Variational autoencoders
Transfer Learning¶
Pretrain: On large dataset
Fine-tune: On target task
Feature extraction: Freeze early layers
Summary¶
Backprop: Gradient computation
CNN: Convolution, pooling
RNN/LSTM: Sequences
Regularization: Dropout, etc.
References¶
AIMA Ch. 21
Russell & Norvig, AIMA 4e, Ch. 21
Chapter PDF:
chapters/chapter-21.pdfaima-python: neural_nets.ipynb, deep_learning4e.py