Lecture 24: Deep Learning for Natural Language Processing - A Modern Approach to AI

Listen — slide 1 Captions (VTT)

Lecture 24: Deep Learning for Natural Language Processing¶

AIMA Chapter 24 — 1 hour¶

Listen — slide 2 Captions (VTT)

Learning Objectives¶

Use word embeddings (Word2Vec, GloVe)
Apply RNNs and LSTMs for NLP
Understand sequence-to-sequence models
Master the Transformer architecture
Apply pretraining (BERT, GPT)

Listen — slide 3 Captions (VTT)

Word Embeddings¶

Word2Vec: Skip-gram, CBOW
GloVe: Global vectors
Contextual: BERT, ELMo — different in different contexts

Listen — slide 4 Captions (VTT)

RNNs for NLP¶

Language model: P(wₜ|w₁,...,wₜ₋₁)
Sequence classification: Sentiment, etc.
Vanishing gradient: LSTM helps

Listen — slide 5 Captions (VTT)

Sequence-to-Sequence¶

Encoder-decoder: Encode input, decode output
Machine translation: Source → target
Attention: Focus on relevant parts

Listen — slide 6 Captions (VTT)

Attention¶

Query, Key, Value: Attention = softmax(QK^T/√d) V
Attend: To all positions
Interpretability: Where model looks

Listen — slide 7 Captions (VTT)

Transformer¶

Self-attention: No recurrence
Multi-head: Multiple attention heads
Position encoding: Inject position info
Parallel: All positions at once

Listen — slide 8 Captions (VTT)

Pretraining¶

Masked LM: Predict masked tokens (BERT)
Causal LM: Predict next token (GPT)
Fine-tuning: On downstream tasks

Listen — slide 9 Captions (VTT)

Summary¶

Embeddings: Word2Vec → contextual
RNN/LSTM: Sequential
Transformer: Self-attention
Pretrain + fine-tune: Modern paradigm

Listen — slide 10 Captions (VTT)

References¶

AIMA Ch. 24
Russell & Norvig, AIMA 4e, Ch. 24
Chapter PDF: chapters/chapter-24.pdf

Listen — slide 11 Captions (VTT)

Questions?¶

Next lecture: Computer Vision (Chapter 25)