Lessons
1
Sequences and the memory problem
2
Vanilla RNNs: the hidden state equation
3
Backpropagation through time
4
Vanishing gradients in sequences
5
LSTMs: cell state and gates
6
GRUs: a simpler gating mechanism
7
Sequence-to-sequence models
8
Why transformers replaced RNNs