Hyperparameter | Values tested |
---|---|
Number of LSTM layers | 1, 2, 3, 4, 5 |
Number of units in each LSTM layer | 16, 32, 64 |
Constant learning rate | 0.1, 0.01, 0.001, 0.0001 |
Learning rate decay with a decay rate (decay rate, initial learning rate) | (0.5, 0.005), (0.75, 0.001) |
Learning rate decreases in discrete steps (initial learning rate) | Decreases to half every 5 epochs (0.01) |