Common Optimizers#
|
The stochastic gradient descent optimizer. |
|
The RMSprop optimizer [1]. |
|
The Adagrad optimizer [1]. |
|
The Adafactor optimizer. |
|
The AdaDelta optimizer with a learning rate [1]. |
|
The Adam optimizer [1]. |
|
The AdamW optimizer [1]. |
|
The Adamax optimizer, a variant of Adam based on the infinity norm [1]. |
|
The Lion optimizer [1]. |