optimizer_adamax.Rd
Adamax optimizer from Section 7 of the Adam paper. It is a variant of Adam based on the infinity norm.
optimizer_adamax(lr = 0.002, beta_1 = 0.9, beta_2 = 0.999, epsilon = NULL, decay = 0, clipnorm = NULL, clipvalue = NULL)
lr | float >= 0. Learning rate. |
---|---|
beta_1 | The exponential decay rate for the 1st moment estimates. float, 0 < beta < 1. Generally close to 1. |
beta_2 | The exponential decay rate for the 2nd moment estimates. float, 0 < beta < 1. Generally close to 1. |
epsilon | float >= 0. Fuzz factor. If |
decay | float >= 0. Learning rate decay over each update. |
clipnorm | Gradients will be clipped when their L2 norm exceeds this value. |
clipvalue | Gradients will be clipped when their absolute value exceeds this value. |
Other optimizers: optimizer_adadelta
,
optimizer_adagrad
,
optimizer_adam
,
optimizer_nadam
,
optimizer_rmsprop
,
optimizer_sgd