Adam optimizer

optimizer_adam(lr = 0.001, beta_1 = 0.9, beta_2 = 0.999,
  epsilon = NULL, decay = 0, amsgrad = FALSE, clipnorm = NULL,
  clipvalue = NULL)

Arguments

lr	float >= 0. Learning rate.
beta_1	The exponential decay rate for the 1st moment estimates. float, 0 < beta < 1. Generally close to 1.
beta_2	The exponential decay rate for the 2nd moment estimates. float, 0 < beta < 1. Generally close to 1.
epsilon	float >= 0. Fuzz factor. If `NULL`, defaults to `k_epsilon()`.
decay	float >= 0. Learning rate decay over each update.
amsgrad	Whether to apply the AMSGrad variant of this algorithm from the paper "On the Convergence of Adam and Beyond".
clipnorm	Gradients will be clipped when their L2 norm exceeds this value.
clipvalue	Gradients will be clipped when their absolute value exceeds this value.

Default parameters follow those provided in the original paper.