4.1.7. Optimizers¶
4.1.7.1. Base Class¶
-
class
Optimizer: primitiv::mixins::Nonmovable<Optimizer>¶ Abstract class for parameter optimizers.
Subclassed by primitiv::optimizers::AdaDelta, primitiv::optimizers::AdaGrad, primitiv::optimizers::Adam, primitiv::optimizers::MomentumSGD, primitiv::optimizers::RMSProp, primitiv::optimizers::SGD
Public Functions
-
void
load(const std::string &path)¶ Loads configurations from a file.
- Parameters
path: Path of the optimizer parameter file.
-
void
save(const std::string &path) const¶ Saves current configurations to a file.
- Parameters
path: Path of the file that will store optimizer parameters.
-
std::uint32_t
get_epoch() const¶ Retrieves current epoch.
- Return
- Current epoch.
-
void
set_epoch(std::uint32_t epoch)¶ Sets current epoch.
- Parameters
epoch: New epoch.
-
float
get_learning_rate_scaling() const¶ Retrieves current learning rate scaling factor.
- Return
- The scaling factor.
-
void
set_learning_rate_scaling(float scale)¶ Sets learning rate scaling factor.
- Remark
- Could not set negative values.
- Parameters
scale: New scaling factor.
-
float
get_weight_decay() const¶ Retrieves current L2 decay strength.
- Return
- Current L2 decay strength.
-
void
set_weight_decay(float strength)¶ Sets L2 decay strength.
- Remark
- Could not set negative values.
- Parameters
strength: New L2 decay strength, or 0 to disable L2 decay.
-
float
get_gradient_clipping() const¶ Retrieves current gradient clipping threshold.
- Return
- Current gradient clipping threshold.
-
void
set_gradient_clipping(float threshold)¶ Sets gradient clipping threshold.
- Remark
- Could not set negative values.
- Parameters
threshold: New clipping threshold, or 0 to disable gradient clipping.
-
void
add()¶ Do nothing. This function is used as the sentinel of other specialized functions.
-
template <typename T, typename... Args>
voidadd(T &model_or_param, Args&... args)¶ Registers multiple parameters and models.
This function behaves similar to multiple
add()calls with the same order of arguments. E.g., below lines should behave similarly (except the case of exceptions):add(a, b, c, d); add(a, b); add(c, d); add(a); add(b); add(c); add(d);
-
void
reset_gradients()¶ Resets all gradients of registered parameters.
-
void
update()¶ Updates parameter values.
-
virtual void
get_configs(std::unordered_map<std::string, std::uint32_t> &uint_configs, std::unordered_map<std::string, float> &float_configs) const¶ Gathers configuration values.
- Parameters
uint_configs: Configurations with std::uint32_t type.float_configs: Configurations with float type.
-
virtual void
set_configs(const std::unordered_map<std::string, std::uint32_t> &uint_configs, const std::unordered_map<std::string, float> &float_configs)¶ Sets configuration values.
- Parameters
uint_configs: Configurations with std::uint32_t type.float_configs: Configurations with float type.
-
void
4.1.7.2. Inherited Classes¶
-
class
MomentumSGD: public primitiv::Optimizer¶ Stochastic gradient descent with momentum.
Public Functions
-
MomentumSGD(float eta = 0.01, float momentum = 0.9)¶ Creates a new MomentumSGD object.
- Parameters
eta: Learning rate.momentum: Decay factor of the momentum.
-
float
eta() const¶ Returns the hyperparameter eta.
- Return
- The value of eta.
-
float
momentum() const¶ Returns the hyperparameter momentum.
- Return
- The value of momentum.
-
-
class
AdaGrad: public primitiv::Optimizer¶ AdaGrad optimizer.
Public Functions
-
primitiv::optimizers::AdaGrad::AdaGrad(float eta = 0.001, float eps = 1e-8) Creates a new AdaGrad object.
- Parameters
eta: Learning rate.eps: Bias of power.
-
float
eta() const¶ Returns the hyperparameter eta.
- Return
- The value of eta.
-
float
eps() const¶ Returns the hyperparameter eps.
- Return
- The value of eps.
-
-
class
RMSProp: public primitiv::Optimizer¶ -
Public Functions
-
primitiv::optimizers::RMSProp::RMSProp(float eta = 0.01, float alpha = 0.9, float eps = 1e-8) Creates a new RMSProp object.
- Parameters
eta: Learning rate.alpha: Decay factor of moment.eps: Bias of power.
-
float
eta() const¶ Returns the hyperparameter eta.
- Return
- The value of eta.
-
float
alpha() const¶ Returns the hyperparameter alpha.
- Return
- The value of alpha.
-
float
eps() const¶ Returns the hyperparameter eps.
- Return
- The value of eps.
-
-
class
AdaDelta: public primitiv::Optimizer¶ AdaDelta optimizer. https://arxiv.org/abs/1212.5701
Public Functions
-
primitiv::optimizers::AdaDelta::AdaDelta(float rho = 0.95, float eps = 1e-6) Creates a new AdaDelta object.
- Parameters
rho: Decay factor of RMS operation.eps: Bias of RMS values.
-
float
rho() const¶ Returns the hyperparameter rho.
- Return
- The value of rho.
-
float
eps() const¶ Returns the hyperparameter eps.
- Return
- The value of eps.
-
-
class
Adam: public primitiv::Optimizer¶ Adam optimizer. https://arxiv.org/abs/1412.6980
Public Functions
-
primitiv::optimizers::Adam::Adam(float alpha = 0.001, float beta1 = 0.9, float beta2 = 0.999, float eps = 1e-8) Creates a new Adam object.
- Parameters
alpha: Learning rate.beta1: Decay factor of momentum history.beta2: Decay factor of power history.eps: Bias of power.
-
float
alpha() const¶ Returns the hyperparameter alpha.
- Return
- The value of alpha.
-
float
beta1() const¶ Returns the hyperparameter beta1.
- Return
- The value of beta1.
-
float
beta2() const¶ Returns the hyperparameter beta2.
- Return
- The value of beta2.
-
float
eps() const¶ Returns the hyperparameter eps.
- Return
- The value of eps.
-