4.1.7. Optimizers¶
4.1.7.1. Base Class¶
-
class
Optimizer
: primitiv::mixins::Nonmovable<Optimizer>¶ Abstract class for parameter optimizers.
Subclassed by primitiv::optimizers::AdaDelta, primitiv::optimizers::AdaGrad, primitiv::optimizers::Adam, primitiv::optimizers::MomentumSGD, primitiv::optimizers::RMSProp, primitiv::optimizers::SGD
Public Functions
-
void
load
(const std::string &path)¶ Loads configurations from a file.
- Parameters
path
: Path of the optimizer parameter file.
-
void
save
(const std::string &path) const¶ Saves current configurations to a file.
- Parameters
path
: Path of the file that will store optimizer parameters.
-
std::uint32_t
get_epoch
() const¶ Retrieves current epoch.
- Return
- Current epoch.
-
void
set_epoch
(std::uint32_t epoch)¶ Sets current epoch.
- Parameters
epoch
: New epoch.
-
float
get_learning_rate_scaling
() const¶ Retrieves current learning rate scaling factor.
- Return
- The scaling factor.
-
void
set_learning_rate_scaling
(float scale)¶ Sets learning rate scaling factor.
- Remark
- Could not set negative values.
- Parameters
scale
: New scaling factor.
-
float
get_weight_decay
() const¶ Retrieves current L2 decay strength.
- Return
- Current L2 decay strength.
-
void
set_weight_decay
(float strength)¶ Sets L2 decay strength.
- Remark
- Could not set negative values.
- Parameters
strength
: New L2 decay strength, or 0 to disable L2 decay.
-
float
get_gradient_clipping
() const¶ Retrieves current gradient clipping threshold.
- Return
- Current gradient clipping threshold.
-
void
set_gradient_clipping
(float threshold)¶ Sets gradient clipping threshold.
- Remark
- Could not set negative values.
- Parameters
threshold
: New clipping threshold, or 0 to disable gradient clipping.
-
void
add
()¶ Do nothing. This function is used as the sentinel of other specialized functions.
-
template <typename T, typename... Args>
voidadd
(T &model_or_param, Args&... args)¶ Registers multiple parameters and models.
This function behaves similar to multiple
add()
calls with the same order of arguments. E.g., below lines should behave similarly (except the case of exceptions):add(a, b, c, d); add(a, b); add(c, d); add(a); add(b); add(c); add(d);
-
void
reset_gradients
()¶ Resets all gradients of registered parameters.
-
void
update
()¶ Updates parameter values.
-
virtual void
get_configs
(std::unordered_map<std::string, std::uint32_t> &uint_configs, std::unordered_map<std::string, float> &float_configs) const¶ Gathers configuration values.
- Parameters
uint_configs
: Configurations with std::uint32_t type.float_configs
: Configurations with float type.
-
virtual void
set_configs
(const std::unordered_map<std::string, std::uint32_t> &uint_configs, const std::unordered_map<std::string, float> &float_configs)¶ Sets configuration values.
- Parameters
uint_configs
: Configurations with std::uint32_t type.float_configs
: Configurations with float type.
-
void
4.1.7.2. Inherited Classes¶
-
class
MomentumSGD
: public primitiv::Optimizer¶ Stochastic gradient descent with momentum.
Public Functions
-
MomentumSGD
(float eta = 0.01, float momentum = 0.9)¶ Creates a new MomentumSGD object.
- Parameters
eta
: Learning rate.momentum
: Decay factor of the momentum.
-
float
eta
() const¶ Returns the hyperparameter eta.
- Return
- The value of eta.
-
float
momentum
() const¶ Returns the hyperparameter momentum.
- Return
- The value of momentum.
-
-
class
AdaGrad
: public primitiv::Optimizer¶ AdaGrad optimizer.
Public Functions
-
primitiv::optimizers::AdaGrad::AdaGrad(float eta = 0.001, float eps = 1e-8)
Creates a new AdaGrad object.
- Parameters
eta
: Learning rate.eps
: Bias of power.
-
float
eta
() const¶ Returns the hyperparameter eta.
- Return
- The value of eta.
-
float
eps
() const¶ Returns the hyperparameter eps.
- Return
- The value of eps.
-
-
class
RMSProp
: public primitiv::Optimizer¶ -
Public Functions
-
primitiv::optimizers::RMSProp::RMSProp(float eta = 0.01, float alpha = 0.9, float eps = 1e-8)
Creates a new RMSProp object.
- Parameters
eta
: Learning rate.alpha
: Decay factor of moment.eps
: Bias of power.
-
float
eta
() const¶ Returns the hyperparameter eta.
- Return
- The value of eta.
-
float
alpha
() const¶ Returns the hyperparameter alpha.
- Return
- The value of alpha.
-
float
eps
() const¶ Returns the hyperparameter eps.
- Return
- The value of eps.
-
-
class
AdaDelta
: public primitiv::Optimizer¶ AdaDelta optimizer. https://arxiv.org/abs/1212.5701
Public Functions
-
primitiv::optimizers::AdaDelta::AdaDelta(float rho = 0.95, float eps = 1e-6)
Creates a new AdaDelta object.
- Parameters
rho
: Decay factor of RMS operation.eps
: Bias of RMS values.
-
float
rho
() const¶ Returns the hyperparameter rho.
- Return
- The value of rho.
-
float
eps
() const¶ Returns the hyperparameter eps.
- Return
- The value of eps.
-
-
class
Adam
: public primitiv::Optimizer¶ Adam optimizer. https://arxiv.org/abs/1412.6980
Public Functions
-
primitiv::optimizers::Adam::Adam(float alpha = 0.001, float beta1 = 0.9, float beta2 = 0.999, float eps = 1e-8)
Creates a new Adam object.
- Parameters
alpha
: Learning rate.beta1
: Decay factor of momentum history.beta2
: Decay factor of power history.eps
: Bias of power.
-
float
alpha
() const¶ Returns the hyperparameter alpha.
- Return
- The value of alpha.
-
float
beta1
() const¶ Returns the hyperparameter beta1.
- Return
- The value of beta1.
-
float
beta2
() const¶ Returns the hyperparameter beta2.
- Return
- The value of beta2.
-
float
eps
() const¶ Returns the hyperparameter eps.
- Return
- The value of eps.
-