optimizers⚓︎
Gradient acceleration.
AdaMax
⚓︎
    
Adam
⚓︎
    A class implementing the Adam optimizer for gradient-based optimization kingma2014.
The Adam update equation for the control x using gradient g, iteration t, and small constants ε is given by:
m_t = β1 * m_{t-1} + (1 - β1) * g
v_t = β2 * v_{t-1} + (1 - β2) * g^2
m_t_hat = m_t / (1 - β1^t)
v_t_hat = v_t / (1 - β2^t)
x_{t+1} = x_t - α * m_t_hat / (sqrt(v_t_hat) + ε)
Attributes:
| Name | Type | Description | 
|---|---|---|
| step_size | float | The initial step size provided during initialization. | 
| beta1 | float | The exponential decay rate for the first moment estimates. | 
| beta2 | float | The exponential decay rate for the second moment estimates. | 
| vel1 | 1-D array_like | First moment estimate. | 
| vel2 | 1-D array_like | Second moment estimate. | 
| eps | float | Small constant to prevent division by zero. | 
| _step_size | float | Private attribute for temporarily modifying step size. | 
| temp_vel1 | 1-D array_like | Temporary first moment estimate. | 
| temp_vel2 | 1-D array_like | Temporary Second moment estimate. | 
Methods:
| Name | Description | 
|---|---|
| apply_update | Apply an Adam update to the control parameter. | 
| apply_backtracking | Apply backtracking by reducing step size temporarily. | 
| restore_parameters | Restore the original step size. | 
__init__(step_size, beta1=0.9, beta2=0.999)
⚓︎
    A class implementing the Adam optimizer for gradient-based optimization. The Adam update equation for the control x using gradient g, iteration t, and small constants ε is given by:
m_t = β1 * m_{t-1} + (1 - β1) * g
v_t = β2 * v_{t-1} + (1 - β2) * g^2
m_t_hat = m_t / (1 - β1^t)
v_t_hat = v_t / (1 - β2^t)
x_{t+1} = x_t - α * m_t_hat / (sqrt(v_t_hat) + ε)
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| step_size | float | The step size (learning rate) for the optimization. | required | 
| beta1 | float | The exponential decay rate for the first moment estimates (default is 0.9). | 0.9 | 
| beta2 | float | The exponential decay rate for the second moment estimates (default is 0.999). | 0.999 | 
apply_backtracking()
⚓︎
    Apply backtracking by reducing step size temporarily.
apply_update(control, gradient, **kwargs)
⚓︎
    Apply a gradient update to the control parameter.
Note
This is the steepest decent update: x_new = x_old - x_step.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| control | array_like | The current value of the parameter being optimized. | required | 
| gradient | array_like | The gradient of the objective function with respect to the control parameter. | required | 
| **kwargs | dict | Additional keyword arguments, including 'iter' for the current iteration. | {} | 
Returns:
| Type | Description | 
|---|---|
| new_control, temp_velocity: tuple | The new value of the control parameter after the update, and the current state step. | 
restore_parameters()
⚓︎
    Restore the original step size.
GradientAscent
⚓︎
    A class for performing gradient ascent optimization with momentum and backtracking. The gradient descent update equation with momentum is given by:
Attributes:
| Name | Type | Description | 
|---|---|---|
| step_size | float | The initial step size provided during initialization. | 
| momentum | float | The initial momentum factor provided during initialization. | 
| velocity | array_like | Current velocity of the optimization process. | 
| temp_velocity | array_like | Temporary velocity | 
| _step_size | float | Private attribute for temporarily modifying step size. | 
| _momentum | float | Private attribute for temporarily modifying momentum. | 
Methods:
| Name | Description | 
|---|---|
| apply_update | Apply a gradient update to the control parameter. | 
| apply_backtracking | Apply backtracking by reducing step size and momentum temporarily. | 
| restore_parameters | Restore the original step size and momentum values. | 
__init__(step_size, momentum)
⚓︎
    
apply_backtracking()
⚓︎
    Apply backtracking by reducing step size and momentum temporarily.
apply_smc_update(control, gradient, **kwargs)
⚓︎
    Apply a gradient update to the control parameter.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| control | array_like | The current value of the parameter being optimized. | required | 
| gradient | array_like | The gradient of the objective function with respect to the control parameter. | required | 
| **kwargs | dict | Additional keyword arguments. | {} | 
Returns:
| Name | Type | Description | 
|---|---|---|
| new_control | ndarray | The new value of the control parameter after the update. | 
apply_update(control, gradient, **kwargs)
⚓︎
    Apply a gradient update to the control parameter.
Note
This is the steepest decent update: x_new = x_old - x_step.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| control | array_like | The current value of the parameter being optimized. | required | 
| gradient | array_like | The gradient of the objective function with respect to the control parameter. | required | 
| **kwargs | dict | Additional keyword arguments. | {} | 
Returns:
| Type | Description | 
|---|---|
| new_control, temp_velocity: tuple | The new value of the control parameter after the update, and the current state step. | 
restore_parameters()
⚓︎
    Restore the original step size and momentum value.
Steihaug
⚓︎
    A class implementing the Steihaug conjugate-gradient trust region optimizer. This code is based on the minfx optimisation library, https://gna.org/projects/minfx
__init__(maxiter=1000000.0, epsilon=1e-08, delta_max=100000.0, delta0=1.0)
⚓︎
    Page 75 from 'Numerical Optimization' by Jorge Nocedal and Stephen J. Wright, 1999, 2nd ed. The CG-Steihaug algorithm is:
- epsilon > 0
- p0 = 0, r0 = g, d0 = -r0
- if ||r0|| < epsilon:- return p = p0
 
- while 1:- if djT.B.dj <= 0:- Find tau such that p = pj + tau.dj minimises m(p) in (4.9) and satisfies ||p|| = delta
- return p
 
- aj = rjT.rj / djT.B.dj
- pj+1 = pj + aj.dj
- if ||pj+1|| >= delta:- Find tau such that p = pj + tau.dj satisfies ||p|| = delta
- return p
 
- rj+1 = rj + aj.B.dj
- if ||rj+1|| < epsilon.||r0||:- return p = pj+1
 
- bj+1 = rj+1T.rj+1 / rjT.rj
- dj+1 = rj+1 + bj+1.dj
 
- if djT.B.dj <= 0:
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| maxiter | float | Maximum number of iterations. | 1000000.0 | 
| epsilon | float | Tolerance for iterations. | 1e-08 | 
| delta_max | float | Maximum thrust region size. | 100000.0 | 
| delta0 | float | Initial thrust region size. | 1.0 | 
apply_backtracking()
⚓︎
    Apply backtracking by reducing step size temporarily.
apply_update(xk, dfk, **kwargs)
⚓︎
    Apply a Steihaug update to the control vector.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| xk | array_like | The current value of the parameter being optimized. | required | 
| dfk | array_like | The gradient of the objective function with respect to the control parameter. | required | 
| **kwargs | dict | Additional keyword arguments, including the hessian of the objective function with respect to the control parameter. | {} | 
Returns:
| Type | Description | 
|---|---|
| new_control, step: tuple | The new value of the control parameter after the update, and the current state step. | 
get_tau(pj, dj)
⚓︎
    Function to find tau such that p = pj + tau.dj, and ||p|| = delta.
restore_parameters()
⚓︎
    Restore the original step size.