IpyoptV1#
- class zfit.minimize.IpyoptV1(tol=None, maxcor=None, verbosity=None, hessian=None, options=None, maxiter=None, criterion=None, strategy=None, name='IpyoptV1')[source]#
Bases:
zfit.minimizers.baseminimizer.BaseMinimizer
Ipopt is a gradient-based minimizer that performs large scale nonlinear optimization of continuous systems.
This implemenation uses the IPyOpt wrapper
Ipopt (Interior Point Optimizer, pronounced “Eye-Pea-Opt”) is an open source software package for large-scale nonlinear optimization. It can be used to solve general nonlinear programming problems It is written in Fortran and C and is released under the EPL (formerly CPL). IPOPT implements a primal-dual interior point method, and uses line searches based on Filter methods (Fletcher and Leyffer).
IPOPT is part of the COIN-OR project.
- Parameters
tol (float | None) – Termination value for the convergence/stopping criterion of the algorithm in order to determine if the minimum has been found. Defaults to 1e-3.
maxcor (int | None) – Maximum number of memory history to keep when using a quasi-Newton update formula such as BFGS. It is the number of gradients to “remember” from previous optimization steps: increasing it increases the memory requirements but may speed up the convergence.
verbosity (int | None) –
Verbosity of the minimizer. Has to be between 0 and 10. The verbosity has the meaning:
a value of 0 means quiet and no output
above 0 up to 5, information that is good to know but without flooding the user, corresponding to a “INFO” level.
A value above 5 starts printing out considerably more and is used more for debugging purposes.
Setting the verbosity to 10 will print out every evaluation of the loss function and gradient.
Some minimizers offer additional output which is also distributed as above but may duplicate certain printed values.
hessian (str | None) –
Determine which hessian matrix to use during the minimization. One of the following option is possible
’bfgs’: BFGS quasi-Newton update formula for the limited approximation, update with skipping
’sr1’: SR1 quasi-Newton update formula for the limited approximation, update (doesn’t work too well)
’exact’: Minimizer uses internally an exact calculation of the hessian using a numerical method.
’zfit’: use the exact hessian provided by the loss (either the automatic gradient or the numerical gradient computed inside the loss). This tends to be slow compared to the approximations and is usually not necessary.
options (dict[str, object] | None) – Additional possible options for the minimizer. All options can be seen by using
shell (the command in the) –
code-block: (.) –
bash:
ipopt --print_options
A selection of parameters is presented here:
- alpha_red_factor: between 0 and 1, default 0.5
Fractional reduction of the trial step size in the backtracking line search. At every step of the backtracking line search, the trial step size is reduced by this factor.
- accept_after_max_steps: -1 to +inf, default -1
Accept a trial point after maximal this number of steps. Even if it does not satisfy line search conditions.
- watchdog_shortened_iter_trigger: 0 to +inf, default 10
Number of shortened iterations that trigger the watchdog. If the number of successive iterations in which the backtracking line search did not accept the first trial point exceeds this number, the watchdog procedure is activated. Choosing “0” here disables the watchdog procedure.
- watchdog_trial_iter_max: 1 to +inf, default 3
Maximum number of watchdog iterations. This option determines the number of trial iterations allowed before the watchdog procedure is aborted and the algorithm returns to the stored point.
- linear_solver: default “mumps”
Linear solver used for step computations. Determines which linear algebra package is to be used for the solution of the augmented linear system (for obtaining the search directions). Note, the code must have been compiled with the linear solver you want to choose. Depending on your Ipopt installation, not all options are available. Possible values: - ma27 [use the Harwell routine MA27] - ma57 [use the Harwell routine MA57] - ma77 [use the Harwell routine HSL_MA77] - ma86 [use the Harwell routine HSL_MA86] - ma97 [use the Harwell routine HSL_MA97] - pardiso [use the Pardiso package] - wsmp [use WSMP package] - mumps [use MUMPS package] - custom [use custom linear solver]
- mumps_pivtol: ONLY FOR MUMPS
Pivot tolerance for the linear solver MUMPS. A smaller number pivots for sparsity, a larger number pivots for stability. This option is only available if Ipopt has been compiled with MUMPS.
- mehrotra_algorithm: default “no”
Indicates if we want to do Mehrotra’s algorithm. If set to yes, Ipopt runs as Mehrotra’s predictor-corrector algorithm. This works usually very well for LPs and convex QPs. This automatically disables the line search, and chooses the (unglobalized) adaptive mu strategy with the “probing” oracle, and uses “corrector_type=affine” without any safeguards; you should not set any of those options explicitly in addition. Also, unless otherwise specified, the values of “bound_push”, “bound_frac”, and “bound_mult_init_val” are set more aggressive, and sets “alpha_for_y=bound_mult”. Possible values: - no [Do the usual Ipopt algorithm.] - yes [Do Mehrotra’s predictor-corrector algorithm.]
- fast_step_computation: default “no”
Indicates if the linear system should be solved quickly. If set to yes, the algorithm assumes that the linear system that is solved to obtain the search direction, is solved sufficiently well. In that case, no residuals are computed, and the computation of the search direction is a little faster. Possible values: - no [Verify solution of linear system by computing residuals.] - yes [Trust that linear systems are solved well.]
maxiter (int | str | None) – Approximate number of iterations. This corresponds to roughly the maximum number of evaluations of the
value
, ‘gradient` orhessian
.criterion (ConvergenceCriterion | None) – Criterion of the minimum. This is an estimated measure for the distance to the minimum and can include the relative or absolute changes of the parameters, function value, gradients and more. If the value of the criterion is smaller than
loss.errordef * tol
, the algorithm stopps and it is assumed that the minimum has been found.strategy (ZfitStrategy | None) – A class of type
ZfitStrategy
that takes no input arguments in the init. Determines the behavior of the minimizer in certain situations, most notably when encountering NaNs. It can also implement a callback function.name (str | None) – Human-readable name of the minimizer.
- create_criterion(loss=None, params=None)#
Create a criterion instance for the given loss and parameters.
- Parameters
loss (ZfitLoss | None) – Loss that is used for the criterion. Can be None if called inside
_minimize
params (ztyping.ParametersType | None) – Parameters that will be associated with the loss in this order. Can be None if called within
_minimize
.
- Return type
ConvergenceCriterion
- Returns
ConvergenceCriterion to check if the function converged.
- create_evaluator(loss=None, params=None, strategy=None)#
Make a loss evaluator using the strategy and more from the minimizer.
Convenience factory for the loss evaluator. This wraps the loss to return a numpy array, to catch NaNs, stop on maxiter and evaluate the gradient and hessian without the need to specify the order every time.
- Parameters
loss (ZfitLoss | None) – Loss to be wrapped. Can be None if called inside
_minimize
params (ztyping.ParametersType | None) – Parameters that will be associated with the loss in this order. Can be None if called within
_minimize
.strategy (ZfitStrategy | None) – Instance of a Strategy that will be used during the evaluation.
- Returns
The evaluator that wraps the Loss ant Strategy with the current parameters.
- Return type
LossEval
- minimize(loss, params=None, init=None)#
Fully minimize the
loss
with respect toparams
, optionally using information frominit
.The minimizer changes the parameter values in order to minimize the loss function until the convergence criterion value is less than the tolerance. This is a stateless function that can take a
FitResult
in order to initialize the minimization.- Parameters
loss (ZfitLoss | Callable) – Loss to be minimized until convergence is reached. Usually a
ZfitLoss
.attribute (- If this is a simple callable that takes an array as argument and an attribute errordef. The) –
can be set to any arbitrary function like
def loss(x): return - x ** 2 loss.errordef = 0.5 # as an example minimizer.minimize(loss, [2, 5])
If not TensorFlow is used inside the function, make sure to set
zfit.run.set_graph_mode(False)
andzfit.run.set_autograd_mode(False)
.method (- A FitResult can be provided as the only argument to the) – parameters to be minimized are taken from it. This allows to easily chain minimization algorithms.
the (in which case the loss as well as) – parameters to be minimized are taken from it. This allows to easily chain minimization algorithms.
params (ztyping.ParamsTypeOpt | None) –
The parameters with respect to which to minimize the
loss
. IfNone
, the parameters will be taken from theloss
.In order to fix the parameter values to a specific value (and thereby make them indepented of their current value), a dictionary mapping a parameter to a value can be given.
If
loss
is a callable,params
can also be (instead ofParameters
):an array of initial values
for more control, a
dict
with the keys:value
(required): array-like initial values.name
: list of unique names of the parameters.lower
: array-like lower limits of the parameters,upper
: array-like upper limits of the parameters,step_size
: array-like initial step size of the parameters (approximately the expected uncertainty)
This will create internally a single parameter for each value that can be accessed in the
FitResult
via params. Repeated calls can therefore (in the current implement) cause a memory increase. The recommended way is to re-use parameters (just taken from theFitResult
attributeparams
).init (ZfitResult | None) –
A result of a previous minimization that provides auxiliary information such as the starting point for the parameters, the approximation of the covariance and more. Which information is used can depend on the specific minimizer implementation.
In general, the assumption is that the loss provided is similar enough to the one provided in
init
.What is assumed to be close:
the parameters at the minimum of loss will be close to the parameter values at the minimum of init.
Covariance matrix, or in general the shape, of init to the loss at its minimum.
What is explicitly _not_ assumed to be the same:
absolute value of the loss function. If init has a function value at minimum x of fmin, it is not assumed that
loss
will have the same/similar value at x.parameters that are used in the minimization may differ in order or which are fixed.
- Return type
- Returns
The fit result containing all information about the minimization.
Examples
Using the ability to restart a minimization with a previous result allows to use a more global search algorithm with a high tolerance and an additional local minimization to polish the found minimum.
result_approx = minimizer_global.minimize(loss, params) result = minimizer_local.minimize(result_approx)
For a simple usage with a callable only, the parameters can be given as an array of initial values.
def func(x): return np.log(np.sum(x ** 2)) func.errordef = 0.5 params = [1.1, 3.5, 8.35] # initial values result = minimizer.minimize(func, param)