Toy studies

Having a model, it can be convenient to do sensitivity studies and checks of the fit by doing a “toy study”: sampling from the model and fitting to the generated sample. The fitted values and the spread characterize whether the fit is biased or not. The difference to the “actual” value divided by the uncertainty (the pulls) should follow a standard Gaussian distribution

import matplotlib.pyplot as plt
import numpy as np
import progressbar
import tensorflow as tf
import zfit
from zfit import z
/home/jonas/Documents/physics/software/zfit_project/zfit_repo/zfit/__init__.py:35: UserWarning: TensorFlow warnings are by default suppressed by zfit. In order to show them, set the environment variable ZFIT_DISABLE_TF_WARNINGS=0. In order to suppress the TensorFlow warnings AND this warning, set ZFIT_DISABLE_TF_WARNINGS=1.
  warnings.warn("TensorFlow warnings are by default suppressed by zfit."

We gonna build a simple model, just a Gaussian. But, given the well defined workflow of zfit, model can be exchanged by any complicated composition or custom model.

obs = zfit.Space('x', (-5, 5))

sigma = zfit.Parameter('sigma', 1, 0.1, 10)
mu = zfit.Parameter('mu', 0, -1, 1)
model = zfit.pdf.Gauss(obs=obs, mu=mu, sigma=sigma)

Instead of using sample as before, we will first build our loss with a more efficient Data, a “sampler”, created by create_sampler. This has like sample the arguments for limits and the number of samples, but also supports fixed_params, which is true by default. This means that whenever this object is resampled, it will be resampled with the parameter values that it had when we created the sampler.

sampler = model.create_sampler(n=3000, fixed_params=True)

So far, no sampling happened yet. But first, we build our whole chain, just using our sampler as data.

nll = zfit.loss.UnbinnedNLL(model, sampler)

from zfit.minimize import \
    DefaultToyStrategy  # this stategy does not raise an error with NaNs but returns a non-converged `FitResult`

minimizer = zfit.minimize.Minuit(strategy=DefaultToyStrategy(), verbosity=0, tol=1e-3, use_minuit_grad=True)
/home/jonas/Documents/physics/software/zfit_project/zfit_repo/zfit/minimizers/baseminimizer.py:200: ChangedFeatureWarning: The behavior of this functionality recently changed.To turn this warning off, use `zfit.settings.changed_warnings.strategies_in_minimizers.` = False`  or 'all' with `zfit.settings.changed_warnings.all = False
A strategy should now be a class, not an instance. The minimizer will at the beginning of the minimization create an instance that can be stateful during the minimization and will be stored in the FitResult.
  warn_changed_feature(message="A strategy should now be a class, not an instance. The minimizer will"
fit_results = []
ntoys = 20
params = nll.get_params()

with progressbar.ProgressBar(max_value=ntoys) as bar:

    while len(fit_results) < ntoys:

        # Generate toys
        sampler.resample()  # this is where the sampling happens

        # Randomise initial values. They can put the pdf in an unphysical region, making it negative at points.
        # This will produce NaNs in the log of the NLL. Therefore, we randomize until we got no NaNs anymore.
        for param in params:
            param.randomize()  # or smarter, use `set_value` for your own method

# The following can be used if the loss may returns NaNs, to test. Repeat in a while loop until it matches
#            try:
#                is_nan = np.isnan(zfit.run(nll.value()))
#            except tf.errors.InvalidArgumentError:  # NaNs produced, check_numerics raises this error
#                # print("nan error, try again")  # try again
#                is_nan = True
#            else:
#                break

        # Minimise the NLL
        result = minimizer.minimize(nll)

        if result.converged:
            # Calculate uncertainties
            result.hesse()
            fit_results.append(result)
            bar.update(len(fit_results))
100% (20 of 20) |########################| Elapsed Time: 0:00:00 Time:  0:00:00
print(fit_results[:10])
[<zfit.minimizers.fitresult.FitResult object at 0x7fcfb4724070>, <zfit.minimizers.fitresult.FitResult object at 0x7fcfb47249a0>, <zfit.minimizers.fitresult.FitResult object at 0x7fcfb4724430>, <zfit.minimizers.fitresult.FitResult object at 0x7fcfb4724940>, <zfit.minimizers.fitresult.FitResult object at 0x7fcfb4717070>, <zfit.minimizers.fitresult.FitResult object at 0x7fcfb46c2340>, <zfit.minimizers.fitresult.FitResult object at 0x7fcfb46b9d00>, <zfit.minimizers.fitresult.FitResult object at 0x7fcfb472f220>, <zfit.minimizers.fitresult.FitResult object at 0x7fcfb472f190>, <zfit.minimizers.fitresult.FitResult object at 0x7fcfb470c6d0>]

Evaluate results

From here on, we can use the fit_results to compare against the true value, make plots, etc.