Toy studies

Having a model, it can be convenient to do sensitivity studies and checks of the fit by doing a “toy study”: sampling from the model and fitting to the generated sample. The fitted values and the spread characterize whether the fit is biased or not. The difference to the “actual” value divided by the uncertainty (the pulls) should follow a standard Gaussian distribution

import matplotlib.pyplot as plt
import numpy as np
import progressbar
import tensorflow as tf
import zfit
from zfit import z
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-e7acd80f0a61> in <module>
      1 import matplotlib.pyplot as plt
      2 import numpy as np
----> 3 import progressbar
      4 import tensorflow as tf
      5 import zfit

ModuleNotFoundError: No module named 'progressbar'

We gonna build a simple model, just a Gaussian. But, given the well defined workflow of zfit, model can be exchanged by any complicated composition or custom model.

obs = zfit.Space('x', (-5, 5))

sigma = zfit.Parameter('sigma', 1, 0.1, 10)
mu = zfit.Parameter('mu', 0, -1, 1)
model = zfit.pdf.Gauss(obs=obs, mu=mu, sigma=sigma)

Instead of using sample as before, we will first build our loss with a more efficient Data, a “sampler”, created by create_sampler. This has like sample the arguments for limits and the number of samples, but also supports fixed_params, which is true by default. This means that whenever this object is resampled, it will be resampled with the parameter values that it had when we created the sampler.

sampler = model.create_sampler(n=3000, fixed_params=True)

So far, no sampling happened yet. But first, we build our whole chain, just using our sampler as data.

nll = zfit.loss.UnbinnedNLL(model, sampler)

from zfit.minimize import \
    DefaultToyStrategy  # this stategy does not raise an error with NaNs but returns a non-converged `FitResult`

minimizer = zfit.minimize.Minuit(strategy=DefaultToyStrategy(), verbosity=0, tol=1e-3, use_minuit_grad=True)
/home/jonas/Documents/physics/software/zfit_project/zfit_repo/zfit/minimizers/baseminimizer.py:200: ChangedFeatureWarning: The behavior of this functionality recently changed.To turn this warning off, use `zfit.settings.changed_warnings.strategies_in_minimizers.` = False`  or 'all' with `zfit.settings.changed_warnings.all = False
A strategy should now be a class, not an instance. The minimizer will at the beginning of the minimization create an instance that can be stateful during the minimization and will be stored in the FitResult.
  warn_changed_feature(message="A strategy should now be a class, not an instance. The minimizer will"
fit_results = []
ntoys = 20
params = nll.get_params()

with progressbar.ProgressBar(max_value=ntoys) as bar:

    while len(fit_results) < ntoys:

        # Generate toys
        sampler.resample()  # this is where the sampling happens

        # Randomise initial values. They can put the pdf in an unphysical region, making it negative at points.
        # This will produce NaNs in the log of the NLL. Therefore, we randomize until we got no NaNs anymore.
        for param in params:
            param.randomize()  # or smarter, use `set_value` for your own method

# The following can be used if the loss may returns NaNs, to test. Repeat in a while loop until it matches
#            try:
#                is_nan = np.isnan(zfit.run(nll.value()))
#            except tf.errors.InvalidArgumentError:  # NaNs produced, check_numerics raises this error
#                # print("nan error, try again")  # try again
#                is_nan = True
#            else:
#                break

        # Minimise the NLL
        result = minimizer.minimize(nll)

        if result.converged:
            # Calculate uncertainties
            result.hesse()
            fit_results.append(result)
            bar.update(len(fit_results))
100% (20 of 20) |########################| Elapsed Time: 0:00:01 Time:  0:00:01
print(fit_results[:10])
[<zfit.minimizers.fitresult.FitResult object at 0x7f6af04313a0>, <zfit.minimizers.fitresult.FitResult object at 0x7f6af0431ee0>, <zfit.minimizers.fitresult.FitResult object at 0x7f6af0431a30>, <zfit.minimizers.fitresult.FitResult object at 0x7f6af0431190>, <zfit.minimizers.fitresult.FitResult object at 0x7f6af0431d30>, <zfit.minimizers.fitresult.FitResult object at 0x7f6af0451130>, <zfit.minimizers.fitresult.FitResult object at 0x7f6af0451070>, <zfit.minimizers.fitresult.FitResult object at 0x7f6af044acd0>, <zfit.minimizers.fitresult.FitResult object at 0x7f6af044adf0>, <zfit.minimizers.fitresult.FitResult object at 0x7f6af0482eb0>]

Evaluate results

From here on, we can use the fit_results to compare against the true value, make plots, etc.