This tutorial is about handling multiple dimensions when creating a custom PDF.
The differences are marginal since the ordering is handled automatically. It is on the other hand crucial to understand the concept of a
Space, most notably
A user (1someone who instantiates the PDF) only knows and handles observables. The relative order does not matter, if a data has observables a and b and a pdf has observables b and a, the data will be reordered automatically. Inside a PDF on the other hand, we do not care at all about observables but only about the ordering of the data, the axis. So any data tensor we have, and limits for integration, normalization etc. inside the PDF is order based and uses axes.
When passing the observables to the init of the PDF (as a user), each observable is automatically assigned to an axis corresponding to the order of the observable. The crucial point is therefore to communicate to the user which axis corresponds to what. The naming of the observables is completely up to the user, but the order of the observables depends on the pdf. Therefore, the correspondance of each axis to it’s meaning has to be stated in the docs.
import numpy as np import zfit from zfit import z
/home/docs/checkouts/readthedocs.org/user_builds/zfit/envs/latest/lib/python3.11/site-packages/zfit/__init__.py:63: UserWarning: TensorFlow warnings are by default suppressed by zfit. In order to show them, set the environment variable ZFIT_DISABLE_TF_WARNINGS=0. In order to suppress the TensorFlow warnings AND this warning, set ZFIT_DISABLE_TF_WARNINGS=1. warnings.warn(
Axes, not obs#
Since we create a pdf here, we now can completely forget about observables. We can assume that all the data is axes based (order based).We simply need to write down what each axis means.
An example pdf is implemented below. It calculates the lenght of a vector shifted by some number (dummy example).
class AbsVectorShifted(zfit.pdf.ZPDF): _N_OBS = 3 # dimension, can be omitted _PARAMS = ['xshift', 'yshift'] # the name of the parameters def _unnormalized_pdf(self, x): x1, x2, x3 = x.unstack_x() # returns a list with the columns: do x1, x2, x3 = z.unstack_x(x) for 3D xshift = self.params['xshift'] yshift = self.params['yshift'] x1 = x1 + xshift x2 = x2 + yshift return z.sqrt(z.square(x1) + z.square(x2) + z.square(x3)) # dummy calculations
Done. Now we can use our pdf already!
xobs = zfit.Space('xobs', (-3, 3)) yobs = zfit.Space('yobs', (-2, 2)) zobs = zfit.Space('z', (-1, 1)) obs = xobs * yobs * zobs data_np = np.random.random(size=(1000, 3)) data = zfit.data.Data.from_numpy(array=data_np, obs=obs) # obs is automatically used as limits here.
Create two parameters and an instance of your own pdf
xshift = zfit.Parameter("xshift", 1.) yshift = zfit.Parameter("yshift", 2.) abs_vector = AbsVectorShifted(obs=obs, xshift=xshift, yshift=yshift)
probs = abs_vector.pdf(data)
Estimated integral error ( 9.25696860849032e-05 ) larger than tolerance ( 3e-06 ), which is maybe not enough (but maybe it's also fine). You can (best solution) implement an anatytical integral (see examples in repo) or manually set a higher number on the PDF with 'update_integration_options' and increase the 'max_draws' (or adjust 'tol'). If partial integration is chosen, this can lead to large memory consumption.This is a new warning checking the integral accuracy. It may warns too often as it is Work In Progress. If you have any observation on it, please tell us about it: https://github.com/zfit/zfit/issues/new/chooseTo suppress this warning, use zfit.settings.set_verbosity(-1).
probs_np = zfit.run(probs) print(probs_np[:20])
[0.02056515 0.02217449 0.0226145 0.01880655 0.02133733 0.02022859 0.02424278 0.02254634 0.02114786 0.02579556 0.0165406 0.02339943 0.02067344 0.0179118 0.02192312 0.023857 0.02131328 0.02174356 0.01799521 0.0228408 ]
We could improve our PDF by registering an integral. This requires a few steps:
define our integral as a function in python
define in which space our integral is valid, e.g. whether it is an integral over all axis or only partial and whether any limit is valid or only special (e.g. from -inf to inf)
register the integral and say if it supports additional things (e.g. norm_range)
Let’s start defining the function. This takes, for an integral over all axes, three parameters:
limits: the actual limits the integral is over
params: the parameters of the model (which may be needed)
model: the model (pdf/func) itself
we need to calculate the integral and return (currently) a scalar.
def abs_vector_integral_from_any_to_any(limits, params, model): lower, upper = limits.limits # write your integral here return 42. # dummy integral, must be a scalar!
Now let’s define the limits. We want to allow an integral over whole space in three dims, this may looks cumbersome but is straightforward (and done only once):
limits_to_integrate = (((zfit.Space.ANY_LOWER, zfit.Space.ANY_LOWER, zfit.Space.ANY_LOWER),), ((zfit.Space.ANY_UPPER,zfit.Space.ANY_UPPER,zfit.Space.ANY_UPPER),))
Now we need the axis we will integrate over
axes_to_integrate = (0, 1, 2) # implies this is over all axes of the pdf
Now we create our space and register the integral. In order to change precedency of integrals (e.g. because some are very simple and return a single number, so this special cases should be regarded first), a priority argument can be given. Also if the integral supports multiple limits or norm range calculation, this can be specified here. Otherwise, this is automatically handled and the integral never gets multiple limits resp a norm range (that’s why we don’t have it in the API of the integral function).
limits = zfit.Space(axes=axes_to_integrate, limits=limits_to_integrate) AbsVectorShifted.register_analytic_integral(func=abs_vector_integral_from_any_to_any, limits=limits, priority=51, supports_norm_range=False, # False by default, but could be set to supports_multiple_limits=False) # True. False -> autohandled
Advanced Custom PDF#
Subclass BasePDF. The
_unnormalized_pdf has to be overriden and, in addition, the
Any of the public main methods (
partial_integrate etc.) can always be overriden by implementing the function with a leading underscore, e.g. implement
_pdf to directly controls
NotImplementedError will restore the default behavior.