{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Multidimensional PDFs\n",
"\n",
"This tutorial is about handling multiple dimensions when creating a custom PDF.\n",
"\n",
"The differences are marginal since the ordering is handled automatically. It is on the other hand crucial to understand the concept of a `Space`, most notably `obs` and `axes`.\n",
"\n",
"A user (1someone who instantiates the PDF) only knows and handles observables. The relative order does not matter, if a data has observables a and b and a pdf has observables b and a, the data will be reordered automatically. Inside a PDF on the other hand, we do not care at all about observables but only about the ordering of the data, the *axis*. So any data tensor we have, and limits for integration, normalization etc. **inside** the PDF is order based and uses *axes*.\n",
"\n",
"When passing the observables to the init of the PDF (as a user), each observable is automatically assigned to an axis corresponding to the order of the observable. The crucial point is therefore to communicate to the user which *axis* corresponds to what. The naming of the observables is completely up to the user, but the order of the observables depends on the pdf. Therefore, the correspondance of each axis to it's meaning has to be stated in the docs."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import zfit\n",
"import zfit.z.numpy as znp\n",
"from zfit import z"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Axes, not obs\n",
"\n",
"Since we create a pdf here, we now can completely forget about observables. We can assume that all the data is axes based (order based).We simply need to write down what each axis means.\n",
"\n",
"An example pdf is implemented below. It calculates the lenght of a vector shifted by some number (dummy example)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class AbsVectorShifted(zfit.pdf.ZPDF):\n",
" _N_OBS = 3 # dimension, can be omitted\n",
" _PARAMS = ['xshift', 'yshift'] # the name of the parameters\n",
"\n",
" @zfit.supports()\n",
" def _unnormalized_pdf(self, x, params):\n",
" x0 = x[0]\n",
" x1 = x[1]\n",
" x2 = x[2]\n",
" # alternatively, we could use the following line to get the same result\n",
" # x0, x1, x2 = z.unstack_x(x) # returns a list with the columns: do x1, x2, x3 = z.unstack_x(x) for 3D\n",
" xshift = params['xshift']\n",
" yshift = params['yshift']\n",
" x0 = x0 + xshift\n",
" x1 = x1 + yshift\n",
" return znp.sqrt(znp.square(x0) + x1 ** 2 + znp.power(x2, 2)) # dummy calculations, all are equivalent"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Done. Now we can use our pdf already!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"xobs = zfit.Space('xobs', (-3, 3))\n",
"yobs = zfit.Space('yobs', (-2, 2))\n",
"zobs = zfit.Space('z', (-1, 1))\n",
"obs = xobs * yobs * zobs\n",
"\n",
"data_np = np.random.random(size=(1000, 3))\n",
"data = zfit.data.Data.from_numpy(array=data_np, obs=obs) # obs is automatically used as limits here."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create two parameters and an instance of your own pdf"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"xshift = zfit.Parameter(\"xshift\", 1.)\n",
"yshift = zfit.Parameter(\"yshift\", 2.)\n",
"abs_vector = AbsVectorShifted(obs=obs, xshift=xshift, yshift=yshift)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"probs = abs_vector.pdf(data)\n",
"print(probs[:20])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We could improve our PDF by registering an integral. This requires a few steps:\n",
" - define our integral as a function in python\n",
" - define in which space our integral is valid, e.g. whether it is an integral over all axis or only partial and whether any limit is valid or only special (e.g. from -inf to inf)\n",
" - register the integral and say if it supports additional things (e.g. norm_range)\n",
"\n",
"Let's start defining the function. This takes, for an integral over all axes, three parameters:\n",
" - limits: the actual limits the integral is over\n",
" - params: the parameters of the model (which _may_ be needed)\n",
" - model: the model (pdf/func) itself\n",
"\n",
"we need to calculate the integral and return (currently) a scalar."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def abs_vector_integral_from_any_to_any(limits, params, model):\n",
" lower, upper = limits.v1.limits\n",
" # write your integral here\n",
" return 42. # dummy integral, must be a scalar!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's define the limits. We want to allow an integral over whole space in three dims, this may looks cumbersome but is straightforward (and done only once):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"limit0 = zfit.Space(axes=0, lower=zfit.Space.ANY_LOWER, upper=zfit.Space.ANY_UPPER)\n",
"limit1 = zfit.Space(axes=1, lower=zfit.Space.ANY_LOWER, upper=zfit.Space.ANY_UPPER)\n",
"limit2 = zfit.Space(axes=2, lower=zfit.Space.ANY_LOWER, upper=zfit.Space.ANY_UPPER)\n",
"limits = limit0 * limit1 * limit2 # creates the 3D limits\n",
"print(limits)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we create our space and register the integral. In order to change precedency of integrals (e.g. because some are very simple and return a single number, so this special cases should be regarded first), a priority argument can be given. Also if the integral supports multiple limits or norm range calculation, this can be specified here. Otherwise, this is automatically handled and the integral never gets multiple limits resp a norm range (that's why we don't have it in the API of the integral function)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"AbsVectorShifted.register_analytic_integral(func=abs_vector_integral_from_any_to_any, limits=limits,\n",
" priority=51,\n",
" supports_norm_range=False, # False by default, but could be set to\n",
" supports_multiple_limits=False) # True. False -> autohandled"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### n-dimensional"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Advanced Custom PDF\n",
"\n",
"Subclass BasePDF. The `_unnormalized_pdf` has to be overriden and, in addition, the `__init__`.\n",
"\n",
"Any of the public main methods (`pdf`, `integrate`, `partial_integrate` etc.) can **always** be overriden by implementing the function with a leading underscore, e.g. implement `_pdf` to directly controls `pdf`, the API is the same as the public function without the name. In case, during execution of your own method, it is found to be a bad idea to have overridden the default methods, throwing a `NotImplementedError` will restore the default behavior."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# TOBEDONE"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}