Data

class zfit.data.Data(dataset, obs=None, name=None, weights=None, iterator_feed_dict=None, dtype=None)[source]

Bases: zfit.util.cache.GraphCachable, zfit.core.interfaces.ZfitUnbinnedData, zfit.core.dimension.BaseDimensional, zfit.core.baseobject.BaseObject, zfit.core.tensorlike.OverloadableMixin

Create a data holder from a dataset used to feed into models.

Parameters
  • dataset (Union[DatasetV2, LightDataset]) – A dataset storing the actual values

  • obs (Union[str, Iterable[str], Space, None]) – Observables where the data is defined in

  • name (Optional[str]) – Name of the Data

  • iterator_feed_dict (Optional[Dict]) –

  • dtype (Optional[DType]) – The DType of the return value. Defaults to the zfit default (usually float64).

set_weights(weights)[source]

Set (temporarily) the weights of the dataset.

Parameters

weights (Union[Tensor, None, ndarray]) –

classmethod from_root(path, treepath, branches=None, branches_alias=None, weights=None, name=None, dtype=None, root_dir_options=None)[source]

Create a Data from a ROOT file. Arguments are passed to uproot.

The arguments are passed to uproot directly.

Parameters
  • path (str) –

  • treepath (str) –

  • branches (Optional[List[str]]) –

  • branches_alias (Optional[Dict]) – A mapping from the branches (as keys) to the actual observables (as values). This allows to have different observable names, independent of the branch name in the file.

  • weights (Union[Tensor, None, ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). Can be a column of the ROOT file by using a string corresponding to a column.

  • name (Optional[str]) –

  • root_dir_options

Returns

Return type

zfit.Data

classmethod from_pandas(df, obs=None, weights=None, name=None, dtype=None)[source]

Create a Data from a pandas DataFrame. If obs is None, columns are used as obs.

Parameters
  • df (DataFrame) –

  • weights (Union[Tensor, None, ndarray]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents).

  • obs (Union[str, Iterable[str], Space, None]) –

  • name (Optional[str]) –

classmethod from_numpy(obs, array, weights=None, name=None, dtype=None)[source]

Create Data from a np.array.

Parameters
  • obs (Union[str, Iterable[str], Space]) –

  • array (ndarray) –

  • name (Optional[str]) –

Returns:

classmethod from_tensor(obs, tensor, weights=None, name=None, dtype=None)[source]

Create a Data from a tf.Tensor. Value simply returns the tensor (in the right order).

Parameters
  • obs (Union[str, Iterable[str], Space]) –

  • tensor (Tensor) –

  • name (Optional[str]) –

Returns:

Return type

Data

to_pandas(obs=None)[source]

Create a pd.DataFrame from obs as columns and return it.

Parameters

obs (Union[str, Iterable[str], Space, None]) – The observables to use as columns. If None, all observables are used.

Returns:

unstack_x(obs=None, always_list=False)[source]

Return the unstacked data: a list of tensors or a single Tensor.

Parameters
  • obs (Union[str, Iterable[str], Space, None]) – which observables to return

  • always_list (bool) – If True, always return a list (also if length 1)

Returns

List(tf.Tensor)

add_cache_deps(cache_deps, allow_non_cachable=True)

Add dependencies that render the cache invalid if they change.

Parameters
  • cache_deps (Union[ForwardRef, Iterable[ForwardRef]]) –

  • allow_non_cachable (bool) – If True, allow cache_dependents to be non-cachables. If False, any cache_dependents that is not a ZfitCachable will raise an error.

Raises

TypeError – if one of the cache_dependents is not a ZfitCachable _and_ allow_non_cachable if False.

property name: str

The name of the object.

Return type

str

register_cacher(cacher)

Register a cacher that caches values produces by this instance; a dependent.

Parameters

cacher (Union[ForwardRef, Iterable[ForwardRef]]) –

reset_cache_self()

Clear the cache of self and all dependent cachers.