data — zfit 0.5.0 documentation

class zfit.core.data.Data(dataset: Union[tensorflow.python.data.ops.dataset_ops.DatasetV2, LightDataset], obs: Union[str, Iterable[str], zfit.Space] = None, name: str = None, weights=None, iterator_feed_dict: Dict[KT, VT] = None, dtype: tensorflow.python.framework.dtypes.DType = None)[source]¶

Bases: zfit.util.cache.GraphCachable, zfit.core.interfaces.ZfitData, zfit.core.dimension.BaseDimensional, zfit.core.baseobject.BaseObject

Create a data holder from a dataset used to feed into models.

Parameters:	() (dtype) – A dataset storing the actual values () – Observables where the data is defined in () – Name of the Data () – () –

BATCH_SIZE = 1000000¶

add_cache_deps(cache_deps: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]], allow_non_cachable: bool = True)¶

Add dependencies that render the cache invalid if they change.

Parameters:	cache_deps (ZfitGraphCachable) – allow_non_cachable (bool) – If True, allow cache_dependents to be non-cachables. If False, any cache_dependents that is not a ZfitCachable will raise an error.
Raises:	`TypeError` – if one of the cache_dependents is not a ZfitCachable _and_ allow_non_cachable if False.

axes¶: Return the axes, integer based identifier(indices) for the coordinate system.

convert_sort_space(obs: Union[str, Iterable[str], zfit.Space] = None, axes: Union[int, Iterable[int]] = None, limits: Union[zfit.core.interfaces.ZfitLimit, tensorflow.python.framework.ops.Tensor, numpy.ndarray, Iterable[float], float, Tuple[float], List[float], bool, None] = None) → Optional[zfit.core.space.Space][source]¶

Convert the inputs (using eventually obs, axes) to Space and sort them according to own obs.

Parameters:	() (limits) – () – () –

Returns:

copy(deep: bool = False, name: str = None, **overwrite_params) → zfit.core.interfaces.ZfitObject¶

data_range¶

dtype¶

classmethod from_numpy(obs: Union[str, Iterable[str], zfit.Space], array: numpy.ndarray, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)[source]¶

Create Data from a np.array.

Parameters:	() (obs) – array (numpy.ndarray) – name (str) –
Returns:
Return type:	zfit.Data

classmethod from_pandas(df: pandas.core.frame.DataFrame, obs: Union[str, Iterable[str], zfit.Space] = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)[source]¶

Create a Data from a pandas DataFrame. If obs is None, columns are used as obs.

Parameters:	df (pandas.DataFrame) – weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). obs (zfit.Space) – name (str) –

classmethod from_root(path: str, treepath: str, branches: List[str] = None, branches_alias: Dict[KT, VT] = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray, str] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None, root_dir_options=None) → zfit.core.data.Data[source]¶

Create a Data from a ROOT file. Arguments are passed to uproot.

Parameters:	path (str) – treepath (str) – branches (List[str]]) – branches_alias (dict) – A mapping from the branches (as keys) to the actual observables (as values). This allows to have different observable names, independent of the branch name in the file. weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). Can be a column of the ROOT file by using a string corresponding to a column. name (str) – () (root_dir_options) –
Returns:
Return type:	zfit.Data

classmethod from_root_iter(path, treepath, branches=None, entrysteps=None, name=None, **kwargs)[source]¶

classmethod from_tensor(obs: Union[str, Iterable[str], zfit.Space], tensor: tensorflow.python.framework.ops.Tensor, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None) → zfit.core.data.Data[source]¶

Create a Data from a tf.Tensor. Value simply returns the tensor (in the right order).

Parameters:	obs (Union[str, List[str]) – tensor (tf.Tensor) – name (str) –
Returns:
Return type:	zfit.core.Data

graph_caching_methods = []¶

has_weights¶

instances = <_weakrefset.WeakSet object>¶

n_events¶

n_obs¶: Return the number of observables, the dimensionality. Corresponds to the last dimension.

name¶: The name of the object.

nevents¶

numpy()[source]¶

obs¶: Return the observables, string identifier for the coordinate system.

register_cacher(cacher: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]])¶

Register a cacher that caches values produces by this instance; a dependent.

Parameters:	() (cacher) –

reset_cache(reseter: zfit.util.cache.ZfitGraphCachable)¶

reset_cache_self()¶: Clear the cache of self and all dependent cachers.

set_data_range(data_range)[source]¶

set_weights(weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray])[source]¶

Set (temporarily) the weights of the dataset.

Parameters:	weights (tf.Tensor, np.ndarray, None) –

sort_by_axes(axes: Union[int, Iterable[int]], allow_superset: bool = True)[source]¶

sort_by_obs(obs: Union[str, Iterable[str], zfit.Space], allow_superset: bool = False)[source]¶

space¶

to_pandas(obs: Union[str, Iterable[str], zfit.Space] = None)[source]¶

Create a pd.DataFrame from obs as columns and return it.

Parameters:	() (obs) – The observables to use as columns. If None, all observables are used.

Returns:

unstack_x(obs: Union[str, Iterable[str], zfit.Space] = None, always_list: bool = False)[source]¶

Return the unstacked data: a list of tensors or a single Tensor.

Parameters:	() (obs) – which observables to return always_list (bool) – If True, always return a list (also if length 1)
Returns:	List(tf.Tensor)

value(obs: Union[str, Iterable[str], zfit.Space] = None)[source]¶

weights¶

class zfit.core.data.LightDataset(tensor)[source]¶

Bases: object

batch(batch_size)[source]¶

classmethod from_tensor(tensor)[source]¶

value()[source]¶

class zfit.core.data.SampleData(dataset: Union[tensorflow.python.data.ops.dataset_ops.DatasetV2, LightDataset], sample_holder: tensorflow.python.framework.ops.Tensor, obs: Union[str, Iterable[str], zfit.Space] = None, weights=None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = tf.float64)[source]¶

Bases: zfit.core.data.Data

BATCH_SIZE = 1000000¶

add_cache_deps(cache_deps: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]], allow_non_cachable: bool = True)¶

Add dependencies that render the cache invalid if they change.

Parameters:	cache_deps (ZfitGraphCachable) – allow_non_cachable (bool) – If True, allow cache_dependents to be non-cachables. If False, any cache_dependents that is not a ZfitCachable will raise an error.
Raises:	`TypeError` – if one of the cache_dependents is not a ZfitCachable _and_ allow_non_cachable if False.

axes¶: Return the axes, integer based identifier(indices) for the coordinate system.

convert_sort_space(obs: Union[str, Iterable[str], zfit.Space] = None, axes: Union[int, Iterable[int]] = None, limits: Union[zfit.core.interfaces.ZfitLimit, tensorflow.python.framework.ops.Tensor, numpy.ndarray, Iterable[float], float, Tuple[float], List[float], bool, None] = None) → Optional[zfit.core.space.Space]¶

Convert the inputs (using eventually obs, axes) to Space and sort them according to own obs.

Parameters:	() (limits) – () – () –

Returns:

copy(deep: bool = False, name: str = None, **overwrite_params) → zfit.core.interfaces.ZfitObject¶

data_range¶

dtype¶

classmethod from_numpy(obs: Union[str, Iterable[str], zfit.Space], array: numpy.ndarray, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)¶

Create Data from a np.array.

Parameters:	() (obs) – array (numpy.ndarray) – name (str) –
Returns:
Return type:	zfit.Data

classmethod from_pandas(df: pandas.core.frame.DataFrame, obs: Union[str, Iterable[str], zfit.Space] = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)¶

Create a Data from a pandas DataFrame. If obs is None, columns are used as obs.

Parameters:	df (pandas.DataFrame) – weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). obs (zfit.Space) – name (str) –

classmethod from_root(path: str, treepath: str, branches: List[str] = None, branches_alias: Dict[KT, VT] = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray, str] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None, root_dir_options=None) → zfit.core.data.Data¶

Create a Data from a ROOT file. Arguments are passed to uproot.

Parameters:	path (str) – treepath (str) – branches (List[str]]) – branches_alias (dict) – A mapping from the branches (as keys) to the actual observables (as values). This allows to have different observable names, independent of the branch name in the file. weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). Can be a column of the ROOT file by using a string corresponding to a column. name (str) – () (root_dir_options) –
Returns:
Return type:	zfit.Data

classmethod from_root_iter(path, treepath, branches=None, entrysteps=None, name=None, **kwargs)¶

classmethod from_sample(sample: tensorflow.python.framework.ops.Tensor, obs: Union[str, Iterable[str], zfit.Space], name: str = None, weights=None)[source]¶

classmethod from_tensor(obs: Union[str, Iterable[str], zfit.Space], tensor: tensorflow.python.framework.ops.Tensor, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None) → zfit.core.data.Data¶

Create a Data from a tf.Tensor. Value simply returns the tensor (in the right order).

Parameters:	obs (Union[str, List[str]) – tensor (tf.Tensor) – name (str) –
Returns:
Return type:	zfit.core.Data

classmethod get_cache_counting()[source]¶

graph_caching_methods = []¶

has_weights¶

instances = <_weakrefset.WeakSet object>¶

n_events¶

n_obs¶: Return the number of observables, the dimensionality. Corresponds to the last dimension.

name¶: The name of the object.

nevents¶

numpy()¶

obs¶: Return the observables, string identifier for the coordinate system.

register_cacher(cacher: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]])¶

Register a cacher that caches values produces by this instance; a dependent.

Parameters:	() (cacher) –

reset_cache(reseter: zfit.util.cache.ZfitGraphCachable)¶

reset_cache_self()¶: Clear the cache of self and all dependent cachers.

set_data_range(data_range)¶

set_weights(weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray])¶

Set (temporarily) the weights of the dataset.

Parameters:	weights (tf.Tensor, np.ndarray, None) –

sort_by_axes(axes: Union[int, Iterable[int]], allow_superset: bool = True)¶

sort_by_obs(obs: Union[str, Iterable[str], zfit.Space], allow_superset: bool = False)¶

space¶

to_pandas(obs: Union[str, Iterable[str], zfit.Space] = None)¶

Create a pd.DataFrame from obs as columns and return it.

Parameters:	() (obs) – The observables to use as columns. If None, all observables are used.

Returns:

unstack_x(obs: Union[str, Iterable[str], zfit.Space] = None, always_list: bool = False)¶

Return the unstacked data: a list of tensors or a single Tensor.

Parameters:	() (obs) – which observables to return always_list (bool) – If True, always return a list (also if length 1)
Returns:	List(tf.Tensor)

value(obs: Union[str, Iterable[str], zfit.Space] = None)¶

weights¶

class zfit.core.data.Sampler(dataset: zfit.core.data.LightDataset, sample_func: Callable, sample_holder: tensorflow.python.ops.variables.Variable, n: Union[int, float, complex, tensorflow.python.framework.ops.Tensor, Callable], weights=None, fixed_params: Dict[zfit.Parameter, Union[int, float, complex, tensorflow.python.framework.ops.Tensor]] = None, obs: Union[str, Iterable[str], zfit.Space] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = tf.float64)[source]¶

Bases: zfit.core.data.Data

BATCH_SIZE = 1000000¶

add_cache_deps(cache_deps: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]], allow_non_cachable: bool = True)¶

Add dependencies that render the cache invalid if they change.

Parameters:	cache_deps (ZfitGraphCachable) – allow_non_cachable (bool) – If True, allow cache_dependents to be non-cachables. If False, any cache_dependents that is not a ZfitCachable will raise an error.
Raises:	`TypeError` – if one of the cache_dependents is not a ZfitCachable _and_ allow_non_cachable if False.

axes¶: Return the axes, integer based identifier(indices) for the coordinate system.

convert_sort_space(obs: Union[str, Iterable[str], zfit.Space] = None, axes: Union[int, Iterable[int]] = None, limits: Union[zfit.core.interfaces.ZfitLimit, tensorflow.python.framework.ops.Tensor, numpy.ndarray, Iterable[float], float, Tuple[float], List[float], bool, None] = None) → Optional[zfit.core.space.Space]¶

Convert the inputs (using eventually obs, axes) to Space and sort them according to own obs.

Parameters:	() (limits) – () – () –

Returns:

copy(deep: bool = False, name: str = None, **overwrite_params) → zfit.core.interfaces.ZfitObject¶

data_range¶

dtype¶

classmethod from_numpy(obs: Union[str, Iterable[str], zfit.Space], array: numpy.ndarray, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)¶

Create Data from a np.array.

Parameters:	() (obs) – array (numpy.ndarray) – name (str) –
Returns:
Return type:	zfit.Data

classmethod from_pandas(df: pandas.core.frame.DataFrame, obs: Union[str, Iterable[str], zfit.Space] = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None)¶

Create a Data from a pandas DataFrame. If obs is None, columns are used as obs.

Parameters:	df (pandas.DataFrame) – weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). obs (zfit.Space) – name (str) –

classmethod from_root(path: str, treepath: str, branches: List[str] = None, branches_alias: Dict[KT, VT] = None, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray, str] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None, root_dir_options=None) → zfit.core.data.Data¶

Create a Data from a ROOT file. Arguments are passed to uproot.

Parameters:	path (str) – treepath (str) – branches (List[str]]) – branches_alias (dict) – A mapping from the branches (as keys) to the actual observables (as values). This allows to have different observable names, independent of the branch name in the file. weights (tf.Tensor, None, np.ndarray, str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). Can be a column of the ROOT file by using a string corresponding to a column. name (str) – () (root_dir_options) –
Returns:
Return type:	zfit.Data

classmethod from_root_iter(path, treepath, branches=None, entrysteps=None, name=None, **kwargs)¶

classmethod from_sample(sample_func: Callable, n: Union[int, float, complex, tensorflow.python.framework.ops.Tensor], obs: Union[str, Iterable[str], zfit.Space], fixed_params=None, name: str = None, weights=None, dtype=None)[source]¶

classmethod from_tensor(obs: Union[str, Iterable[str], zfit.Space], tensor: tensorflow.python.framework.ops.Tensor, weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray] = None, name: str = None, dtype: tensorflow.python.framework.dtypes.DType = None) → zfit.core.data.Data¶

Create a Data from a tf.Tensor. Value simply returns the tensor (in the right order).

Parameters:	obs (Union[str, List[str]) – tensor (tf.Tensor) – name (str) –
Returns:
Return type:	zfit.core.Data

classmethod get_cache_counting()[source]¶

graph_caching_methods = []¶

has_weights¶

instances = <_weakrefset.WeakSet object>¶

n_events¶

n_obs¶: Return the number of observables, the dimensionality. Corresponds to the last dimension.

n_samples¶

name¶: The name of the object.

nevents¶

numpy()¶

obs¶: Return the observables, string identifier for the coordinate system.

register_cacher(cacher: Union[zfit.core.interfaces.ZfitCachable, Iterable[zfit.core.interfaces.ZfitCachable]])¶

Register a cacher that caches values produces by this instance; a dependent.

Parameters:	() (cacher) –

resample(param_values: Mapping[KT, VT_co] = None, n: Union[int, tensorflow.python.framework.ops.Tensor] = None)[source]¶

Update the sample by newly sampling. This affects any object that used this data already.

All params that are not in the attribute fixed_params will use their current value for the creation of the new sample. The value can also be overwritten for one sampling by providing a mapping with param_values from Parameter to the temporary value.

Parameters:	param_values (Dict) – a mapping from `Parameter` to a value. For the current sampling, Parameter will use the value. n (int, tf.Tensor) – the number of samples to produce. If the Sampler was created with anything else then a numerical or tf.Tensor, this can’t be used.

reset_cache(reseter: zfit.util.cache.ZfitGraphCachable)¶

reset_cache_self()¶: Clear the cache of self and all dependent cachers.

set_data_range(data_range)¶

set_weights(weights: Union[tensorflow.python.framework.ops.Tensor, None, numpy.ndarray])¶

Set (temporarily) the weights of the dataset.

Parameters:	weights (tf.Tensor, np.ndarray, None) –

sort_by_axes(axes: Union[int, Iterable[int]], allow_superset: bool = True)¶

sort_by_obs(obs: Union[str, Iterable[str], zfit.Space], allow_superset: bool = False)¶

space¶

to_pandas(obs: Union[str, Iterable[str], zfit.Space] = None)¶

Create a pd.DataFrame from obs as columns and return it.

Parameters:	() (obs) – The observables to use as columns. If None, all observables are used.

Returns:

unstack_x(obs: Union[str, Iterable[str], zfit.Space] = None, always_list: bool = False)¶

Return the unstacked data: a list of tensors or a single Tensor.

Parameters:	() (obs) – which observables to return always_list (bool) – If True, always return a list (also if length 1)
Returns:	List(tf.Tensor)

value(obs: Union[str, Iterable[str], zfit.Space] = None)¶

weights¶

data¶