Data#
- class zfit.data.Data(dataset, obs=None, name=None, weights=None, dtype=None, use_hash=None)[source]#
Bases:
ZfitUnbinnedData
,BaseDimensional
,BaseObject
,GraphCachable
,SerializableMixin
,ZfitSerializable
Create a data holder from a
dataset
used to feed intomodels
.- Parameters:
dataset (
DatasetV2
|LightDataset
) – A dataset storing the actual valuesobs (
Union
[str
,Iterable
[str
],Space
]) – Observables where the data is defined inname (
str
) – Name of theData
weights – Weights of the data
dtype (
DType
) – The DType of the return value. Defaults to the zfit default (usually float64).use_hash (
bool
) – Whether to use a hash for caching
- property weights#
Get the weights of the data.
- set_weights(weights: ztyping.WeightsInputType)[source]#
Set (temporarily) the weights of the dataset. (deprecated)
Deprecated: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Do not set the weights on a data set, create a new one instead.
- classmethod from_pandas(df, obs=None, weights=None, name=None, dtype=None, use_hash=None)[source]#
Create a
Data
from a pandas DataFrame. Ifobs
isNone
, columns are used as obs.- Parameters:
df (
DataFrame
) – pandas DataFrame that contains the data. Ifobs
isNone
, columns are used as obs. Can be a superset of obs.obs (
Union
[str
,Iterable
[str
],Space
]) – obs to use for the data. obs have to be the columns in the data frame. IfNone
, columns are used as obs.weights (
Union
[Tensor
,None
,ndarray
,str
]) –Weights of the data. Has to be 1-D and match the shape of the data (nevents) or a string that is a column in the dataframe. By default, looks for a column
""
, i.e.an empty string.
name (
str
) –dtype (
DType
) – dtype of the datause_hash (
bool
) – IfTrue
, a hash of the data is created and is used to identify it in caching.
- classmethod from_root(cls, path, treepath, obs=None, *, weights=None, obs_alias=None, name=None, dtype=None, root_dir_options=None, use_hash=None, branches=None, branches_alias=None)[source]#
Create a
Data
from a ROOT file. Arguments are passed touproot
. (deprecated arguments) (deprecated arguments)Deprecated: SOME ARGUMENTS ARE DEPRECATED: (branches). They will be removed in a future version. Instructions for updating: Use obs instead.
Deprecated: SOME ARGUMENTS ARE DEPRECATED: (branches_alias). They will be removed in a future version. Instructions for updating: Use obs_alias instead and make sure to invert the logic! I.e. it’s a mapping from the observable name to the actual branch name.
The arguments are passed to uproot directly.
- Parameters:
path (
str
) – Path to the root file.treepath (
str
) – Name of the tree in the root file.obs (
ZfitSpace
) – Observables of the data. This will also be the columns of the data if not obs_alias is given.weights (
Union
[Tensor
,None
,ndarray
,str
]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). Can be a column of the ROOT file by using a string corresponding to a column.obs_alias (
Mapping
[str
,str
]) – A mapping from theobs
(as keys) to the actualbranches
(as values) in the root file. This allows to have differentobservable
names, independent of the branch name in the file.name (
str
) –root_dir_options –
- Returns:
A
Data
object containing the unbinned data.- Return type:
zfit.Data
- classmethod from_numpy(obs, array, weights=None, name=None, dtype=None, use_hash=None)[source]#
Create
Data
from anp.array
.- Parameters:
obs (
Union
[str
,Iterable
[str
],Space
]) – Observables of the data. They will be matched to the data in the same order.array (
ndarray
) – Numpy array containing the data.weights (
Union
[Tensor
,None
,ndarray
]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents).name (
str
) – Name of the data.dtype (
DType
) – dtype of the data.use_hash – If
True
, a hash of the data is created and is used to identify it in caching.
- Returns:
A
Data
object containing the unbinned data.- Return type:
zfit.Data
- classmethod from_tensor(obs, tensor, weights=None, name=None, dtype=None, use_hash=None)[source]#
Create a
Data
from atf.Tensor
.Value
simply returns the tensor (in the right order).- Parameters:
obs (
Union
[str
,Iterable
[str
],Space
]) – Observables of the data. They will be matched to the data in the same order.tensor (
Tensor
) – Tensor containing the data.weights (
Union
[Tensor
,None
,ndarray
]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents).name (
str
) – Name of the data.
- Returns:
A
Data
object containing the unbinned data.- Return type:
zfit.Data
- with_obs(obs)[source]#
Create a new
Data
with a subset of the data using the obs.- Parameters:
obs – Observables to return. Has to be a subset of the original observables.
- Returns:
A new
Data
object containing the subset of the data.- Return type:
zfit.Data
- to_pandas(obs=None, weightsname=None)[source]#
Create a
pd.DataFrame
fromobs
as columns and return it.- Parameters:
- Returns:
A
pd.DataFrame
containing the data and the weights (if present).- Return type:
pd.DataFrame
- unstack_x(obs=None, always_list=None)[source]#
Return the unstacked data: a list of tensors or a single Tensor.
- value(obs=None)[source]#
Return the data as a numpy-like object in
obs
order.- Parameters:
obs (
Union
[str
,Iterable
[str
],Space
]) – Observables to return. IfNone
, all observables are returned. Can be a subset of the original observables. If a string is given, a 1-D array is returned with shape (nevents,). If a list of strings or azfit.Space
is given, a 2-D array is returned with shape (nevents, nobs).
Returns:
- add_cache_deps(cache_deps, allow_non_cachable=True)#
Add dependencies that render the cache invalid if they change.
- Parameters:
cache_deps (ztyping.CacherOrCachersType) –
allow_non_cachable (bool) – If True, allow cache_dependents to be non-cachables. If False, any cache_dependents that is not a ZfitGraphCachable will raise an error.
- Raises:
TypeError – if one of the cache_dependents is not a ZfitGraphCachable _and_ allow_non_cachable if False.
- classmethod from_asdf(asdf_obj)#
Load an object from an asdf file.
- classmethod from_dict(dict_)#
Creates an object from a dictionary structure as generated by to_dict.
- Parameters:
dict – Dictionary structure.
- Returns:
The deserialized object.
- classmethod from_json(cls, json)#
Load an object from a json string.
- classmethod get_repr()#
Abstract representation of the object for serialization.
This objects knows how to serialize and deserialize the object and is used by the to_json, from_json, to_dict and from_dict methods.
- Returns:
The representation of the object.
- Return type:
pydantic.BaseModel
- register_cacher(cacher)#
Register a cacher that caches values produces by this instance; a dependent.
- Parameters:
cacher (ztyping.CacherOrCachersType) –
- reset_cache_self()#
Clear the cache of self and all dependent cachers.
- to_asdf()#
Convert the object to an asdf file.
- to_dict()#
Convert the object to a nested dictionary structure.
- Returns:
The dictionary structure.
- Return type: