Data#
- class zfit.data.Data(dataset, obs=None, name=None, weights=None, dtype=None, use_hash=None)[source]#
Bases:
ZfitUnbinnedData,BaseDimensional,BaseObject,GraphCachable,SerializableMixin,ZfitSerializableCreate a data holder from a
datasetused to feed intomodels.- Parameters:
dataset (
DatasetV2|LightDataset) – A dataset storing the actual valuesobs (
Union[str,Iterable[str],Space]) – Observables where the data is defined inname (
str) – Name of theDataweights – Weights of the data
dtype (
DType) – The DType of the return value. Defaults to the zfit default (usually float64).use_hash (
bool) – Whether to use a hash for caching
- property weights#
Get the weights of the data.
- set_weights(weights: ztyping.WeightsInputType)[source]#
Set (temporarily) the weights of the dataset. (deprecated)
Deprecated: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Do not set the weights on a data set, create a new one instead.
- classmethod from_pandas(df, obs=None, weights=None, name=None, dtype=None, use_hash=None)[source]#
Create a
Datafrom a pandas DataFrame. IfobsisNone, columns are used as obs.- Parameters:
df (
DataFrame) – pandas DataFrame that contains the data. IfobsisNone, columns are used as obs. Can be a superset of obs.obs (
Union[str,Iterable[str],Space]) – obs to use for the data. obs have to be the columns in the data frame. IfNone, columns are used as obs.weights (
Union[Tensor,None,ndarray,str]) –Weights of the data. Has to be 1-D and match the shape of the data (nevents) or a string that is a column in the dataframe. By default, looks for a column
"", i.e.an empty string.
name (
str) –dtype (
DType) – dtype of the datause_hash (
bool) – IfTrue, a hash of the data is created and is used to identify it in caching.
- classmethod from_root(cls, path, treepath, obs=None, *, weights=None, obs_alias=None, name=None, dtype=None, root_dir_options=None, use_hash=None, branches=None, branches_alias=None)[source]#
Create a
Datafrom a ROOT file. Arguments are passed touproot. (deprecated arguments) (deprecated arguments)Deprecated: SOME ARGUMENTS ARE DEPRECATED:
(branches). They will be removed in a future version. Instructions for updating: Use obs instead.Deprecated: SOME ARGUMENTS ARE DEPRECATED:
(branches_alias). They will be removed in a future version. Instructions for updating: Use obs_alias instead and make sure to invert the logic! I.e. it’s a mapping from the observable name to the actual branch name.The arguments are passed to uproot directly.
- Parameters:
path (
str) – Path to the root file.treepath (
str) – Name of the tree in the root file.obs (
ZfitSpace) – Observables of the data. This will also be the columns of the data if not obs_alias is given.weights (
Union[Tensor,None,ndarray,str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). Can be a column of the ROOT file by using a string corresponding to a column.obs_alias (
Mapping[str,str]) – A mapping from theobs(as keys) to the actualbranches(as values) in the root file. This allows to have differentobservablenames, independent of the branch name in the file.name (
str) –root_dir_options –
- Returns:
A
Dataobject containing the unbinned data.- Return type:
zfit.Data
- classmethod from_numpy(obs, array, weights=None, name=None, dtype=None, use_hash=None)[source]#
Create
Datafrom anp.array.- Parameters:
obs (
Union[str,Iterable[str],Space]) – Observables of the data. They will be matched to the data in the same order.array (
ndarray) – Numpy array containing the data.weights (
Union[Tensor,None,ndarray]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents).name (
str) – Name of the data.dtype (
DType) – dtype of the data.use_hash – If
True, a hash of the data is created and is used to identify it in caching.
- Returns:
A
Dataobject containing the unbinned data.- Return type:
zfit.Data
- classmethod from_tensor(obs, tensor, weights=None, name=None, dtype=None, use_hash=None)[source]#
Create a
Datafrom atf.Tensor.Valuesimply returns the tensor (in the right order).- Parameters:
obs (
Union[str,Iterable[str],Space]) – Observables of the data. They will be matched to the data in the same order.tensor (
Tensor) – Tensor containing the data.weights (
Union[Tensor,None,ndarray]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents).name (
str) – Name of the data.
- Returns:
A
Dataobject containing the unbinned data.- Return type:
zfit.Data
- with_obs(obs)[source]#
Create a new
Datawith a subset of the data using the obs.- Parameters:
obs – Observables to return. Has to be a subset of the original observables.
- Returns:
A new
Dataobject containing the subset of the data.- Return type:
zfit.Data
- to_pandas(obs=None, weightsname=None)[source]#
Create a
pd.DataFramefromobsas columns and return it.- Parameters:
- Returns:
A
pd.DataFramecontaining the data and the weights (if present).- Return type:
pd.DataFrame
- unstack_x(obs=None, always_list=None)[source]#
Return the unstacked data: a list of tensors or a single Tensor.
- value(obs=None)[source]#
Return the data as a numpy-like object in
obsorder.- Parameters:
obs (
Union[str,Iterable[str],Space]) – Observables to return. IfNone, all observables are returned. Can be a subset of the original observables. If a string is given, a 1-D array is returned with shape (nevents,). If a list of strings or azfit.Spaceis given, a 2-D array is returned with shape (nevents, nobs).
Returns:
- add_cache_deps(cache_deps, allow_non_cachable=True)#
Add dependencies that render the cache invalid if they change.
- classmethod from_asdf(asdf_obj)#
Load an object from an asdf file.
- classmethod from_dict(dict_)#
Creates an object from a dictionary structure as generated by
to_dict.- Parameters:
dict – Dictionary structure.
- Returns:
The deserialized object.
- classmethod from_json(cls, json)#
Load an object from a json string.
- classmethod get_repr()#
Abstract representation of the object for serialization.
This objects knows how to serialize and deserialize the object and is used by the
to_json,from_json,to_dictandfrom_dictmethods.- Returns:
The representation of the object.
- Return type:
pydantic.BaseModel
- register_cacher(cacher)#
Register a
cacherthat caches values produces by this instance; a dependent.- Parameters:
cacher (ztyping.CacherOrCachersType) –
- reset_cache_self()#
Clear the cache of self and all dependent cachers.
- to_asdf()#
Convert the object to an asdf file.
- to_dict()#
Convert the object to a nested dictionary structure.
- Returns:
The dictionary structure.
- Return type: