Data#
- class zfit.data.Data(dataset, obs=None, name=None, weights=None, dtype=None, use_hash=None)[source]#
Bases:
ZfitUnbinnedData,BaseDimensional,BaseObject,GraphCachable,SerializableMixin,ZfitSerializableCreate a data holder from a
datasetused to feed intomodels.- Parameters
dataset (
DatasetV2|LightDataset) – A dataset storing the actual valuesobs (
Union[str,Iterable[str],Space,None]) – Observables where the data is defined inweights – Weights of the data
dtype (
Optional[DType]) – The DType of the return value. Defaults to the zfit default (usually float64).use_hash (
Optional[bool]) – Whether to use a hash for caching
- property weights#
Get the weights of the data.
- set_weights(weights: ztyping.WeightsInputType)[source]#
Set (temporarily) the weights of the dataset. (deprecated)
Deprecated: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Do not set the weights on a data set, create a new one instead.
- classmethod from_pandas(df, obs=None, weights=None, name=None, dtype=None, use_hash=None)[source]#
Create a
Datafrom a pandas DataFrame. IfobsisNone, columns are used as obs.- Parameters
df (
DataFrame) – pandas DataFrame that contains the data. IfobsisNone, columns are used as obs. Can be a superset of obs.obs (
Union[str,Iterable[str],Space,None]) – obs to use for the data. obs have to be the columns in the data frame. IfNone, columns are used as obs.weights (
Union[Tensor,None,ndarray,str]) –Weights of the data. Has to be 1-D and match the shape of the data (nevents) or a string that is a column in the dataframe. By default, looks for a column
"", i.e.an empty string.
dtype (
Optional[DType]) – dtype of the datause_hash (
Optional[bool]) – IfTrue, a hash of the data is created and is used to identify it in caching.
- classmethod from_root(cls, path, treepath, obs=None, *, weights=None, obs_alias=None, name=None, dtype=None, root_dir_options=None, use_hash=None, branches=None, branches_alias=None)[source]#
Create a
Datafrom a ROOT file. Arguments are passed touproot. (deprecated arguments) (deprecated arguments)Deprecated: SOME ARGUMENTS ARE DEPRECATED:
(branches). They will be removed in a future version. Instructions for updating: Use obs instead.Deprecated: SOME ARGUMENTS ARE DEPRECATED:
(branches_alias). They will be removed in a future version. Instructions for updating: Use obs_alias instead and make sure to invert the logic! I.e. it’s a mapping from the observable name to the actual branch name.The arguments are passed to uproot directly.
- Parameters
path (
str) – Path to the root file.treepath (
str) – Name of the tree in the root file.obs (
Optional[ZfitSpace]) – Observables of the data. This will also be the columns of the data if not obs_alias is given.weights (
Union[Tensor,None,ndarray,str]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents). Can be a column of the ROOT file by using a string corresponding to a column.obs_alias (
Optional[Mapping[str,str]]) – A mapping from theobs(as keys) to the actualbranches(as values) in the root file. This allows to have differentobservablenames, independent of the branch name in the file.root_dir_options –
- Returns
A
Dataobject containing the unbinned data.- Return type
zfit.Data
- classmethod from_numpy(obs, array, weights=None, name=None, dtype=None, use_hash=None)[source]#
Create
Datafrom anp.array.- Parameters
obs (
Union[str,Iterable[str],Space]) – Observables of the data. They will be matched to the data in the same order.array (
ndarray) – Numpy array containing the data.weights (
Union[Tensor,None,ndarray]) – Weights of the data. Has to be 1-D and match the shape of the data (nevents).dtype (
Optional[DType]) – dtype of the data.use_hash – If
True, a hash of the data is created and is used to identify it in caching.
- Returns
A
Dataobject containing the unbinned data.- Return type
zfit.Data
- classmethod from_tensor(obs, tensor, weights=None, name=None, dtype=None, use_hash=None)[source]#
Create a
Datafrom atf.Tensor.Valuesimply returns the tensor (in the right order).- Parameters
- Returns
A
Dataobject containing the unbinned data.- Return type
zfit.Data
- with_obs(obs)[source]#
Create a new
Datawith a subset of the data using the obs.- Parameters
obs – Observables to return. Has to be a subset of the original observables.
- Returns
A new
Dataobject containing the subset of the data.- Return type
zfit.Data
- to_pandas(obs=None, weightsname=None)[source]#
Create a
pd.DataFramefromobsas columns and return it.- Parameters
- Returns
A
pd.DataFramecontaining the data and the weights (if present).- Return type
pd.DataFrame
- unstack_x(obs=None, always_list=None)[source]#
Return the unstacked data: a list of tensors or a single Tensor.
- value(obs=None)[source]#
Return the data as a numpy-like object in
obsorder.- Parameters
obs (
Union[str,Iterable[str],Space,None]) – Observables to return. IfNone, all observables are returned. Can be a subset of the original observables. If a string is given, a 1-D array is returned with shape (nevents,). If a list of strings or azfit.Spaceis given, a 2-D array is returned with shape (nevents, nobs).
Returns:
- add_cache_deps(cache_deps, allow_non_cachable=True)#
Add dependencies that render the cache invalid if they change.
- classmethod from_asdf(asdf_obj)#
Load an object from an asdf file.
- classmethod from_dict(dict_)#
Creates an object from a dictionary structure as generated by
to_dict.- Parameters
dict – Dictionary structure.
- Returns
The deserialized object.
- classmethod from_json(cls, json)#
Load an object from a json string.
- classmethod get_repr()#
Abstract representation of the object for serialization.
This objects knows how to serialize and deserialize the object and is used by the
to_json,from_json,to_dictandfrom_dictmethods.- Returns
The representation of the object.
- Return type
pydantic.BaseModel
- register_cacher(cacher)#
Register a
cacherthat caches values produces by this instance; a dependent.- Parameters
cacher (ztyping.CacherOrCachersType) –
- reset_cache_self()#
Clear the cache of self and all dependent cachers.
- to_asdf()#
Convert the object to an asdf file.
- to_dict()#
Convert the object to a nested dictionary structure.
- Returns
The dictionary structure.
- Return type