Belle II Software  release-08-01-10
PriorDataLoader Class Reference
Inheritance diagram for PriorDataLoader:
Collaboration diagram for PriorDataLoader:

Public Member Functions

def __init__ (self, str path, str key, list particlelist, list labels)
 
def __getitem__ (self, index)
 
def __len__ (self)
 
torch.tensor get_split (self, float n_test=0.1)
 

Public Attributes

 x
 The tensor of features.
 
 y
 The tensor of labels.
 

Detailed Description

Dataloader for PID prior probability training.

Attributes:
    x (np.array): Array containing feature data with a second order combination of momentum, cos(theta) and transverse momentum.
    y (np.array): Array containing the label encoded PDG values.

Definition at line 26 of file priorDataLoaderAndModel.py.

Constructor & Destructor Documentation

◆ __init__()

def __init__ (   self,
str  path,
str  key,
list  particlelist,
list  labels 
)
Initialize the dataloader for PID prior training.

Parameters:
    path (str): Path to the root file containing the data.
    key (str): Key (i.e. path) of the tree within the root file.
    particlelist (list(int)): List of particle PDG values for which the model has to be trained.
    labels (str): Labels of pandas columns containing cos(theta), momentum and PDG values (in this order).

Definition at line 36 of file priorDataLoaderAndModel.py.

36  def __init__(self, path: str, key: str, particlelist: list, labels: list):
37  """
38  Initialize the dataloader for PID prior training.
39 
40  Parameters:
41  path (str): Path to the root file containing the data.
42  key (str): Key (i.e. path) of the tree within the root file.
43  particlelist (list(int)): List of particle PDG values for which the model has to be trained.
44  labels (str): Labels of pandas columns containing cos(theta), momentum and PDG values (in this order).
45 
46  """
47  data = ur.open(path)
48  data = data[key].pandas.df(labels)
49  df = data.dropna().reset_index(drop=True)
50  df.loc[:, labels[2]] = df.loc[:, labels[2]].abs()
51  droplist = np.setdiff1d(np.unique(df[labels[2]].values), particlelist)
52  for i in droplist:
53  df = df.drop(df.loc[df[labels[2]] == i].index).reset_index(drop=True)
54  x = df.values[:, 0:2]
55  x = np.hstack((x, (np.sin(np.arccos(x[:, 0])) * x[:, 1]).reshape(-1, 1)))
56  pol = PolynomialFeatures(2, include_bias=False)
57  x = pol.fit_transform(x)
58 
59  self.x = x.astype("float32")
60  y = df.values[:, 2]
61  le = LabelEncoder()
62  y = le.fit_transform(y)
63 
64  self.y = y.astype("int64")
65 

Member Function Documentation

◆ __getitem__()

def __getitem__ (   self,
  index 
)
Function to get feature and label tensors at the given index location.

Parameters:
    index (int): The index of required tensors.

Returns:
    Tensors of features and labels at the given index.

Definition at line 66 of file priorDataLoaderAndModel.py.

◆ __len__()

def __len__ (   self)
Function to obtain length of a tensor.

Parameters:
    None.

Returns:
    Number of feature sets.

Definition at line 78 of file priorDataLoaderAndModel.py.

◆ get_split()

torch.tensor get_split (   self,
float   n_test = 0.1 
)
Split the input data into training and validation set.

Parameter:
    n_test (float): Ratio of number of particles to be taken in the validation set to that of training set.

Return:
    A randomly split data set with the ratio given by 'n_test'.

Definition at line 90 of file priorDataLoaderAndModel.py.


The documentation for this class was generated from the following file: