Belle II Software development
GraphDataSet Class Reference
Inheritance diagram for GraphDataSet:

Public Member Functions

 __init__ (self, root, n_files=None, samples=None, features=[], edge_features=[], global_features=[], normalize=None, **kwargs)
 
 processed_file_names (self)
 
 process (self)
 

Public Attributes

 root = Path(root)
 Root path.
 
 normalize = normalize
 Normalize.
 
 n_files = n_files
 Number of files.
 
 node_features = features
 Node features.
 
 edge_features = edge_features
 Edge features.
 
 global_features = global_features
 Global features.
 
 samples = samples
 Samples.
 
 data = True)
 Data and Slices.
 
 slices = torch.load(self.processed_paths[0])
 Data and Slices.
 
 x
 delete attributes
 
 y
 delete attributes
 
 avail_samples
 delete attributes
 

Detailed Description

Dataset handler for converting Belle II data to PyTorch geometric InMemoryDataset.

The ROOT format expects the tree in every file to be named ``Tree``,
and all node features to have the format ``feat_FEATNAME``.

.. note:: This expects the files under root to have the structure ``root/**/<file_name>.root``
    where the root path is different for train and val.
    The ``**/`` is to handle subdirectories, e.g. ``sub00``.

Args:
    root (str): Path to ROOT files.
    n_files (int): Load only ``n_files`` files.
    samples (int): Load only ``samples`` events.
    features (list): List of node features names.
    edge_features (list): List of edge features names.
    global_features (list): List of global features names.
    normalize (bool): Whether to normalize input features.

Definition at line 258 of file geometric_datasets.py.

Constructor & Destructor Documentation

◆ __init__()

__init__ ( self,
root,
n_files = None,
samples = None,
features = [],
edge_features = [],
global_features = [],
normalize = None,
** kwargs )
Initialization.

Definition at line 279 of file geometric_datasets.py.

289 ):
290 """
291 Initialization.
292 """
293 assert isinstance(
294 features, list
295 ), f'Argument "features" must be a list and not {type(features)}'
296 assert len(features) > 0, "You need to use at least one node feature"
297
298
299 self.root = Path(root)
300
301
302 self.normalize = normalize
303
304
305 self.n_files = n_files
306
307 self.node_features = features
308
309 self.edge_features = edge_features
310
311 self.global_features = global_features
312
313 self.samples = samples
314
315 # Delete processed files, in case
316 file_path = Path(self.root, "processed")
317 files = list(file_path.glob("*.pt"))
318 for f in files:
319 f.unlink(missing_ok=True)
320
321 # Needs to be called after having assigned all attributes
322 super().__init__(root, None, None, None)
323
324
325 self.data, self.slices = torch.load(self.processed_paths[0])
326

Member Function Documentation

◆ process()

process ( self)
Processes the data to create graph objects and stores them in ``root/processed/processed_data.pt``
where the root path is different for train and val.

Called internally by PyTorch.

Definition at line 334 of file geometric_datasets.py.

334 def process(self):
335 """
336 Processes the data to create graph objects and stores them in ``root/processed/processed_data.pt``
337 where the root path is different for train and val.
338
339 Called internally by PyTorch.
340 """
341 num_samples = _preload(self)
342 data_list = [_process_graph(self, i) for i in range(num_samples)]
343 data, slices = self.collate(data_list)
344 torch.save((data, slices), self.processed_paths[0])
345
346
347 del self.x, self.y, self.avail_samples, data_list, data, slices

◆ processed_file_names()

processed_file_names ( self)
Name of processed file.

Definition at line 328 of file geometric_datasets.py.

328 def processed_file_names(self):
329 """
330 Name of processed file.
331 """
332 return ["processed_data.pt"]
333

Member Data Documentation

◆ avail_samples

avail_samples

delete attributes

Definition at line 347 of file geometric_datasets.py.

◆ data

data = True)

Data and Slices.

Definition at line 325 of file geometric_datasets.py.

◆ edge_features

edge_features = edge_features

Edge features.

Definition at line 309 of file geometric_datasets.py.

◆ global_features

global_features = global_features

Global features.

Definition at line 311 of file geometric_datasets.py.

◆ n_files

n_files = n_files

Number of files.

Definition at line 305 of file geometric_datasets.py.

◆ node_features

node_features = features

Node features.

Definition at line 307 of file geometric_datasets.py.

◆ normalize

normalize = normalize

Normalize.

Definition at line 302 of file geometric_datasets.py.

◆ root

root = Path(root)

Root path.

Definition at line 299 of file geometric_datasets.py.

◆ samples

samples = samples

Samples.

Definition at line 313 of file geometric_datasets.py.

◆ slices

slices = torch.load(self.processed_paths[0])

Data and Slices.

Definition at line 325 of file geometric_datasets.py.

◆ x

x

delete attributes

Definition at line 347 of file geometric_datasets.py.

◆ y

y

delete attributes

Definition at line 347 of file geometric_datasets.py.


The documentation for this class was generated from the following file: