Belle II Software development
TrackQEEvaluationBaseTask Class Reference
Inheritance diagram for TrackQEEvaluationBaseTask:
CDCTrackQEEvaluationTask RecoTrackQEEvaluationTask VXDTrackQEEvaluationTask

Public Member Functions

TrackQETeacherBaseTask teacher_task (self)
 
Basf2PathTask data_collection_task (self)
 
def task_acronym (self)
 
def requires (self)
 
def output (self)
 
def run (self)
 

Static Public Attributes

b2luigi git_hash
 Use git hash / release of basf2 version as additional luigi parameter.
 
b2luigi n_events_testing = b2luigi.IntParameter()
 Number of events to generate for the test data set.
 
b2luigi n_events_training = b2luigi.IntParameter()
 Number of events to generate for the training data set.
 
b2luigi experiment_number = b2luigi.IntParameter()
 Experiment number of the conditions database, e.g.
 
b2luigi process_type
 Define which kind of process shall be used.
 
b2luigi training_target
 Feature/variable to use as truth label in the quality estimator MVA classifier.
 
b2luigi exclude_variables
 List of collected variables to not use in the training of the QE MVA classifier.
 
b2luigi fast_bdt_option
 Hyperparameter options for the FastBDT algorithm.
 

Detailed Description

Base class for evaluating a quality estimator ``basf2_mva_evaluate.py`` on a
separate test data set.

Evaluation tasks for VXD, CDC and combined QE can inherit from it.

Definition at line 1749 of file combined_quality_estimator_teacher.py.

Member Function Documentation

◆ data_collection_task()

Basf2PathTask data_collection_task (   self)
Property defining the specific ``DataCollectionTask`` to require.  Must
implemented by the inheriting specific teacher task class.

Definition at line 1811 of file combined_quality_estimator_teacher.py.

1811 def data_collection_task(self) -> Basf2PathTask:
1812 """
1813 Property defining the specific ``DataCollectionTask`` to require. Must
1814 implemented by the inheriting specific teacher task class.
1815 """
1816 raise NotImplementedError(
1817 "Evaluation Tasks must define a data collection task to require "
1818 )
1819

◆ output()

def output (   self)
Generate list of output files that the task should produce.
The task is considered finished if and only if the outputs all exist.

Definition at line 1858 of file combined_quality_estimator_teacher.py.

1858 def output(self):
1859 """
1860 Generate list of output files that the task should produce.
1861 The task is considered finished if and only if the outputs all exist.
1862 """
1863 weightfile_details = create_fbdt_option_string(self.fast_bdt_option)
1864 evaluation_pdf_output = self.teacher_task.weightfile_identifier_basename + weightfile_details + ".pdf"
1865 yield self.add_to_output(evaluation_pdf_output)
1866

◆ requires()

def requires (   self)
Generate list of luigi Tasks that this Task depends on.

Reimplemented in RecoTrackQEEvaluationTask.

Definition at line 1829 of file combined_quality_estimator_teacher.py.

1829 def requires(self):
1830 """
1831 Generate list of luigi Tasks that this Task depends on.
1832 """
1833 yield self.teacher_task(
1834 n_events_training=self.n_events_training,
1835 experiment_number=self.experiment_number,
1836 process_type=self.process_type,
1837 training_target=self.training_target,
1838 exclude_variables=self.exclude_variables,
1839 fast_bdt_option=self.fast_bdt_option,
1840 )
1841 if 'USEREC' in self.process_type:
1842 if 'USERECBB' in self.process_type:
1843 process = 'BBBAR'
1844 elif 'USERECEE' in self.process_type:
1845 process = 'BHABHA'
1846 yield CheckExistingFile(
1847 filename='datafiles/qe_records_N' + str(self.n_events_testing) + '_' + process + '_test_' +
1848 self.task_acronym + '.root'
1849 )
1850 else:
1851 yield self.data_collection_task(
1852 num_processes=MasterTask.num_processes,
1853 n_events=self.n_events_testing,
1854 experiment_number=self.experiment_number,
1855 random_seed=self.process_type + '_test',
1856 )
1857

◆ run()

def run (   self)
Run ``basf2_mva_evaluate.py`` subprocess to evaluate QE MVA.

The MVA weight file created from training on the training data set is
evaluated on separate test data.

Definition at line 1868 of file combined_quality_estimator_teacher.py.

1868 def run(self):
1869 """
1870 Run ``basf2_mva_evaluate.py`` subprocess to evaluate QE MVA.
1871
1872 The MVA weight file created from training on the training data set is
1873 evaluated on separate test data.
1874 """
1875 weightfile_details = create_fbdt_option_string(self.fast_bdt_option)
1876 evaluation_pdf_output_basename = self.teacher_task.weightfile_identifier_basename + weightfile_details + ".pdf"
1877
1878 evaluation_pdf_output_path = self.get_output_file_name(evaluation_pdf_output_basename)
1879
1880 if 'USEREC' in self.process_type:
1881 if 'USERECBB' in self.process_type:
1882 process = 'BBBAR'
1883 elif 'USERECEE' in self.process_type:
1884 process = 'BHABHA'
1885 datafiles = 'datafiles/qe_records_N' + str(self.n_events_testing) + '_' + \
1886 process + '_test_' + self.task_acronym + '.root'
1887 else:
1888 datafiles = self.get_input_file_names(
1889 self.data_collection_task.get_records_file_name(
1890 self.data_collection_task,
1891 n_events=self.n_events_testing,
1892 random_seed=self.process + '_test_' +
1893 self.task_acronym))[0]
1894 cmd = [
1895 "basf2_mva_evaluate.py",
1896 "--identifiers",
1897 self.get_input_file_names(
1898 self.teacher_task.get_weightfile_xml_identifier(
1899 self.teacher_task,
1900 fast_bdt_option=self.fast_bdt_option))[0],
1901 "--datafiles",
1902 datafiles,
1903 "--treename",
1904 self.teacher_task.tree_name,
1905 "--outputfile",
1906 evaluation_pdf_output_path,
1907 ]
1908
1909 # Prepare log files
1910 log_file_dir = get_log_file_dir(self)
1911 # check if directory already exists, if not, create it. I think this is necessary as this task does not
1912 # inherit properly from b2luigi and thus does not do it automatically??
1913 try:
1914 os.makedirs(log_file_dir, exist_ok=True)
1915 # the following should be unnecessary as exist_ok=True should take care that no FileExistError rises. I
1916 # might ask about a permission error...
1917 except FileExistsError:
1918 print('Directory ' + log_file_dir + 'already exists.')
1919 stderr_log_file_path = log_file_dir + "stderr"
1920 stdout_log_file_path = log_file_dir + "stdout"
1921 with open(stdout_log_file_path, "w") as stdout_file:
1922 stdout_file.write(f'stdout output of the command:\n{" ".join(cmd)}\n\n')
1923 if os.path.exists(stderr_log_file_path):
1924 # remove stderr file if it already exists b/c in the following it will be opened in appending mode
1925 os.remove(stderr_log_file_path)
1926
1927 # Run evaluation via subprocess and write output into logfiles
1928 with open(stdout_log_file_path, "a") as stdout_file:
1929 with open(stderr_log_file_path, "a") as stderr_file:
1930 try:
1931 subprocess.run(cmd, check=True, stdin=stdout_file, stderr=stderr_file)
1932 except subprocess.CalledProcessError as err:
1933 stderr_file.write(f"Evaluation failed with error:\n{err}")
1934 raise err
1935
1936

◆ task_acronym()

def task_acronym (   self)
Acronym to distinguish between cdc, vxd and rec(o) MVA

Definition at line 1821 of file combined_quality_estimator_teacher.py.

1821 def task_acronym(self):
1822 """
1823 Acronym to distinguish between cdc, vxd and rec(o) MVA
1824 """
1825 raise NotImplementedError(
1826 "Evaluation Tasks must define a task acronym."
1827 )
1828

◆ teacher_task()

TrackQETeacherBaseTask teacher_task (   self)
Property defining specific teacher task to require.

Definition at line 1802 of file combined_quality_estimator_teacher.py.

1802 def teacher_task(self) -> TrackQETeacherBaseTask:
1803 """
1804 Property defining specific teacher task to require.
1805 """
1806 raise NotImplementedError(
1807 "Evaluation Tasks must define a teacher task to require "
1808 )
1809

Member Data Documentation

◆ exclude_variables

b2luigi exclude_variables
static
Initial value:
= b2luigi.ListParameter(
)

List of collected variables to not use in the training of the QE MVA classifier.

In addition to variables containing the "truth" substring, which are excluded by default.

Definition at line 1789 of file combined_quality_estimator_teacher.py.

◆ experiment_number

b2luigi experiment_number = b2luigi.IntParameter()
static

Experiment number of the conditions database, e.g.

defines simulation geometry

Definition at line 1772 of file combined_quality_estimator_teacher.py.

◆ fast_bdt_option

b2luigi fast_bdt_option
static
Initial value:
= b2luigi.ListParameter(
)

Hyperparameter options for the FastBDT algorithm.

Definition at line 1795 of file combined_quality_estimator_teacher.py.

◆ git_hash

b2luigi git_hash
static
Initial value:
= b2luigi.Parameter(
)

Use git hash / release of basf2 version as additional luigi parameter.

This parameter is already set in all other tasks that inherit from Basf2Task. For this task, I decided against inheriting from Basf2Task because it already calls a subprocess and therefore does not need a dispatchable process method.

Definition at line 1762 of file combined_quality_estimator_teacher.py.

◆ n_events_testing

b2luigi n_events_testing = b2luigi.IntParameter()
static

Number of events to generate for the test data set.

Definition at line 1768 of file combined_quality_estimator_teacher.py.

◆ n_events_training

b2luigi n_events_training = b2luigi.IntParameter()
static

Number of events to generate for the training data set.

Definition at line 1770 of file combined_quality_estimator_teacher.py.

◆ process_type

b2luigi process_type
static
Initial value:
= b2luigi.Parameter(
)

Define which kind of process shall be used.

Decide between simulating BBBAR or BHABHA, MUMU, YY, DDBAR, UUBAR, SSBAR, CCBAR, reconstructing DATA or already simulated files (USESIMBB/EE) or running on existing reconstructed files (USERECBB/EE)

Definition at line 1776 of file combined_quality_estimator_teacher.py.

◆ training_target

b2luigi training_target
static
Initial value:
= b2luigi.Parameter(
)

Feature/variable to use as truth label in the quality estimator MVA classifier.

Definition at line 1782 of file combined_quality_estimator_teacher.py.


The documentation for this class was generated from the following file: