SectorMap training

SectorMap training#

The SectorMap needs to be trained on MC events. During this training the friendship relations between sectors and the allowed ranges for the applied filter variables are learned. The training procedure can be very roughly separated into four steps:

collection of trainings data: Large MC samples are generated and relevant data to train the SectorMap are written to disk.
SectorMap training: The actual training of the SectorMap. For this the data created in the previous step is read and friendship relations and filter ranges are estimated (“trained”)
finalization of the SectorMap: The SectorMap from the previous step is usable as it is. But we provide SectorMaps with and without SVD time cuts applied. Therefore a SectorMap with timing cuts removed has to be generated. In addition the two SectorMaps are brought into the correct format to be uploaded to the data base.
validation of the SectorMap: Run the tracking validation with the new SectorMap to validate it.

The scripts used for the SectorMap training are collected in tracking/scripts/tracking/secMapTraining/ and tracking/tools/ their usage will be outlined in the following sections.

Collection of trainings data#

For the SectorMap training a large amount trainings data needs to be generated. By default we generate 10 Mio BB events and an additional 2 Mio Bhabha events for the training. Both event types have additional muon tracks mixed in using the ParticleGun event generator. To make the SectorMap more robust against displaced IP - positions the IP position is randomly varied for each generated sub-sample. The concrete setup of the event generators is done by the function tracking.secMapTraining.SectorMapTrainingUtils.add_event_generation. We train on MC - track candidates only, so adding beam background or a change in the beam background simulation should have no effect on the training. Thus all samples generated do not include beam background.

Please note that there is a special reconstruction chain applied for this MC generation. When using the commands described below this chain is set up automatically. If you need to know or adjust this chain you can look up the setting in the following function tracking.secMapTraining.SectorMapTrainingUtils.add_simulation_and_reconstruction_modules.

By default 2500 sub-samples of BB with 4000 events each and 500 sub-samples of Bhabha with 4000 events each are generated. To generate a single sub-sample you can run the following command:

Listing 23.1 Example for generating a single sub-sample of training data#

tracking-vxdtf2-collect-train-data --rndSeed 12345 --eventType BBbar --outputDir output/

There are several command line parameters (in addition to those shown in the example) which can be used to customize the data generation. For most of them the default values should be sufficient. For a full list of available parameter you can run the following command in a shell:

Listing 23.2 Get full list of available parameter#

tracking-vxdtf2-collect-train-data --help

Please note that the training with PXD data (off by default) has not been maintained since some time. So if you activate this feature this may lead to strange results. Also training with PXD data requires the addition of a new SectorMap-config into the SectorMapBootstrap module by changing the code. If you intend to train a SectorMap with PXD data please contact the tracking group.

The output files generated are named after the SectorMap-config used for the training as implemented into the SectorMapBootstrap module and type of dataset generated. At the moment there is only one config implemented which is called SVDOnlyDefault. Consequently all output files are named similar to SVDOnlyDefault_BBbar24956SVDOnly.root where the number corresponds to the random seed used during the generation and the name of the dataset is included (BBbar or BhaBha). By default the generated events are discarded and only the information needed for the SectorMap training is written out. The full root output can be activated by command line parameter, but note that this will take more than 1TB of disk space if a default SectorMap is trained (12Mio events).

An additional script is provided which sends all jobs to a queue to generate the full training sample. This script assumes that the attached queue is a LSF-queue (e.g. at KEKcc). If you don’t have access to a LSF-queue you would need to adopt the scripts or kook up your own recipe. The default usage of the script for submitting the jobs is as follows:

Listing 23.3 Submitting all data collection jobs to a queue#

tracking-vxdtf2-submitAllCollectionJobs --outputDir OUTPUTDIR

This will send the default number of jobs (2500 + 500) and events per job (4000) to a queue called l. This will result in a total of 12Mio events generated. The default parameter will work out of the box at KEKcc, except for the output directory which you should create and set. The output of running this command with default settings will use of the order of O(100GB) of disk space. So make sure to direct the output to a disk with sufficient quota. Internally the tracking-vxdtf2-collect-train-data command is called with settings used for the default SectorMap training. There are several command line parameter to customize the functionality of the script (e.g. number of jobs, name of the queue, number of events per job, …). A full list of available options can be obtained by running the following command:

Listing 23.4 get full list of available command line parameter#

tracking-vxdtf2-submitAllCollectionJobs --help

Note, for SectorMaps applied on data (the real one) the displacement of sensors should be included in the training. By default the MC has a perfect geometry which is not the case for the actual geometry in Belle II. Though differences are small, they do have an effect on the performance of the trained SectorMap. The default way of including a displaced geometry in the training is to provide a global tag with the displaced SVD-geometry (assuming PXD is ignored). This can be done by providing the name of a global tag containing such a geometry. Both of the above commands (tracking-vxdtf2-submitAllCollectionJobs and tracking-vxdtf2-collect-train-data) provide the option --prependGT which can be used to provide the name of such a global tag. The given global will be prepend to the other global tags.

The script tracking/scripts/tracking/secMapTraining/CreateSensorDisplacements.py can be used to create xml files containing sensor displacements which can be used to create a geometry with displaced (w.r.t. the ideal geometry) sensors. You have to provide global tags which contain the geometry payload and the payloads containing the relevant alignment data (named VXDAlignment*). The payloads can be either provided as list of global tags using the --listOfGT parameter where you can give a list of global tags separated by spaces. Alternatively one can provide the payloads as local database using the --localDB option. Note that the IOV of the alignment data and the geometry has to match. The experiment number and run number can be specified using command line parameters. An example is given below

basf2 CreateSensorDisplacements.py -- --help

basf2 CreateSensorDisplacements.py -- --localDB database.txt --listOfGT  globaltag1 globaltag2 globaltag2 --expNum 0 --runNum 0

, where the first command will display all available command line parameter. Note the leading -- is needed. The output will be two xml files containing the displacement for SVD (SVD-Alignment.xml) and PXD (PXD-Alignment.xml). The geometry in xml format is stored in svd/data and pxd/data, respectively. These already contain afore mentioned files, which should be replaced with the newly created versions. It is recommended to not use the original folders svd/data and pxd/data to not mess around with your release. Instead you should either create a copy of those folders or work on a branch. Once you replaced the corresponding files you can use the Geometry module to create the actual payloads. For the Geometry module you need to set the options createPayloads to True and you should set the components parameter correspondingly. If you use a copy of the original xml files you may also need to set the fileName parameter in the Gearbox module. An example for the SVD geometry is given in the script svd/scripts/dbImporters/create_SVDGeometryPar_payload.py. The output will be the geometry payloads which can be uploaded to the database or used as local database.

SectorMap training#

Given the training data generated in the previous step the SectorMap can now be trained. This is done by the command tracking-vxdtf2-train-SectorMap. You need to specify the input training dataset (see previous step). The usual way to call that command is the following:

Listing 23.5 usual call for training a default SectorMap#

tracking-vxdtf2-train-SectorMap --train_sample output/SVDOnlyDefault_*.root

Please note that you can and should use wild cards to specify the trainings sample. The above example will train on all root files located in the directory called output starting with SVDOnlyDefault. Setting the input training sample is the only non-optional parameter. A full list of available parameters can be obtained by running the command:

Listing 23.6 display full list of parameter#

tracking-vxdtf2-train-SectorMap --help

Also note that the training process takes very long ( O(24h) ) and uses a large amount of resources ( O(32GB) RAM) for the default settings. From past experience no queue at KEKcc was found which can handle these requirements. Therefore past SectorMaps have been trained interactively on one of the KEKcc worker nodes. If it is not an option to keep an ssh connection open for 24 hours you should check the linux command nohup which lets you run the training in the background while still allowing you to log off without the command being stopped on log off.

The paramter called --threshold is used to prune the SectorMap. It is given in per cent and will remove 70% of the least used Sector connections. The value of 70% was optimized and provided the best results in terms of efficiency and fake rate. So it is advised to keep the default value for this parameter, unless you know what you are doing.

Also for training you can specify a global tag to be prepend. At this stage only the numbering of sensors is read from the geometry.

Note that the command will cause a B2FATAL if the output SectorMap already exists to prevent overwriting old SectorMaps by accident. To fix this either move the old SectorMap or set a different output name by using the --secmap parameter.

The output of this procedure is a SectorMap which already can be used for processing. By default this SectorMap contains cuts on SVD timing information. How to obtain a SectorMap without timing is shown in the following section.

Finalization of the SectorMap#

By default a SectorMap is always trained including timing information, but tracking usually provides SectorMap with and without timing information. The timing information is removed by copying the original SectorMap and then setting the corresponding ranges for the filters using time to (-INF ,+INF) for the copied SectorMap.

In addition both SectorMaps, with and without timing, need to be brought into a format that is expected by the database.

Both of these tasks are performed by executing the following command:

Listing 23.7 command to remove timing cuts and prepare for DB upload#

tracking-vxdtf2-prepare-SectorMap --inputSectorMap NameInputSectorMap.root

The only parameter is the name of the input SectorMap (including directory). The generated output will be a new SectorMap with timing information removed. The new SectorMap (without timing) is named after the input SectorMap by adding the postfix _timingRemoved. This SectorMap uses usually a bit less disk space (around 16 vs. 17MB).

The local database with both SectorMap payloads will be created in your current folder in a directory named localSectorMapDB. Note that the IOVs specified in the file database.txt contained in that folder use dummy values. These MUST be adjusted before you upload this to the database. After adjusting the IOVs you can upload the SectorMaps to the database using the b2conditionsdb: Conditions DB interface command.

Validation of the SectorMap#

The easiest method to validate a SectorMap is to use the tracking validation scripts tracking/validation/VXDTF2TrackingValidation.py or tracking/validation/VXDTF2TrackingValidationBkg.py, where the latter uses input files with beam bkg included.

By default SectorMaps are read from the database. So you need to tell those scripts that they should use the SectorMap you want to test. The easiest way to achieve that is to set the linux environment variable BELLE2_TESTING_VXDTF2_SECMAP. The value should be set to name of the file of the SectorMap you want to test (including path).

Listing 23.8 Example for setting the variable in bash#

export BELLE2_TESTING_VXDTF2_SECMAP="myPath/MySectorMap.root"
echo $BELLE2_TESTING_VXDTF2_SECMAP

Once this variable is set you can simply run either of the two above given validation scripts and check their output to validate the SectorMap. The validation can be run automatically with either of the following two shell commands (after basf2 is set up):

Listing 23.9 run the VXDTF2 validation scripts#

b2validation -s VXDTF2TrackingValidation.py
b2validation -s VXDTF2TrackingValidationBkg.py

These commands will automatically generate input samples and run the corresponding validation script on those. The output can be found under the results/ directory. Note that if you trained a SectorMap on an altered Geometry you need to make sure the input samples also need to be generated with this Geometry. This means you would need to generate the input samples “by hand”, as the above method will use the default Geometry.

Note this method only works for these specific scripts. If you need to set the SectorMap in your custom code you need to adjust the settings of the SectorMapBootstrap module. To achieve that put the following code after the tracking chain has been set up:

Listing 23.10 Example of forcing the SectorMapBootstrap module to use a local SectorMap#

import basf2

basf2.set_module_parameters(path, "SectorMapBootstrap", ReadSecMapFromDB=False)
basf2.set_module_parameters(path, "SectorMapBootstrap", ReadSectorMap=True)
basf2.set_module_parameters(path, "SectorMapBootstrap", SectorMapsInputFile="myPath/mySectorMap.root")

Alternatively one can also prepend a global tag which contains the new SectorMap:

Listing 23.11 prepending a global tag to the list of tags#

import basf2
basf2.conditions.prepend_globaltag("MyGlobalTagName")

SectorMap training

Contents

SectorMap training#

Collection of trainings data#

SectorMap training#

Finalization of the SectorMap#

Validation of the SectorMap#