21.5. Smart Background Simulation#

The Smart Background project aims to reduce the production time and resources required for directly skimmed background MC campaigns. To this end, a transformer based neural network is used to predict the probability that an event will pass a given skim directly after event generation before the costly simulation and reconstruction steps (see Fig. 21.1) so that these can be skipped for events that will be filtered out by the skim anyway. To ensure unbiased distributions after neural network filtering, importance sampling is employed, using the neural network output as a probability to sample an event and, if it is kept, weighting it with the inverse neural network output.

../../_images/smartbkg_workflow.png

Fig. 21.1 Schematic view of skimmed MC production using Smart Background.#

Note

Datasets produced using Smart Background are weighted and must be treated as such when analyzed! The weights are stored in the event extra info as weight_<SkimName>.

Warning

If you are running event generation and skimming in the same steering file, you have to pass roundToMdstPrecision=True to the skim. This is mandatory for the FEI skims (as large discrepancies have been observed there) and recommended for all other skims. If in doubt, check explicitly that your skim produces identical results when run in the same vs. a seperate steering file.

21.5.1. Usage#

To employ this method, we recommend using the skim.smartbkg.add_smartbkg_filtering() convenience function. It should be placed after the event generator but before simulation and reconstruction. As mandatory input it requires the skim object you are running (works with any skim derived from skim.core.BaseSkim, including skim.core.CombinedSkim , as long as the used skims are known to the trained model, see below). It also requires information about the background type produced (uubar, ddbar, ssbar, ccbar, charged, mixed, taupair). If this cannot be inferred from the event extra info (e.g. by setting it in the event generator as below), you can manually provide it via the event_type argument. A part of your steering file might then look like this (for a full example see skim/examples/SmartBkgExampleSteering.py):

# Add event generator (evtgen for charged events in this example)
finalstate = "charged"
gen.add_evtgen_generator(finalstate=finalstate, path=path, eventType=finalstate)

# Define skim
fei_skim = feiHadronic(
    analysisGlobaltag=ma.getAnalysisGlobaltag(),
    OutputFileName="your_output_file_name.udst.root",
    roundToMdstPrecision=True
)

# Add SmartBkg filtering by providing the skim
skim.smartbkg.add_smartbkg_filtering(
    skim=fei_skim,
    path=path
)

# Add simulation and reconstruction
sim.add_simulation(path)
rec.add_reconstruction(path)

# Apply the skim
fei_skim(path)

The event weights for the skim (or for each skim separately if you use a skim.core.CombinedSkim) are stored in the event extra info as weight_<SkimName>. If an event is not sampled for a particular skim, the corresponding weight is set to 0. If an event is sampled for none of the provided skims, it is filtered out to an empty path.

We currently provide a pre-trained model via the conditions database that is trained on 51 skims. The supported skim codes are 11180500, 11180600, 11640100, 12160100, 12160200, 12160300, 12160400, 13160200, 13160300, 14120300, 14120600, 14121100, 14140100, 14140101, 14140102, 14140200, 14141000, 14141001, 14141002, 15410300, 15420100, 15440100, 16460200, 17230100, 17230200, 17230400, 17230500, 17230600, 17240100, 17240300, 17240600, 17240700, 17241000, 17241200, 18000000, 18000001, 18020100, 18020200, 18020400, 18130100, 18360100, 18520100, 18520200, 18520400, 18520500, 18530200, 18570600, 18570700, 19120100, 19130201, 19130300.

For studies you may want to disable filtering and look at the model output. This is possible by setting the debug_mode argument of skim.smartbkg.add_smartbkg_filtering() to True. This will disable filtering and reweighting, and instead the model outputs will be saved to the event extra info as SmartBKG_Prediction_<SkimName>. An example of how to write out the model predictions as well as the skim flags is provided under skim/examples/SmartBkgDebugMode.py.

For greater customisability you may also use the SmartBackground module directly. It requires as mandatory input the skim LFN codes and names of all used skims, and can also be put into debug mode. It performs the reweighting, but no filtering on its own (instead it returns 1 as a return value if an event is sampled for at least one skim, otherwise 0).

The code for the entire Smart Background project including the model setup, training script and data preparation can be found on GitLab.

21.5.2. Module documentation#

skim.smartbkg.add_smartbkg_filtering(skim, path, empty_path=None, debug_mode=False, payload_weights='SmartBackgroundWeights_default', payload_config='SmartBackgroundConfig_default', event_type=None, activation_params=None)[source]#

Adds event preselection based on the SmartBkg neural network. Should be used only for directly skimmed MC productions. Must be added to the path after generator.add_abc_generator but before simulation.add_simulation. Given one or multiple skims, the model predicts the probability of each event passing each of the skims. Events are then sampled for each skim according to this probability. An event weight is stored for each skim in the event extra info as ‘weight_<SkimName>’, either as the inverse probability if the event is sampled for that skim, or 0 otherwise. If an event is sampled for none of the provided skims, it is rejected to the empty_path. Use case is the reduction of simulation time for directly skimmed MC productions.

Parameters:
  • skim (skim.core.BaseSkim or skim.core.CombinedSkim) – instance of the used skim

  • path (basf2.Path) – main path with generator modules, used for pass events

  • empty_path (basf2.Path or None) – path rejected events are given to (new empty path if None)

  • debug_mode (bool) – enables debug mode (events are never rejected, instead the neural network prediction is written to the event extra info as ‘SmartBKG_Prediction_<SkimName>’)

  • payload_weights (str) – name of the payload storing neural network weights in ONNX format

  • payload_config (str) – name of the payload storing the SmartBackgroundConfig object

  • event_type (str or None) – type of events that are generated, allowed values are ‘charged’, ‘mixed’, ‘uubar’, ‘ddbar’, ‘ssbar’, ‘ccbar’, ‘taupair’; if None, automatically determined from the event extra info

  • activation_params (tuple(float, float) or None) – custom parameters (a, b) for the activation function (useful for testing/validation); if None, prefitted values for the chosen skims are used

SmartBackground#

Event preselector based on the SmartBkg neural network. Should be used only for directly skimmed MC productions. Must be added to the path after the MC generator module but before simulation. Given one or multiple skims, the model predicts the probability of each event passing each of the skims. Events are then sampled for each skim according to this probability. An event weight is stored for each skim in the event extra info as ‘weight_<SkimName>’, either as the inverse probability if the event is sampled for that skim, or 0 otherwise. If an event is sampled for none of the provided skims, the module returns 0, otherwise 1. Use case is the reduction of simulation time for directly skimmed MC productions.

Package:

skim

Library:

libskim_modules.so

Required Parameters:
  • skimCodes (list(int))

    Skim LFN codes

  • skimNames (list(str))

    Skim names

Parameters:
  • activationOverrideParams (list(float), default=[0.5, 0.0])

    Parameters (a, b) of the activation function (clipped exponential)

  • debugMode (bool, default=False)

    Debug mode execution (in debug mode, always returns 1 and neural network predictions are saved to the event extra info as ‘SmartBKG_Prediction_<SkimName>’)

  • eventType (str, default=’charged’)

    Event type (charged, mixed, uubar, ccbar, ddbar, ssbar, taupair)

  • overrideActivation (bool, default=False)

    Override parameters (a, b) of the activation function (clipped exponential)

  • overrideEventType (bool, default=False)

    Override automatically determined event type

  • payloadConfig (str, default=’SmartBackgroundConfig_default’)

    Name of payload storing SmartBackgroundConfig object

  • payloadWeights (str, default=’SmartBackgroundWeights_default’)

    Name of payload storing neural network weights in ONNX format