Information for skim experts
Contents
20.4. Information for skim experts#
Tip
The functions and tools documented here are intended for skim liaisons and developers. If you are only interested in the selection criteria, then this section is probably not relevant for you.
20.4.1. Writing a skim#
In the skim package, skims are defined via the BaseSkim
class. The skim package is organised around this for the following reasons:
this keeps the package organised, with every skim being defined in a predictable way,
this allows the skims to be located by standard helper tools such as b2skim-run and b2skim-stats-print, and
skims must be combined with other skims to reduce the number of grid job submissions, and the
CombinedSkim
class is written to combined objects of typeBaseSkim
.
To write a new skim, please follow these steps:
Start by defining a class which inherits from
BaseSkim
and give it the name of your skim. Put the class in an appropriate skim module for your working group. For example, the skimDarkSinglePhoton
belongs inskim/scripts/skim/dark.py
, and begins with the following definition:class DarkSinglePhoton(BaseSkim): # docstring here explaining reconstructed decay modes and applied cuts.
[Mandatory] Tell us about your skim by setting the following attributes:
__description__
: one-line summary describing the purpose of your skim.__category__
: a list of category keywords.__authors__
: list of skim authors.__contact__
: the name and contact email of the skim liaison responsible for this skim.
BaseSkim
requires you to set these attributes in each subclass. Once these are set, we can we add a lovely auto-generated header to the documentation of the skim by using thefancy_skim_header
decorator.@fancy_skim_header class DarkSinglePhoton(BaseSkim): # docstring here describing your skim, and explaining cuts.
This header will appear as a “Note” block at the top of your skim class on Sphinx, and will also appear at the top of the help function in an interactive Python session:
>>> from skim.WGs.foo import MySkim >>> help(MySkim)
Tip
If your skim does not define
__description__
,__category__
,__authors__
,__contact__
, orbuild_lists
, then you will see an error message like:TypeError: Can't instantiate abstract class SinglePhotonDark with abstract methods __authors__
This can be fixed by defining these required attributes and methods.
If you require any standard lists to be loaded for your skim, override the method
load_standard_lists
. This will be run beforebuild_lists
andadditional_setup
.This step is separated into its own function so that the
CombinedSkim
class can do special handling of these functions to avoid accidentally loading a standard list twice when combining skims.If any further setup is required, then override the
additional_setup
method.[Mandatory] Define all cuts by overriding
build_lists
. This function is expected to return the list of particle lists reconstructed by the skim.Changed in version release-06-00-00: Previously, this function was expected to set the attribute
BaseSkim.SkimLists
. This is now handled internally byBaseSkim
, andBaseSkim.build_lists
is expected to return the list of particle list names.Skims can crash on the grid if the log files are too large. If any modules is producing too much output, then override the attribute
NoisyModules
as a list of such modules, and their output will be set to print only error-level messages.By default, the skim test file is a neutral \(B\) pair sample with beam background. If your skim has a retention rate of close to zero for this sample type, you may wish to override the attribute
TestSampleProcess
. This should be a label of a generic MC type, e.g."ccbar"
,"charged"
, or"eemumu"
. This attribute is passed toskim.utils.testfiles.get_test_file
, which retrieves a suitable test file, available in the propertyTestFiles
.[Mandatory] Add your skim to the registry, with an appropriate skim code (see Skim Registry).
With all of these steps followed, you will now be able to run your skim using the skim command line tools. To make sure that you skim does what you expect, and is feasible to put into production, please also complete the following steps:
Test your skim! The primary point of skims is to be run on the grid, so you want to be sure that the retention rate and processing time are low enough to make this process smooth.
The skim package contains a set of tools to make this straightforward for you. See Testing skim performance for more details.
Define validation histograms for your skim by overriding the method
BaseSkim.validation_histograms
, and running b2skim-generate-validation to auto-generate a steering file in the skim validation directory. Thevalidation_histograms
method should not be long: it should simply use the particle lists that have been created bybuild_lists
to plot one or two key variables. If possible, do not do any further reconstruction or particle list loading here. Below is an example of what a typical method ought to contain.def validation_histograms(self, path): # The validation package is not part of the light releases, so this import # must be made inside this function rather than at the top of the file. from validation_tools.metadata import create_validation_histograms # Combine B+ particle lists for a single histogram (assuming self.SkimLists only # has B+ particle lists). Not necessary if only one particle list is created. ma.copyLists(f"B+:{self}_validation", self.SkimLists, path=path) create_validation_histograms( rootfile=f"{self}_validation.root", particlelist=f"B+:{self}_validation", variables_1d=[ ("deltaE", 20, -0.5, 0.5, "#Delta E", __liaison__, "$\\Delta E$ distribution of reconstructed $B^{+}$ candidates", "Peak around 0", "#Delta E [GeV]", "B^{+} candidates"), # Include "shifter" flag to have this plot shown to shifters ("Mbc", 20, 5.2, 5.3, "M_{bc}", __liaison__, "$M_{\\rm bc}$ distribution of reconstructed $B^{+}$ candidates", "Peak around 5.28", "M_{bc} [GeV]", "B^{+} candidates", "shifter")], )
See also
Documentation of
create_validation_histograms
for explanation of the expected arguments. Options to pay particular attention to:Passing the “shifter” flag in
metaoptions
, which will allow the plot to be shown to shifters when they check validation.belle2.org.Adding a contact email address with the
contact
option, preferably the contact email of your working group’s skim liaison. If this is set, then the B2Bot will know where to send polite emails in case the validation comparison fails.
20.4.2. Building skim lists in a steering file#
Calling an instance of a skim class will run the particle list loaders, setup function, list builder function, and uDST output function. So a minimal skim steering file might consist of the following:
import basf2 as b2
import modularAnalysis as ma
from skim.WGs.foo import MySkim
path = b2.Path()
ma.inputMdstList([], path=path)
skim = MySkim()
skim(path) # __call__ method loads standard lists, creates skim lists, and saves to uDST
b2.process(path)
After skim(path)
has been called, the skim list names are stored in the Python list skim.SkimLists
.
Warning
There is a subtle but important technicality here: if BaseSkim.skim_event_cuts
has been called, then the skim lists are not built for all events on the path, but they are built for all events on a conditional path. A side-effect of this is that no post-skim path can be safely defined for the CombinedSkim
class (since a combined skim of five skims may have up to five conditional paths).
After a skim has been added to the path, the attribute BaseSkim.postskim_path
contains a safe path for adding subsequent modules to (e.g. performing further reconstruction using the skim candidates). However, the final call to basf2.process
must be passed the original (main) path.
skim = MySkim()
skim(path)
# Add subsequent modules to skim.postskim_path
ma.variablesToNtuple(skim.SkimLists[0], ["pt", "E"], path=skim.postskim_path)
# Process full path
b2.process(path)
The above code snippet will produce both uDST and ntuple output. To only build the skim lists without writing to uDST, pass the configuration parameter outputUdst=False
during initialisation of the skim object:
skim = MySkim(udstOutput=False)
skim(path)
Disabling uDST output may be useful to you if you want to do any of the following:
print the statistics of the skim without producing any output files,
build the skim lists and perform further reconstruction or fitting on the skim candidates before writing the ROOT output,
go directly from unskimmed MDST to analysis ntuples in a single steering file (but please consider first using the centrally-produce skimmed uDSTs), or
use the skim flag to build the skim lists and write an event-level ntuple with information about which events pass the skim.
Tip
The tool b2skim-generate can be used to generate simple skim steering files like the example above. The tool b2skim-run is a standalone tool for running skims. b2skim-run is preferable for quickly testing a skim during skim development. b2skim-generate should be used as a starting point if you are doing anything more complicated than simply running a skim on an MDST file to produce a uDST file.
Skim flags#
When a skim is added to the path, an entry is added to the event extra info to indicate whether an event passes the skim or not. This flag is of the form eventExtraInfo(passes_<SKIMNAME>)
(aliased to passes_<SKIMNAME>
for convenience), and the flag name is stored in the property BaseSkim.flag
.
In the example below, we build the skim lists, skip the uDST output, and write an ntuple containing the skim flag and other event-level variables:
skim = MySkim(udstOutput=False)
skim(path)
ma.variablesToNtuple("", [skim.flag, "nTracks"], path=path)
b2.process(path)
Skim flags can also be used in combined skims, with the individual flags being available in the list CombinedSkim.flags
. In the example below, we run three skims in a combined skim, disable the uDST output, and then save the three skim flags to an ntuple.
skim = CombinedSkim(
SkimA(),
SkimB(),
SkimC(),
udstOutput=False,
)
skim(path)
ma.variablesToNtuple("", skim.flags + ["nTracks"], path=path)
b2.process(path)
Tip
Skim flags are guaranteed to work on the main path (the variable path
in the above examples). However, any other modules attempting to access the skim lists should be added to the postskim_path
.
See also
Skim flags are implemented using two basf2 modules, which are documented in skim-utils-flags.
20.4.3. Running a skim#
In the skim package, there are command-line tools available for running skims, documented below. These take a skim name as a command line argument, and run the code defined in the corresponding subclass of BaseSkim
.
b2skim-run
: Run a skim#
Tip
This tool completely supplants the <SkimName>_Skim_Standalone.py
steering files from
previous versions of basf2. The standalone/
and combined/
directories no longer exist in
the skim package from version-05-00-00 onwards.
General steering file for running skims.
usage: b2skim-run [-h] {single,combined,module} ...
Required Arguments
- action
Possible choices: single, combined, module
Run just one skim, or multiple skims at once.
Sub-commands:#
single#
Run a single skim.
b2skim-run single [-h] [-o Output uDST location] [-n MaxInputEvents]
[-i InputFileList [InputFileList ...]] [--data]
[--analysis-globaltag AnalysisGlobaltag]
[--pid-globaltag PIDGlobaltag]
Skim
Required Arguments
- Skim
Possible choices: Random, SystematicsTracking, Resonance, SystematicsRadMuMu, SystematicsEELL, SystematicsRadEE, SystematicsLambda, SystematicsPhiGamma, SystematicsFourLeptonFromHLTFlag, SystematicsRadMuMuFromHLTFlag, SystematicsJpsi, SystematicsKshort, SystematicsBhabha, SystematicsCombinedHadronic, SystematicsCombinedLowMulti, SystematicsDstar, PRsemileptonicUntagged, LeptonicUntagged, SLUntagged, B0toDstarl_Kpi_Kpipi0_Kpipipi, BtoDl_and_ROE_e_or_mu_or_lowmult, feiHadronicB0, feiHadronicBplus, feiSLB0, feiSLB0_RDstar, feiSLBplus, feiHadronic, feiSL, BtoXgamma, BtoXll, BtoXll_LFV, B0TwoBody, FourLepton, RadiativeDilepton, TDCPV_ccs, TDCPV_qqs, TDCPV_dilepton, BtoD0h_Kspi0, BtoD0h_Kspipipi0, BtoDstarpipipi0_D0pi_Kpi, B0toDpi_Kpipi, B0toDpi_Kspi, B0toDstarPi_D0pi_Kpi, B0toDstarPi_D0pi_Kpipipi_Kpipi0, B0toDrho_Kpipi, B0toDrho_Kspi, B0toDstarRho_D0pi_Kpi, B0toDstarRho_D0pi_Kpipipi_Kpipi0, BptoD0etapi_Kpi, BptoD0pipi0_Kpi, BtoD0h_hh, BtoD0h_Kpi, B0toDstarpipi0_D0pi_Kpi, BtoD0h_Kpipipi_Kpipi0, B0toDstaretapi_D0pi_Kpi, BtoD0h_Kshh, BtoD0rho_Kpi, BtoD0rho_Kpipipi_Kpipi0, B0toDD_Kpipi_Kspi, B0toDstarD, B0toD0Kpipi0_pi0, B0toDs1D, B0toDDs0star, B0toDomegapi_Kpipi_pipipi0, B0toDomegapi_Kspi_pipipi0, BtoD0pi_Kpiomega_pipipi0, InclusiveLambda, BottomoniumEtabExclusive, BottomoniumUpsilon, InclusiveUpsilon, CharmoniumPsi, XToD0_D0ToHpJm, XToD0_D0ToNeutrals, DstToD0Pi_D0ToRare, XToDp_DpToKsHp, XToDp_DpToHpHmJp, LambdacTopHpJm, DstToD0Pi_D0ToGeneric, LambdacToSHpJm, XicpTopHpJm, XicToXimPipPim, Xic0ToLHpJm, XicpToLKsHp, DstToD0Pi_D0ToHpJm, DstToD0Pi_D0ToHpJmPi0, DstToD0Pi_D0ToKsOmega, DstToD0Pi_D0ToHpJmEta, DstToD0Pi_D0ToNeutrals, DstToD0Pi_D0ToHpJmKs, EarlyData_DstToD0Pi_D0ToHpJmPi0, EarlyData_DstToD0Pi_D0ToHpHmPi0, DstToDpPi0_DpToHpPi0, DpToHpPi0, DpToKsHp, DstToD0Pi_D0ToHpHmHpJm, DstToD0Pi_D0ToVGamma, DpToPipEpEm, DpToPipMupMum, DpToPipKpKm, DpToHpOmega, DspToHpOmega, DpToEtaHp, SinglePhotonDark, GammaGammaControlKLMDark, ALP3Gamma, EGammaControlDark, InelasticDarkMatter, RadBhabhaV0Control, TauLFV, DimuonPlusMissingEnergy, ElectronMuonPlusMissingEnergy, DielectronPlusMissingEnergy, LFVZpVisible, BtoKplusLLP, TauGeneric, TauThrust, TauKshort, TwoTrackLeptonsForLuminosity, LowMassTwoTrack, SingleTagPseudoScalar, InelasticDarkMatterWithDarkHiggs, AA2uuuu, DimuonPlusVisibleDarkHiggs, DielectronPlusVisibleDarkHiggs, LowMassOneTrack, BtoPi0Pi0, BtoPi0Eta, BtoHadTracks, BtoHad1Pi0, BtoHad3Tracks1Pi0, BtoRhopRhom, BtoEtapKstp
Skim to run.
Optional Arguments
- -o, --output-udst-name
Location of output uDST file.
- -n, --max-input-events
Maximum number of input events to process.
- -i, --input-file-list
Input file list
- --data
Pass this flag if intending to run this skim on data, so that MC quantities are not saved in the output.
- --analysis-globaltag
Analysis globaltag to be passed to the skims.
- --pid-globaltag
PID globaltag to be passed to the skims.
combined#
Run several skims as a combined steering file.
b2skim-run combined [-h] [-n MaxInputEvents]
[-i InputFileList [InputFileList ...]] [--data]
[--analysis-globaltag AnalysisGlobaltag]
[--pid-globaltag PIDGlobaltag]
Skim [Skim ...]
Required Arguments
- Skim
Possible choices: Random, SystematicsTracking, Resonance, SystematicsRadMuMu, SystematicsEELL, SystematicsRadEE, SystematicsLambda, SystematicsPhiGamma, SystematicsFourLeptonFromHLTFlag, SystematicsRadMuMuFromHLTFlag, SystematicsJpsi, SystematicsKshort, SystematicsBhabha, SystematicsCombinedHadronic, SystematicsCombinedLowMulti, SystematicsDstar, PRsemileptonicUntagged, LeptonicUntagged, SLUntagged, B0toDstarl_Kpi_Kpipi0_Kpipipi, BtoDl_and_ROE_e_or_mu_or_lowmult, feiHadronicB0, feiHadronicBplus, feiSLB0, feiSLB0_RDstar, feiSLBplus, feiHadronic, feiSL, BtoXgamma, BtoXll, BtoXll_LFV, B0TwoBody, FourLepton, RadiativeDilepton, TDCPV_ccs, TDCPV_qqs, TDCPV_dilepton, BtoD0h_Kspi0, BtoD0h_Kspipipi0, BtoDstarpipipi0_D0pi_Kpi, B0toDpi_Kpipi, B0toDpi_Kspi, B0toDstarPi_D0pi_Kpi, B0toDstarPi_D0pi_Kpipipi_Kpipi0, B0toDrho_Kpipi, B0toDrho_Kspi, B0toDstarRho_D0pi_Kpi, B0toDstarRho_D0pi_Kpipipi_Kpipi0, BptoD0etapi_Kpi, BptoD0pipi0_Kpi, BtoD0h_hh, BtoD0h_Kpi, B0toDstarpipi0_D0pi_Kpi, BtoD0h_Kpipipi_Kpipi0, B0toDstaretapi_D0pi_Kpi, BtoD0h_Kshh, BtoD0rho_Kpi, BtoD0rho_Kpipipi_Kpipi0, B0toDD_Kpipi_Kspi, B0toDstarD, B0toD0Kpipi0_pi0, B0toDs1D, B0toDDs0star, B0toDomegapi_Kpipi_pipipi0, B0toDomegapi_Kspi_pipipi0, BtoD0pi_Kpiomega_pipipi0, InclusiveLambda, BottomoniumEtabExclusive, BottomoniumUpsilon, InclusiveUpsilon, CharmoniumPsi, XToD0_D0ToHpJm, XToD0_D0ToNeutrals, DstToD0Pi_D0ToRare, XToDp_DpToKsHp, XToDp_DpToHpHmJp, LambdacTopHpJm, DstToD0Pi_D0ToGeneric, LambdacToSHpJm, XicpTopHpJm, XicToXimPipPim, Xic0ToLHpJm, XicpToLKsHp, DstToD0Pi_D0ToHpJm, DstToD0Pi_D0ToHpJmPi0, DstToD0Pi_D0ToKsOmega, DstToD0Pi_D0ToHpJmEta, DstToD0Pi_D0ToNeutrals, DstToD0Pi_D0ToHpJmKs, EarlyData_DstToD0Pi_D0ToHpJmPi0, EarlyData_DstToD0Pi_D0ToHpHmPi0, DstToDpPi0_DpToHpPi0, DpToHpPi0, DpToKsHp, DstToD0Pi_D0ToHpHmHpJm, DstToD0Pi_D0ToVGamma, DpToPipEpEm, DpToPipMupMum, DpToPipKpKm, DpToHpOmega, DspToHpOmega, DpToEtaHp, SinglePhotonDark, GammaGammaControlKLMDark, ALP3Gamma, EGammaControlDark, InelasticDarkMatter, RadBhabhaV0Control, TauLFV, DimuonPlusMissingEnergy, ElectronMuonPlusMissingEnergy, DielectronPlusMissingEnergy, LFVZpVisible, BtoKplusLLP, TauGeneric, TauThrust, TauKshort, TwoTrackLeptonsForLuminosity, LowMassTwoTrack, SingleTagPseudoScalar, InelasticDarkMatterWithDarkHiggs, AA2uuuu, DimuonPlusVisibleDarkHiggs, DielectronPlusVisibleDarkHiggs, LowMassOneTrack, BtoPi0Pi0, BtoPi0Eta, BtoHadTracks, BtoHad1Pi0, BtoHad3Tracks1Pi0, BtoRhopRhom, BtoEtapKstp
List of skims to run as a combined skim.
Optional Arguments
- -n, --max-input-events
Maximum number of input events to process.
- -i, --input-file-list
Input file list
- --data
Pass this flag if intending to run this skim on data, so that MC quantities are not saved in the output.
- --analysis-globaltag
Analysis globaltag to be passed to the skims.
- --pid-globaltag
PID globaltag to be passed to the skims.
module#
Run all skims in a module.
b2skim-run module [-h] [-n MaxInputEvents]
[-i InputFileList [InputFileList ...]] [--data]
[--analysis-globaltag AnalysisGlobaltag]
[--pid-globaltag PIDGlobaltag]
module
Required Arguments
- module
Possible choices: semileptonic, ewp, charm, btocharm, dark, leptonic, taupair, lowMulti, btocharmless, systematics, quarkonium, fei, tdcpv
Skim module to run all skims for as combined steering file.
Optional Arguments
- -n, --max-input-events
Maximum number of input events to process.
- -i, --input-file-list
Input file list
- --data
Pass this flag if intending to run this skim on data, so that MC quantities are not saved in the output.
- --analysis-globaltag
Analysis globaltag to be passed to the skims.
- --pid-globaltag
PID globaltag to be passed to the skims.
b2skim-generate
: Generate skim steering files#
Tip
This tool is for cases where other tools does not suffice (such as running on the grid, or adding additional modules to the path after adding a skim.). If you just want to run a skim on KEKCC, consider using b2skim-run. If you want to test the performance of your skim, consider using the b2skim-stats tools.
Generate skim steering files.
This tool is for if you really need a steering file, and b2skim-run doesn’t cut it (such as if you are testing your skim on the grid).
usage: b2skim-generate [-h] [-o [OutputFilename]] [--data] [--no-stats]
[--skimmed-mdst-output] [--no-user-hints]
[--no-backward-compatibility]
[--local-module LOCAL_MODULE]
[--analysis-globaltag AnalysisGlobaltag]
[--udst-output-name] [--pid-globaltag PIDGlobaltag]
Skim|Module [Skim|Module ...]
Required Arguments
- Skim|Module
Possible choices: Random, SystematicsTracking, Resonance, SystematicsRadMuMu, SystematicsEELL, SystematicsRadEE, SystematicsLambda, SystematicsPhiGamma, SystematicsFourLeptonFromHLTFlag, SystematicsRadMuMuFromHLTFlag, SystematicsJpsi, SystematicsKshort, SystematicsBhabha, SystematicsCombinedHadronic, SystematicsCombinedLowMulti, SystematicsDstar, PRsemileptonicUntagged, LeptonicUntagged, SLUntagged, B0toDstarl_Kpi_Kpipi0_Kpipipi, BtoDl_and_ROE_e_or_mu_or_lowmult, feiHadronicB0, feiHadronicBplus, feiSLB0, feiSLB0_RDstar, feiSLBplus, feiHadronic, feiSL, BtoXgamma, BtoXll, BtoXll_LFV, B0TwoBody, FourLepton, RadiativeDilepton, TDCPV_ccs, TDCPV_qqs, TDCPV_dilepton, BtoD0h_Kspi0, BtoD0h_Kspipipi0, BtoDstarpipipi0_D0pi_Kpi, B0toDpi_Kpipi, B0toDpi_Kspi, B0toDstarPi_D0pi_Kpi, B0toDstarPi_D0pi_Kpipipi_Kpipi0, B0toDrho_Kpipi, B0toDrho_Kspi, B0toDstarRho_D0pi_Kpi, B0toDstarRho_D0pi_Kpipipi_Kpipi0, BptoD0etapi_Kpi, BptoD0pipi0_Kpi, BtoD0h_hh, BtoD0h_Kpi, B0toDstarpipi0_D0pi_Kpi, BtoD0h_Kpipipi_Kpipi0, B0toDstaretapi_D0pi_Kpi, BtoD0h_Kshh, BtoD0rho_Kpi, BtoD0rho_Kpipipi_Kpipi0, B0toDD_Kpipi_Kspi, B0toDstarD, B0toD0Kpipi0_pi0, B0toDs1D, B0toDDs0star, B0toDomegapi_Kpipi_pipipi0, B0toDomegapi_Kspi_pipipi0, BtoD0pi_Kpiomega_pipipi0, InclusiveLambda, BottomoniumEtabExclusive, BottomoniumUpsilon, InclusiveUpsilon, CharmoniumPsi, XToD0_D0ToHpJm, XToD0_D0ToNeutrals, DstToD0Pi_D0ToRare, XToDp_DpToKsHp, XToDp_DpToHpHmJp, LambdacTopHpJm, DstToD0Pi_D0ToGeneric, LambdacToSHpJm, XicpTopHpJm, XicToXimPipPim, Xic0ToLHpJm, XicpToLKsHp, DstToD0Pi_D0ToHpJm, DstToD0Pi_D0ToHpJmPi0, DstToD0Pi_D0ToKsOmega, DstToD0Pi_D0ToHpJmEta, DstToD0Pi_D0ToNeutrals, DstToD0Pi_D0ToHpJmKs, EarlyData_DstToD0Pi_D0ToHpJmPi0, EarlyData_DstToD0Pi_D0ToHpHmPi0, DstToDpPi0_DpToHpPi0, DpToHpPi0, DpToKsHp, DstToD0Pi_D0ToHpHmHpJm, DstToD0Pi_D0ToVGamma, DpToPipEpEm, DpToPipMupMum, DpToPipKpKm, DpToHpOmega, DspToHpOmega, DpToEtaHp, SinglePhotonDark, GammaGammaControlKLMDark, ALP3Gamma, EGammaControlDark, InelasticDarkMatter, RadBhabhaV0Control, TauLFV, DimuonPlusMissingEnergy, ElectronMuonPlusMissingEnergy, DielectronPlusMissingEnergy, LFVZpVisible, BtoKplusLLP, TauGeneric, TauThrust, TauKshort, TwoTrackLeptonsForLuminosity, LowMassTwoTrack, SingleTagPseudoScalar, InelasticDarkMatterWithDarkHiggs, AA2uuuu, DimuonPlusVisibleDarkHiggs, DielectronPlusVisibleDarkHiggs, LowMassOneTrack, BtoPi0Pi0, BtoPi0Eta, BtoHadTracks, BtoHad1Pi0, BtoHad3Tracks1Pi0, BtoRhopRhom, BtoEtapKstp, semileptonic, ewp, charm, btocharm, dark, leptonic, taupair, lowMulti, btocharmless, systematics, quarkonium, fei, tdcpv
Skim/s to produce a steering file for. If more than one skim is provided, then a combined steering file is produced. If a module name is passed, the combined steering file will contain all skims in that module.
Optional Arguments
- -o, --output-script-name
Location to output steering file. If flag not given, code is printed to screen. If flag is given with no arguments, writes to a file in the current directory using a default name.
Default: “”
- --data
Pass this flag if intending to run this skim on data, so that MC quantities are not saved in the output.
- --no-stats
If flag passed,
print(b2.statistics)
will not be included at the end of the steering file.- --skimmed-mdst-output
If flag passed, save a single MDST containing events which pass at least one of the skims.
- --no-user-hints
If flag passed, the steering file will not include a comment explaining how to add modules to the path after building the skim lists.
- --no-backward-compatibility
If this flag is not passed, the steering file will include additional imports wrapped in a try-except block, in order to be work with both release 5 and 6.
- --local-module
[EXPERT FLAG] Name of local module to import skim functions from. Script will fail if skims come from more than one module.
- --analysis-globaltag
Analysis globaltag to be passed to the skims.
Default: “”
- --udst-output-name
Name given to the output udst from skim script.
Default: “”
- --pid-globaltag
PID globaltag to be passed to the skims.
Default: “”
b2skim-generate-validation
: Generate skim validation scripts#
Generate skim validation scripts.
usage: b2skim-generate-validation [-h] (--skims SKIM [SKIM ...] | --all)
[-o DIRECTORY] [--in-place]
[--add-analysis-globaltag]
Optional Arguments
- --skims
Possible choices: Random, SystematicsTracking, Resonance, SystematicsRadMuMu, SystematicsEELL, SystematicsRadEE, SystematicsLambda, SystematicsPhiGamma, SystematicsFourLeptonFromHLTFlag, SystematicsRadMuMuFromHLTFlag, SystematicsJpsi, SystematicsKshort, SystematicsBhabha, SystematicsCombinedHadronic, SystematicsCombinedLowMulti, SystematicsDstar, PRsemileptonicUntagged, LeptonicUntagged, SLUntagged, B0toDstarl_Kpi_Kpipi0_Kpipipi, BtoDl_and_ROE_e_or_mu_or_lowmult, feiHadronicB0, feiHadronicBplus, feiSLB0, feiSLB0_RDstar, feiSLBplus, feiHadronic, feiSL, BtoXgamma, BtoXll, BtoXll_LFV, B0TwoBody, FourLepton, RadiativeDilepton, TDCPV_ccs, TDCPV_qqs, TDCPV_dilepton, BtoD0h_Kspi0, BtoD0h_Kspipipi0, BtoDstarpipipi0_D0pi_Kpi, B0toDpi_Kpipi, B0toDpi_Kspi, B0toDstarPi_D0pi_Kpi, B0toDstarPi_D0pi_Kpipipi_Kpipi0, B0toDrho_Kpipi, B0toDrho_Kspi, B0toDstarRho_D0pi_Kpi, B0toDstarRho_D0pi_Kpipipi_Kpipi0, BptoD0etapi_Kpi, BptoD0pipi0_Kpi, BtoD0h_hh, BtoD0h_Kpi, B0toDstarpipi0_D0pi_Kpi, BtoD0h_Kpipipi_Kpipi0, B0toDstaretapi_D0pi_Kpi, BtoD0h_Kshh, BtoD0rho_Kpi, BtoD0rho_Kpipipi_Kpipi0, B0toDD_Kpipi_Kspi, B0toDstarD, B0toD0Kpipi0_pi0, B0toDs1D, B0toDDs0star, B0toDomegapi_Kpipi_pipipi0, B0toDomegapi_Kspi_pipipi0, BtoD0pi_Kpiomega_pipipi0, InclusiveLambda, BottomoniumEtabExclusive, BottomoniumUpsilon, InclusiveUpsilon, CharmoniumPsi, XToD0_D0ToHpJm, XToD0_D0ToNeutrals, DstToD0Pi_D0ToRare, XToDp_DpToKsHp, XToDp_DpToHpHmJp, LambdacTopHpJm, DstToD0Pi_D0ToGeneric, LambdacToSHpJm, XicpTopHpJm, XicToXimPipPim, Xic0ToLHpJm, XicpToLKsHp, DstToD0Pi_D0ToHpJm, DstToD0Pi_D0ToHpJmPi0, DstToD0Pi_D0ToKsOmega, DstToD0Pi_D0ToHpJmEta, DstToD0Pi_D0ToNeutrals, DstToD0Pi_D0ToHpJmKs, EarlyData_DstToD0Pi_D0ToHpJmPi0, EarlyData_DstToD0Pi_D0ToHpHmPi0, DstToDpPi0_DpToHpPi0, DpToHpPi0, DpToKsHp, DstToD0Pi_D0ToHpHmHpJm, DstToD0Pi_D0ToVGamma, DpToPipEpEm, DpToPipMupMum, DpToPipKpKm, DpToHpOmega, DspToHpOmega, DpToEtaHp, SinglePhotonDark, GammaGammaControlKLMDark, ALP3Gamma, EGammaControlDark, InelasticDarkMatter, RadBhabhaV0Control, TauLFV, DimuonPlusMissingEnergy, ElectronMuonPlusMissingEnergy, DielectronPlusMissingEnergy, LFVZpVisible, BtoKplusLLP, TauGeneric, TauThrust, TauKshort, TwoTrackLeptonsForLuminosity, LowMassTwoTrack, SingleTagPseudoScalar, InelasticDarkMatterWithDarkHiggs, AA2uuuu, DimuonPlusVisibleDarkHiggs, DielectronPlusVisibleDarkHiggs, LowMassOneTrack, BtoPi0Pi0, BtoPi0Eta, BtoHadTracks, BtoHad1Pi0, BtoHad3Tracks1Pi0, BtoRhopRhom, BtoEtapKstp
Skims to produce a validation file for.
- --all
Delete all existing validation scripts and reproduce validation scripts for all skims with a
validation_histograms
method defined. This option implies the--in-place
flag.- -o, --output-directory
Directory to output steering file. Defaults to current working directory.
- --in-place
Overwrite scripts in
skim/validation/
.- --add-analysis-globaltag
If flag passed, the default analysis globaltag will be passed to the skim constructor.
20.4.4. Skim registry#
All skims must be registered and encoded by the relevant skim liaison. Registering a skim is as simple as adding it to the list in skim/scripts/skim/registry.py
as an entry of the form (SkimCode, ParentModule, SkimName)
.
The skim numbering convention is defined on the Confluence skim page.
- skim.registry.Registry = <skim.registry.SkimRegistryClass object>#
An instance of
skim.registry.SkimRegistryClass
. Use this in your script to get information from the registry.>>> from skim.registry import Registry >>> Registry.encode_skim_name("SinglePhotonDark") 18020100
- class skim.registry.SkimRegistryClass[source]#
Class containing information on all official registered skims. This class also contains helper functions for getting information from the registry. For convenience, an instance of this class is provided:
skim.registry.Registry
.The table below lists all registered skims and their skim codes:
Module
Skim name
Skim code
btocharm
BtoD0h_Kspi0
14120300
BtoD0h_Kspipipi0
14120400
B0toDpi_Kpipi
14120600
B0toDpi_Kspi
14120601
B0toDstarPi_D0pi_Kpi
14120700
B0toDstarPi_D0pi_Kpipipi_Kpipi0
14120800
B0toDrho_Kpipi
14121100
B0toDrho_Kspi
14121101
B0toDstarRho_D0pi_Kpi
14121200
B0toDstarRho_D0pi_Kpipipi_Kpipi0
14121201
B0toD0Kpipi0_pi0
14121300
B0toDstarpipi0_D0pi_Kpi
14121400
B0toDstaretapi_D0pi_Kpi
14121500
BptoD0etapi_Kpi
14121600
BptoD0pipi0_Kpi
14121601
BtoD0h_hh
14140100
BtoD0h_Kpi
14140101
BtoD0h_Kpipipi_Kpipi0
14140102
BtoD0h_Kshh
14140200
BtoD0rho_Kpi
14141000
BtoD0rho_Kpipipi_Kpipi0
14141001
B0toDD_Kpipi_Kspi
14141002
B0toDstarD
14141003
B0toDomegapi_Kpipi_pipipi0
14141701
B0toDomegapi_Kspi_pipipi0
14141702
BtoD0pi_Kpiomega_pipipi0
14141703
B0toDs1D
14160200
B0toDDs0star
14160201
BtoDstarpipipi0_D0pi_Kpi
14161400
btocharmless
BtoPi0Pi0
19120100
BtoRhopRhom
19120400
BtoHadTracks
19130201
BtoHad1Pi0
19130300
BtoHad3Tracks1Pi0
19130310
BtoPi0Eta
19130600
BtoEtapKstp
19140500
charm
DpToPipEpEm
17220100
DpToPipMupMum
17220200
DpToPipKpKm
17220300
DpToKsHp
17222100
XToD0_D0ToHpJm
17230100
XToD0_D0ToNeutrals
17230200
DstToD0Pi_D0ToRare
17230300
XToDp_DpToKsHp
17230400
XToDp_DpToHpHmJp
17230500
LambdacTopHpJm
17230600
DstToD0Pi_D0ToGeneric
17230700
LambdacToSHpJm
17230900
XicpTopHpJm
17231000
XicToXimPipPim
17231100
Xic0ToLHpJm
17231200
XicpToLKsHp
17231300
DpToHpPi0
17232000
DstToD0Pi_D0ToHpJm
17240100
DstToD0Pi_D0ToHpJmPi0
17240200
DstToD0Pi_D0ToKsOmega
17240400
DstToD0Pi_D0ToHpJmEta
17240500
DstToD0Pi_D0ToNeutrals
17240600
DstToD0Pi_D0ToHpJmKs
17240700
EarlyData_DstToD0Pi_D0ToHpJmPi0
17240800
EarlyData_DstToD0Pi_D0ToHpHmPi0
17240900
DstToDpPi0_DpToHpPi0
17241000
DstToD0Pi_D0ToHpHmHpJm
17241100
DstToD0Pi_D0ToVGamma
17241200
DpToEtaHp
17241300
DpToHpOmega
17260100
DspToHpOmega
17260200
dark
InelasticDarkMatter
18000000
RadBhabhaV0Control
18000001
SinglePhotonDark
18020100
GammaGammaControlKLMDark
18020200
ALP3Gamma
18020300
EGammaControlDark
18020400
InelasticDarkMatterWithDarkHiggs
18020500
DimuonPlusVisibleDarkHiggs
18020600
DielectronPlusVisibleDarkHiggs
18020700
BtoKplusLLP
18130100
AA2uuuu
18370100
DimuonPlusMissingEnergy
18520100
ElectronMuonPlusMissingEnergy
18520200
DielectronPlusMissingEnergy
18520300
LFVZpVisible
18520400
ewp
B0TwoBody
12120400
FourLepton
12120500
RadiativeDilepton
12120600
BtoXgamma
12160100
BtoXll
12160200
BtoXll_LFV
12160300
fei
feiHadronicB0
11180100
feiHadronicBplus
11180200
feiSLB0
11180300
feiSLB0_RDstar
11180301
feiSLBplus
11180400
feiHadronic
11180500
feiSL
11180600
leptonic
LeptonicUntagged
11130300
lowMulti
LowMassTwoTrack
18520500
TwoTrackLeptonsForLuminosity
18530100
SingleTagPseudoScalar
18530200
LowMassOneTrack
18530600
quarkonium
InclusiveLambda
15410300
BottomoniumEtabExclusive
15420100
BottomoniumUpsilon
15440100
InclusiveUpsilon
15460400
CharmoniumPsi
16460200
semileptonic
PRsemileptonicUntagged
11110100
SLUntagged
11160200
B0toDstarl_Kpi_Kpipi0_Kpipipi
11160201
BtoDl_and_ROE_e_or_mu_or_lowmult
11170100
systematics
Random
10000000
SystematicsTracking
10600300
Resonance
10600400
SystematicsRadMuMu
10600500
SystematicsEELL
10600600
SystematicsRadEE
10600700
SystematicsFourLeptonFromHLTFlag
10600800
SystematicsRadMuMuFromHLTFlag
10600900
SystematicsBhabha
10601200
SystematicsCombinedHadronic
10601300
SystematicsCombinedLowMulti
10601400
SystematicsDstar
10601500
SystematicsJpsi
10611000
SystematicsKshort
10611100
SystematicsLambda
10620200
SystematicsPhiGamma
11640100
taupair
TauLFV
18360100
TauGeneric
18570600
TauThrust
18570700
TauKshort
18570800
tdcpv
TDCPV_dilepton
13130300
TDCPV_ccs
13160200
TDCPV_qqs
13160300
- property codes#
A list of all registered skim codes.
- decode_skim_code(SkimCode)[source]#
Find the name of the skim which corresponds to the provided skim code.
This is useful to determine the skim script used to produce a specific uDST file, given the 8-digit code name of the file itself.
- Parameters
SkimCode (str) – 8 digit skim code assigned to some skim.
- Returns
Name of the corresponding skim as it appears in the skim registry.
- encode_skim_name(SkimName)[source]#
Find the 8 digit skim code assigned to the skim with the provided name.
- Parameters
SkimName (str) – Name of the corresponding skim as it appears in the skim registry.
- Returns
8 digit skim code assigned to the given skim.
- get_skim_function(SkimName)[source]#
Get the skim class constructor for the given skim.
This is achieved by importing the module listed alongside the skim name in the skim registry.
- Parameters
SkimName (str) – Name of the skim to be found.
- Returns
The class constructor for the given skim.
- get_skim_module(SkimName)[source]#
Retrieve the skim module name from the registry which contains the given skim.
- Parameters
SkimName (str) – Name of the skim as it appears in the skim registry.
- Returns
The name of the skim module which contains the skim.
- get_skims_in_module(SkimModule)[source]#
Retrieve a list of the skims listed in the registry as existing in the given skim module.
- Parameters
SkimModule (str) – The name of the module, e.g.
btocharmless
(notskim.btocharmless
orbtocharmless.py
).- Returns
The skims listed in the registry as belonging to
SkimModule
.
- property modules#
A list of all registered skim modules.
- property names#
A list of all registered skim names.
20.4.5. Testing skim performance#
When skims are developed, it is important to test the performance of the skim on a data and on a range of background MC samples. Two command-line tools are provided in the skim package to aid in this: b2skim-stats-submit and b2skim-stats-print. They are available in the PATH
after setting up the basf2
environment after calling b2setup
. The former submits a series of test jobs for a skim on data and MC samples, and the latter uses the output files of the jobs to calculate performance statistics for each sample including retention rate, CPU time, and uDST size per event. b2skim-stats-print
also provides estimates for combined MC samples, where the statistics are weighted by the cross-section of each background process.
First run b2skim-stats-submit
, which will submit small skim jobs on test files of MC and data using bsub
. For example,
b2skim-stats-submit -s LeptonicUntagged SLUntagged
Monitor your jobs with bjobs -u USERNAME
. Once all of the submitted jobs have completed successfully, then run b2skim-stats-print
.
b2skim-stats-print -s LeptonicUntagged SLUntagged
This will read the output files of the test jobs, and produce tables of statistics in a variety of outputs.
By default, a subset of the statistics are printed to the screen.
If the
-M
flag is provided, a Markdown table will be written toSkimStats.md
. This table is in a format that can be copied into the comment fields of pull requests (where BitBucket will format the table nicely for you). Use this flag when asked to produce a table of stats in a pull request.If the
-C
flag is provided, a text fileSkimStats.txt
is written, in which the statistics are formatted as Confluence wiki markup tables. These tables can be copied directly onto a Confluence page by editing the page, selectingInsert more content
from the toolbar, selectingMarkup
from the drop-down menu, and then pasting the content of the text file into the markup editor which appears. Confluence will then format the tables and headings. The markup editor can also be accessed viactrl-shift-D
(cmd-shift-D
).If the
-J
flag is provided, then all statistics produced are printed to a JSON fileSkimStats.json
, indexed by skim, statistic, and sample label. This file contains extra metadata about when and how the tests were run. This file is to be used by grid production tools.
Tip
To test your own newly-developed skim, make sure you have followed all the instructions in Writing a skim, particularly the instructions regarding the skim registry.
b2skim-stats-submit
: Run skim scripts on test samples#
Note
Please run these skim tests on KEKCC, so that the estimates for CPU time are directly comparable to one another.
Submits test jobs for a given set of skims, and saves the output in a format to be read by b2skim-stats-print
. One or more standalone or combined skim names must be provided.
usage: b2skim-stats-submit [-h]
(-s skim [skim ...] | -c YAMLFile [CombinedSkim ...])
[--custom-samples Filename [Filename ...] |
--sample-yaml Filename]
[--analysis-globaltag AnalysisGlobaltag]
[--pid-globaltag PIDGlobaltag]
[-n nEventsPerSample] [--dry-run]
[--mc-only | --data-only | --custom-only]
Optional Arguments
- -s, --single
Possible choices: all, Random, SystematicsTracking, Resonance, SystematicsRadMuMu, SystematicsEELL, SystematicsRadEE, SystematicsLambda, SystematicsPhiGamma, SystematicsFourLeptonFromHLTFlag, SystematicsRadMuMuFromHLTFlag, SystematicsJpsi, SystematicsKshort, SystematicsBhabha, SystematicsCombinedHadronic, SystematicsCombinedLowMulti, SystematicsDstar, PRsemileptonicUntagged, LeptonicUntagged, SLUntagged, B0toDstarl_Kpi_Kpipi0_Kpipipi, BtoDl_and_ROE_e_or_mu_or_lowmult, feiHadronicB0, feiHadronicBplus, feiSLB0, feiSLB0_RDstar, feiSLBplus, feiHadronic, feiSL, BtoXgamma, BtoXll, BtoXll_LFV, B0TwoBody, FourLepton, RadiativeDilepton, TDCPV_ccs, TDCPV_qqs, TDCPV_dilepton, BtoD0h_Kspi0, BtoD0h_Kspipipi0, BtoDstarpipipi0_D0pi_Kpi, B0toDpi_Kpipi, B0toDpi_Kspi, B0toDstarPi_D0pi_Kpi, B0toDstarPi_D0pi_Kpipipi_Kpipi0, B0toDrho_Kpipi, B0toDrho_Kspi, B0toDstarRho_D0pi_Kpi, B0toDstarRho_D0pi_Kpipipi_Kpipi0, BptoD0etapi_Kpi, BptoD0pipi0_Kpi, BtoD0h_hh, BtoD0h_Kpi, B0toDstarpipi0_D0pi_Kpi, BtoD0h_Kpipipi_Kpipi0, B0toDstaretapi_D0pi_Kpi, BtoD0h_Kshh, BtoD0rho_Kpi, BtoD0rho_Kpipipi_Kpipi0, B0toDD_Kpipi_Kspi, B0toDstarD, B0toD0Kpipi0_pi0, B0toDs1D, B0toDDs0star, B0toDomegapi_Kpipi_pipipi0, B0toDomegapi_Kspi_pipipi0, BtoD0pi_Kpiomega_pipipi0, InclusiveLambda, BottomoniumEtabExclusive, BottomoniumUpsilon, InclusiveUpsilon, CharmoniumPsi, XToD0_D0ToHpJm, XToD0_D0ToNeutrals, DstToD0Pi_D0ToRare, XToDp_DpToKsHp, XToDp_DpToHpHmJp, LambdacTopHpJm, DstToD0Pi_D0ToGeneric, LambdacToSHpJm, XicpTopHpJm, XicToXimPipPim, Xic0ToLHpJm, XicpToLKsHp, DstToD0Pi_D0ToHpJm, DstToD0Pi_D0ToHpJmPi0, DstToD0Pi_D0ToKsOmega, DstToD0Pi_D0ToHpJmEta, DstToD0Pi_D0ToNeutrals, DstToD0Pi_D0ToHpJmKs, EarlyData_DstToD0Pi_D0ToHpJmPi0, EarlyData_DstToD0Pi_D0ToHpHmPi0, DstToDpPi0_DpToHpPi0, DpToHpPi0, DpToKsHp, DstToD0Pi_D0ToHpHmHpJm, DstToD0Pi_D0ToVGamma, DpToPipEpEm, DpToPipMupMum, DpToPipKpKm, DpToHpOmega, DspToHpOmega, DpToEtaHp, SinglePhotonDark, GammaGammaControlKLMDark, ALP3Gamma, EGammaControlDark, InelasticDarkMatter, RadBhabhaV0Control, TauLFV, DimuonPlusMissingEnergy, ElectronMuonPlusMissingEnergy, DielectronPlusMissingEnergy, LFVZpVisible, BtoKplusLLP, TauGeneric, TauThrust, TauKshort, TwoTrackLeptonsForLuminosity, LowMassTwoTrack, SingleTagPseudoScalar, InelasticDarkMatterWithDarkHiggs, AA2uuuu, DimuonPlusVisibleDarkHiggs, DielectronPlusVisibleDarkHiggs, LowMassOneTrack, BtoPi0Pi0, BtoPi0Eta, BtoHadTracks, BtoHad1Pi0, BtoHad3Tracks1Pi0, BtoRhopRhom, BtoEtapKstp
List of individual skims to run.
Default: []
- -c, --combined
List of combined skims to run. This flag expects as its first argument the path to a YAML defining the combined skims. All remaining arguments are the combined skims to test. The YAML file is simply a mapping of combined skim names to the invidivual skims comprising them. For example,
feiSL: [feiSLB0, feiSLBplus]
.- --custom-samples
Filenames of custom samples to test in addition to standard data and MC files.
- --sample-yaml
YAML file containing a list of samples to test on. File must conform to the schema defined in
skim/tools/resources/test_samples_schema.json
(see examples in/group/belle2/dataprod/MC/SkimTraining/SampleLists
). If argument not passed, defaults to/group/belle2/dataprod/MC/SkimTraining/SampleLists/TestFiles.yaml
.- --analysis-globaltag
Analysis globaltag to be passed to the skims.
- --pid-globaltag
PID globaltag to be passed to the skims.
- -n
Number of events to run per sample. This input can be any positive number, but the actual number events run is limited to the size of the test files (~200,000 for MC files and ~20,000 for data files).
Default: 10000
- --dry-run, --dry
Print the submission commands, but don’t run them.
- --mc-only
Test on only MC samples.
- --data-only
Test on only data samples.
- --custom-only
Test on only custom samples.
b2skim-stats-print
: Print tables of performance statistics#
Reads the output files of test skim jobs from b2skim-stats-submit
and prints tables of performance statistics. One or more single or combined skim names must be provided.
usage: b2skim-stats-print [-h]
(-s skim [skim ...] | -c CombinedSkim [CombinedSkim ...])
[-C [OutputFilename] | -M [OutputFilename] | -J
[OutputFilename]] [--mconly | --dataonly] [-v]
Optional Arguments
- -s, --single
Possible choices: all, Random, SystematicsTracking, Resonance, SystematicsRadMuMu, SystematicsEELL, SystematicsRadEE, SystematicsLambda, SystematicsPhiGamma, SystematicsFourLeptonFromHLTFlag, SystematicsRadMuMuFromHLTFlag, SystematicsJpsi, SystematicsKshort, SystematicsBhabha, SystematicsCombinedHadronic, SystematicsCombinedLowMulti, SystematicsDstar, PRsemileptonicUntagged, LeptonicUntagged, SLUntagged, B0toDstarl_Kpi_Kpipi0_Kpipipi, BtoDl_and_ROE_e_or_mu_or_lowmult, feiHadronicB0, feiHadronicBplus, feiSLB0, feiSLB0_RDstar, feiSLBplus, feiHadronic, feiSL, BtoXgamma, BtoXll, BtoXll_LFV, B0TwoBody, FourLepton, RadiativeDilepton, TDCPV_ccs, TDCPV_qqs, TDCPV_dilepton, BtoD0h_Kspi0, BtoD0h_Kspipipi0, BtoDstarpipipi0_D0pi_Kpi, B0toDpi_Kpipi, B0toDpi_Kspi, B0toDstarPi_D0pi_Kpi, B0toDstarPi_D0pi_Kpipipi_Kpipi0, B0toDrho_Kpipi, B0toDrho_Kspi, B0toDstarRho_D0pi_Kpi, B0toDstarRho_D0pi_Kpipipi_Kpipi0, BptoD0etapi_Kpi, BptoD0pipi0_Kpi, BtoD0h_hh, BtoD0h_Kpi, B0toDstarpipi0_D0pi_Kpi, BtoD0h_Kpipipi_Kpipi0, B0toDstaretapi_D0pi_Kpi, BtoD0h_Kshh, BtoD0rho_Kpi, BtoD0rho_Kpipipi_Kpipi0, B0toDD_Kpipi_Kspi, B0toDstarD, B0toD0Kpipi0_pi0, B0toDs1D, B0toDDs0star, B0toDomegapi_Kpipi_pipipi0, B0toDomegapi_Kspi_pipipi0, BtoD0pi_Kpiomega_pipipi0, InclusiveLambda, BottomoniumEtabExclusive, BottomoniumUpsilon, InclusiveUpsilon, CharmoniumPsi, XToD0_D0ToHpJm, XToD0_D0ToNeutrals, DstToD0Pi_D0ToRare, XToDp_DpToKsHp, XToDp_DpToHpHmJp, LambdacTopHpJm, DstToD0Pi_D0ToGeneric, LambdacToSHpJm, XicpTopHpJm, XicToXimPipPim, Xic0ToLHpJm, XicpToLKsHp, DstToD0Pi_D0ToHpJm, DstToD0Pi_D0ToHpJmPi0, DstToD0Pi_D0ToKsOmega, DstToD0Pi_D0ToHpJmEta, DstToD0Pi_D0ToNeutrals, DstToD0Pi_D0ToHpJmKs, EarlyData_DstToD0Pi_D0ToHpJmPi0, EarlyData_DstToD0Pi_D0ToHpHmPi0, DstToDpPi0_DpToHpPi0, DpToHpPi0, DpToKsHp, DstToD0Pi_D0ToHpHmHpJm, DstToD0Pi_D0ToVGamma, DpToPipEpEm, DpToPipMupMum, DpToPipKpKm, DpToHpOmega, DspToHpOmega, DpToEtaHp, SinglePhotonDark, GammaGammaControlKLMDark, ALP3Gamma, EGammaControlDark, InelasticDarkMatter, RadBhabhaV0Control, TauLFV, DimuonPlusMissingEnergy, ElectronMuonPlusMissingEnergy, DielectronPlusMissingEnergy, LFVZpVisible, BtoKplusLLP, TauGeneric, TauThrust, TauKshort, TwoTrackLeptonsForLuminosity, LowMassTwoTrack, SingleTagPseudoScalar, InelasticDarkMatterWithDarkHiggs, AA2uuuu, DimuonPlusVisibleDarkHiggs, DielectronPlusVisibleDarkHiggs, LowMassOneTrack, BtoPi0Pi0, BtoPi0Eta, BtoHadTracks, BtoHad1Pi0, BtoHad3Tracks1Pi0, BtoRhopRhom, BtoEtapKstp
List of individual skims to run.
Default: []
- -c, --combined
List of combined skims to run.
Default: []
- -C, --confluence
Save a wiki markup table to be copied to Confluence.
- -M, --markdown
Save a markdown table in a format that can be copied into pull request comments.
- -J, --json
Save the tables of statistics to a JSON file.
- --mconly
Test on only MC samples.
- --dataonly
Test on only data samples.
- -v, --verbose
Print out extra warning messages when the script cannot calculate a value, but moves on anyway.
Running b2skim-stats
tools on custom samples#
By default, these tools will run over a standard list of samples defined in /group/belle2/dataprod/MC/SkimTraining/SampleLists/TestFiles.yaml
. If you would like to run these tools over a set of custom samples (e.g. a signal MC sample) in addition to the standard list of files, then simply pass the filenames to b2skim-stats-submit
:
b2skim-stats-submit -s SkimA SkimB SkimC --custom-samples /path/to/sample/*.root
# wait for jobs to finish...
b2skim-stats-print -s SkimA SkimB SkimC
Alternatively, you may create a YAML file specifying the list of samples you wish to run on:
Custom:
- location: /path/to/sample/a.root
label: "A nice human-readable label"
- location: /path/to/sample/b.root
label: "Another nice human-readable label"
- location: /path/to/sample/c.root
label: "Yet another nice human-readable label"
Then pass this YAML file to b2skim-stats-submit
:
b2skim-stats-submit -s SkimA SkimB SkimC --sample-yaml MyCustomSamples.yaml
# wait for jobs to finish...
b2skim-stats-print -s SkimA SkimB SkimC
If you specify the samples by this second method, then only the samples listed in the YAML file will be tested on.
The JSON schema for the input YAML file is defined in skim/tools/resources/test_samples_schema.json
.
20.4.6. Core skim package API#
The core classes of the skim package are defined in skim.core
: BaseSkim
and
CombinedSkim
.
BaseSkim
is an abstract base class from which all skims inherit. It defines template functions for a skim, and includes attributes describing the skim metadata.CombinedSkim
is a class for combiningBaseSkim
objects into a single steering file.
- class skim.core.BaseSkim(*, OutputFileName=None, additionalDataDescription=None, udstOutput=True, validation=False, mc=True, analysisGlobaltag=None, pidGlobaltag=None)[source]#
Base class for skims. Initialises a skim name, and creates template functions required for each skim.
See Writing a skim for information on how to use this to define a new skim.
- ApplyHLTHadronCut = False#
If this property is set to True, then the HLT selection for
hlt_hadron
will be applied to the skim lists when the skim is added to the path.
- MergeDataStructures = {}#
Dict of
str -> function
pairs to determine if any special data structures should be merged when combining skims. Currently, this is only used to merge FEI config parameters when running multiple FEI skims at once, so that it can be run just once with all the necessary arguments.
- NoisyModules = None#
List of module types to be silenced. This may be necessary in certain skims in order to keep log file sizes small.
Tip
The elements of this list should be the module type, which is not necessarily the same as the module name. The module type can be inspected in Python via
module.type()
.See also
This attribute is used by
BaseSkim.set_skim_logging
.
- SkimLists = []#
List of particle lists reconstructed by the skim. This attribute should only be accessed after running the
__call__
method.
- property TestFiles#
Location of test MDST sample. To modify this, set the property
BaseSkim.TestSampleProcess
, and this function will find an appropriate test sample from the list in/group/belle2/dataprod/MC/SkimTraining/SampleLists/TestFiles.yaml
If no sample can be found, an empty list is returned.
- TestSampleProcess = 'mixed'#
MC process of test file.
BaseSkim.TestFiles
passes this property toskim.utils.testfiles.get_test_file
to retrieve an appropriate file location. Defaults to a \(B^{0}\overline{B^{0}}\) sample.
- additional_setup(path)[source]#
Perform any setup steps necessary before running the skim.
Warning
Standard particle lists should not be loaded in here. This should be done by overriding the method
BaseSkim.load_standard_lists
. This is crucial for avoiding loading lists twice when combining skims for production.- Parameters
path (basf2.Path) – Skim path to be processed.
- analysisGlobaltag = None#
Analysis globaltag.
- apply_hlt_hadron_cut_if_required(path)[source]#
Apply the
hlt_hadron
selection if the propertyApplyHLTHadronCut
is True.- Parameters
path (basf2.Path) – Skim path to be processed.
- abstract build_lists(path)[source]#
Create the skim lists to be saved in the output uDST. This function is where the main skim cuts should be applied. This function should return a list of particle list names.
- Parameters
path (basf2.Path) – Skim path to be processed.
Changed in version release-06-00-00: Previously, this function was expected to set the attribute
BaseSkim.SkimLists
. Now this is handled byBaseSkim
, and this function is expected to return the list of particle list names.
- property code#
Eight-digit code assigned to this skim in the registry.
- property flag#
Event-level variable indicating whether an event passes the skim or not. To use the skim flag without writing uDST output, use the argument
udstOutput=False
when instantiating the skim class.
- initialise_skim_flag(path)[source]#
Add the module
skim.utils.flags.InitialiseSkimFlag
to the path, which initialises flag for this skim to zero.
- load_standard_lists(path)[source]#
Load any standard lists. This code will be run before any
BaseSkim.additional_setup
andBaseSkim.build_lists
.Note
This is separated into its own function so that when skims are combined, any standard lists used by two skims can be loaded just once.
- Parameters
path (basf2.Path) – Skim path to be processed.
- mc = True#
Include Monte Carlo quantities in skim output.
- output_udst(path)[source]#
Write the skim particle lists to an output uDST and print a summary of the skim list statistics.
- Parameters
path (basf2.Path) – Skim path to be processed.
- pidGlobaltag = None#
PID globaltag.
- property postskim_path#
Return the skim path.
If
BaseSkim.skim_event_cuts
has been run, then the skim lists will only be created on a conditional path, so subsequent modules should be added to the conditional path.If
BaseSkim.skim_event_cuts
has not been run, then the main analysis path is returned.
- produce_on_tau_samples = True#
If this property is set to False, then
b2skim-prod
will not produce data production requests for this skim on taupair MC samples. This decision may be made for one of two reasons:The retention rate of the skim on taupair samples is basically zero, so there is no point producing the skim for these samples.
The retention rate of the skim on taupair samples is too high (>20%), so the production system may struggle to handle the jobs.
- produces_mdst_by_default = False#
Special property for combined systematics skims, which produce MDST output instead of uDST. This property is used by
b2skim-prod
to set theDataLevel
parameter in theDataDescription
block for this skim tomdst
instead ofudst
.
- set_skim_logging()[source]#
Turns the log level to ERROR for selected modules to decrease the total size of the skim log files. Additional modules can be silenced by setting the attribute
NoisyModules
for an individual skim.- Parameters
path (basf2.Path) – Skim path to be processed.
Warning
This method works by inspecting the modules added to the path, and setting the log level to ERROR. This method should be called after all skim-related modules are added to the path.
- skim_event_cuts(cut, *, path)[source]#
Apply event-level cuts in a skim-safe way.
- Parameters
cut (str) – Event-level cut to be applied.
path (basf2.Path) – Skim path to be processed.
- Returns
Path on which the rest of this skim should be processed. On this path, only events which passed the event-level cut will be processed further.
Tip
If running this function in
BaseSkim.additional_setup
orBaseSkim.build_lists
, redefine thepath
to the path returned byBaseSkim.skim_event_cuts
, e.g.def build_lists(self, path): path = self.skim_event_cuts("nTracks>4", path=path) # rest of skim list building...
Note
The motivation for using this function over
applyEventCuts
is thatapplyEventCuts
completely removes events from processing. If we combine multiple skims in a single steering file (which is done in production), and the first has a set of event-level cuts, then all the remaining skims will never even see those events.Internally, this function creates a new path, which is only processed for events passing the event-level cut. To avoid issues around particles not being available on the main path (leading to noisy error logs), we need to add the rest of the skim to this path. So this new path is assigned to the attribute
BaseSkim._ConditionalPath
, andBaseSkim.__call__
will run all remaining methods on this path.
- update_skim_flag(path)[source]#
Add the module
skim.utils.flags.UpdateSkimFlag
to the path, which updates flag for this skim.Warning
If a conditional path has been created before this, then this function must run on the conditional path, since the skim lists are not guaranteed to exist for all events on the main path.
- validation_histograms(path)[source]#
Create validation histograms for the skim.
- Parameters
path (basf2.Path) – Skim path to be processed.
- validation_sample = None#
MDST sample to use for validation histograms. Must be a valid location of a validation dataset (see documentation for
basf2.find_file
).
- class skim.core.CombinedSkim(*skims, NoisyModules=None, additionalDataDescription=None, udstOutput=None, mdstOutput=False, mdst_kwargs=None, CombinedSkimName='CombinedSkim', OutputFileName=None, mc=None, analysisGlobaltag=None, pidGlobaltag=None)[source]#
Class for creating combined skims which can be run using similar-looking methods to
BaseSkim
objects.A steering file which combines skims can be as simple as the following:
import basf2 as b2 import modularAnalysis as ma from skim.WGs.foo import OneSkim, TwoSkim, RedSkim, BlueSkim path = b2.Path() ma.inputMdstList([], path=path) skims = CombinedSkim(OneSkim(), TwoSkim(), RedSkim(), BlueSkim()) skims(path) # load standard lists, create skim lists, and save to uDST path.process()
When skims are combined using this class, the
BaseSkim.NoisyModules
lists of each skim are combined and all silenced.The heavy-lifting functions
BaseSkim.additional_setup
,BaseSkim.build_lists
andBaseSkim.output_udst
are modified to loop over the corresponding functions of each invididual skim. Theload_standard_lists
method is also modified to load all required lists, without accidentally loading a list twice.Calling an instance of the
CombinedSkim
class will load all the required particle lists, then run all the setup steps, then the list building functions, and then all the output steps.- property TestFiles#
Location of test MDST sample. To modify this, set the property
BaseSkim.TestSampleProcess
, and this function will find an appropriate test sample from the list in/group/belle2/dataprod/MC/SkimTraining/SampleLists/TestFiles.yaml
If no sample can be found, an empty list is returned.
- additional_setup(path)[source]#
Run the
BaseSkim.additional_setup
function of each skim.- Parameters
path (basf2.Path) – Skim path to be processed.
- apply_hlt_hadron_cut_if_required(path)[source]#
Run the
BaseSkim.apply_hlt_hadron_cut_if_required
function for each skim.- Parameters
path (basf2.Path) – Skim path to be processed.
- build_lists(path)[source]#
Run the
BaseSkim.build_lists
function of each skim.- Parameters
path (basf2.Path) – Skim path to be processed.
- property flag#
Event-level variable indicating whether an event passes the combinedSkim or not.
- property flags#
List of flags for each skim in combined skim.
- initialise_skim_flag(path)[source]#
Add the module
skim.utils.flags.InitialiseSkimFlag
to the path, to initialise flags for each skim.
- load_standard_lists(path)[source]#
Add all required standard list loading to the path.
Note
To avoid loading standard lists twice, this function creates dummy paths that are passed through
load_standard_lists
for each skim. These dummy paths are then inspected, and a list of unique module-parameter combinations is added to the main skim path.- Parameters
path (basf2.Path) – Skim path to be processed.
- merge_data_structures()[source]#
Read the values of
BaseSkim.MergeDataStructures
and merge data structures accordingly.For example, if
MergeDataStructures
has the value{"FEIChannelArgs": _merge_boolean_dicts.__func__}
, then_merge_boolean_dicts
is run on all input skims with the attributeFEIChannelArgs
, and the value ofFEIChannelArgs
for that skim is set to the result.In the FEI skims, this is used to merge configs which are passed to a cached function, thus allowing us to apply the FEI once with all the required particles available.
- output_mdst_if_any_flag_passes(*, path, **kwargs)[source]#
Add MDST output to the path if the event passes any of the skim flags. EventExtraInfo is included in the MDST output so that the flags are available in the output.
The
CombinedSkimName
parameter in theCombinedSkim
initialisation is used for the output filename iffilename
is not included in kwargs.- Parameters
path (basf2.Path) – Skim path to be processed.
**kwargs – Passed on to
mdst.add_mdst_output
.
- output_udst(path)[source]#
Run the
BaseSkim.output_udst
function of each skim.- Parameters
path (basf2.Path) – Skim path to be processed.
- property produce_on_tau_samples#
Corresponding value of this attribute for each individual skim.
A warning is issued if the individual skims in combined skim contain a mix of True and False for this property.
- set_skim_logging()[source]#
Run
BaseSkim.set_skim_logging
for each skim.
- update_skim_flag(path)[source]#
Add the module
skim.utils.flags.UpdateSkimFlag
to the conditional path of each skims.
20.4.7. Utility functions for skim experts#
Skim flag implementation#
Modules required for calculating skim flags. Skim flags track whether an event passes a skim, without the need to directly remove those events from processing.
- class skim.utils.flags.InitialiseSkimFlag(*skims)[source]#
[Module for skim expert usage] Create the EventExtraInfo DataStore object, and set all required flag variables to zero.
Note
Add this module to the path before adding any skims, so that the skim flags are defined in the datastore for all events.
- class skim.utils.flags.UpdateSkimFlag(skim)[source]#
[Module for skim expert usage] Update the skim flag to be 1 if there is at least one candidate in any of the skim lists.
Note
Add this module to the post-skim path of each skim in the combined skim, as the skim lists are only guaranteed to exist on the conditional path (if a conditional path was used).
Per-cut retention checker#
Provides class for tracking retention rate of each cut in a skim.
- class skim.utils.retention.RetentionCheck(module_name='', module_number=0, particle_lists=None)[source]#
Check the retention rate and the number of candidates for a given set of particle lists.
The module stores its results in the static variable “summary”.
To monitor the effect of every module of an initial path, this module should be added after each module of the path. A function was written (
skim.utils.retention.pathWithRetentionCheck
) to do it:>>> path = pathWithRetentionCheck(particle_lists, path)
After the path processing, the result of the RetentionCheck can be printed with
>>> RetentionCheck.print_results()
or plotted with (check the corresponding documentation)
>>> RetentionCheck.plot_retention(...)
and the summary dictionary can be accessed through
>>> RetentionCheck.summary
Authors:
Cyrille Praz, Slavomira Stefkova
- Parameters
- output_override = None#
- classmethod plot_retention(particle_list, plot_title='', save_as=None, module_name_max_length=80)[source]#
Plot the result of the RetentionCheck for a given particle list.
Example of use (to be put after process(path)):
>>> RetentionCheck.plot_retention('B+:semileptonic','skim:feiSLBplus','retention_plots/plot.pdf')
- Parameters
- summary = {}#
- skim.utils.retention.pathWithRetentionCheck(particle_lists, path)[source]#
Return a new path with the module RetentionCheck inserted between each module of a given path.
This allows for checking how the retention rate is modified by each module of the path.
Example of use (to be put just before process(path)):
>>> path = pathWithRetentionCheck(['B+:semileptonic'], path)
Warning: pathWithRetentionCheck([‘B+:semileptonic’], path) does not modify path, it only returns a new one.
After the path processing, the result of the RetentionCheck can be printed with
>>> RetentionCheck.print_results()
or plotted with (check the corresponding documentation)
>>> RetentionCheck.plot_retention(...)
and the summary dictionary can be accessed through
>>> RetentionCheck.summary
- Parameters
particle_lists (list(str)) – list of particle list names which will be tracked by RetentionCheck
path (basf2.Path) – initial path (it is not modified, see warning above and example of use)
Skim samples framework#
Tip
This section is probably only of interest to you if you are developing the b2skim
tools. The classes defined here are used internally by these tools to parse YAML files and handle sample metadata internally.
- class skim.utils.testfiles.CustomSample(*, location, label=None, **kwargs)[source]#
- property as_dict#
Sample serialised as a dictionary.
- property encodeable_name#
Identifying string which is safe to be included as a filename component or as a key in the skim stats JSON file.
As a rough naming convention, data samples should start with ‘Data-’, MC sample with ‘MC-’, and custom samples with ‘Custom-‘.
- property printable_name#
Human-readable name for displaying in printed tables.
- class skim.utils.testfiles.DataSample(*, location, processing, experiment, beam_energy='4S', general_skim='all', **kwargs)[source]#
- property as_dict#
Sample serialised as a dictionary.
- property encodeable_name#
Identifying string which is safe to be included as a filename component or as a key in the skim stats JSON file.
As a rough naming convention, data samples should start with ‘Data-’, MC sample with ‘MC-’, and custom samples with ‘Custom-‘.
- property printable_name#
Human-readable name for displaying in printed tables.
- class skim.utils.testfiles.MCSample(*, location, process, campaign, beam_energy='4S', beam_background='BGx1', **kwargs)[source]#
- property as_dict#
Sample serialised as a dictionary.
- property encodeable_name#
Identifying string which is safe to be included as a filename component or as a key in the skim stats JSON file.
As a rough naming convention, data samples should start with ‘Data-’, MC sample with ‘MC-’, and custom samples with ‘Custom-‘.
- property printable_name#
Human-readable name for displaying in printed tables.
- class skim.utils.testfiles.Sample(**kwargs)[source]#
Base class for skim test samples.
- property as_dict#
Sample serialised as a dictionary.
- property encodeable_name#
Identifying string which is safe to be included as a filename component or as a key in the skim stats JSON file.
As a rough naming convention, data samples should start with ‘Data-’, MC sample with ‘MC-’, and custom samples with ‘Custom-‘.
- location = NotImplemented#
Path of the test file.
- property printable_name#
Human-readable name for displaying in printed tables.
- static resolve_path(location)[source]#
Replace
'${SampleDirectory}'
withSample.SampleDirectory
, and resolve the path.- Parameters
location (str, pathlib.Path) – Filename to be resolved.
- Returns
Resolved path.
- Return type
- class skim.utils.testfiles.TestSampleList(*, SampleYAML=None, SampleDict=None, SampleList=None)[source]#
Container class for lists of MC, data, and custom samples.
- DefaultSampleYAML = '/group/belle2/dataprod/MC/SkimTraining/SampleLists/TestFiles.yaml'#
- property SampleDict#
- query_data_samples(*, processing=None, experiment=None, beam_energy=None, general_skim=None, exact_match=False, inplace=False)[source]#
Find all MC samples matching query.
- Parameters
processing (str) – Data processing campaign number to query.
beam_energy (str) – Beam energy to query.
general_skim (str) –
GeneralSkimName
to query.exact_match (bool) – If passed, an error is raised if there is not exactly one matching sample. If there is exactly one matching sample, then the single sample is returned, rather than a list.
inplace (bool) – Replace MC samples with the list obtained from query.
- query_mc_samples(*, process=None, campaign=None, beam_energy=None, beam_background=None, exact_match=False, inplace=False)[source]#
Find all MC samples matching query.
- Parameters
process (str) – Simulated MC process to query.
beam_energy (str) – Beam energy to query.
beam_background (str, int) – Nominal beam background to query.
exact_match (bool) – If passed, an error is raised if there is not exactly one matching sample. If there is exactly one matching sample, then the single sample is returned, rather than a list.
inplace (bool) – Replace MC samples with the list obtained from query.
- skim.utils.testfiles.get_test_file(process, *, SampleYAML=None)[source]#
Attempt to find a test sample of the given MC process.
- Parameters
process (str) – Physics process, e.g. mixed, charged, ccbar, eemumu.
SampleYAML (str, pathlib.Path) – Path to a YAML file containing sample specifications.
- Returns
Path to test sample file.
- Return type
- Raises
FileNotFoundError – Raised if no sample can be found.
Miscellaneous utility functions#
Miscellaneous utility functions for skim experts.
- skim.utils.misc.dry_run_steering_file(SteeringFile)[source]#
Check if the steering file at the given path can be run with the “–dry-run” option.
- skim.utils.misc.fancy_skim_header(SkimClass)[source]#
Decorator to generate a fancy header to skim documentation and prepend it to the docstring. Add this just above the definition of a skim.
Also ensures the documentation of the template functions like
BaseSkim.build_lists
is not repeated in every skim documentation.@fancy_skim_header class MySkimName(BaseSkim): # docstring here describing your skim, and explaining cuts.
- skim.utils.misc.get_eventN(filename)[source]#
Retrieve the number of events in a file using
b2file-metadata-show
.
- skim.utils.misc.get_file_metadata(filename)[source]#
Retrieve the metadata for a file using
b2file-metadata-show
.
- skim.utils.misc.resolve_skim_modules(SkimsOrModules, *, LocalModule=None)[source]#
Produce an ordered list of skims, by expanding any Python skim module names into a list of skims in that module. Also produce a dict of skims grouped by Python module.
- Raises
RuntimeError – Raised if a skim is listed twice.
ValueError – Raised if
LocalModule
is passed and skims are normally expected from more than one module.
20.4.8. b2skim-prod
: Produce grid production requests#
Note
This tool is intended for use by skim production managers, not by skim liaisons.
b2skim-prod
is a tool for producing grid production requests in the format required by the
production system, and
also generating combined steering files.
YAML files are used by this tool to define the LPNs of datasets. Below are examples of valid YAML entries for data and MCri. The tool lpns2yaml.py is provided to create these YAML files from a list of LPNs.
## Example of a YAML file for data:
proc9_exp3r1:
sampleLabel: proc9_exp3 # This label must match a skim sample in TestFiles.yaml
LPNPrefix: /belle/Data
inputReleaseNumber: release-03-02-02
prodNumber: prod00008530
inputDBGlobalTag: DB00000654
procNumber: proc9
experimentNumber: e0003
beamEnergy: 4S
inputDataLevel: mdst
runNumbers:
- r02724
- r02801
- r02802
proc9_exp3r2:
sampleLabel: proc9_exp3
LPNPrefix: /belle/Data
inputReleaseNumber: release-03-02-02
# prodNumber, inputDBGlobalTag, experimentNumber, and runNumbers can be integers
prodNumber: 8530
inputDBGlobalTag: 654
procNumber: proc9
experimentNumber: 3
beamEnergy: 4S
inputDataLevel: mdst
runNumbers:
- 3237
- 3238
- 3239
## Example of a YAML file for MCri:
MC12b_mixed:
sampleLabel: MC12_mixedBGx1
LPNPrefix: /belle/MC
inputReleaseNumber: release-03-01-00
inputDBGlobalTag: DB00000547
mcCampaign: MC12b
prodNumber: prod00007392
experimentNumber: s00/e1003
beamEnergy: 4S
mcType: mixed
mcBackground: BGx1
inputDataLevel: mdst
runNumbers: r00000
MC12b_charged:
sampleLabel: MC12_chargedBGx1
LPNPrefix: /belle/MC
inputReleaseNumber: release-03-01-00
# inputDBGlobalTag, prodNumber, and runNumber can be integers
inputDBGlobalTag: 547
mcCampaign: MC12b
prodNumber:
- 7799 # prodNumber can be a list
- 7802
experimentNumber: s00/e1003
beamEnergy: 4S
mcType: charged
mcBackground: BGx1
inputDataLevel: mdst
runNumbers: 0
To produce JSON files for a list of combined skims, pass this tool the YAML file and the names of the skims. The other required arguments include the skim campaign, intended release to be used, and base directory of the repository to output the JSON files in.
This tool is designed to work with the SkimStats.json
output of
b2skim-stats-print
(see Testing skim performance). The YAML files can be used to specify
which sample statistics are to be used for each dataset, with the keyword
sampleLabel
—this must match one of the sample labels used by the skim statistics
tools. SkimStats.json
must be present in the current directory when this tool is
run.
Production requests cannot be produced without resource usage estimates, so the pipeline for producing a production request and combined steering file is as follows:
Put together a YAML file defining which single skims comprise each combined skim. For example,
# contents of CombinedSkims.yaml EWP: - BtoXll - BtoXll_LFV - BtoXgamma Tau: - TauLFV - TauGeneric - TauThrust
Pass this combined skim definition to b2skim-stats-submit, and produce JSON output of b2skim-stats-print
$ b2skim-stats-submit -c CombinedSkims.yaml EWP Tau # wait for LSF jobs to complete... $ b2skim-stats-print -c EWP Tau -J
The output
SkimStats.json
can then be used to produce production JSON files for the EWP and Tau combined skims, and will construct a steering files for the specified combined skims.
usage: b2skim-prod [-h] -s CombinedSkim [CombinedSkim ...] -o
OUTPUT_BASE_DIRECTORY -c CAMPAIGN -r RELEASE
[--DBGlobalTagRelease DBGLOBALTAGRELEASE]
[-l LOCAL_SKIM_SCRIPT]
[--analysis-globaltag AnalysisGlobaltag]
[--pid-globaltag PIDGlobaltag] [-N LPNS_PER_JSON]
[-rdcamp MCRD_CAMPAIGN] [-b STARTINGBATCHNUMBER]
(--mcri | --mcrd | --data)
sampleRegistryYaml SkimStatsJson
Required Arguments
- sampleRegistryYaml
YAML file defining the samples produce JSON files for.
- SkimStatsJson
The JSON output file of b2skim-stats-print.
Optional Arguments
- -s, --skims
List of skims to produce request files for.
- -o, --output-base-directory
Base directory for output. This should be the base directory of the
B2P/MC
orB2P/data
repo.- -c, --campaign
Name of the campaign, e.g. SKIMDATAx1.
- -r, --release
The basf2 to release to be used, e.g. release-04-00-03.
- --DBGlobalTagRelease
Release number to use associated DB global tag for. If provided, the script will try to find the global tag and raise an error if none is found. Otherwise, a range of global tags will be searched for, and the user will be presented with a prompt to select the one they want.
- -l, --local-skim-script
File name of the local skim script to use, if any. e.g.
ewp_local.py
. Should not include any path before the file name.- --analysis-globaltag
Analysis globaltag to be passed to the skims.
Default: “”
- --pid-globaltag
PID globaltag to be passed to the skims.
Default: “”
- -N, --lpns-per-json
Restrict number of LPNs in each JSON file to given number.
- -rdcamp, --mcrd-campaign
Give the corresponding campaign for mcrd for the json file output name. e.g.
b16
,b20
,p12exp7
.- -b, --starting-batch-number
Starting number to count from for batch label appended to prod names.
Default: 1
- --mcri, --MCri
Produce JSON files for run-independent MC.
- --mcrd, --MCrd
Produce JSON files for run-dependent MC.
- --data, --Data
Produce JSON files for data.
Example usage
Produce requests for EWP and feiSLCombined skims on proc9:
$ b2skim-prod Registry_proc9.yaml SkimStats.json -s EWP feiSLCombined --data -c SKIMDATAx1 -r release-04-01-01 -o B2P/data/
Produce requests for EWP on MC13, with one LPN per JSON file:
$ b2skim-prod Registry_MC13.yaml SkimStats.json -N 1 -s EWP --data -c SKIMDATAx1 -r release-04-01-01 -o B2P/data/
Produce requests for EWP on MC13, using local skim module script:
$ b2skim-prod Registry_MC13.yaml SkimStats.json -N 1 -l ewp_local.py -s EWP --data -c SKIMDATAx1 -r release-04-01-01 -o B2P/data/
20.4.9. b2skim-stats-total
: Produce summary statistics for skim package#
Tool for producing summary statistics of skims for the distributed computing group. The expected input is the JSON output of :ref`b2skim-stats-print<b2skim-stats-print>`.
The values produced by the script are aggregated over all skims (either averaged or summed, depending on the flag passed by the user). For MC samples, this aggregate is printed per-sample. For data, the values are first aggregated over all skims per test data file, and then those values are averaged and are attached an uncertainty given by the standard deviation of the individual aggregated values. This is done to account for the wide variation in skim performance when tested on data samples from different runs.
usage: b2skim-stats-total [-h] [-j STATSJSON] [-C | -M]
(-r SKIM [SKIM ...] | -f REQUESTED_SKIMS_FILE)
(-a | -t | -b)
Optional Arguments
- -j, --stats-json
JSON file of stats produced by
b2skim-stats-print
.- -C, --confluence
If passed, print tables in Confluence-friendly format.
- -M, --markdown
If passed, print tables in Markdown format.
- -r, --requested-skims
Possible choices: Random, SystematicsTracking, Resonance, SystematicsRadMuMu, SystematicsEELL, SystematicsRadEE, SystematicsLambda, SystematicsPhiGamma, SystematicsFourLeptonFromHLTFlag, SystematicsRadMuMuFromHLTFlag, SystematicsJpsi, SystematicsKshort, SystematicsBhabha, SystematicsCombinedHadronic, SystematicsCombinedLowMulti, SystematicsDstar, PRsemileptonicUntagged, LeptonicUntagged, SLUntagged, B0toDstarl_Kpi_Kpipi0_Kpipipi, BtoDl_and_ROE_e_or_mu_or_lowmult, feiHadronicB0, feiHadronicBplus, feiSLB0, feiSLB0_RDstar, feiSLBplus, feiHadronic, feiSL, BtoXgamma, BtoXll, BtoXll_LFV, B0TwoBody, FourLepton, RadiativeDilepton, TDCPV_ccs, TDCPV_qqs, TDCPV_dilepton, BtoD0h_Kspi0, BtoD0h_Kspipipi0, BtoDstarpipipi0_D0pi_Kpi, B0toDpi_Kpipi, B0toDpi_Kspi, B0toDstarPi_D0pi_Kpi, B0toDstarPi_D0pi_Kpipipi_Kpipi0, B0toDrho_Kpipi, B0toDrho_Kspi, B0toDstarRho_D0pi_Kpi, B0toDstarRho_D0pi_Kpipipi_Kpipi0, BptoD0etapi_Kpi, BptoD0pipi0_Kpi, BtoD0h_hh, BtoD0h_Kpi, B0toDstarpipi0_D0pi_Kpi, BtoD0h_Kpipipi_Kpipi0, B0toDstaretapi_D0pi_Kpi, BtoD0h_Kshh, BtoD0rho_Kpi, BtoD0rho_Kpipipi_Kpipi0, B0toDD_Kpipi_Kspi, B0toDstarD, B0toD0Kpipi0_pi0, B0toDs1D, B0toDDs0star, B0toDomegapi_Kpipi_pipipi0, B0toDomegapi_Kspi_pipipi0, BtoD0pi_Kpiomega_pipipi0, InclusiveLambda, BottomoniumEtabExclusive, BottomoniumUpsilon, InclusiveUpsilon, CharmoniumPsi, XToD0_D0ToHpJm, XToD0_D0ToNeutrals, DstToD0Pi_D0ToRare, XToDp_DpToKsHp, XToDp_DpToHpHmJp, LambdacTopHpJm, DstToD0Pi_D0ToGeneric, LambdacToSHpJm, XicpTopHpJm, XicToXimPipPim, Xic0ToLHpJm, XicpToLKsHp, DstToD0Pi_D0ToHpJm, DstToD0Pi_D0ToHpJmPi0, DstToD0Pi_D0ToKsOmega, DstToD0Pi_D0ToHpJmEta, DstToD0Pi_D0ToNeutrals, DstToD0Pi_D0ToHpJmKs, EarlyData_DstToD0Pi_D0ToHpJmPi0, EarlyData_DstToD0Pi_D0ToHpHmPi0, DstToDpPi0_DpToHpPi0, DpToHpPi0, DpToKsHp, DstToD0Pi_D0ToHpHmHpJm, DstToD0Pi_D0ToVGamma, DpToPipEpEm, DpToPipMupMum, DpToPipKpKm, DpToHpOmega, DspToHpOmega, DpToEtaHp, SinglePhotonDark, GammaGammaControlKLMDark, ALP3Gamma, EGammaControlDark, InelasticDarkMatter, RadBhabhaV0Control, TauLFV, DimuonPlusMissingEnergy, ElectronMuonPlusMissingEnergy, DielectronPlusMissingEnergy, LFVZpVisible, BtoKplusLLP, TauGeneric, TauThrust, TauKshort, TwoTrackLeptonsForLuminosity, LowMassTwoTrack, SingleTagPseudoScalar, InelasticDarkMatterWithDarkHiggs, AA2uuuu, DimuonPlusVisibleDarkHiggs, DielectronPlusVisibleDarkHiggs, LowMassOneTrack, BtoPi0Pi0, BtoPi0Eta, BtoHadTracks, BtoHad1Pi0, BtoHad3Tracks1Pi0, BtoRhopRhom, BtoEtapKstp
List of all skims that are requested for production. This list will be used to produce resource usage estimates based on the skims that are actually requested.
- -f, --requested-skims-file
File containing list of all skims that are requested for production. This list will be used to produce resource usage estimates based on the skims that are actually requested. The expected format of this file is one skim name per line, with each skim name exactly matching a name in the skim registry.
- -a, --average-tables
Print tables of averages.
- -t, --total-tables
Print tables of totals.
- -b, --both-tables
Print both tables of averages and tables of totals.
20.4.10. lpns2yaml.py
: Convert lists of LPNs to format expected by b2skim-prod
#
lpns2yaml.py
is a tool for converting a list of LPNs into YAML format expected by
b2skim-prod. The expected input to lpns2yaml.py
is a text file of
LPNs, like those which can be downloaded from the dataset searcher.
The test sample labels (under the key sampleLabel
) are automatically generated, so
please check they all correspond to a label skim/scripts/TestFiles.yaml
after
running the script.
usage: lpns2yaml.py [-h] [-o output_filename] (--data | --mcri | --mcrd)
[--bg {BGx0,BGx1}]
input_lpn_list_file
Required Arguments
- input_lpn_list_file
Input file containing list of LPNs (such as that from the dataset searcher).
Optional Arguments
- -o
Output YAML file name. If none given, prints output to screen.
- --data
Flag to indicate the LPNs are for data.
- --mcri
Flag to indicate the LPNs are for run-independent MC.
- --mcrd
Flag to indicate the LPNs are for run-dependent MC.
- --bg
Possible choices: BGx0, BGx1
Beam background level of MC samples. Only required for MC.
Example usage
Convert list of BGx1 MCri LPNs into YAML format and print to screen:
$ lpns2yaml.py my_MCri_LPNs.txt --mcri --bg BGx1
Convert list of BGx1 MCrd LPNs into YAML format and print to screen:
$ lpns2yaml.py my_MCrd_LPNs.txt --mcrd --bg BGx1
Convert list of data LPNs into YAML format and save to file:
$ lpns2yaml.py my_data_LPNs.txt --data -o my_data_LPNs.yaml