Belle II Software  release-08-01-10
ChargedPidMVAWeights Class Reference

Class to contain the payload of MVA weightfiles needed for charged particle identification. More...

#include <ChargedPidMVAWeights.h>

Inheritance diagram for ChargedPidMVAWeights:
Collaboration diagram for ChargedPidMVAWeights:

Public Types

enum class  ChargedPidMVATrainingMode : unsigned int {
  c_Classification = 0 ,
  c_Multiclass = 1 ,
  c_ECL_Classification = 2 ,
  c_ECL_Multiclass = 3 ,
  c_PSD_Classification = 4 ,
  c_PSD_Multiclass = 5 ,
  c_ECL_PSD_Classification = 6 ,
  c_ECL_PSD_Multiclass = 7
}
 A (strongly-typed) enumerator identifier for each valid MVA training mode. More...
 

Public Member Functions

 ChargedPidMVAWeights ()
 Default constructor, necessary for ROOT to stream the object.
 
 ChargedPidMVAWeights (const double &energyUnit, const double &angUnit, const std::string &thetaVarName="clusterTheta", bool implictNaNmasking=false)
 Specialized constructor.
 
 ~ChargedPidMVAWeights ()
 Destructor.
 
void setEnergyUnit (const double &unit)
 Set the energy unit to ensure consistency w/ the one used to define the bins grid.
 
void setAngularUnit (const double &unit)
 Set the angular unit to ensure consistency w/ the one used to define the bins grid.
 
void setWeightCategories (const double *clusterThetaBins, const int nClusterThetaBins, const double *pBins, const int nPBins, const double *chargeBins, const int nChargeBins)
 Set the 3D (clusterTheta, p, charge) grid representing the categories for which weightfiles are defined. More...
 
void storeMVAWeights (const int pdg, const std::vector< std::string > &filepaths, const std::vector< std::tuple< double, double, double >> &categoryBinCentres)
 Given a particle mass hypothesis' pdgId, store the list of MVA weight files (one for each category) into the payload. More...
 
void storeMVAWeightsMultiClass (const std::vector< std::string > &filepaths, const std::vector< std::tuple< double, double, double >> &categoryBinCentres)
 For the multi-class mode, store the list of MVA weight files (one for each category) into the payload. More...
 
void storeCuts (const int pdg, const std::vector< std::string > &cutfiles, const std::vector< std::tuple< double, double, double >> &categoryBinCentres)
 Given a particle mass hypothesis' pdgId, store the list of selection cuts (one for each category) into the payload. More...
 
void storeCutsMultiClass (const std::vector< std::string > &cutfiles, const std::vector< std::tuple< double, double, double >> &categoryBinCentres)
 For the multi-class mode, store the list of selection cuts (one for each category) into the payload. More...
 
void storeAliases (const VariablesByAlias &aliases)
 Store the map associating variable aliases to variable names knowm to VariableManager. More...
 
const TH3F * getWeightCategories () const
 Get the raw pointer to the 3D grid representing the categories for which weightfiles are defined. More...
 
const std::vector< std::string > * getMVAWeights (const int pdg) const
 Given a particle mass hypothesis' pdgId, get the list of (serialized) MVA weightfiles stored in the payload, one for each category. More...
 
const std::vector< std::string > * getMVAWeightsMulticlass () const
 For the multi-class mode, get the list of (serialized) MVA weightfiles stored in the payload, one for each category. More...
 
const std::vector< std::string > * getCuts (const int pdg) const
 Given a particle mass hypothesis' pdgId, get the list of selection cuts stored in the payload, one for each category. More...
 
const std::vector< std::string > * getCutsMulticlass () const
 For the multi-class mode, get the list of selection cuts stored in the payload, one for each category. More...
 
const VariablesByAliasgetAliases () const
 Get the map of unique aliases.
 
unsigned int getMVAWeightIdx (const double &theta, const double &p, const double &charge, int &idx_theta, int &idx_p, int &idx_charge) const
 Get the index of the XML weight file, for a given reconstructed triplet (clusterTheta(theta), p, charge). More...
 
unsigned int getMVAWeightIdx (const double &theta, const double &p, const double &charge) const
 Overloaded method, to be used if not interested in knowing the 3D bin coordinates.
 
void dumpPayload (const double &theta, const double &p, const double &charge, const int pdg, bool dump_all=false) const
 Read and dump the payload content from the internal 'matrioska' maps into an XML weightfile for the given set of inputs. More...
 
void dumpPayloadMulticlass (const double &theta, const double &p, const double &charge) const
 Special version for multi-class mode. More...
 
bool isValidPdg (const int pdg) const
 Check if the input pdgId is that of a valid charged particle. More...
 
std::string getThetaVarName () const
 Get the name of the polar angle variable.
 
bool hasImplicitNaNmasking () const
 Check flag for implicit NaN masking.
 

Private Types

typedef std::unordered_map< int, std::vector< std::string > > WeightfilesByParticle
 Typedef.
 
typedef std::map< std::string, std::string > VariablesByAlias
 Typedef.
 

Private Member Functions

int findBin (const double &x, const double &y, const double &z) const
 Find global bin index of the 3D categories histogram for the given (x, y, z) values. More...
 
 ClassDef (ChargedPidMVAWeights, 10)
 2: add energy/angular units. More...
 

Private Attributes

TParameter< double > m_energy_unit
 The energy unit used for defining the bins grid.
 
TParameter< double > m_ang_unit
 The angular unit used for defining the bins grid.
 
std::string m_thetaVarName
 The name of the polar angle variable used in the MVA categorisation. More...
 
bool m_implicitNaNmasking
 Flag to indicate whther the MVA variables have been NaN-masked directly in the weightfiles.
 
std::unique_ptr< TH3F > m_categories
 A 3D histogram whose bins represent the categories for which XML weight files are defined. More...
 
WeightfilesByParticle m_weightfiles
 For each charged particle mass hypothesis' pdgId, this map contains a list of (serialized) Weightfile objects to be stored in the payload. More...
 
WeightfilesByParticle m_cuts
 For each charged particle mass hypothesis' pdgId, this map contains a list of selection cuts to be stored in the payload. More...
 
VariablesByAlias m_aliases
 A map that associates variable aliases used in the MVA training to variable names known to the VariableManager.
 

Detailed Description

Class to contain the payload of MVA weightfiles needed for charged particle identification.

Definition at line 38 of file ChargedPidMVAWeights.h.

Member Enumeration Documentation

◆ ChargedPidMVATrainingMode

enum ChargedPidMVATrainingMode : unsigned int
strong

A (strongly-typed) enumerator identifier for each valid MVA training mode.

Enumerator
c_Classification 

Binary classification.

c_Multiclass 

Multi-class classification.

c_ECL_Classification 

Binary classification, ECL only.

c_ECL_Multiclass 

Multi-class classification, ECL only.

c_PSD_Classification 

Binary classification, including PSD.

c_PSD_Multiclass 

Multi-class classification, including PSD.

c_ECL_PSD_Classification 

Binary classification, ECL only, including PSD.

c_ECL_PSD_Multiclass 

Multi-class classification, ECL only, including PSD.

Definition at line 77 of file ChargedPidMVAWeights.h.

77  : unsigned int {
79  c_Classification = 0,
81  c_Multiclass = 1,
83  c_ECL_Classification = 2,
85  c_ECL_Multiclass = 3,
87  c_PSD_Classification = 4,
89  c_PSD_Multiclass = 5,
91  c_ECL_PSD_Classification = 6,
93  c_ECL_PSD_Multiclass = 7
94  };

Member Function Documentation

◆ ClassDef()

ClassDef ( ChargedPidMVAWeights  ,
10   
)
private

2: add energy/angular units.

  1. Add name of polar angle variable used for categorisation, and a boolean flag to check if implicit NaN masking is set in the input data.9. Add map of variable aliases and original basf2 vars.8. Use unique_ptr for m_categories.7. Use double instead of float in tuple.6. Introduce charge bin in the parametrisation.5. remove 2D grid dependence on pdgId, add multi-class support, define enum for valid training modes4. add cuts map.3. add overloaded getMVAWeightIdx. 1: first class implementation.

◆ dumpPayload()

void dumpPayload ( const double &  theta,
const double &  p,
const double &  charge,
const int  pdg,
bool  dump_all = false 
) const
inline

Read and dump the payload content from the internal 'matrioska' maps into an XML weightfile for the given set of inputs.

Useful for debugging.

Parameters
thetathe particle polar angle (from the cluster, or from the track if no cluster match) in [rad].
pthe particle momentum (from the track) in [GeV/c].
chargethe particle charge (from the track).
pdgthe particle mass hypothesis' pdgId.
dump_alldump all information.

Definition at line 396 of file ChargedPidMVAWeights.h.

◆ dumpPayloadMulticlass()

void dumpPayloadMulticlass ( const double &  theta,
const double &  p,
const double &  charge 
) const
inline

Special version for multi-class mode.

Uses the special value of pdg=0 reserved for multi-class mode.

Definition at line 443 of file ChargedPidMVAWeights.h.

◆ findBin()

int findBin ( const double &  x,
const double &  y,
const double &  z 
) const
inlineprivate

Find global bin index of the 3D categories histogram for the given (x, y, z) values.

This method had to be re-implemented b/c ROOT has no const version of TH1::FindBin() :(

Parameters
xvalue along the x axis.
yvalue along the y axis.
zvalue along the z axis.
Returns
the global linearised bin index.

Definition at line 488 of file ChargedPidMVAWeights.h.

◆ getCuts()

const std::vector<std::string>* getCuts ( const int  pdg) const
inline

Given a particle mass hypothesis' pdgId, get the list of selection cuts stored in the payload, one for each category.

Parameters
pdgthe particle mass hypothesis' pdgId.
pdgthe particle mass hypothesis' pdgId.

Definition at line 318 of file ChargedPidMVAWeights.h.

◆ getCutsMulticlass()

const std::vector<std::string>* getCutsMulticlass ( ) const
inline

For the multi-class mode, get the list of selection cuts stored in the payload, one for each category.

Uses the special value of pdg=0 reserved for multi-class mode.

Definition at line 329 of file ChargedPidMVAWeights.h.

◆ getMVAWeightIdx()

unsigned int getMVAWeightIdx ( const double &  theta,
const double &  p,
const double &  charge,
int &  idx_theta,
int &  idx_p,
int &  idx_charge 
) const
inline

Get the index of the XML weight file, for a given reconstructed triplet (clusterTheta(theta), p, charge).

The index is obtained by linearising the 3D m_categories histogram. The same index can be used to look up the correct MVAExpert, Dataset and Cut in the application module, hence we believe it's more useful to return the index rather than a pointer to the weightfile itself. The function also retrieves the 3D bin coordinates.

Parameters
thetathe particle polar angle (from the cluster, or from the track if no cluster match) in [rad].
pthe particle momentum (from the track) in [GeV/c].
chargethe particle charge (from the track).
[out]idx_thetathe index of the 3D bin along the theta (X) axis.
[out]idx_pthe index of the 3D bin along the p (Y) axis.
[out]idx_chargethe index of the 3D bin along the charge (Z) axis.
Returns
the index of the weightfile of interest from the array of weightfiles.

Definition at line 358 of file ChargedPidMVAWeights.h.

◆ getMVAWeights()

const std::vector<std::string>* getMVAWeights ( const int  pdg) const
inline

Given a particle mass hypothesis' pdgId, get the list of (serialized) MVA weightfiles stored in the payload, one for each category.

Parameters
pdgthe particle mass hypothesis' pdgId.

Definition at line 295 of file ChargedPidMVAWeights.h.

◆ getMVAWeightsMulticlass()

const std::vector<std::string>* getMVAWeightsMulticlass ( ) const
inline

For the multi-class mode, get the list of (serialized) MVA weightfiles stored in the payload, one for each category.

Uses the special value of pdg=0 reserved for multi-class mode.

Definition at line 306 of file ChargedPidMVAWeights.h.

◆ getWeightCategories()

const TH3F* getWeightCategories ( ) const
inline

Get the raw pointer to the 3D grid representing the categories for which weightfiles are defined.

Used just to view the stored data.

Definition at line 284 of file ChargedPidMVAWeights.h.

◆ isValidPdg()

bool isValidPdg ( const int  pdg) const
inline

Check if the input pdgId is that of a valid charged particle.

An input value of pdg=0 is considered valid, since it's reserved for multi-class mode.

Definition at line 453 of file ChargedPidMVAWeights.h.

◆ setWeightCategories()

void setWeightCategories ( const double *  clusterThetaBins,
const int  nClusterThetaBins,
const double *  pBins,
const int  nPBins,
const double *  chargeBins,
const int  nChargeBins 
)
inline

Set the 3D (clusterTheta, p, charge) grid representing the categories for which weightfiles are defined.

Parameters
clusterThetaBinsarray of clusterTheta bin edges
nClusterThetaBinsnumber of clusterTheta bins
pBinsarray of p bin edges
nPBinsnumber of p bins
chargeBinsarray of charge bin edges
nChargeBinsnumber of charge bins

Definition at line 117 of file ChargedPidMVAWeights.h.

◆ storeAliases()

void storeAliases ( const VariablesByAlias aliases)
inline

Store the map associating variable aliases to variable names knowm to VariableManager.

Parameters
aliasesa map of (alias, VM variable) pairs. NB: it is supposed to contain all the aliases for every category.

Definition at line 274 of file ChargedPidMVAWeights.h.

◆ storeCuts()

void storeCuts ( const int  pdg,
const std::vector< std::string > &  cutfiles,
const std::vector< std::tuple< double, double, double >> &  categoryBinCentres 
)
inline

Given a particle mass hypothesis' pdgId, store the list of selection cuts (one for each category) into the payload.

Parameters
pdgthe particle mass hypothesis' pdgId.
cutfilesa list of text files w/ cut strings, for each (clusterTheta, p, charge) category. The format of the cut must comply with the GeneralCut syntax.
categoryBinCentresa list of <double, double, double> representing the (clusterTheta, p, charge) bin centres. Used to check consistency of the xml vector indexing w/ the linearised TH3 category map.

Definition at line 213 of file ChargedPidMVAWeights.h.

◆ storeCutsMultiClass()

void storeCutsMultiClass ( const std::vector< std::string > &  cutfiles,
const std::vector< std::tuple< double, double, double >> &  categoryBinCentres 
)
inline

For the multi-class mode, store the list of selection cuts (one for each category) into the payload.

Uses the special value of pdg=0 reserved for multi-class mode.

Parameters
cutfilesa list of text files w/ cut strings, for each (clusterTheta, p, charge) category. The format of the cut must comply with the GeneralCut syntax.
categoryBinCentresa list of <double, double, double> representing the (clusterTheta, p, charge) bin centres. Used to check consistency of the xml vector indexing w/ the linearised TH3 category map.

Definition at line 262 of file ChargedPidMVAWeights.h.

◆ storeMVAWeights()

void storeMVAWeights ( const int  pdg,
const std::vector< std::string > &  filepaths,
const std::vector< std::tuple< double, double, double >> &  categoryBinCentres 
)
inline

Given a particle mass hypothesis' pdgId, store the list of MVA weight files (one for each category) into the payload.

Parameters
pdgthe particle mass hypothesis' pdgId.
filepathsa list of xml (root) file paths for several (clusterTheta, p, charge) categories.
categoryBinCentresa list of <double, double, double> representing the (clusterTheta, p, charge) bin centres. Used to check consistency of the xml vector indexing w/ the linearised TH3 category map.

Definition at line 138 of file ChargedPidMVAWeights.h.

◆ storeMVAWeightsMultiClass()

void storeMVAWeightsMultiClass ( const std::vector< std::string > &  filepaths,
const std::vector< std::tuple< double, double, double >> &  categoryBinCentres 
)
inline

For the multi-class mode, store the list of MVA weight files (one for each category) into the payload.

Uses the special value of pdg=0 reserved for multi-class mode.

Parameters
filepathsa list of xml (root) file paths for several (clusterTheta, p, charge) categories.
categoryBinCentresa list of <double, double, double> representing the (clusterTheta, p, charge) bin centres. Used to check consistency of the xml vector indexing w/ the linearised TH3 category map.

Definition at line 196 of file ChargedPidMVAWeights.h.

Member Data Documentation

◆ m_categories

std::unique_ptr<TH3F> m_categories
private

A 3D histogram whose bins represent the categories for which XML weight files are defined.

It is used to lookup the correct file in the payload, given a reconstructed set of (clusterTheta(theta), p, charge).

Definition at line 533 of file ChargedPidMVAWeights.h.

◆ m_cuts

WeightfilesByParticle m_cuts
private
Initial value:
= {
{ 0, std::vector<std::string>() },
{ Const::electron.getPDGCode(), std::vector<std::string>() },
{ Const::muon.getPDGCode(), std::vector<std::string>() },
{ Const::pion.getPDGCode(), std::vector<std::string>() },
{ Const::kaon.getPDGCode(), std::vector<std::string>() },
{ Const::proton.getPDGCode(), std::vector<std::string>() },
{ Const::deuteron.getPDGCode(), std::vector<std::string>() }
}
int getPDGCode() const
PDG code.
Definition: Const.h:464
static const ChargedStable muon
muon particle
Definition: Const.h:651
static const ChargedStable pion
charged pion particle
Definition: Const.h:652
static const ChargedStable proton
proton particle
Definition: Const.h:654
static const ChargedStable kaon
charged kaon particle
Definition: Const.h:653
static const ChargedStable electron
electron particle
Definition: Const.h:650
static const ChargedStable deuteron
deuteron particle
Definition: Const.h:655

For each charged particle mass hypothesis' pdgId, this map contains a list of selection cuts to be stored in the payload.

To each Weightfile (i.e., category) corresponds a cut. The indexing in each vector must reflect the one of the corresponding 'linearised' TH3F histogram contained in the m_grids map.

The dummy pdgId=0 key is reserved for multi-class, where a unique signal hypothesis is not defined.

Definition at line 563 of file ChargedPidMVAWeights.h.

◆ m_thetaVarName

std::string m_thetaVarName
private

The name of the polar angle variable used in the MVA categorisation.

Must be a string that can be parsed by the VariableManager.

Definition at line 525 of file ChargedPidMVAWeights.h.

◆ m_weightfiles

WeightfilesByParticle m_weightfiles
private
Initial value:
= {
{ 0, std::vector<std::string>() },
{ Const::electron.getPDGCode(), std::vector<std::string>() },
{ Const::muon.getPDGCode(), std::vector<std::string>() },
{ Const::pion.getPDGCode(), std::vector<std::string>() },
{ Const::kaon.getPDGCode(), std::vector<std::string>() },
{ Const::proton.getPDGCode(), std::vector<std::string>() },
{ Const::deuteron.getPDGCode(), std::vector<std::string>() }
}

For each charged particle mass hypothesis' pdgId, this map contains a list of (serialized) Weightfile objects to be stored in the payload.

Each weightfile in the list corresponds to a 3D category. The indexing in each vector must reflect the one of the corresponding 'linearised' TH3F histogram contained in the m_grids map.

The dummy pdgId=0 key is reserved for multi-class, where a unique signal hypothesis is not defined.

Definition at line 544 of file ChargedPidMVAWeights.h.


The documentation for this class was generated from the following file: