Belle II Software development
ChargedPidMVAWeights Class Reference

Class to contain the payload of MVA weightfiles needed for charged particle identification. More...

#include <ChargedPidMVAWeights.h>

Inheritance diagram for ChargedPidMVAWeights:

Public Types

enum class  ChargedPidMVATrainingMode : unsigned int {
  c_Classification = 0 ,
  c_Multiclass = 1 ,
  c_ECL_Classification = 2 ,
  c_ECL_Multiclass = 3 ,
  c_PSD_Classification = 4 ,
  c_PSD_Multiclass = 5 ,
  c_ECL_PSD_Classification = 6 ,
  c_ECL_PSD_Multiclass = 7
}
 A (strongly-typed) enumerator identifier for each valid MVA training mode. More...
 

Public Member Functions

 ChargedPidMVAWeights ()
 Default constructor, necessary for ROOT to stream the object.
 
 ChargedPidMVAWeights (const double &energyUnit, const double &angUnit, const std::string &thetaVarName="clusterTheta", bool implictNaNmasking=false)
 Specialized constructor.
 
 ~ChargedPidMVAWeights ()
 Destructor.
 
void setEnergyUnit (const double &unit)
 Set the energy unit to ensure consistency w/ the one used to define the bins grid.
 
void setAngularUnit (const double &unit)
 Set the angular unit to ensure consistency w/ the one used to define the bins grid.
 
void setWeightCategories (const double *clusterThetaBins, const int nClusterThetaBins, const double *pBins, const int nPBins, const double *chargeBins, const int nChargeBins)
 Set the 3D (clusterTheta, p, charge) grid representing the categories for which weightfiles are defined.
 
void storeMVAWeights (const int pdg, const std::vector< std::string > &filepaths, const std::vector< std::tuple< double, double, double > > &categoryBinCentres)
 Given a particle mass hypothesis' pdgId, store the list of MVA weight files (one for each category) into the payload.
 
void storeMVAWeightsMultiClass (const std::vector< std::string > &filepaths, const std::vector< std::tuple< double, double, double > > &categoryBinCentres)
 For the multi-class mode, store the list of MVA weight files (one for each category) into the payload.
 
void storeCuts (const int pdg, const std::vector< std::string > &cutfiles, const std::vector< std::tuple< double, double, double > > &categoryBinCentres)
 Given a particle mass hypothesis' pdgId, store the list of selection cuts (one for each category) into the payload.
 
void storeCutsMultiClass (const std::vector< std::string > &cutfiles, const std::vector< std::tuple< double, double, double > > &categoryBinCentres)
 For the multi-class mode, store the list of selection cuts (one for each category) into the payload.
 
void storeAliases (const VariablesByAlias &aliases)
 Store the map associating variable aliases to variable names knowm to VariableManager.
 
const TH3F * getWeightCategories () const
 Get the raw pointer to the 3D grid representing the categories for which weightfiles are defined.
 
const std::vector< std::string > * getMVAWeights (const int pdg) const
 Given a particle mass hypothesis' pdgId, get the list of (serialized) MVA weightfiles stored in the payload, one for each category.
 
const std::vector< std::string > * getMVAWeightsMulticlass () const
 For the multi-class mode, get the list of (serialized) MVA weightfiles stored in the payload, one for each category.
 
const std::vector< std::string > * getCuts (const int pdg) const
 Given a particle mass hypothesis' pdgId, get the list of selection cuts stored in the payload, one for each category.
 
const std::vector< std::string > * getCutsMulticlass () const
 For the multi-class mode, get the list of selection cuts stored in the payload, one for each category.
 
const VariablesByAliasgetAliases () const
 Get the map of unique aliases.
 
unsigned int getMVAWeightIdx (const double &theta, const double &p, const double &charge, int &idx_theta, int &idx_p, int &idx_charge) const
 Get the index of the XML weight file, for a given reconstructed triplet (clusterTheta(theta), p, charge).
 
unsigned int getMVAWeightIdx (const double &theta, const double &p, const double &charge) const
 Overloaded method, to be used if not interested in knowing the 3D bin coordinates.
 
void dumpPayload (const double &theta, const double &p, const double &charge, const int pdg, bool dump_all=false) const
 Read and dump the payload content from the internal 'matrioska' maps into an XML weightfile for the given set of inputs.
 
void dumpPayloadMulticlass (const double &theta, const double &p, const double &charge) const
 Special version for multi-class mode.
 
bool isValidPdg (const int pdg) const
 Check if the input pdgId is that of a valid charged particle.
 
std::string getThetaVarName () const
 Get the name of the polar angle variable.
 
bool hasImplicitNaNmasking () const
 Check flag for implicit NaN masking.
 

Private Types

typedef std::unordered_map< int, std::vector< std::string > > WeightfilesByParticle
 Typedef.
 
typedef std::map< std::string, std::string > VariablesByAlias
 Typedef.
 

Private Member Functions

int findBin (const double &x, const double &y, const double &z) const
 Find global bin index of the 3D categories histogram for the given (x, y, z) values.
 
 ClassDef (ChargedPidMVAWeights, 10)
 2: add energy/angular units.
 

Private Attributes

TParameter< double > m_energy_unit
 The energy unit used for defining the bins grid.
 
TParameter< double > m_ang_unit
 The angular unit used for defining the bins grid.
 
std::string m_thetaVarName
 The name of the polar angle variable used in the MVA categorisation.
 
bool m_implicitNaNmasking
 Flag to indicate whether the MVA variables have been NaN-masked directly in the weightfiles.
 
std::unique_ptr< TH3F > m_categories
 A 3D histogram whose bins represent the categories for which XML weight files are defined.
 
WeightfilesByParticle m_weightfiles
 For each charged particle mass hypothesis' pdgId, this map contains a list of (serialized) Weightfile objects to be stored in the payload.
 
WeightfilesByParticle m_cuts
 For each charged particle mass hypothesis' pdgId, this map contains a list of selection cuts to be stored in the payload.
 
VariablesByAlias m_aliases
 A map that associates variable aliases used in the MVA training to variable names known to the VariableManager.
 

Detailed Description

Class to contain the payload of MVA weightfiles needed for charged particle identification.

Definition at line 38 of file ChargedPidMVAWeights.h.

Member Typedef Documentation

◆ VariablesByAlias

typedef std::map<std::string, std::string> VariablesByAlias
private

Typedef.

Definition at line 41 of file ChargedPidMVAWeights.h.

◆ WeightfilesByParticle

typedef std::unordered_map<int, std::vector<std::string> > WeightfilesByParticle
private

Typedef.

Definition at line 40 of file ChargedPidMVAWeights.h.

Member Enumeration Documentation

◆ ChargedPidMVATrainingMode

enum class ChargedPidMVATrainingMode : unsigned int
strong

A (strongly-typed) enumerator identifier for each valid MVA training mode.

Enumerator
c_Classification 

Binary classification.

c_Multiclass 

Multi-class classification.

c_ECL_Classification 

Binary classification, ECL only.

c_ECL_Multiclass 

Multi-class classification, ECL only.

c_PSD_Classification 

Binary classification, including PSD.

c_PSD_Multiclass 

Multi-class classification, including PSD.

c_ECL_PSD_Classification 

Binary classification, ECL only, including PSD.

c_ECL_PSD_Multiclass 

Multi-class classification, ECL only, including PSD.

Definition at line 77 of file ChargedPidMVAWeights.h.

77 : unsigned int {
81 c_Multiclass = 1,
94 };
@ c_PSD_Multiclass
Multi-class classification, including PSD.
@ c_PSD_Classification
Binary classification, including PSD.
@ c_ECL_Multiclass
Multi-class classification, ECL only.
@ c_ECL_PSD_Classification
Binary classification, ECL only, including PSD.
@ c_ECL_Classification
Binary classification, ECL only.
@ c_ECL_PSD_Multiclass
Multi-class classification, ECL only, including PSD.

Constructor & Destructor Documentation

◆ ChargedPidMVAWeights() [1/2]

Default constructor, necessary for ROOT to stream the object.

Definition at line 48 of file ChargedPidMVAWeights.h.

48 :
49 m_energy_unit("energyUnit", Unit::GeV),
50 m_ang_unit("angularUnit", Unit::rad),
51 m_thetaVarName("clusterTheta"),
53 {};
TParameter< double > m_energy_unit
The energy unit used for defining the bins grid.
std::string m_thetaVarName
The name of the polar angle variable used in the MVA categorisation.
TParameter< double > m_ang_unit
The angular unit used for defining the bins grid.
bool m_implicitNaNmasking
Flag to indicate whether the MVA variables have been NaN-masked directly in the weightfiles.
static const double rad
Standard of [angle].
Definition: Unit.h:50
static const double GeV
Standard of [energy, momentum, mass].
Definition: Unit.h:51

◆ ChargedPidMVAWeights() [2/2]

ChargedPidMVAWeights ( const double &  energyUnit,
const double &  angUnit,
const std::string &  thetaVarName = "clusterTheta",
bool  implictNaNmasking = false 
)
inline

Specialized constructor.

Definition at line 59 of file ChargedPidMVAWeights.h.

62 {
63 setEnergyUnit(energyUnit);
64 setAngularUnit(angUnit);
65 m_thetaVarName = thetaVarName;
66 m_implicitNaNmasking = implictNaNmasking;
67 }
void setAngularUnit(const double &unit)
Set the angular unit to ensure consistency w/ the one used to define the bins grid.
void setEnergyUnit(const double &unit)
Set the energy unit to ensure consistency w/ the one used to define the bins grid.

◆ ~ChargedPidMVAWeights()

~ChargedPidMVAWeights ( )
inline

Destructor.

Definition at line 72 of file ChargedPidMVAWeights.h.

72{};

Member Function Documentation

◆ ClassDef()

ClassDef ( ChargedPidMVAWeights  ,
10   
)
private

2: add energy/angular units.

  1. Add name of polar angle variable used for categorisation, and a boolean flag to check if implicit NaN masking is set in the input data.9. Add map of variable aliases and original basf2 vars.8. Use unique_ptr for m_categories.7. Use double instead of float in tuple.6. Introduce charge bin in the parametrisation.5. remove 2D grid dependence on pdgId, add multi-class support, define enum for valid training modes4. add cuts map.3. add overloaded getMVAWeightIdx. 1: first class implementation.

◆ dumpPayload()

void dumpPayload ( const double &  theta,
const double &  p,
const double &  charge,
const int  pdg,
bool  dump_all = false 
) const
inline

Read and dump the payload content from the internal 'matrioska' maps into an XML weightfile for the given set of inputs.

Useful for debugging.

Parameters
thetathe particle polar angle (from the cluster, or from the track if no cluster match) in [rad].
pthe particle momentum (from the track) in [GeV/c].
chargethe particle charge (from the track).
pdgthe particle mass hypothesis' pdgId.
dump_alldump all information.

Definition at line 395 of file ChargedPidMVAWeights.h.

396 {
397
398 B2INFO("Dumping payload content for:");
399 B2INFO("clusterTheta(theta) = " << theta << " [rad], p = " << p << " [GeV/c], charge = " << charge);
400
401 if (m_categories) {
402 std::string filename = "db_payload_chargedpidmva__theta_p_charge_categories.root";
403 B2INFO("\tWriting ROOT file w/ TH3F grid that defines categories:" << filename);
404 auto f = std::make_unique<TFile>(filename.c_str(), "RECREATE");
405 m_categories->Write();
406 f->Close();
407 } else {
408 B2WARNING("\tThe TH3F object that defines categories is a nullptr!");
409 }
410
411 for (const auto& [pdgId, weights] : m_weightfiles) {
412
413 if (!dump_all && pdg != pdgId) continue;
414
415 auto idx = getMVAWeightIdx(theta, p, charge);
416
417 auto serialized_weightfile = weights.at(idx);
418
419 std::string filename = "db_payload_chargedpidmva__weightfile_pdg_" + std::to_string(pdgId) +
420 "_glob_bin_" + std::to_string(idx + 1) + ".xml";
421
422 auto cutstr = getCuts(pdgId)->at(idx);
423
424 B2INFO("\tpdgId = " << pdgId);
425 B2INFO("\tCut: " << cutstr);
426 B2INFO("\tWriting weight file: " << filename);
427
428 std::ofstream weightfile;
429 weightfile.open(filename.c_str(), std::ios::out);
430 weightfile << serialized_weightfile << std::endl;
431 weightfile.close();
432
433 }
434
435 };
unsigned int getMVAWeightIdx(const double &theta, const double &p, const double &charge, int &idx_theta, int &idx_p, int &idx_charge) const
Get the index of the XML weight file, for a given reconstructed triplet (clusterTheta(theta),...
std::unique_ptr< TH3F > m_categories
A 3D histogram whose bins represent the categories for which XML weight files are defined.
WeightfilesByParticle m_weightfiles
For each charged particle mass hypothesis' pdgId, this map contains a list of (serialized) Weightfile...
const std::vector< std::string > * getCuts(const int pdg) const
Given a particle mass hypothesis' pdgId, get the list of selection cuts stored in the payload,...

◆ dumpPayloadMulticlass()

void dumpPayloadMulticlass ( const double &  theta,
const double &  p,
const double &  charge 
) const
inline

Special version for multi-class mode.

Uses the special value of pdg=0 reserved for multi-class mode.

Definition at line 442 of file ChargedPidMVAWeights.h.

443 {
444 dumpPayload(theta, p, charge, 0);
445 }
void dumpPayload(const double &theta, const double &p, const double &charge, const int pdg, bool dump_all=false) const
Read and dump the payload content from the internal 'matrioska' maps into an XML weightfile for the g...

◆ findBin()

int findBin ( const double &  x,
const double &  y,
const double &  z 
) const
inlineprivate

Find global bin index of the 3D categories histogram for the given (x, y, z) values.

This method had to be re-implemented b/c ROOT has no const version of TH1::FindBin() :(

Parameters
xvalue along the x axis.
yvalue along the y axis.
zvalue along the z axis.
Returns
the global linearised bin index.

Definition at line 487 of file ChargedPidMVAWeights.h.

488 {
489
490 int nbinsx_vis = m_categories->GetXaxis()->GetNbins();
491 int nbinsy_vis = m_categories->GetYaxis()->GetNbins();
492 int nbinsz_vis = m_categories->GetZaxis()->GetNbins();
493
494 double xx = x;
495 double yy = y;
496 double zz = z;
497
498 // If x, y, z are outside of the 3D grid (visible) range, set their value to
499 // fall in the last (first) bin before (after) overflow (underflow).
500 if (x < m_categories->GetXaxis()->GetBinLowEdge(1)) { xx = m_categories->GetXaxis()->GetBinCenter(1); }
501 if (x >= m_categories->GetXaxis()->GetBinLowEdge(nbinsx_vis + 1)) { xx = m_categories->GetXaxis()->GetBinCenter(nbinsx_vis); }
502 if (y < m_categories->GetYaxis()->GetBinLowEdge(1)) { yy = m_categories->GetYaxis()->GetBinCenter(1); }
503 if (y >= m_categories->GetYaxis()->GetBinLowEdge(nbinsy_vis + 1)) { yy = m_categories->GetYaxis()->GetBinCenter(nbinsy_vis); }
504 if (z < m_categories->GetZaxis()->GetBinLowEdge(1)) { zz = m_categories->GetZaxis()->GetBinCenter(1); }
505 if (z >= m_categories->GetZaxis()->GetBinLowEdge(nbinsz_vis + 1)) { zz = m_categories->GetZaxis()->GetBinCenter(nbinsz_vis); }
506
507 int nbinsx = m_categories->GetXaxis()->GetNbins() + 2;
508 int nbinsy = m_categories->GetYaxis()->GetNbins() + 2;
509
510 int j = m_categories->GetXaxis()->FindBin(xx);
511 int i = m_categories->GetYaxis()->FindBin(yy);
512 int k = m_categories->GetZaxis()->FindBin(zz);
513
514 return j + nbinsx * (i + nbinsy * k);
515 }

◆ getAliases()

const VariablesByAlias * getAliases ( ) const
inline

Get the map of unique aliases.

Definition at line 337 of file ChargedPidMVAWeights.h.

338 {
339 return &m_aliases;
340 }
VariablesByAlias m_aliases
A map that associates variable aliases used in the MVA training to variable names known to the Variab...

◆ getCuts()

const std::vector< std::string > * getCuts ( const int  pdg) const
inline

Given a particle mass hypothesis' pdgId, get the list of selection cuts stored in the payload, one for each category.

Parameters
pdgthe particle mass hypothesis' pdgId.

Definition at line 317 of file ChargedPidMVAWeights.h.

318 {
319 return &(m_cuts.at(pdg));
320 }
WeightfilesByParticle m_cuts
For each charged particle mass hypothesis' pdgId, this map contains a list of selection cuts to be st...

◆ getCutsMulticlass()

const std::vector< std::string > * getCutsMulticlass ( ) const
inline

For the multi-class mode, get the list of selection cuts stored in the payload, one for each category.

Uses the special value of pdg=0 reserved for multi-class mode.

Definition at line 328 of file ChargedPidMVAWeights.h.

329 {
330 return getCuts(0);
331 }

◆ getMVAWeightIdx() [1/2]

unsigned int getMVAWeightIdx ( const double &  theta,
const double &  p,
const double &  charge 
) const
inline

Overloaded method, to be used if not interested in knowing the 3D bin coordinates.

Definition at line 379 of file ChargedPidMVAWeights.h.

380 {
381 int idx_theta, idx_p, idx_charge;
382 return getMVAWeightIdx(theta, p, charge, idx_theta, idx_p, idx_charge);
383 }

◆ getMVAWeightIdx() [2/2]

unsigned int getMVAWeightIdx ( const double &  theta,
const double &  p,
const double &  charge,
int &  idx_theta,
int &  idx_p,
int &  idx_charge 
) const
inline

Get the index of the XML weight file, for a given reconstructed triplet (clusterTheta(theta), p, charge).

The index is obtained by linearising the 3D m_categories histogram. The same index can be used to look up the correct MVAExpert, Dataset and Cut in the application module, hence we believe it's more useful to return the index rather than a pointer to the weightfile itself. The function also retrieves the 3D bin coordinates.

Parameters
thetathe particle polar angle (from the cluster, or from the track if no cluster match) in [rad].
pthe particle momentum (from the track) in [GeV/c].
chargethe particle charge (from the track).
[out]idx_thetathe index of the 3D bin along the theta (X) axis.
[out]idx_pthe index of the 3D bin along the p (Y) axis.
[out]idx_chargethe index of the 3D bin along the charge (Z) axis.
Returns
the index of the weightfile of interest from the array of weightfiles.

Definition at line 357 of file ChargedPidMVAWeights.h.

359 {
360
361 if (!m_categories) {
362 B2FATAL("No (clusterTheta, p, charge) TH3 grid was found in the DB payload. Most likely, you are using a GT w/ an old payload which is no longer compatible with the DB object class implementation. This should not happen! Abort...");
363 }
364
365 int nbins_th = m_categories->GetXaxis()->GetNbins(); // nr. of theta (visible) bins, along X.
366 int nbins_p = m_categories->GetYaxis()->GetNbins(); // nr. of p (visible) bins, along Y.
367
368 int glob_bin_idx = findBin(theta / m_ang_unit.GetVal(), p / m_energy_unit.GetVal(), charge);
369 m_categories->GetBinXYZ(glob_bin_idx, idx_theta, idx_p, idx_charge);
370
371 // The index of the linearised 3D m_categories.
372 // The unit offset is b/c ROOT sets global bin idx also for overflows and underflows.
373 return (idx_theta - 1) + nbins_th * ((idx_p - 1) + nbins_p * (idx_charge - 1));
374 }
int findBin(const double &x, const double &y, const double &z) const
Find global bin index of the 3D categories histogram for the given (x, y, z) values.

◆ getMVAWeights()

const std::vector< std::string > * getMVAWeights ( const int  pdg) const
inline

Given a particle mass hypothesis' pdgId, get the list of (serialized) MVA weightfiles stored in the payload, one for each category.

Parameters
pdgthe particle mass hypothesis' pdgId.

Definition at line 295 of file ChargedPidMVAWeights.h.

296 {
297 return &(m_weightfiles.at(pdg));
298 }

◆ getMVAWeightsMulticlass()

const std::vector< std::string > * getMVAWeightsMulticlass ( ) const
inline

For the multi-class mode, get the list of (serialized) MVA weightfiles stored in the payload, one for each category.

Uses the special value of pdg=0 reserved for multi-class mode.

Definition at line 306 of file ChargedPidMVAWeights.h.

307 {
308 return getMVAWeights(0);
309 }
const std::vector< std::string > * getMVAWeights(const int pdg) const
Given a particle mass hypothesis' pdgId, get the list of (serialized) MVA weightfiles stored in the p...

◆ getThetaVarName()

std::string getThetaVarName ( ) const
inline

Get the name of the polar angle variable.

Definition at line 461 of file ChargedPidMVAWeights.h.

462 {
463 return m_thetaVarName;
464 }

◆ getWeightCategories()

const TH3F * getWeightCategories ( ) const
inline

Get the raw pointer to the 3D grid representing the categories for which weightfiles are defined.

Used just to view the stored data.

Definition at line 284 of file ChargedPidMVAWeights.h.

285 {
286 return m_categories.get();
287 }

◆ hasImplicitNaNmasking()

bool hasImplicitNaNmasking ( ) const
inline

Check flag for implicit NaN masking.

Definition at line 470 of file ChargedPidMVAWeights.h.

471 {
473 }

◆ isValidPdg()

bool isValidPdg ( const int  pdg) const
inline

Check if the input pdgId is that of a valid charged particle.

An input value of pdg=0 is considered valid, since it's reserved for multi-class mode.

Definition at line 452 of file ChargedPidMVAWeights.h.

453 {
454 bool isValid = (Const::chargedStableSet.find(pdg) != Const::invalidParticle) || (pdg == 0);
455 return isValid;
456 }
const ParticleType & find(int pdg) const
Returns particle in set with given PDG code, or invalidParticle if not found.
Definition: Const.h:571
static const ParticleSet chargedStableSet
set of charged stable particles
Definition: Const.h:618
static const ParticleType invalidParticle
Invalid particle, used internally.
Definition: Const.h:681
bool isValid(EForwardBackward eForwardBackward)
Check whether the given enum instance is one of the valid values.

◆ setAngularUnit()

void setAngularUnit ( const double &  unit)
inline

Set the angular unit to ensure consistency w/ the one used to define the bins grid.

Definition at line 106 of file ChargedPidMVAWeights.h.

106{ m_ang_unit.SetVal(unit); }

◆ setEnergyUnit()

void setEnergyUnit ( const double &  unit)
inline

Set the energy unit to ensure consistency w/ the one used to define the bins grid.

Definition at line 100 of file ChargedPidMVAWeights.h.

100{ m_energy_unit.SetVal(unit); }

◆ setWeightCategories()

void setWeightCategories ( const double *  clusterThetaBins,
const int  nClusterThetaBins,
const double *  pBins,
const int  nPBins,
const double *  chargeBins,
const int  nChargeBins 
)
inline

Set the 3D (clusterTheta, p, charge) grid representing the categories for which weightfiles are defined.

Parameters
clusterThetaBinsarray of clusterTheta bin edges
nClusterThetaBinsnumber of clusterTheta bins
pBinsarray of p bin edges
nPBinsnumber of p bins
chargeBinsarray of charge bin edges
nChargeBinsnumber of charge bins

Definition at line 117 of file ChargedPidMVAWeights.h.

120 {
121
122 m_categories = std::make_unique<TH3F>("clustertheta_p_charge_binsgrid",
123 ";ECL cluster #theta;p_{lab};Q",
124 nClusterThetaBins, clusterThetaBins,
125 nPBins, pBins,
126 nChargeBins, chargeBins);
127 }

◆ storeAliases()

void storeAliases ( const VariablesByAlias aliases)
inline

Store the map associating variable aliases to variable names knowm to VariableManager.

Parameters
aliasesa map of (alias, VM variable) pairs. NB: it is supposed to contain all the aliases for every category.

Definition at line 274 of file ChargedPidMVAWeights.h.

275 {
276 m_aliases = VariablesByAlias(aliases);
277 }
std::map< std::string, std::string > VariablesByAlias
Typedef.

◆ storeCuts()

void storeCuts ( const int  pdg,
const std::vector< std::string > &  cutfiles,
const std::vector< std::tuple< double, double, double > > &  categoryBinCentres 
)
inline

Given a particle mass hypothesis' pdgId, store the list of selection cuts (one for each category) into the payload.

Parameters
pdgthe particle mass hypothesis' pdgId.
cutfilesa list of text files w/ cut strings, for each (clusterTheta, p, charge) category. The format of the cut must comply with the GeneralCut syntax.
categoryBinCentresa list of <double, double, double> representing the (clusterTheta, p, charge) bin centres. Used to check consistency of the xml vector indexing w/ the linearised TH3 category map.

Definition at line 213 of file ChargedPidMVAWeights.h.

215 {
216
217 if (!isValidPdg(pdg)) {
218 B2FATAL("PDG: " << pdg << " is not that of a valid charged particle! Aborting...");
219 }
220
221 unsigned int idx(0);
222 for (const auto& cutfile : cutfiles) {
223
224 auto bin_centres_tuple = categoryBinCentres.at(idx);
225
226 auto theta_bin_centre = std::get<0>(bin_centres_tuple);
227 auto p_bin_centre = std::get<1>(bin_centres_tuple);
228 auto charge_bin_centre = std::get<2>(bin_centres_tuple);
229
230 auto h_idx = getMVAWeightIdx(theta_bin_centre, p_bin_centre, charge_bin_centre);
231 if (idx != h_idx) {
232 B2FATAL("Cut file:\n" << cutfile << "\nindex in input vector:\n" << idx << "\ndoes not correspond to:\n" << h_idx <<
233 "\n, i.e. the linearised index of the 3D bin centered in (clusterTheta, p, charge) = (" << theta_bin_centre << ", " << p_bin_centre
234 << ", " <<
235 charge_bin_centre <<
236 ")\nPlease check how the input cut file list is being filled.");
237 }
238
239 std::ifstream ifs(cutfile);
240 std::string cut((std::istreambuf_iterator<char>(ifs)), (std::istreambuf_iterator<char>()));
241
242 // Strip trailing newline.
243 cut.erase(std::remove(cut.begin(), cut.end(), '\n'), cut.end());
244
245 m_cuts[pdg].push_back(cut);
246
247 ++idx;
248 }
249
250 }
bool isValidPdg(const int pdg) const
Check if the input pdgId is that of a valid charged particle.

◆ storeCutsMultiClass()

void storeCutsMultiClass ( const std::vector< std::string > &  cutfiles,
const std::vector< std::tuple< double, double, double > > &  categoryBinCentres 
)
inline

For the multi-class mode, store the list of selection cuts (one for each category) into the payload.

Uses the special value of pdg=0 reserved for multi-class mode.

Parameters
cutfilesa list of text files w/ cut strings, for each (clusterTheta, p, charge) category. The format of the cut must comply with the GeneralCut syntax.
categoryBinCentresa list of <double, double, double> representing the (clusterTheta, p, charge) bin centres. Used to check consistency of the xml vector indexing w/ the linearised TH3 category map.

Definition at line 262 of file ChargedPidMVAWeights.h.

264 {
265 storeCuts(0, cutfiles, categoryBinCentres);
266 }
void storeCuts(const int pdg, const std::vector< std::string > &cutfiles, const std::vector< std::tuple< double, double, double > > &categoryBinCentres)
Given a particle mass hypothesis' pdgId, store the list of selection cuts (one for each category) int...

◆ storeMVAWeights()

void storeMVAWeights ( const int  pdg,
const std::vector< std::string > &  filepaths,
const std::vector< std::tuple< double, double, double > > &  categoryBinCentres 
)
inline

Given a particle mass hypothesis' pdgId, store the list of MVA weight files (one for each category) into the payload.

Parameters
pdgthe particle mass hypothesis' pdgId.
filepathsa list of xml (root) file paths for several (clusterTheta, p, charge) categories.
categoryBinCentresa list of <double, double, double> representing the (clusterTheta, p, charge) bin centres. Used to check consistency of the xml vector indexing w/ the linearised TH3 category map.

Definition at line 138 of file ChargedPidMVAWeights.h.

140 {
141
142 if (!isValidPdg(pdg)) {
143 B2FATAL("PDG: " << pdg << " is not that of a valid charged particle! Aborting...");
144 }
145
146 unsigned int idx(0);
147 for (const auto& path : filepaths) {
148
149 // Index consistency check.
150 auto bin_centres_tuple = categoryBinCentres.at(idx);
151
152 auto theta_bin_centre = std::get<0>(bin_centres_tuple);
153 auto p_bin_centre = std::get<1>(bin_centres_tuple);
154 auto charge_bin_centre = std::get<2>(bin_centres_tuple);
155
156 auto h_idx = getMVAWeightIdx(theta_bin_centre, p_bin_centre, charge_bin_centre);
157 if (idx != h_idx) {
158 B2FATAL("xml file:\n" << path << "\nindex in input vector:\n" << idx << "\ndoes not correspond to:\n" << h_idx <<
159 "\n, i.e. the linearised index of the 3D bin centered in (clusterTheta, p, charge) = (" << theta_bin_centre << ", " << p_bin_centre
160 << ", " <<
161 charge_bin_centre <<
162 ")\nPlease check how the input xml file list is being filled.");
163 }
164
165 Belle2::MVA::Weightfile weightfile;
166 if (boost::ends_with(path, ".root")) {
168 } else if (boost::ends_with(path, ".xml")) {
170 } else {
171 B2WARNING("Unknown file extension for file: " << path << ", fallback to xml...");
173 }
174
175 // Serialize the MVA::Weightfile object into a string for storage in the database,
176 // otherwise there are issues w/ dictionary generation for the payload class...
177 std::stringstream ss;
179 m_weightfiles[pdg].push_back(ss.str());
180
181 ++idx;
182 }
183
184 }
The Weightfile class serializes all information about a training into an xml tree.
Definition: Weightfile.h:38
static Weightfile loadFromXMLFile(const std::string &filename)
Static function which loads a Weightfile from a XML file.
Definition: Weightfile.cc:240
static Weightfile loadFromROOTFile(const std::string &filename)
Static function which loads a Weightfile from a ROOT file.
Definition: Weightfile.cc:217
static void saveToStream(Weightfile &weightfile, std::ostream &stream)
Static function which serializes a Weightfile to a stream.
Definition: Weightfile.cc:185

◆ storeMVAWeightsMultiClass()

void storeMVAWeightsMultiClass ( const std::vector< std::string > &  filepaths,
const std::vector< std::tuple< double, double, double > > &  categoryBinCentres 
)
inline

For the multi-class mode, store the list of MVA weight files (one for each category) into the payload.

Uses the special value of pdg=0 reserved for multi-class mode.

Parameters
filepathsa list of xml (root) file paths for several (clusterTheta, p, charge) categories.
categoryBinCentresa list of <double, double, double> representing the (clusterTheta, p, charge) bin centres. Used to check consistency of the xml vector indexing w/ the linearised TH3 category map.

Definition at line 196 of file ChargedPidMVAWeights.h.

198 {
199 storeMVAWeights(0, filepaths, categoryBinCentres);
200 }
void storeMVAWeights(const int pdg, const std::vector< std::string > &filepaths, const std::vector< std::tuple< double, double, double > > &categoryBinCentres)
Given a particle mass hypothesis' pdgId, store the list of MVA weight files (one for each category) i...

Member Data Documentation

◆ m_aliases

VariablesByAlias m_aliases
private

A map that associates variable aliases used in the MVA training to variable names known to the VariableManager.

Definition at line 576 of file ChargedPidMVAWeights.h.

◆ m_ang_unit

TParameter<double> m_ang_unit
private

The angular unit used for defining the bins grid.

Definition at line 522 of file ChargedPidMVAWeights.h.

◆ m_categories

std::unique_ptr<TH3F> m_categories
private

A 3D histogram whose bins represent the categories for which XML weight files are defined.

It is used to lookup the correct file in the payload, given a reconstructed set of (clusterTheta(theta), p, charge).

Definition at line 532 of file ChargedPidMVAWeights.h.

◆ m_cuts

WeightfilesByParticle m_cuts
private
Initial value:
= {
{ 0, std::vector<std::string>() },
{ Const::electron.getPDGCode(), std::vector<std::string>() },
{ Const::muon.getPDGCode(), std::vector<std::string>() },
{ Const::pion.getPDGCode(), std::vector<std::string>() },
{ Const::kaon.getPDGCode(), std::vector<std::string>() },
{ Const::proton.getPDGCode(), std::vector<std::string>() },
{ Const::deuteron.getPDGCode(), std::vector<std::string>() }
}
int getPDGCode() const
PDG code.
Definition: Const.h:473
static const ChargedStable muon
muon particle
Definition: Const.h:660
static const ChargedStable pion
charged pion particle
Definition: Const.h:661
static const ChargedStable proton
proton particle
Definition: Const.h:663
static const ChargedStable kaon
charged kaon particle
Definition: Const.h:662
static const ChargedStable electron
electron particle
Definition: Const.h:659
static const ChargedStable deuteron
deuteron particle
Definition: Const.h:664

For each charged particle mass hypothesis' pdgId, this map contains a list of selection cuts to be stored in the payload.

To each Weightfile (i.e., category) corresponds a cut. The indexing in each vector must reflect the one of the corresponding 'linearised' TH3F histogram contained in the m_grids map.

The dummy pdgId=0 key is reserved for multi-class, where a unique signal hypothesis is not defined.

Definition at line 562 of file ChargedPidMVAWeights.h.

◆ m_energy_unit

TParameter<double> m_energy_unit
private

The energy unit used for defining the bins grid.

Definition at line 521 of file ChargedPidMVAWeights.h.

◆ m_implicitNaNmasking

bool m_implicitNaNmasking
private

Flag to indicate whether the MVA variables have been NaN-masked directly in the weightfiles.

Definition at line 525 of file ChargedPidMVAWeights.h.

◆ m_thetaVarName

std::string m_thetaVarName
private

The name of the polar angle variable used in the MVA categorisation.

Must be a string that can be parsed by the VariableManager.

Definition at line 524 of file ChargedPidMVAWeights.h.

◆ m_weightfiles

WeightfilesByParticle m_weightfiles
private
Initial value:
= {
{ 0, std::vector<std::string>() },
{ Const::electron.getPDGCode(), std::vector<std::string>() },
{ Const::muon.getPDGCode(), std::vector<std::string>() },
{ Const::pion.getPDGCode(), std::vector<std::string>() },
{ Const::kaon.getPDGCode(), std::vector<std::string>() },
{ Const::proton.getPDGCode(), std::vector<std::string>() },
{ Const::deuteron.getPDGCode(), std::vector<std::string>() }
}

For each charged particle mass hypothesis' pdgId, this map contains a list of (serialized) Weightfile objects to be stored in the payload.

Each weightfile in the list corresponds to a 3D category. The indexing in each vector must reflect the one of the corresponding 'linearised' TH3F histogram contained in the m_grids map.

The dummy pdgId=0 key is reserved for multi-class, where a unique signal hypothesis is not defined.

Definition at line 543 of file ChargedPidMVAWeights.h.


The documentation for this class was generated from the following file: