Belle II Software development
RegressionDataSet Class Reference

Dataset needed during the training of a regression method. More...

#include <Regression.h>

Inheritance diagram for RegressionDataSet:
Dataset

Public Member Functions

 RegressionDataSet (const GeneralOptions &general_options, Dataset *dataSet, double cutValue)
 Create a new regression data set out of the general options, a pointer to the real dataset and the cut value.
 
unsigned int getNumberOfFeatures () const override
 Return the number of features from the real dataset.
 
unsigned int getNumberOfEvents () const override
 Return the number of events from the real dataset.
 
unsigned int getNumberOfSpectators () const override
 Return the number of spectators from the real dataset.
 
void loadEvent (unsigned int iEvent) override
 Load an event. Sets all internal variables and sets the isSignal variable dependent on the cut value.
 
std::vector< float > getFeature (unsigned int iFeature) override
 Return a specific feature from the real dataset.
 
std::vector< float > getSpectator (unsigned int iSpectator) override
 Return a specific spectator from the real dataset.
 
std::vector< float > getWeights () override
 Return the weights from the real dataset.
 
std::vector< float > getTargets () override
 Return the targets from the real dataset.
 
virtual float getSignalFraction ()
 Returns the signal fraction of the whole sample.
 
virtual unsigned int getFeatureIndex (const std::string &feature)
 Return index of feature with the given name.
 
virtual unsigned int getSpectatorIndex (const std::string &spectator)
 Return index of spectator with the given name.
 
virtual std::vector< bool > getSignals ()
 Returns all is Signals.
 

Public Attributes

GeneralOptions m_general_options
 GeneralOptions passed to this dataset.
 
std::vector< float > m_input
 Contains all feature values of the currently loaded event.
 
std::vector< float > m_spectators
 Contains all spectators values of the currently loaded event.
 
float m_weight
 Contains the weight of the currently loaded event.
 
float m_target
 Contains the target value of the currently loaded event.
 
bool m_isSignal
 Defines if the currently loaded event is signal or background.
 

Private Attributes

double m_cutValue
 The cut value.
 
Datasetm_childDataSet
 The real data set (our child)
 

Detailed Description

Dataset needed during the training of a regression method.

It basically wraps another dataset it receives as pointer and every call to this dataset is forwarded to the other dataset.

The only difference is, that the isSignal variable is set dependent on the cut value given in the constructor to

 isSignal = target >= cutValue

to generate a binary classification out of a regression task.

Definition at line 77 of file Regression.h.

Constructor & Destructor Documentation

◆ RegressionDataSet()

RegressionDataSet ( const GeneralOptions general_options,
Dataset dataSet,
double  cutValue 
)

Create a new regression data set out of the general options, a pointer to the real dataset and the cut value.

Definition at line 16 of file Regression.cc.

16 :
17 Dataset(general_options), m_cutValue(cutValue), m_childDataSet(dataset)
18{
19}
Abstract base class of all Datasets given to the MVA interface The current event can always be access...
Definition: Dataset.h:33
Dataset * m_childDataSet
The real data set (our child)
Definition: Regression.h:111
double m_cutValue
The cut value.
Definition: Regression.h:108

Member Function Documentation

◆ getFeature()

std::vector< float > getFeature ( unsigned int  iFeature)
overridevirtual

Return a specific feature from the real dataset.

Reimplemented from Dataset.

Definition at line 46 of file Regression.cc.

47{
48 return m_childDataSet->getFeature(iFeature);
49}
virtual std::vector< float > getFeature(unsigned int iFeature)
Returns all values of one feature in a std::vector<float>
Definition: Dataset.cc:74

◆ getFeatureIndex()

unsigned int getFeatureIndex ( const std::string &  feature)
virtualinherited

Return index of feature with the given name.

Parameters
featurename of the feature

Definition at line 50 of file Dataset.cc.

51 {
52
53 auto it = std::find(m_general_options.m_variables.begin(), m_general_options.m_variables.end(), feature);
54 if (it == m_general_options.m_variables.end()) {
55 B2ERROR("Unknown feature named " << feature);
56 return 0;
57 }
58 return std::distance(m_general_options.m_variables.begin(), it);
59
60 }
GeneralOptions m_general_options
GeneralOptions passed to this dataset.
Definition: Dataset.h:122
std::vector< std::string > m_variables
Vector of all variables (branch names) used in the training.
Definition: Options.h:86

◆ getNumberOfEvents()

unsigned int getNumberOfEvents ( ) const
overridevirtual

Return the number of events from the real dataset.

Implements Dataset.

Definition at line 36 of file Regression.cc.

37{
39}
virtual unsigned int getNumberOfEvents() const =0
Returns the number of events in this dataset.

◆ getNumberOfFeatures()

unsigned int getNumberOfFeatures ( ) const
overridevirtual

Return the number of features from the real dataset.

Implements Dataset.

Definition at line 31 of file Regression.cc.

32{
34}
virtual unsigned int getNumberOfFeatures() const =0
Returns the number of features in this dataset.

◆ getNumberOfSpectators()

unsigned int getNumberOfSpectators ( ) const
overridevirtual

Return the number of spectators from the real dataset.

Implements Dataset.

Definition at line 41 of file Regression.cc.

42{
44}
virtual unsigned int getNumberOfSpectators() const =0
Returns the number of spectators in this dataset.

◆ getSignalFraction()

float getSignalFraction ( )
virtualinherited

Returns the signal fraction of the whole sample.

Reimplemented in SPlotDataset.

Definition at line 35 of file Dataset.cc.

36 {
37
38 double signal_weight_sum = 0;
39 double weight_sum = 0;
40 for (unsigned int i = 0; i < getNumberOfEvents(); ++i) {
41 loadEvent(i);
42 weight_sum += m_weight;
43 if (m_isSignal)
44 signal_weight_sum += m_weight;
45 }
46 return signal_weight_sum / weight_sum;
47
48 }
virtual void loadEvent(unsigned int iEvent)=0
Load the event number iEvent.
bool m_isSignal
Defines if the currently loaded event is signal or background.
Definition: Dataset.h:127
float m_weight
Contains the weight of the currently loaded event.
Definition: Dataset.h:125

◆ getSignals()

std::vector< bool > getSignals ( )
virtualinherited

Returns all is Signals.

Reimplemented in ReweightingDataset.

Definition at line 122 of file Dataset.cc.

123 {
124
125 std::vector<bool> result(getNumberOfEvents());
126 for (unsigned int iEvent = 0; iEvent < getNumberOfEvents(); ++iEvent) {
127 loadEvent(iEvent);
128 result[iEvent] = m_isSignal;
129 }
130 return result;
131
132 }

◆ getSpectator()

std::vector< float > getSpectator ( unsigned int  iSpectator)
overridevirtual

Return a specific spectator from the real dataset.

Reimplemented from Dataset.

Definition at line 51 of file Regression.cc.

52{
53 return m_childDataSet->getSpectator(iSpectator);
54}
virtual std::vector< float > getSpectator(unsigned int iSpectator)
Returns all values of one spectator in a std::vector<float>
Definition: Dataset.cc:86

◆ getSpectatorIndex()

unsigned int getSpectatorIndex ( const std::string &  spectator)
virtualinherited

Return index of spectator with the given name.

Parameters
spectatorname of the spectator

Definition at line 62 of file Dataset.cc.

63 {
64
65 auto it = std::find(m_general_options.m_spectators.begin(), m_general_options.m_spectators.end(), spectator);
66 if (it == m_general_options.m_spectators.end()) {
67 B2ERROR("Unknown spectator named " << spectator);
68 return 0;
69 }
70 return std::distance(m_general_options.m_spectators.begin(), it);
71
72 }
std::vector< std::string > m_spectators
Vector of all spectators (branch names) used in the training.
Definition: Options.h:87

◆ getTargets()

std::vector< float > getTargets ( )
overridevirtual

Return the targets from the real dataset.

Reimplemented from Dataset.

Definition at line 61 of file Regression.cc.

62{
63 return m_childDataSet->getTargets();
64}
virtual std::vector< float > getTargets()
Returns all targets.
Definition: Dataset.cc:110

◆ getWeights()

std::vector< float > getWeights ( )
overridevirtual

Return the weights from the real dataset.

Reimplemented from Dataset.

Definition at line 56 of file Regression.cc.

57{
58 return m_childDataSet->getWeights();
59}
virtual std::vector< float > getWeights()
Returns all weights.
Definition: Dataset.cc:98

◆ loadEvent()

void loadEvent ( unsigned int  iEvent)
overridevirtual

Load an event. Sets all internal variables and sets the isSignal variable dependent on the cut value.

Implements Dataset.

Definition at line 21 of file Regression.cc.

22{
29}
std::vector< float > m_spectators
Contains all spectators values of the currently loaded event.
Definition: Dataset.h:124
std::vector< float > m_input
Contains all feature values of the currently loaded event.
Definition: Dataset.h:123
float m_target
Contains the target value of the currently loaded event.
Definition: Dataset.h:126

Member Data Documentation

◆ m_childDataSet

Dataset* m_childDataSet
private

The real data set (our child)

Definition at line 111 of file Regression.h.

◆ m_cutValue

double m_cutValue
private

The cut value.

Definition at line 108 of file Regression.h.

◆ m_general_options

GeneralOptions m_general_options
inherited

GeneralOptions passed to this dataset.

Definition at line 122 of file Dataset.h.

◆ m_input

std::vector<float> m_input
inherited

Contains all feature values of the currently loaded event.

Definition at line 123 of file Dataset.h.

◆ m_isSignal

bool m_isSignal
inherited

Defines if the currently loaded event is signal or background.

Definition at line 127 of file Dataset.h.

◆ m_spectators

std::vector<float> m_spectators
inherited

Contains all spectators values of the currently loaded event.

Definition at line 124 of file Dataset.h.

◆ m_target

float m_target
inherited

Contains the target value of the currently loaded event.

Definition at line 126 of file Dataset.h.

◆ m_weight

float m_weight
inherited

Contains the weight of the currently loaded event.

Definition at line 125 of file Dataset.h.


The documentation for this class was generated from the following files: