Belle II Software light-2406-ragdoll
MultiDataset Class Reference

Wraps the data of a multiple event into a Dataset. More...

#include <Dataset.h>

Inheritance diagram for MultiDataset:
Collaboration diagram for MultiDataset:

Public Member Functions

 MultiDataset (const GeneralOptions &general_options, const std::vector< std::vector< float > > &input, const std::vector< std::vector< float > > &spectators, const std::vector< float > &targets={}, const std::vector< float > &weights={})
 Constructs a new MultiDataset.
 
virtual unsigned int getNumberOfFeatures () const override
 Returns the number of features in this dataset.
 
virtual unsigned int getNumberOfSpectators () const override
 Returns the number of spectators in this dataset.
 
virtual unsigned int getNumberOfEvents () const override
 Returns the number of events in this dataset.
 
virtual void loadEvent (unsigned int iEvent) override
 Does nothing in the case of a single dataset, because the only event is already loaded.
 
virtual float getSignalFraction ()
 Returns the signal fraction of the whole sample.
 
virtual unsigned int getFeatureIndex (const std::string &feature)
 Return index of feature with the given name.
 
virtual unsigned int getSpectatorIndex (const std::string &spectator)
 Return index of spectator with the given name.
 
virtual std::vector< float > getFeature (unsigned int iFeature)
 Returns all values of one feature in a std::vector<float>
 
virtual std::vector< float > getSpectator (unsigned int iSpectator)
 Returns all values of one spectator in a std::vector<float>
 
virtual std::vector< float > getWeights ()
 Returns all weights.
 
virtual std::vector< float > getTargets ()
 Returns all targets.
 
virtual std::vector< bool > getSignals ()
 Returns all is Signals.
 

Public Attributes

GeneralOptions m_general_options
 GeneralOptions passed to this dataset.
 
std::vector< float > m_input
 Contains all feature values of the currently loaded event.
 
std::vector< float > m_spectators
 Contains all spectators values of the currently loaded event.
 
float m_weight
 Contains the weight of the currently loaded event.
 
float m_target
 Contains the target value of the currently loaded event.
 
bool m_isSignal
 Defines if the currently loaded event is signal or background.
 

Private Attributes

std::vector< std::vector< float > > m_matrix
 Feature matrix.
 
std::vector< std::vector< float > > m_spectator_matrix
 Spectator matrix.
 
std::vector< float > m_targets
 target vector
 
std::vector< float > m_weights
 weight vector
 

Detailed Description

Wraps the data of a multiple event into a Dataset.

Mostly useful if one wants to apply an Expert to a feature matrix

Definition at line 186 of file Dataset.h.

Constructor & Destructor Documentation

◆ MultiDataset()

MultiDataset ( const GeneralOptions general_options,
const std::vector< std::vector< float > > &  input,
const std::vector< std::vector< float > > &  spectators,
const std::vector< float > &  targets = {},
const std::vector< float > &  weights = {} 
)

Constructs a new MultiDataset.

Parameters
general_optionswhich defines e.g. number of variables
inputfeature values of the single event
spectatorsspectator values of the single event
targetstarget values of the single event (defaults to 1, because often this is not known if one wants to apply an expert)
weightsweights assigned to the input feature values

Definition at line 145 of file Dataset.cc.

147 : Dataset(general_options), m_matrix(input),
148 m_spectator_matrix(spectators),
149 m_targets(targets), m_weights(weights)
150 {
151
152 if (m_targets.size() > 0 and m_matrix.size() != m_targets.size()) {
153 B2ERROR("Feature matrix and target vector need same number of elements in MultiDataset, got " << m_targets.size() << " and " <<
154 m_matrix.size());
155 }
156 if (m_weights.size() > 0 and m_matrix.size() != m_weights.size()) {
157 B2ERROR("Feature matrix and weight vector need same number of elements in MultiDataset, got " << m_weights.size() << " and " <<
158 m_matrix.size());
159 }
160 if (m_spectator_matrix.size() > 0 and m_matrix.size() != m_spectator_matrix.size()) {
161 B2ERROR("Feature matrix and spectator matrix need same number of elements in MultiDataset, got " << m_spectator_matrix.size() <<
162 " and " <<
163 m_matrix.size());
164 }
165 }
Dataset(const GeneralOptions &general_options)
Constructs a new dataset given the general options.
Definition: Dataset.cc:26
std::vector< float > m_weights
weight vector
Definition: Dataset.h:226
std::vector< std::vector< float > > m_matrix
Feature matrix.
Definition: Dataset.h:223
std::vector< std::vector< float > > m_spectator_matrix
Spectator matrix.
Definition: Dataset.h:224
std::vector< float > m_targets
target vector
Definition: Dataset.h:225

Member Function Documentation

◆ getFeature()

std::vector< float > getFeature ( unsigned int  iFeature)
virtualinherited

Returns all values of one feature in a std::vector<float>

Parameters
iFeaturethe position of the feature to return

Reimplemented in SingleDataset, SubDataset, CombinedDataset, ROOTDataset, RegressionDataSet, ReweightingDataset, and SidebandDataset.

Definition at line 74 of file Dataset.cc.

75 {
76
77 std::vector<float> result(getNumberOfEvents());
78 for (unsigned int iEvent = 0; iEvent < getNumberOfEvents(); ++iEvent) {
79 loadEvent(iEvent);
80 result[iEvent] = m_input[iFeature];
81 }
82 return result;
83
84 }
virtual unsigned int getNumberOfEvents() const =0
Returns the number of events in this dataset.
virtual void loadEvent(unsigned int iEvent)=0
Load the event number iEvent.
std::vector< float > m_input
Contains all feature values of the currently loaded event.
Definition: Dataset.h:123

◆ getFeatureIndex()

unsigned int getFeatureIndex ( const std::string &  feature)
virtualinherited

Return index of feature with the given name.

Parameters
featurename of the feature

Definition at line 50 of file Dataset.cc.

51 {
52
53 auto it = std::find(m_general_options.m_variables.begin(), m_general_options.m_variables.end(), feature);
54 if (it == m_general_options.m_variables.end()) {
55 B2ERROR("Unknown feature named " << feature);
56 return 0;
57 }
58 return std::distance(m_general_options.m_variables.begin(), it);
59
60 }
GeneralOptions m_general_options
GeneralOptions passed to this dataset.
Definition: Dataset.h:122
std::vector< std::string > m_variables
Vector of all variables (branch names) used in the training.
Definition: Options.h:86

◆ getNumberOfEvents()

virtual unsigned int getNumberOfEvents ( ) const
inlineoverridevirtual

Returns the number of events in this dataset.

Implements Dataset.

Definition at line 214 of file Dataset.h.

214{ return m_matrix.size(); }

◆ getNumberOfFeatures()

virtual unsigned int getNumberOfFeatures ( ) const
inlineoverridevirtual

Returns the number of features in this dataset.

Implements Dataset.

Definition at line 204 of file Dataset.h.

204{ return m_input.size(); }

◆ getNumberOfSpectators()

virtual unsigned int getNumberOfSpectators ( ) const
inlineoverridevirtual

Returns the number of spectators in this dataset.

Implements Dataset.

Definition at line 209 of file Dataset.h.

209{ return m_spectators.size(); }
std::vector< float > m_spectators
Contains all spectators values of the currently loaded event.
Definition: Dataset.h:124

◆ getSignalFraction()

float getSignalFraction ( )
virtualinherited

Returns the signal fraction of the whole sample.

Reimplemented in SPlotDataset.

Definition at line 35 of file Dataset.cc.

36 {
37
38 double signal_weight_sum = 0;
39 double weight_sum = 0;
40 for (unsigned int i = 0; i < getNumberOfEvents(); ++i) {
41 loadEvent(i);
42 weight_sum += m_weight;
43 if (m_isSignal)
44 signal_weight_sum += m_weight;
45 }
46 return signal_weight_sum / weight_sum;
47
48 }
bool m_isSignal
Defines if the currently loaded event is signal or background.
Definition: Dataset.h:127
float m_weight
Contains the weight of the currently loaded event.
Definition: Dataset.h:125

◆ getSignals()

std::vector< bool > getSignals ( )
virtualinherited

Returns all is Signals.

Reimplemented in ReweightingDataset.

Definition at line 122 of file Dataset.cc.

123 {
124
125 std::vector<bool> result(getNumberOfEvents());
126 for (unsigned int iEvent = 0; iEvent < getNumberOfEvents(); ++iEvent) {
127 loadEvent(iEvent);
128 result[iEvent] = m_isSignal;
129 }
130 return result;
131
132 }

◆ getSpectator()

std::vector< float > getSpectator ( unsigned int  iSpectator)
virtualinherited

Returns all values of one spectator in a std::vector<float>

Parameters
iSpectatorthe position of the feature to return

Reimplemented in SingleDataset, SubDataset, CombinedDataset, ROOTDataset, RegressionDataSet, ReweightingDataset, and SidebandDataset.

Definition at line 86 of file Dataset.cc.

87 {
88
89 std::vector<float> result(getNumberOfEvents());
90 for (unsigned int iEvent = 0; iEvent < getNumberOfEvents(); ++iEvent) {
91 loadEvent(iEvent);
92 result[iEvent] = m_spectators[iSpectator];
93 }
94 return result;
95
96 }

◆ getSpectatorIndex()

unsigned int getSpectatorIndex ( const std::string &  spectator)
virtualinherited

Return index of spectator with the given name.

Parameters
spectatorname of the spectator

Definition at line 62 of file Dataset.cc.

63 {
64
65 auto it = std::find(m_general_options.m_spectators.begin(), m_general_options.m_spectators.end(), spectator);
66 if (it == m_general_options.m_spectators.end()) {
67 B2ERROR("Unknown spectator named " << spectator);
68 return 0;
69 }
70 return std::distance(m_general_options.m_spectators.begin(), it);
71
72 }
std::vector< std::string > m_spectators
Vector of all spectators (branch names) used in the training.
Definition: Options.h:87

◆ getTargets()

std::vector< float > getTargets ( )
virtualinherited

Returns all targets.

Reimplemented in RegressionDataSet, and ReweightingDataset.

Definition at line 110 of file Dataset.cc.

111 {
112
113 std::vector<float> result(getNumberOfEvents());
114 for (unsigned int iEvent = 0; iEvent < getNumberOfEvents(); ++iEvent) {
115 loadEvent(iEvent);
116 result[iEvent] = m_target;
117 }
118 return result;
119
120 }
float m_target
Contains the target value of the currently loaded event.
Definition: Dataset.h:126

◆ getWeights()

std::vector< float > getWeights ( )
virtualinherited

Returns all weights.

Reimplemented in ROOTDataset, RegressionDataSet, and ReweightingDataset.

Definition at line 98 of file Dataset.cc.

99 {
100
101 std::vector<float> result(getNumberOfEvents());
102 for (unsigned int iEvent = 0; iEvent < getNumberOfEvents(); ++iEvent) {
103 loadEvent(iEvent);
104 result[iEvent] = m_weight;
105 }
106 return result;
107
108 }

◆ loadEvent()

void loadEvent ( unsigned int  iEvent)
overridevirtual

Does nothing in the case of a single dataset, because the only event is already loaded.

Implements Dataset.

Definition at line 168 of file Dataset.cc.

169 {
170 m_input = m_matrix[iEvent];
171
172 if (m_spectator_matrix.size() > 0) {
174 }
175
176 if (m_targets.size() > 0) {
177 m_target = m_targets[iEvent];
179 }
180
181 if (m_weights.size() > 0)
182 m_weight = m_weights[iEvent];
183
184 }
int m_signal_class
Signal class which is used as signal in a classification problem.
Definition: Options.h:88

Member Data Documentation

◆ m_general_options

GeneralOptions m_general_options
inherited

GeneralOptions passed to this dataset.

Definition at line 122 of file Dataset.h.

◆ m_input

std::vector<float> m_input
inherited

Contains all feature values of the currently loaded event.

Definition at line 123 of file Dataset.h.

◆ m_isSignal

bool m_isSignal
inherited

Defines if the currently loaded event is signal or background.

Definition at line 127 of file Dataset.h.

◆ m_matrix

std::vector<std::vector<float> > m_matrix
private

Feature matrix.

Definition at line 223 of file Dataset.h.

◆ m_spectator_matrix

std::vector<std::vector<float> > m_spectator_matrix
private

Spectator matrix.

Definition at line 224 of file Dataset.h.

◆ m_spectators

std::vector<float> m_spectators
inherited

Contains all spectators values of the currently loaded event.

Definition at line 124 of file Dataset.h.

◆ m_target

float m_target
inherited

Contains the target value of the currently loaded event.

Definition at line 126 of file Dataset.h.

◆ m_targets

std::vector<float> m_targets
private

target vector

Definition at line 225 of file Dataset.h.

◆ m_weight

float m_weight
inherited

Contains the weight of the currently loaded event.

Definition at line 125 of file Dataset.h.

◆ m_weights

std::vector<float> m_weights
private

weight vector

Definition at line 226 of file Dataset.h.


The documentation for this class was generated from the following files: