Belle II Software development
CombinedDataset Class Reference

Wraps two other Datasets, one containing signal, the other background events Used by the reweighting method to train mc files against data files. More...

#include <Dataset.h>

Inheritance diagram for CombinedDataset:
Dataset

Public Member Functions

 CombinedDataset (const GeneralOptions &general_options, Dataset &signal_dataset, Dataset &background_dataset)
 Constructs a new CombinedDataset holding a reference to the wrapped Datasets.
 
virtual unsigned int getNumberOfFeatures () const override
 Returns the number of features in this dataset, so the size of the given subset of the variables.
 
virtual unsigned int getNumberOfSpectators () const override
 Returns the number of spectators in this dataset, so the size of the given subset of the spectators.
 
virtual unsigned int getNumberOfEvents () const override
 Returns the number of events in the wrapped dataset.
 
virtual void loadEvent (unsigned int iEvent) override
 Load the event number iEvent from the wrapped dataset.
 
virtual std::vector< float > getFeature (unsigned int iFeature) override
 Returns all values of one feature in a std::vector<float> of the wrapped dataset.
 
virtual std::vector< float > getSpectator (unsigned int iSpectator) override
 Returns all values of one spectator in a std::vector<float> of the wrapped dataset.
 
virtual float getSignalFraction ()
 Returns the signal fraction of the whole sample.
 
virtual unsigned int getFeatureIndex (const std::string &feature)
 Return index of feature with the given name.
 
virtual unsigned int getSpectatorIndex (const std::string &spectator)
 Return index of spectator with the given name.
 
virtual std::vector< float > getWeights ()
 Returns all weights.
 
virtual std::vector< float > getTargets ()
 Returns all targets.
 
virtual std::vector< bool > getSignals ()
 Returns all is Signals.
 

Public Attributes

GeneralOptions m_general_options
 GeneralOptions passed to this dataset.
 
std::vector< float > m_input
 Contains all feature values of the currently loaded event.
 
std::vector< float > m_spectators
 Contains all spectators values of the currently loaded event.
 
float m_weight
 Contains the weight of the currently loaded event.
 
float m_target
 Contains the target value of the currently loaded event.
 
bool m_isSignal
 Defines if the currently loaded event is signal or background.
 

Private Attributes

Datasetm_signal_dataset
 Reference to the wrapped dataset containing signal events.
 
Datasetm_background_dataset
 Reference to the wrapped dataset containing background events.
 

Detailed Description

Wraps two other Datasets, one containing signal, the other background events Used by the reweighting method to train mc files against data files.

Definition at line 294 of file Dataset.h.

Constructor & Destructor Documentation

◆ CombinedDataset()

CombinedDataset ( const GeneralOptions general_options,
Dataset signal_dataset,
Dataset background_dataset 
)

Constructs a new CombinedDataset holding a reference to the wrapped Datasets.

Parameters
general_options
signal_datasetreference to the wrapped Dataset containing signal events
background_datasetreference to the wrapped Dataset containing background events

Definition at line 273 of file Dataset.cc.

274 : Dataset(general_options), m_signal_dataset(signal_dataset),
275 m_background_dataset(background_dataset) { }
Dataset & m_background_dataset
Reference to the wrapped dataset containing background events.
Definition: Dataset.h:340
Dataset & m_signal_dataset
Reference to the wrapped dataset containing signal events.
Definition: Dataset.h:339
Dataset(const GeneralOptions &general_options)
Constructs a new dataset given the general options.
Definition: Dataset.cc:26

Member Function Documentation

◆ getFeature()

std::vector< float > getFeature ( unsigned int  iFeature)
overridevirtual

Returns all values of one feature in a std::vector<float> of the wrapped dataset.

Parameters
iFeaturethe position of the feature to return in the given subset

Reimplemented from Dataset.

Definition at line 296 of file Dataset.cc.

297 {
298
299 auto s = m_signal_dataset.getFeature(iFeature);
300 auto b = m_background_dataset.getFeature(iFeature);
301 s.insert(s.end(), b.begin(), b.end());
302 return s;
303
304 }
virtual std::vector< float > getFeature(unsigned int iFeature)
Returns all values of one feature in a std::vector<float>
Definition: Dataset.cc:74

◆ getFeatureIndex()

unsigned int getFeatureIndex ( const std::string &  feature)
virtualinherited

Return index of feature with the given name.

Parameters
featurename of the feature

Definition at line 50 of file Dataset.cc.

51 {
52
53 auto it = std::find(m_general_options.m_variables.begin(), m_general_options.m_variables.end(), feature);
54 if (it == m_general_options.m_variables.end()) {
55 B2ERROR("Unknown feature named " << feature);
56 return 0;
57 }
58 return std::distance(m_general_options.m_variables.begin(), it);
59
60 }
GeneralOptions m_general_options
GeneralOptions passed to this dataset.
Definition: Dataset.h:122
std::vector< std::string > m_variables
Vector of all variables (branch names) used in the training.
Definition: Options.h:86

◆ getNumberOfEvents()

virtual unsigned int getNumberOfEvents ( ) const
inlineoverridevirtual

Returns the number of events in the wrapped dataset.

Implements Dataset.

Definition at line 318 of file Dataset.h.

virtual unsigned int getNumberOfEvents() const =0
Returns the number of events in this dataset.

◆ getNumberOfFeatures()

virtual unsigned int getNumberOfFeatures ( ) const
inlineoverridevirtual

Returns the number of features in this dataset, so the size of the given subset of the variables.

Implements Dataset.

Definition at line 308 of file Dataset.h.

virtual unsigned int getNumberOfFeatures() const =0
Returns the number of features in this dataset.

◆ getNumberOfSpectators()

virtual unsigned int getNumberOfSpectators ( ) const
inlineoverridevirtual

Returns the number of spectators in this dataset, so the size of the given subset of the spectators.

Implements Dataset.

Definition at line 313 of file Dataset.h.

virtual unsigned int getNumberOfSpectators() const =0
Returns the number of spectators in this dataset.

◆ getSignalFraction()

float getSignalFraction ( )
virtualinherited

Returns the signal fraction of the whole sample.

Reimplemented in SPlotDataset.

Definition at line 35 of file Dataset.cc.

36 {
37
38 double signal_weight_sum = 0;
39 double weight_sum = 0;
40 for (unsigned int i = 0; i < getNumberOfEvents(); ++i) {
41 loadEvent(i);
42 weight_sum += m_weight;
43 if (m_isSignal)
44 signal_weight_sum += m_weight;
45 }
46 return signal_weight_sum / weight_sum;
47
48 }
virtual void loadEvent(unsigned int iEvent)=0
Load the event number iEvent.
bool m_isSignal
Defines if the currently loaded event is signal or background.
Definition: Dataset.h:127
float m_weight
Contains the weight of the currently loaded event.
Definition: Dataset.h:125

◆ getSignals()

std::vector< bool > getSignals ( )
virtualinherited

Returns all is Signals.

Reimplemented in ReweightingDataset.

Definition at line 122 of file Dataset.cc.

123 {
124
125 std::vector<bool> result(getNumberOfEvents());
126 for (unsigned int iEvent = 0; iEvent < getNumberOfEvents(); ++iEvent) {
127 loadEvent(iEvent);
128 result[iEvent] = m_isSignal;
129 }
130 return result;
131
132 }

◆ getSpectator()

std::vector< float > getSpectator ( unsigned int  iSpectator)
overridevirtual

Returns all values of one spectator in a std::vector<float> of the wrapped dataset.

Parameters
iSpectatorthe position of the spectator to return in the given subset

Reimplemented from Dataset.

Definition at line 306 of file Dataset.cc.

307 {
308
309 auto s = m_signal_dataset.getSpectator(iSpectator);
310 auto b = m_background_dataset.getSpectator(iSpectator);
311 s.insert(s.end(), b.begin(), b.end());
312 return s;
313
314 }
virtual std::vector< float > getSpectator(unsigned int iSpectator)
Returns all values of one spectator in a std::vector<float>
Definition: Dataset.cc:86

◆ getSpectatorIndex()

unsigned int getSpectatorIndex ( const std::string &  spectator)
virtualinherited

Return index of spectator with the given name.

Parameters
spectatorname of the spectator

Definition at line 62 of file Dataset.cc.

63 {
64
65 auto it = std::find(m_general_options.m_spectators.begin(), m_general_options.m_spectators.end(), spectator);
66 if (it == m_general_options.m_spectators.end()) {
67 B2ERROR("Unknown spectator named " << spectator);
68 return 0;
69 }
70 return std::distance(m_general_options.m_spectators.begin(), it);
71
72 }
std::vector< std::string > m_spectators
Vector of all spectators (branch names) used in the training.
Definition: Options.h:87

◆ getTargets()

std::vector< float > getTargets ( )
virtualinherited

Returns all targets.

Reimplemented in RegressionDataSet, and ReweightingDataset.

Definition at line 110 of file Dataset.cc.

111 {
112
113 std::vector<float> result(getNumberOfEvents());
114 for (unsigned int iEvent = 0; iEvent < getNumberOfEvents(); ++iEvent) {
115 loadEvent(iEvent);
116 result[iEvent] = m_target;
117 }
118 return result;
119
120 }
float m_target
Contains the target value of the currently loaded event.
Definition: Dataset.h:126

◆ getWeights()

std::vector< float > getWeights ( )
virtualinherited

Returns all weights.

Reimplemented in ROOTDataset, RegressionDataSet, and ReweightingDataset.

Definition at line 98 of file Dataset.cc.

99 {
100
101 std::vector<float> result(getNumberOfEvents());
102 for (unsigned int iEvent = 0; iEvent < getNumberOfEvents(); ++iEvent) {
103 loadEvent(iEvent);
104 result[iEvent] = m_weight;
105 }
106 return result;
107
108 }

◆ loadEvent()

void loadEvent ( unsigned int  iEvent)
overridevirtual

Load the event number iEvent from the wrapped dataset.

Parameters
iEventevent number to load

Implements Dataset.

Definition at line 277 of file Dataset.cc.

278 {
279 if (iEvent < m_signal_dataset.getNumberOfEvents()) {
281 m_target = 1.0;
282 m_isSignal = true;
286 } else {
288 m_target = 0.0;
289 m_isSignal = false;
293 }
294 }
std::vector< float > m_spectators
Contains all spectators values of the currently loaded event.
Definition: Dataset.h:124
std::vector< float > m_input
Contains all feature values of the currently loaded event.
Definition: Dataset.h:123

Member Data Documentation

◆ m_background_dataset

Dataset& m_background_dataset
private

Reference to the wrapped dataset containing background events.

Definition at line 340 of file Dataset.h.

◆ m_general_options

GeneralOptions m_general_options
inherited

GeneralOptions passed to this dataset.

Definition at line 122 of file Dataset.h.

◆ m_input

std::vector<float> m_input
inherited

Contains all feature values of the currently loaded event.

Definition at line 123 of file Dataset.h.

◆ m_isSignal

bool m_isSignal
inherited

Defines if the currently loaded event is signal or background.

Definition at line 127 of file Dataset.h.

◆ m_signal_dataset

Dataset& m_signal_dataset
private

Reference to the wrapped dataset containing signal events.

Definition at line 339 of file Dataset.h.

◆ m_spectators

std::vector<float> m_spectators
inherited

Contains all spectators values of the currently loaded event.

Definition at line 124 of file Dataset.h.

◆ m_target

float m_target
inherited

Contains the target value of the currently loaded event.

Definition at line 126 of file Dataset.h.

◆ m_weight

float m_weight
inherited

Contains the weight of the currently loaded event.

Definition at line 125 of file Dataset.h.


The documentation for this class was generated from the following files: