Belle II Software development
RawDataCollectedMinMax Class Reference

takes care of collecting raw data and staying below RAM-threshold. More...

#include <RawDataCollectedMinMax.h>

Public Member Functions

 RawDataCollectedMinMax (unsigned expectedSize, std::pair< double, double > quantiles, unsigned maxSizeThreshold=100000)
 constructor. please use for quantiles [min, max] min ~0 & max ~1 (range 0-1)
 
void add (double newVal)
 adds value to collector.
 
unsigned getSampleSize () const
 returns current sample size (which is not the actual size of the container).
 
std::pair< double, double > getMinMax ()
 returns current best estimates for min and max cuts.
 

Protected Attributes

unsigned m_currentSize
 the current size of the data sample.
 
unsigned m_fillIntermediateThreshold
 an internal threshold taking care of collecting intermediate results during sample collection
 
std::pair< double, double > m_minMaxQuantiles
 the quantiles to be collected in the end (defined in [0;1])
 
std::vector< std::pair< double, double > > m_intermediateValues
 collects intermediate threshold if expected size is too big.
 
MinMaxCollector< double > m_collector
 collects raw data in an RAM-saving way.
 

Detailed Description

takes care of collecting raw data and staying below RAM-threshold.

Definition at line 27 of file RawDataCollectedMinMax.h.

Constructor & Destructor Documentation

◆ RawDataCollectedMinMax()

RawDataCollectedMinMax ( unsigned  expectedSize,
std::pair< double, double >  quantiles,
unsigned  maxSizeThreshold = 100000 
)
inline

constructor. please use for quantiles [min, max] min ~0 & max ~1 (range 0-1)

Definition at line 37 of file RawDataCollectedMinMax.h.

39 :
41 m_fillIntermediateThreshold(std::numeric_limits<unsigned>::max()),
42 m_minMaxQuantiles(quantiles),
43 m_collector((quantiles.first > (1. - quantiles.second) ? quantiles.first * 2. : (1. - quantiles.second) * 2.))
44 {
45 if (double(expectedSize) / (double(maxSizeThreshold) * 0.05) > double(maxSizeThreshold))
46 { B2FATAL("RawDataCollectedMinMax: expected data to big, can not execute!"); }
47
48 if (maxSizeThreshold < expectedSize) {
49 m_fillIntermediateThreshold = maxSizeThreshold / 10;
50 }
51 }
unsigned m_fillIntermediateThreshold
an internal threshold taking care of collecting intermediate results during sample collection
MinMaxCollector< double > m_collector
collects raw data in an RAM-saving way.
unsigned m_currentSize
the current size of the data sample.
std::pair< double, double > m_minMaxQuantiles
the quantiles to be collected in the end (defined in [0;1])

Member Function Documentation

◆ add()

void add ( double  newVal)
inline

adds value to collector.

Definition at line 54 of file RawDataCollectedMinMax.h.

55 {
56 m_collector.append(newVal);
58
59 // if threshold reached, collect results and fill into intermediate value-container:
61 std::pair<double, double> results = m_collector.getMinMax(m_minMaxQuantiles.first, m_minMaxQuantiles.second);
62 m_intermediateValues.push_back(std::move(results));
64 }
65 }
std::pair< DataType, DataType > getMinMax(DataType minQuantile=0., DataType maxQuantile=1.) const
for given pair of quantiles, the according cuts (min, max) will be returned.
unsigned totalSize() const
returns the combined size of the containers storing the values
void append(DataType newVal)
append new value
void clear()
deletes all values collected so far and resets to constructor-settings.
std::vector< std::pair< double, double > > m_intermediateValues
collects intermediate threshold if expected size is too big.

◆ getMinMax()

std::pair< double, double > getMinMax ( )
inline

returns current best estimates for min and max cuts.

Definition at line 71 of file RawDataCollectedMinMax.h.

72 {
73 if (m_intermediateValues.empty()) {
75 }
76
77 // issue: m_collector-sample could be too small and therefore distort results for small intermediateValue-samples. Therefore neglect m_collector for that case.
78 if (m_intermediateValues.size() == 1) {
79 return { m_intermediateValues.at(0).first, m_intermediateValues.at(0).second};
80 }
81 if (m_intermediateValues.size() == 2) {
82 return {
83 0.5 * (m_intermediateValues.at(0).first + m_intermediateValues.at(1).first),
84 0.5 * (m_intermediateValues.at(0).second + m_intermediateValues.at(1).second) };
85 }
86
87 if (!m_collector.empty()) {
88 std::pair<double, double> results = m_collector.getMinMax(m_minMaxQuantiles.first, m_minMaxQuantiles.second);
89 m_intermediateValues.push_back(results);
90 }
91
92 unsigned index = std::floor(double(m_intermediateValues.size()) * 0.5);
93 double min, max;
94
95 std::sort(m_intermediateValues.begin(), m_intermediateValues.end(),
96 [](const std::pair<double, double>& a, const std::pair<double, double>& b) -> bool { return a.first < b.first; });
97 min = m_intermediateValues.at(index).first;
98
99 std::sort(m_intermediateValues.begin(), m_intermediateValues.end(),
100 [](const std::pair<double, double>& a, const std::pair<double, double>& b) -> bool { return a.second < b.second; });
101 max = m_intermediateValues.at(index).second;
102
103 return {min, max};
104 }
bool empty() const
returns if internal containers are empty

◆ getSampleSize()

unsigned getSampleSize ( ) const
inline

returns current sample size (which is not the actual size of the container).

Definition at line 68 of file RawDataCollectedMinMax.h.

68{ return m_currentSize; }

Member Data Documentation

◆ m_collector

MinMaxCollector<double> m_collector
protected

collects raw data in an RAM-saving way.

Definition at line 33 of file RawDataCollectedMinMax.h.

◆ m_currentSize

unsigned m_currentSize
protected

the current size of the data sample.

Definition at line 29 of file RawDataCollectedMinMax.h.

◆ m_fillIntermediateThreshold

unsigned m_fillIntermediateThreshold
protected

an internal threshold taking care of collecting intermediate results during sample collection

Definition at line 30 of file RawDataCollectedMinMax.h.

◆ m_intermediateValues

std::vector<std::pair<double, double> > m_intermediateValues
protected

collects intermediate threshold if expected size is too big.

Definition at line 32 of file RawDataCollectedMinMax.h.

◆ m_minMaxQuantiles

std::pair<double, double> m_minMaxQuantiles
protected

the quantiles to be collected in the end (defined in [0;1])

Definition at line 31 of file RawDataCollectedMinMax.h.


The documentation for this class was generated from the following file: