Belle II Software development
RawDataCollectedMinMax Class Reference

takes care of collecting raw data and staying below RAM-threshold. More...

#include <RawDataCollectedMinMax.h>

Public Member Functions

 RawDataCollectedMinMax (unsigned expectedSize, std::pair< double, double > quantiles, unsigned maxSizeThreshold=100000)
 constructor. please use for quantiles [min, max] min ~0 & max ~1 (range 0-1)
 
void add (double newVal)
 adds value to collector.
 
unsigned getSampleSize () const
 returns current sample size (which is not the actual size of the container).
 
std::pair< double, double > getMinMax ()
 returns current best estimates for min and max cuts.
 

Protected Attributes

MinMaxCollector< double > m_collector
 collects raw data in an RAM-saving way.
 
std::pair< double, double > m_minMaxQuantiles
 the quantiles to be collected in the end (defined in [0;1])
 
std::vector< std::pair< double, double > > m_intermediateValues
 collects intermediate threshold if expected size is too big.
 
unsigned m_currentSize
 the current size of the data sample.
 
unsigned m_fillIntermediateThreshold
 an internal threshold taking care of collecting intermediate results during sample collection
 

Detailed Description

takes care of collecting raw data and staying below RAM-threshold.

Definition at line 27 of file RawDataCollectedMinMax.h.

Constructor & Destructor Documentation

◆ RawDataCollectedMinMax()

RawDataCollectedMinMax ( unsigned expectedSize,
std::pair< double, double > quantiles,
unsigned maxSizeThreshold = 100000 )
inline

constructor. please use for quantiles [min, max] min ~0 & max ~1 (range 0-1)

Definition at line 37 of file RawDataCollectedMinMax.h.

39 :
40 m_collector((quantiles.first > (1. - quantiles.second) ? quantiles.first * 2. : (1. - quantiles.second) * 2.)),
41 m_minMaxQuantiles(quantiles),
42 m_currentSize(0),
43 m_fillIntermediateThreshold(std::numeric_limits<unsigned>::max())
44 {
45 if (double(expectedSize) / (double(maxSizeThreshold) * 0.05) > double(maxSizeThreshold))
46 { B2FATAL("RawDataCollectedMinMax: expected data to big, can not execute!"); }
47
48 if (maxSizeThreshold < expectedSize) {
49 m_fillIntermediateThreshold = maxSizeThreshold / 10;
50 }
51 }

Member Function Documentation

◆ add()

void add ( double newVal)
inline

adds value to collector.

Definition at line 54 of file RawDataCollectedMinMax.h.

55 {
56 m_collector.append(newVal);
57 m_currentSize++;
58
59 // if threshold reached, collect results and fill into intermediate value-container:
60 if (m_collector.totalSize() > m_fillIntermediateThreshold) {
61 std::pair<double, double> results = m_collector.getMinMax(m_minMaxQuantiles.first, m_minMaxQuantiles.second);
62 m_intermediateValues.push_back(std::move(results));
63 m_collector.clear();
64 }
65 }

◆ getMinMax()

std::pair< double, double > getMinMax ( )
inline

returns current best estimates for min and max cuts.

Definition at line 71 of file RawDataCollectedMinMax.h.

72 {
73 if (m_intermediateValues.empty()) {
74 return m_collector.getMinMax(m_minMaxQuantiles.first, m_minMaxQuantiles.second);
75 }
76
77 // issue: m_collector-sample could be too small and therefore distort results for small intermediateValue-samples. Therefore neglect m_collector for that case.
78 if (m_intermediateValues.size() == 1) {
79 return { m_intermediateValues.at(0).first, m_intermediateValues.at(0).second};
80 }
81 if (m_intermediateValues.size() == 2) {
82 return {
83 0.5 * (m_intermediateValues.at(0).first + m_intermediateValues.at(1).first),
84 0.5 * (m_intermediateValues.at(0).second + m_intermediateValues.at(1).second) };
85 }
86
87 if (!m_collector.empty()) {
88 std::pair<double, double> results = m_collector.getMinMax(m_minMaxQuantiles.first, m_minMaxQuantiles.second);
89 m_intermediateValues.push_back(results);
90 }
91
92 unsigned index = std::floor(double(m_intermediateValues.size()) * 0.5);
93 double min, max;
94
95 std::sort(m_intermediateValues.begin(), m_intermediateValues.end(),
96 [](const std::pair<double, double>& a, const std::pair<double, double>& b) -> bool { return a.first < b.first; });
97 min = m_intermediateValues.at(index).first;
98
99 std::sort(m_intermediateValues.begin(), m_intermediateValues.end(),
100 [](const std::pair<double, double>& a, const std::pair<double, double>& b) -> bool { return a.second < b.second; });
101 max = m_intermediateValues.at(index).second;
102
103 return {min, max};
104 }

◆ getSampleSize()

unsigned getSampleSize ( ) const
inline

returns current sample size (which is not the actual size of the container).

Definition at line 68 of file RawDataCollectedMinMax.h.

68{ return m_currentSize; }

Member Data Documentation

◆ m_collector

MinMaxCollector<double> m_collector
protected

collects raw data in an RAM-saving way.

Definition at line 29 of file RawDataCollectedMinMax.h.

◆ m_currentSize

unsigned m_currentSize
protected

the current size of the data sample.

Definition at line 32 of file RawDataCollectedMinMax.h.

◆ m_fillIntermediateThreshold

unsigned m_fillIntermediateThreshold
protected

an internal threshold taking care of collecting intermediate results during sample collection

Definition at line 33 of file RawDataCollectedMinMax.h.

◆ m_intermediateValues

std::vector<std::pair<double, double> > m_intermediateValues
protected

collects intermediate threshold if expected size is too big.

Definition at line 31 of file RawDataCollectedMinMax.h.

◆ m_minMaxQuantiles

std::pair<double, double> m_minMaxQuantiles
protected

the quantiles to be collected in the end (defined in [0;1])

Definition at line 30 of file RawDataCollectedMinMax.h.


The documentation for this class was generated from the following file: