Lifetime 2D Fit with Bat
Contents
25.5.6. Lifetime 2D Fit with Bat#
This directory contains an example of how to use BAT for a 2D fit. To run this code, you will need to install BAT. Follow the instructions in Installation instructions to install BAT.
The template for creating these examples was generated by BAT via a the command-line tool provided with BAT:
bat-project Lifetime2DFit Lifetime2DFit
This creates the directory Lifetime2DFit
; the header and source
files for a model class, Lifetime2DFit
; the source code for a
program to instantiate and run the model, runLifetime2DFit.cxx
;
and a Makefile
for compiling everything.
To compile the code in this example, run
make
The Data#
The data to be analyzed in this example is stored in a file in a
directory above this one at ../../example-data/lifetime.root
. In
this ROOT file is a TTree
(DtoKpipi0
) with branches named
Dz_t
and Dz_t_err
, containing the measured decay time and
decay-time uncertainty for D0 mesons reconstructed via their decay to
K pi pi0. This data is entirely signal with no background.
The Model#
We perform a fit to the data to learn the D0 lifetime, \(\tau\). We model the two-dimensional data distribution with the likelihood
where \(i\) runs over the data points, each containing a decay time, \(t\), and an uncertainty on it, \(\sigma\); and \(\vec\nu\) is a vector of nuisance parameters that help describe our data distribution. The 2D likelihood function for an individual data point is
It is the product of a probability density for \(t\) given \(\sigma\) and a probability density for \(\sigma\). The probability for \(t\) is the convolution of the standard decay time distribution, \(\exp(-t'/\tau)\) with a normal distribution for the true decay time \(t'\) given the measured \(t\) and \(\sigma\):
where \(b\) is a bias (offset) in \(t\) and \(s\) is a scaling of the projected uncertainty on \(t\). The integral evaluates to
where erfc is the complimentary error function; and the full function is normalized for fixed \(\sigma\) over the range of \(t\) in the data.
The probability for \(\sigma\) is a Johnson’s \(S_U\) distribution:
which is normalized over the range of \(\sigma\) in the data.
These functions and their integrals (and some of their logarithms) are
contained in Functions.h
.
The Code#
We code the above model into our class Lifetime2DFit
. We will
store the data in a vector of data points, each data point an array of
two double
’s. We will pass this data set to our constructor along
with a level of parallelization (to speed up our likelihood).
Inside the constructor (in Lifetime2DFit.cxx
), we define the
parameters of our model:
// define lifetime parameter
AddParameter("tau", 390e-3, 430e-3, "#tau", "[ps]");
// define decay time shape parameters
AddParameter("b", -10e-3, 10e-3, "bias", "[ps]");
AddParameter("s", 1.2, 1.45, "#sigma scaling");
AddParameter("mu", 25e-3, 33e-3, "#mu", "[ps]");
AddParameter("lambda", 7e-3, 12e-3, "#lambda", "[ps]");
AddParameter("gamma", -3.5, -2.5, "#gamma");
AddParameter("delta", 1.26, 1.56, "#delta");
For each parameter, we give it a name, lower and upper limits, a “pretty” name (in ROOT TeX) for plotting, and a units string for plotting, if applicable. After defining our parameters, we must set priors for them. For simplicity we will take flat priors for all the parameters:
SetPriorConstantAll();
In Lifetime2DFit::LogLikelihood(...)
we will loop over all the
data points and add their individual log-likelihood to the total. To
help this, we define the lambda function add_ll
that calculates
the log-likelihood for a data point (with the unnormalized Johnson’s
\(S_U\) function) and adds it into the sum of log-likelihoods.
We define the lambda function sum_ll
which sums up the
log-likelihood for a block of data. And then we divide the data set
into _NThreads
disjoint blocks and perform their sums in parallel
using C++’s built in threading system.
Finally we need to normalize our Johnson’s \(S_U\) function, by dividing each data point by the normalization factor. That is: by subtracting from the log the log of the integral for each data point.
In runLifetime2DFit.cxx
we load our data from the TTree
in the
file and instantiate our model with that data:
Lifetime2DFit m("Lifetime2DFit", data, 8)
where we have also specified to calculate our likelihood using eight-fold parallelization. We set our desired precision
m.SetPrecision(BCEngineMCMC::kMedium);
(you can use kQuick
if you are impatient). Then we run our
analysis, sampling from our posterior and improving the best-fit
result with Minuit:
m.MarginalizeAll();
m.FindMode();
And finally print the results to files and to the screen:
m.PrintAllMarginalized(...);
m.PrintSummary();
The Output#
The output plots show the posterior probability distributions for the individual parameters in 1D and for pairs of parameters in 2D. The (textual) output summary gives information on the best-fit point and the individual marginalizations. For more details consult the documentation in the Simple1DFit example.
The member function DrawFit(...)
in Lifetime2DFit
draws the
data set as a 2D histogram and the fit in projections of \(t\) and
\(\sigma\) at the best-fit parameter point. It also gives the
marginalized mean and standard deviation for \(\tau\) and a
\(\chi^2/\mathrm{ndf}\) for the data as binned in the displayed 2D
histogram.
Prior Selection#
In principle, our prior information on the lifetime is the current
state of knowledge concerning it: the PDG value. We could set the
prior to that knowledge by uncommenting-out the relevant highlighted
code in runLifetime2DFit.cxx
, most importantly
m.GetParameter("tau").SetPrior(new BCGaussianPrior(0.4101, 0.0015));
If you also uncomment out the block of code that prints the knowledge update, you can see how our data has changed our knowledge from the prior to the posterior. When we run the fit with a flat prior, the result is
This is twice as wide as the prior we have just set, so we should expect only a small improvement from the prior to the posterior. This is indeed what we see
which you can visually inspect in Lifetime2DFit_update.pdf
.
Caution: If you want your result to be averaged with the results in the PDG (that is, included in the PDG itself), you should not give it a prior based on those results and instead opt for the flat prior. (Or report both in your analysis.)