The basics.
Contents
3.4.1. The basics.#
The Belle II software is called basf2. It is an abbreviation for “Belle II Analysis Software Framework”. You may see also “BASF2” or “Basf2” in some outdated documentation, but the official way for writing it is basf2, using only lower case letters. You might wonder why we didn’t choose “b2asf”, and when you get a bit further you will probably wonder why it has “analysis” in the name (it does much more than analysis)? Well historic reasons: Belle had BASF, we have basf2.
basf2 is used in all aspects of the data-processing chain at Belle II:
generating simulated data,
unpacking of real raw data,
reconstruction (tracking, clustering, …),
and high-level “analysis” reconstruction (such as applying cuts, vertex-fitting, …).
basf2 is not normally used for the final analysis steps (histogramming, fitting 1D distributions, …). These final steps are usually called the “offline” analysis and will be covered in later lessons Offline analysis.
There is a citable reference for basf2:
Kuhr, T. et al. Comput Softw Big Sci 3, 1 (2019) https://doi.org/10.1007/s41781-018-0017-9
… and a logo.
Fig. 3.19 The basf2 logo.#
Pragmatically, you will encounter two separate objects named basf2
.
It is both a command-line executable which you can invoke, and a python
module from which you import functions.
You will soon be running commands that look like:
basf2 myScript -i myInputFile.root
… and inside the scripts you might see code like:
Core concepts#
There are some concepts we use in basf2, which you will definitely need to understand. These are:
basf2 module,
path,
package,
steering script / steering file.
Most of the other jargon terms we use are generic software development terms (so you can search the internet). A good place to look for Belle II-specific jargon is the Belle II Glossary.
Exercise
Find the Belle II Glossary (again).
Hint
Solution
basf2 modules#
A basf2 module is a piece of (usually) C++ code that does a specific “unit” of data processing. The full documentation can be found here in this website under the section Modules and Paths.
Warning
It is an unfortunate clash of naming that python uses the word “module” for a separate concept. In these tutorials we will always specify python module (and basf2 module) if there is ambiguity.
Path#
A basf2 path is an ordered list of modules that will be used to process the
data.
You can think of building a path by adding modules in a chain.
It is a python object: basf2.Path
.
Warning
A common misconception is that adding modules to a path is processing data. This is not true, you will prepare your path for data-processing by adding modules. The event-loop starts when you process your path.
Exercise
Find a diagram of a path with modules in this documentation.
Hint
Solution
Package#
A package is a logical collection of code in basf2. A typical package has several modules and some python scripts which configure paths to do common things.
You will encounter some basf2 packages in these lessons. We try to give them meaningful names (tracking, reconstruction, …) or name the package after the subdetector that they are related to (ecl, klm, cdc, top, …).
During these lessons, you will mostly interact with the analysis package. You will meet this at the end of this lesson.
Exercise
Find the source code and find a list of all packages.
Hint
Solution
Steering#
A steering file or a steering script is some python code that sets up some
analysis or data-processing task.
A typical steering file will declare a basf2.Path
, configure basf2 modules,
and then add them to the path.
Then it will call basf2.process
and maybe print some information.
We use the word “steering” since no real data processing is done in python.
Fig. 3.20 The C++ and python logos.#
Question
Why do we use both C++ and python?
Solution
Databases#
There are a couple more concepts that you might come across:
the conditions database
and the run database.
For these lessons and exercises you should not need to know too much but it’s good to be aware of the jargon.
See also
See also
See also
“rundb” in the glossary (no link this time, you should have it bookmarked!)
Key points
basf2 is the name of the Belle II software.
You work in basf2 by adding modules to a path.
Most basf2 modules are written in C++.
Data-processing happens when you process the path.
You do all of this configuration of the path, etc in python in a steering file.
You can navigate this online documentation.
Tip
After you’ve progressed a bit more through these lessons, you should revisit the Modules and Paths documentation page and reread the opening paragraphs.
By that stage everything should be clear.
Getting started, and getting help interactively#
Now let’s setup the environment, actually execute basf2
, and navigate the
command line help.
Please ssh
onto your favourite site.
If you do not have a preference, you should connect to login.cc.kek.jp
.
Before we start though…
You shouldn’t need to install anything#
A common misconception by newcomers (and even by senior people in the collaboration), is that you need to “install” basf2 or “install a release”.
It is possible to install from scratch, but you almost certainly do not want or need to do this. If you are working at KEK (for certain) and at many many other sites, basf2 is available preinstalled. It is distributed by something called /cvmfs.
b2setup#
To set up your environment to work with basf2
you first have to source the
setup script…
source /cvmfs/belle.cern.ch/tools/b2setup
Some people like to put an alias to the setup script in their .profile
(or
.bashrc
, .zshrc
, …) file.
You are welcome to do this if you like.
So now you have a Belle II environment.
You might have noticed that you still don’t have the basf2
executable:
$ source /cvmfs/belle.cern.ch/tools/b2setup
Belle II software tools set up at: /cvmfs/belle.cern.ch/tools
$ basf2
command not found: basf2
Note: we only used the $
character to distinguish the commands from the
expected output, it should not be typed.
In order to get the basf2
executable you need to choose a release
(a specific version of the software).
If you don’t know what release you want, you should take the latest stable
full release or the latest light release (see below).
There is a command-line tool to help with this. Try:
b2help-releases --help
To setup the release of your choice simply call b2setup
again with the
name of your release.
Since you’ve already set up the environment, the b2setup
executable itself
is already in your PATH
(that means we don’t need the full path /cvmfs/.../b2setup
anymore):
b2setup <your choice of release>
See also
If you already know what release you want, you can do the first and second step in one go:
source /cvmfs/belle.cern.ch/tools/b2setup <your choice of release>
Note that if you setup an unsupported, old, or strange release you should see a warning:
$ b2setup release-01-02-09
Environment setup for release: release-01-02-09
Central release directory : /cvmfs/belle.cern.ch/el7/releases/release-01-02-09
Warning: The release release-01-02-09 is not supported any more. Please update to ...
Sometimes people have good reason to use old releases but you should know that you will get limited help and support if you are using a very old version.
And you expose yourself to strange bugs that will not be fixed in your version (because they are fixed in some later release).
It is also true that using the latest supported release makes you cool.
Exercise
There is a detailed page in this documentation describing the differences between a full release and a light release and also a Belle II question.
Hint
Solution
Question
What is semantic versioning?
Hint
Solution
Question
If you have code that worked in release-AA-00-00
will it work in
release-AA-01-00
?
Solution
Question
If you have code that worked in release-AA-00-00
will it work in
release-BB-00-00
?
Solution
Question
If you have code that worked in light-5501-future
will it work in
light-5602-reallyfarfuture
?
Solution
Exercise
Typically there are two supported full releases. What are they?
Hint
Solution
Exercise
Find the source code for the recommended full release.
Hint
Solution
A useful command#
If you’re ever stuck and you are writing a post on questions.belle2.org or an email to an expert they will always want to know what version you are using.
Try
basf2 --info
to check everything was set up correctly. If that worked, then paste the information at the bottom (after the ascii art) into any correspondence with experts.
Help at the command line#
There are quite a lot of standard python tools/ways to get you help at the command line or in an interactive environment. The Belle II environment supports pydoc3.
Try:
pydoc3 basf2.Path
You should notice that this is the same documentation that you will find by
clicking on: basf2.Path
here in this online documentation.
In addition, there are some basf2-specific commands.
Listing the basf2 modules#
To find information about a basf2 module, try:
b2help-modules # this lists all of them
b2help-modules | grep "Particle"
b2help-modules ParticleCombiner
Listing the basf2 variables#
In the next lessons, you will need to refer to physics quantities in plain text format. basf2 defines many variables for you. These variables are collected in something called the VariableManager.
To check the list of basf2 variables known to the VariableManager, run
b2help-variables
b2help-variables | grep "invariant"
There is a Variables section in this documentation which you might find more helpful than the big dump.
See also
Listing the modular analysis convenience functions#
We have a python module full of useful shorthand functions which configure
basf2 modules in the recommended way.
It is called modularAnalysis
.
More on this later.
For now, you can list them all with:
basf2 modularAnalysis.py
basf2 particles#
Sometimes you will need to write particles’ names in plain text format. basf2 adopts the convention used by ROOT, the PDG, EvtGen, …
To show information about all the particles and properties known to basf2,
there is a tool b2help-particles
.
b2help-particles --pdg 313 # how should I write the K*(892)?
b2help-particles B_s # what was the pdg cod of the B-sub-s meson again?
b2help-particles Sigma_b- # I've forgotten the mass of the Sigma_b- !
b2help-particles Upsi # partial names are accepted
#b2help-particles # lists them all (this is a lot of output)
Note
In the next lesson you will need to use these names.
Question
What was the luminosity collected in experiment 8?
Hint
Another hint
Are you sure you really need another hint?
Solution
Other useful features#
If you just execute basf2 without any arguments, you will start an IPython session with many basf2 functions imported. Try just:
basf2
In your IPython session, you can try the basf2 python interface to the PDG database:
You should also make use of IPython’s built-in documentation features.
In [4]: import modularAnalysis
In [5]: modularAnalysis.reconstructDecay?
In [6]: # the question mark brings up the function documentation in IPython
In [7]: print(dir(modularAnalysis)) # the python dir() function will also show you all functions' names
You can remind yourself of the documentation for a basf2.Path
in yet another way:
In [8]: import basf2
In [9]: basf2.Path?
In [10]: # the question mark brings up the function documentation in IPython
In [11]: # this is equivalent to...
In [12]: print(help(basf2.Path))
To leave interactive basf2 / IPython, simply:
In [13]: # exit()
In [14]: # ... or just
In [15]: exit
Other useful things in your environment#
You might notice that setting up the basf2 environment means that you also have tools like ROOT, and (an up-to-date version of) git.
These come via the Belle II externals. We call software “external” if is not specific to Belle II but used by basf2.
See also
If you are interested, you can browse the list of everything included in the externals in this README file.
Some python packages that are useful for final offline analysis are also included in the externals for your convenience. These are tools such as numpy and pandas. You will meet them in the Offline analysis lessons.
Key points
b2setup
sets up the environment.You need to setup a specific release and you should try and keep up-to-date.
b2help-releases
b2setup <choose a release>
b2help-particles
basf2 has a python interface. You can use python tools to find help.
basf2
without any tools gets you into a basf2-flavoured IPython shell.
The basf2 analysis package#
The analysis package of basf2 contains python functions and C++ basf2 modules to help you perform your specific analysis on reconstructed dataobjects. It will probably become your favourite package.
The collection of “reconstructed dataobjects” is actually a well-defined list. You will hear people call these “mdst dataobjects”. The “mdst” is both a file-format and another basf2 package containing the post-reconstruction dataobjects.
Exercise
Find the documentation for the analysis package and read the first two sections.
Hint
Solution
Exercise
Find a list of mdst dataobjects.
Solution
See also
“mdst” in the glossary
Earlier we asked some questions about code backward-compatibility. We can now take a brief diversion into the second kind of backward-compatibility that is guaranteed in the software.
Mdst backward-compatibility is guaranteed for the last two major releases.
See also
The confluence page Software Backward Compatibility
Question
If you have an mdst file that was created in release-AA-00-00
will you be able to open it with release-BB-00-00
?
Solution
Question
If you have an mdst file that is from the latest MC campaign. Will you be able to open it with the latest light release?
Solution
You will use mdst data files in the next lesson.
Let’s get back to thinking about the reconstructed dataobjects. An important point to understand is that the analysis package interprets collections of these dataobjects as particle candidates.
In brief:
A track (with or without a cluster and with or without PID information) is interpreted as a charged particle (
, , , , or ).A cluster with no track in close vicinity is interpreted as a photon or a
.Two or more of the above particles can be combined to make composite particle candidates. For example:
Two photons can be combined to create
candidates.Two tracks can be combined to create
candidates.
… And so on.
In fact, the analysis package mostly operates on ParticleList s. A ParticleList is just the list of all such particle candidates in each event. In the next lesson you will make your own particle lists and use analysis package tools to manipulate them.
Making your life easier#
Suggested configuration of the analysis package basf2 modules is usually done for you in so-called “convenience functions”. Certainly all the modules needed for these lessons.
The python module containing these functions is called modularAnalysis
.
You have already met the modularAnalysis
convenience functions
earlier in this lesson: Listing the modular analysis convenience functions.
You are encouraged to look at the source code for the modularAnalysis
convenience functions that you find yourself using often.
In pseudo-python you will see they are very often of the form:
import basf2
def doAnAnalysisTask(<arguments>, path):
"""
A meaningful and clear docstring. Sometimes quite long-winded.
Occasionally longer than the active code in the function.
Details all of the function inputs...
Parameters:
foo (bar): some input argument
path (basf2.Path): modules are added to this path
"""
# register a module...
this_module = basf2.register_module("AnalysisTaskModule")
# configure the parameters...
this_module.param('someModuleParamter', someValue)
# add it to the path...
path.add_module(this_module)
Question
What is the ParticleCombiner module? What does it do?
Hint
Solution
Exercise
Find the modularAnalysis
convenience function that wraps the
ParticleCombiner
module?
Read the function.
Solution
Congratulations! You are now ready to write your first steering file. Good luck.
See also
While the next sections will help you to understand the basics of steering files step by step, there are also some complete examples for steering files in the main software repository. You might want to take a look there after the starterkit.
Stuck? We can help!
Improving things!
Quick feedback!
Author of this lesson
Sam Cunliffe