9. Belle II File Format

The official standard Belle II file format is the mini data-summary table (mdst).

There is a basf2 package (the mdst package) which defines this file format. An mdst file contains a curated list of post-reconstruction dataobjects which are provided for analysis use. The dataobjects are particularly important as they are the (only) information that is provided for high-level analysis.

These mdst dataobjects are optimised for minimal disk size per event. This is important because the Belle II experiment will collect around \(5\times10^{10}\) events, so data file size considerations will become rather critical.

9.1. Python interface

mdst.add_mdst_dump(path, print_untested=False)[source]

Add a PrintObjectsModule to a path for printing the mDST content.

Parameters
  • path (basf2.Path) – Path to add module to

  • print_untested (bool) – If True print the names of all methods which are not explicitly printed to make sure we don’t miss addition of new members

mdst.add_mdst_output(path, mc=True, filename='mdst.root', additionalBranches=[], dataDescription=None)[source]

Add the mDST output module to a path. This function defines the mDST data format.

Parameters
  • path (basf2.Path) – Path to add module to

  • mc (bool) – Save Monte Carlo quantities? (MCParticles and corresponding relations)

  • filename (str) – Output file name.

  • additionalBranches (list) – Additional objects/arrays of event durability to save

  • dataDescription (dict or None) – Additional key->value pairs to be added as data description fields to the output FileMetaData

9.2. C++ dataobjects

The post-reconstruction dataobjects are C++ classes found in the mdst dataobjects directory:

ls $BELLE2_RELASE_DIR/mdst/dataobjects/include

Or in the C++ doxygen documentation here.

9.3. ROOT and compatibility guarantee

Mdst files are written by the RootOutput module and are based on the ROOT file format. However it is important to note that analysis of mdst with any software other than basf2 is not supported or permitted. I.e. use of the basf2 framework and the Analysis package is mandatory.

Warning

A common misconception: Opening an mdst file with standard ROOT tools (e.g. with a TBrowser) may initially “work”, but the results are not reproducible.

Many dataobject member accessors require the basf2 environment to return meaningful values. Segmentation faults and double-counting of tracks and/or clusters are very likely to be encountered.

However! Backward-compatibility is guaranteed for two major basf2 releases and the supported light releases.

See also

Backward compatibility on confluence.