.. _onlinebook_fundamentals_datataking:

Data Taking
===========

.. sidebar:: Overview
:class: overview

**Teaching**: 45 min

**Prerequisites**:

* :ref:`Fundamentals Introduction <onlinebook_fundamentals_introduction>`

**Objectives**:

* Understand the different detector systems in Belle II.
* Definition of triggers and their effects.

One of the most important steps is of course to record the data we want to
analyse. In this chapter we will go through the important concepts of the Belle
II detector and how we record data. Understanding these concepts is fundamental
for any analysis.

The Experiment
--------------

If you are reading this manual, you are probably already at least partially
familiar with the general layout of the SuperKEKB accelerator and the Belle II
experiment. However, before moving on, let's very quickly review their structure.

The SuperKEKB accelerator circulates electrons and positrons through its roughly
3 km circumference tunnel in opposite directions. These beams are asymmetric in
momentum, with the electrons kept at around 7 GeV/c and the positrons at around
4 GeV/c. At a single point on the accelerator ring, the two beams are steered
into (almost) head-on collision, resulting in a center-of-mass energy of
typically around 10.58 GeV, corresponding to the :math:`\Upsilon(4S)` resonance.
The point of collision is named the "interaction region".
The center of mass energy can be changed to take data at other resonances of the
:math:`\Upsilon` family, from around 9.4 to 11 GeV, for the non-B physics part
of the physics program.

.. admonition:: Question
:class: exercise stacked

At LHC, every bunch collision generates dozens of individual particle
interactions that overlay each other in the detectors (pile-up),
considerably complicating the data analysis.
This doesn't seem to be a problem at SuperKEKB and Belle II. Why?

.. admonition:: Hint
:class: toggle xhint stacked

Start with the planned final instantaneous luminosity of SuperKEKB. How
many bunch crossings will happen per second?
Then think about the typical cross sections in :math:`e^+e^-` collisions
as discussed previously.

.. admonition:: Another hint
:class: toggle xhint stacked

The goal instantaneous luminosity of SuperKEKB is :math:`8\times 10^{35}\, \textrm
{cm}^{-2} \textrm{s}^{-1}`. It takes a beam particle bunch roughly 10 Î¼s to complete
a full revolution around the accelerator ring. Up to 2376 bunches will circulate
in each ring.

.. admonition:: Solution
:class: toggle solution

At a final design luminosity of :math:`8\times 10^{35}\, \textrm{cm}^{-2}\textrm{s}^
{-1}` at 2376 bunches per ring, each taking about 10 Î¼s to complete a revolution, the
delivered luminosity per bunch crossing is about :math:`8\times 10^{35}\, \textrm
{cm}^{-2} \textrm{s}^{-1} \cdot 10\times 10^{-6}\ \textrm{s} / 2376 = 3.4\times10^{-6}\,
(\textrm{nb})^{-1}`, so even the most likely Bhabha process at :math:`125\,
\textrm{nb}` only happens about once every
:math:`(3.4\times 10^{-6}\cdot 125)^{-1} \approx 2400` bunch crossings.

.. figure:: belle2.png
:align: center
:width: 900px
:alt: The Belle II detector.

The Belle II detector.

The detector is built around the interaction region, with the goal to
detect and measure as many of the particles produced in the SuperKEKB collisions
as possible. Belle II consists of several sub-systems, each one dedicated to a
specific task: reconstruct the trajectory of charged track, reconstruct the
energy of photons, identify the particle type or to identify muons and
reconstruct long-living hadrons. Of course some systems can be used for
multiple purposes: for example, the ECL is mainly intended as a device to
reconstruct photons, but is also used to identify electrons and hadrons.

Due to the asymmetry of the SuperKEKB collisions, the detector is
asymmetric along the beam axis. In the context of Belle II the "forward"
direction is the direction in which the high energy electron beam points, while "backward"
is the direction in which the lower energy positron beam points.

.. seealso::

There is an important document for any large HEP detector called the
**Technical Design Report** (TDR). This document contains the proposed design
of the experiment.

The Belle II TDR is `arXiv:1011.0352 <https://arxiv.org/abs/1011.0352>`_.

Unfortunately, some of the details are now outdated (that document dates 2010).
Particularly, we do not recommend you rely on this for performance studies.

Nonetheless you should know what it is, because people might mention it.
You may need to reference it in your thesis.

.. figure:: detector_labeled.jpg
:align: center
:width: 900px
:alt: Closeup of the Belle II detector.

Closeup of the Belle II detector indicating all the different sub detectors.

Beam Pipe
The beam pipe itself is not an active part of the detector, but plays the crucial
role of separating the detector from the interaction region, which is located in
the low-pressure vacuum of the SuperKEKB rings. It is a cylindrical pipe designed
to be as thin as possible in order to minimize the particle's energy loss in it,
but it is also crucial to absorb most of the soft synchrotron X-rays emitted by
the beams before they can hit the detector. Otherwise they would represent
a major source of noise for the innermost detector, the PXD.

PXD
The first active system met by the particles that emerge form the IP is the
PiXel Detector (PXD). With this Pixel detector can measure the position of
each particle going through this detector with very high precision and thus,
when combining different particles, get a very precise determination where
these particles intersect. This intersection is called vertex and is usually
where all the particles originate from: either the place of collision or
where they were created when another particle decayed.

You can think of the PXD as the inner vertex detector. The PXD is
constructed from DEPFET silicon sensors segmented into individual 8 million
pixels of down to 50 Ã— 55 Î¼mÂ² size. It consists of two layers at 14 mm and
22 mm radius from the interaction point.

SVD
The Silicon Vertex Detector (SVD) is the outer part of the vertex detector.
It comprises of double sided silicon microstrip sensors with strips widths
down to 50 Î¼m.

A double sided strip detector is similar to a pixel detector in terms of
precision but we don't have one 2D measurement of the position. Instead we
have strips along the sensor on each side and measure two 1D positions: the
horizontal and the vertical position on the sensor.

The four layers of the SVD system extend the outer radius of the vertex
detector up to 140 mm.

.. _vxd-description:

VXD
You will occasionally hear people refer to the pair of detectors: PXD+SVD as
the VerteX Detector (VXD). If you look at :numref:`fig:fundamentals_vxd` you
can see that this does make sense as both systems are closely integrated
with almost no space between them. Technically they are also installed
together as one unit.

.. _fig:fundamentals_vxd:
.. figure:: vxd.jpg
:align: center
:width: 900px
:alt: Image of the VXD

Picture of the VXD: In the middle you can see the PXD which is built around
the Beam Pipe. Outside you can see the four layers of one half of the SVD.

.. admonition:: Optional Question (hard)
:class: exercise stacked

This is not a question we expect you to be able to answer but it might be
good to think about it for a few minutes. So here goes:

Why do we have both, a pixel and a strip detector?

.. admonition:: Hint
:class: toggle xhint stacked

Think about the differences that come from one 2D measurement vs. two 1D measurements

.. admonition:: Answer
:class: toggle solution

A strip detector measures the position of a particle by measuring vertical
and horizontal position in the sensor. This means we get two measurements we
can correlate to the position of the particle.

The benefit is that we have much less readout channels: instead of :math:`X
* Y` pixels we have :math:`X + Y` strips. This means readout can be much
faster and the data size will be much much smaller.

But if we have multiple particles hitting the same sensor then this can
become problematic: we have to combine the horizontal and vertical
measurements and for :math:`N` particles passing the sensor this gives us
:math:`N^2` possible combinations.

So very close to the collision, where we expect the highest density of
particles, it is usually much better to use a pixel detector even though the
data size will be much larger.

And that's why we have a smaller pixel detector very close to the
interaction region and then a larger strip detector a bit further away.

.. _cdc-description:

CDC
The main tracking system for Belle II is the Central Drift Chamber (CDC).
It is comprised of so-called sense wires suspended in He-Câ‚‚Hâ‚† gas. Charged
particles passing through the gas cause ionisation charges, which then
drift (hence the name) to nearby sense wires, where `gas amplification
<https://en.wikipedia.org/wiki/Townsend_discharge>`_ causes signal
propagation. You will hear people refer to these ionisation signals as
"hits" in the CDC.

A charged particle passing through the CDC results in a succession of hits
following the trajectory of the particle. From the timing of each wire
signal it is possible to infer the drift time and thus the distance at which
the primary ionization was caused. You can approximate the resulting
isochrone with a "drift circle" for each wire, to which the particle
trajectory must have been tangent (see
:numref:`fig:reconstruction-trackfinding`) . This allows for a much better
point resolution than the wire spacing alone might let you assume.

.. _fig:fundamentals_CDC:
.. figure:: cdc.jpg
:align: center
:width: 900px
:alt: Picture of the CDC endplate

Testing of the inner CDC wires before installation.

TOP
The Time Of Propagation (TOP) detector provides particle identification
information in the barrel region of Belle II .
The subdetector comprises of quartz bars and works by utilizing the
`Cherenkov effect <https://en.wikipedia.org/wiki/Cherenkov_radiation>`_.
Particles passing through will cause Cherenkov photons to be emitted at an
angle that directly depends on the particle velocity. Combining this
velocity information with the particle momentum measured in the preceding
tracking detectors yields a mass measurement, which identifies the particle
species.

Emitted Cherenkov photons are captured inside the quartz bars by total
internal reflection (see :numref:`fig:fundamentals_top`). TOP reconstructs
the Cherenkov emission angle by measuring the effective propagation time of
individual Cherenkov photons from their emissions point to the TOP sensor
plane. At a given momentum, heavier particles will have lower velocities,
thus a lower Cherenkov opening angle and thus, on average, a longer photon
propagation path, causing a longer time of propagation of individual
photons. You might also hear people refer to the TOP as the iTOP (imaging
TOP).

.. _fig:fundamentals_top:
.. figure:: top.jpg
:align: center
:width: 900px
:alt: Internal reflection in TOP

Internal reflection of a laser inside one of the 16 bars of the TOP detector.

ARICH
The Aerogel Ring-Imaging Cherenkov detector is another dedicated particle
identification subdetector using aerogel as its radiator medium. It covers
the forward region of the detector.

Just as with the quartz in TOP, Cherenkov photons are emitted when a charged particle
of sufficient velocity passes through the aerogel. Contrary to the TOP quartz, the
aerogel does not capture the emitted Cherenkov photons, so they are forming a cone of
Cherenkov light around a particle track which is imaged as a ring of characteristic
radius, providing an orthogonal source of particle mass information.

.. _fig:fundamentals_arich:
.. figure:: arich.jpg
:align: center
:width: 900px
:alt: Picture of the ARICH

Picture of the ARICH before installation.

.. admonition:: Question
:class: exercise stacked

Why is it OK to only have an ARICH in forward region and why do we not need
any particle identification in the backwards direction of the detector?

.. admonition:: Hint
:class: toggle xhint stacked

Think of the beam energies at Belle II

.. admonition:: Another hint
:class: toggle xhint stacked

The nominal beam energies are 7 GeV on 4 GeV. That means the center of mass
is boosted in forward direction.

.. admonition:: Solution
:class: toggle solution

The center of mass is boosted due to the asymmetric beam energies. As such,
any particles generated in the collision are boosted in forward direction
and the fraction of particles actually hitting the backwards direction of
the detector is low.

With higher boost we would not need a backawards detector at all. And if we
could go even higher with the boost we might not even need a barrel region.

So now if you look at the LHCb detector you might understand why its only a
forward spectrometer: their boost is so high that they simply don't need a
barrel or backwards region at all and thus saved the money.

However they do loose hermiticity so they sacrifice the possibility to do
some analysis.

ECL
The Electromagnetic CaLorimeter (ECL) is chiefly tasked with measuring the
electromagnetic energy of photons and electrons produced in the collision.
In combination with tracking information, the calorimeter can distinguish, for
example, electrons from muons.

A track from an electron will stop in the calorimeter, a muon will continue
through as a minimum-ionising particle. It therefore provides further
orthogonal information to the particle-identification system.

The ECL consists of over 8000 Caesium Iodide crystsals which create
scintillalation light when a particle flies into them. The amount of
light is proportional to the energy deposited in the crystal so by measuring
it we can measure the energy of the the particle. Of course this assumes the
particle is fully stopped and deposits all of its energy in the ECL. This is
not true for most particles so we need to calibrate the light response to
energy measurements.

.. _fig:fundamentals_ecl:
.. figure:: ecl.jpg
:align: center
:width: 900px
:alt: Picture of ECL electronics

Small part of the ECL electronics in the endcap. oards do the mapping from
the preamps (which you canâ€™t see) to the cables that go to the ShaperDSPs.
You can also see the cooling lines under the boards.

KLM
Finally, there is the KLong and Muon (KLM) system.
The KLM provides muon identification information to tracks that pass
through all other subdetectors and also reconstructs :math:`K_L^0` s from
the collision.

.. seealso::

There are two more useful reference documents that you should be aware of.
Now seems like a good time to mention them.

1. Bevan, A. *et al*. The Physics of the B Factories. *Eur.Phys.J. C* **74** 3026(2014).
https://doi.org/10.1140/epjc/s10052-014-3026-9

2. Kou, E. *et al*. The Belle II physics book, *PTEP 2019* **12** 123C01,
https://doi.org/10.1093/ptep/ptz106.

The former is a book describing the previous generation B-factories (the detectors and their achievements).
The latter describes the Belle II detector and the physics goals.
It is sometimes referred to (rather opaquely) as the B2TiP report.
If you are a newcomer you should probably refer to it as it's (significantly more sane) official name.

.. admonition:: Key points
:class: key-points

* You have heard of all the sub detectors and what their primary measurement is

* You know where to find the Belle II TDR, "The Physics of the B factories", and "The Belle II physics book".

On Resonance, Continuum, Cosmics
--------------------------------

We saw that to collect :math:`B` mesons one must collide electrons and positrons at the
centre-of-mass energy of :math:`\sqrt{s} = 10.580` GeV, corresponding to the
:math:`\Upsilon(4S)` resonance mass. However this is not the only energy at
which the SuperKEKB accelerator can work, and it's not the only kind of dataset
that Belle II can collect.

On-resonance
The standard collisions at :math:`\sqrt{s} = 10.580` GeV.

Off-resonance
:math:`e^+e^- \to \Upsilon(4S) \to B\bar{B}` is not the only process that takes place at
:math:`\sqrt{s} = 10.580` GeV. The production of light and charm quark pairs in the reaction
:math:`e^+e^- \to u\bar{u}, d\bar{d}, s\bar{s}, c\bar{c}` has a total cross section of about :math:`3.7`
nb is more that three times larger than the production of :math:`B` mesons. As the quarks hadronize leaving
final states that are similar to the :math:`B\bar{B}`. This background can be studied using the Monte Carlo
simulation, but it's more effective to study it directly on data. Occasionally, 1--2 times per year, a
special dataset is collected approximately 60 MeV below the :math:`\Upsilon(4S)`. Here no :math:`B` mesons
can be produced, leaving one with a pure sample of continuum events, called *off-resonance* (or *continuum*) sample.

.. _fig:fundamentals_onresonance:
.. figure:: Uresonances.svg
:align: center
:width: 900px
:alt: Cross section plot for the Upsilon resonances

Cross section around the Î¥ resonances showing the difference between on- and off-resonance.

Cosmic
At the beginning and end of each run period Belle II acquires cosmic muons. These events are used mainly for
performance studies and for calibration, as they provide an unique sample for aligning the detectors with
each other. Usually part of this dataset is collected with the solenoid switched off, so that
muons cross the detectors on straight trajectories. If the SuperKEKB accelerator has a major
downtime of few days, a cosmic dataset is usually collected to keep the Belle II system running.

Beam
Beam runs are special, usually short data takings used to study the beam-induced background on
the inner sub-detectors.
They are taken with the beams circulating without colliding, to remove all the processes
related to :math:`e^+e^-` hard scattering.

Scan
A scan consists of rather short data taking periods (hours or few days long) performed at slightly different energies
(usually 10--50 MeV apart). The goals of a scan is to measure the line shape of the :math:`e^+e^-` cross section to either
check that data are collected on the resonance peak (short scans), or to perform real physics measurements
such as the search for exotic vector resonances (long scans above the :math:`\Upsilon(4S)` energy).

Non-4S
SuperKEKB can operate across the whole spectrum of bottomonia, from the :math:`\Upsilon(1S)` at
:math:`9.460` GeV to slightly above the :math:`\Upsilon(6S)` at around :math:`11.00` GeV. These datasets can be used for all the non-B
parts of the Belle II physics program, but are particularly interesting for the spectroscopy, hadronic physics and
dark sector studies.

.. _onlinebook_fundamentals_triggers_filters:

Triggers and filters
--------------------

SuperKEKB bunches can cross the interaction region up to every 4 ns. However, in
the vast majority of cases either no collision (more precisely: no hard
interaction) takes place at all, or the collision results are not interesting
(for example :math:`e^+e^-\to e^+e^-` events are the most common, but of
secondary interest to our physics program).

Recording and keeping all detector information for each collision would be
wasteful. Not only are most events not interesting, but the bandwidth
requirements to the offline disks would be impossible to provide. Like most HEP
experiments we therefore trigger and filter events. This is all achieved with
**the Belle II online system** (you might hear people refer to it as "online").
The online system consists of the Data Acquisition (DAQ), Level 1 Trigger (TRG,
also called L1) and the High Level Trigger (HLT). These two trigger systems are
designed to reduce the amount of data as much as possible before they reach the
first storage hard disk.

Generally, when Belle II is running and operational, each subdetector will
transmit its readout data upon receipt of an external trigger signal.
The data gathered from all subdetectors in response to a given external trigger
is what we call "one event". Generating this trigger signal for each
"interesting" collision is the task of the TRG system. The TRG system receives
what effectively amounts to a low resolution "live stream" of the readout data
of CDC, ECL and KLM (for completeness: TOP also sends stream data to TRG but it
is not used for triggering directly). The streamed data is interpreted in near
realtime using specialized fast electronics (Field Programmable Gate
Arrays, FPGAs) by continuously matching it to predefined trigger conditions. If
TRG determines an interesting collision event has just taken place, it generates
a trigger signal which is distributed to all subdetectors. The TRG system is
designed to issue up to 30 kHz of such triggers at the full SuperKEKB design
luminosity.

.. note::
The TRG system will issue a trigger decision with a fixed delay of about 4 Âµs.
In practice, all subdetector frontend electronics thus have to keep a buffer
of their readout data of the past several microseconds, so they can transmit
the measurement they took in the time slice around 4 Âµs ago.

The DAQ system makes sure that all trigger signals are synchronously delivered
to all subdetectors. It also provides the high-speed data links that are used
to read out the subdetector data for each event and forwards it to the HLT
system.

.. _fig:fundamentals_dataflow:
.. figure:: b2dataflow.svg
:align: center
:width: 900px
:alt: Diagram of Belle II Dataflow

Simplified diagram of the Belle II data flow.

The HLT system is a computing cluster of about 10000 CPU cores located right next
to the detector. It receives the full raw subdetector data for each
triggered event and performs an immediate full reconstruction using the
exact same basf2 software as is used in offline data analysis. Based on the
result of this reconstruction, events are classified and either stored to a
local offline storage hard disk drive or discarded. This high level event
selection is expected to reduce the amount of data written to the offline
storage by at least 60%.

.. note::
The HLT can operate in two modes: **filtering** and **monitoring**. In the former the
events that do not satisfy any of the HLT selections (called "lines")
are discarded and lost forever. In the latter, all the events are kept regardless of the
HLT decision (which is however stored, so analyses of the HLT filtering efficiency
can be conducted).
On top of this, the HLT also performs a first, rought skimming producing the so-called
**HLT skims**. All the events that are satisfying the HLT conditions (regardless of the
HLT operation mode), are then assigned according to a quick analysis into few
possible categories: bhabha, hadronic, tau, mumu and so on attaching a flag to them,
which can then be read out from the raw data without reconstructing the full event.
This categorization is mainly done to quickly select the events that are needed for
calibration and send them to the calibration center, however the hadronic HLT skim
(called **hlt_hadron**) has become increasingly popular to perform analysis in the
early stage of the experiment, and is being considered as a starting point
for many analysis skims (more on them later), so you will likely hear of it.

.. _fig:fundamentals_hlt:
.. figure:: hlt.jpg
:align: center
:width: 900px
:alt: Picture of the HLT

The HLT is one of the critical parts of our data taking chain. Every part of
it needs to be carefully installed, maintained, and operated.

Both the TRG system and the HLT classify events based on the data available to
them. While the decision whether to issue a trigger for a given collision (or on
HLT whether to keep the event or discard it) is of course binary, certain event
classes might be intentionally triggered at less than 100% of their occurrence.
For example, while Bhabha scattering events (:math:`e^+e^- \to e^+e^-`, often just
called "Bhabhas") are generally not very interesting for the physics program of
Belle II, keeping some of them for calibration purposes might be very useful.
Since Bhabhas are easily identified even with the limited information available
to the TRG system, the TRG system will not issue a trigger for every single
identified Bhabha event, but only for a configurable fraction. This technique of
intentionally issuing triggers only for fractions of a given event class is
named prescaling. When working on your own analysis, it is very important to
keep in mind potential prescaling of the triggers that yield the events you use
in your analysis. Since the prescaling settings can (and will) change over
the lifetime of the experiment, updated numbers for each run can be
found `here <https://confluence.desy.de/display/BI/TriggerBitTable>`_.. See also
`this question <https://questions.belle2.org/question/9437/where-to-find-pre-scaling-factors/>`_.
for more details.

Since the TRG and HLT systems are ultimately deciding which data is being kept
for offline analysis, understanding and validating their performance vs. their
intended functionality is of highest importance for the success of the experiment.

.. admonition:: Key points
:class: key-points

* The TRG system aims to recognize interesting events from the near
continuous stream of collisions.
* The HLT system uses the full readout data for each event to further decide
which events to keep for offline analysis and which ones to discard.
* Prescaling might be used to only record every n-th event (on average) that
satisfies given trigger conditions.

.. include:: ../lesson_footer.rstinclude

.. topic:: Author(s) of this lesson

Umberto Tamponi,
Martin Ritter,
Oskar Hartbrich,
Michael Eliachevitch,
Sam Cunliffe