Full Workflow Execution#
We execute the dependency tree for a target task by calling b2luigi.process(Task(parameters=...),workers=<nworkers>)
. b2luigi will run a maximum number of <nworkers>
tasks in parallel, whenever possible.
It is best practice to include a __main__
method in the scripts:
# @cond
import b2luigi as luigi
from offlineanalysis import Plot
if __name__ == "__main__":
output_directory = "/group/belle/users/<user>"
luigi.set_setting("result_dir", output_directory)
luigi.process(Plot(), workers=100)
# @endcond
Calling python3 main.py --batch
on KEKcc will the trigger the full workflow execution. b2luigi will build the dependency tree for the Plot
task and execute only the required tasks for which no output files are existing in the given output directory. Do not forget to adjust output_directory
and to setup basf2 beforehand, for the recommended release use b2setup $(b2help-releases)
. Remember that the reconstruction task is the only task not marked as local
and will therefore be submitted to the KEKcc batch system.
You can run b2luigi workflows dryly with python3 main.py --dry-run
to check what tasks would be run.
Luigi features a dynamic directed acyclic graph, that can be viewed in the Luigi Task Status. To access it, start the luigi scheduler in a tmux process on KEKcc and specify the host and port in the workflow execution:
tmux #open a new tmux session
source /cvmfs/belle.cern.ch/tools/b2setup <release> #setup basf2
~/.local/bin/luigid --port <ssh port> #start the luigi scheduler
Ctrl + b + d #detach the tmux session
source /cvmfs/belle.cern.ch/tools/b2setup <release> #setup basf2
python3 main.py --batch --scheduler-host localhost --scheduler-port <ssh port> #start workflow
firefox localhost:<ssh port> #view scheduler on your local machine