Full Workflow Execution#

We execute the dependency tree for a target task by calling b2luigi.process(Task(parameters=...),workers=<nworkers>). b2luigi will run a maximum number of <nworkers> tasks in parallel, whenever possible.

It is best practice to include a __main__ method in the scripts:

Listing 3.6 main.py#
1# @cond
2import b2luigi as luigi
3from offlineanalysis import Plot
4if __name__ == "__main__":
5    output_directory = "/group/belle/users/<user>"
6    luigi.set_setting("result_dir", output_directory)
7    luigi.process(Plot(), workers=100)
8# @endcond

Calling python3 main.py --batch on KEKcc will the trigger the full workflow execution. b2luigi will build the dependency tree for the Plot task and execute only the required tasks for which no output files are existing in the given output directory. Do not forget to adjust output_directory and to setup basf2 beforehand, for the recommended release use b2setup $(b2help-releases). Remember that the reconstruction task is the only task not marked as local and will therefore be submitted to the KEKcc batch system.

You can run b2luigi workflows dryly with python3 main.py --dry-run to check what tasks would be run.

Luigi features a dynamic directed acyclic graph, that can be viewed in the Luigi Task Status. To access it, start the luigi scheduler in a tmux process on KEKcc and specify the host and port in the workflow execution:

1tmux #open a new tmux session
2source /cvmfs/belle.cern.ch/tools/b2setup <release> #setup basf2
3~/.local/bin/luigid --port <ssh port> #start the luigi scheduler
4Ctrl + b + d #detach the tmux session
5
6source /cvmfs/belle.cern.ch/tools/b2setup <release> #setup basf2
7python3 main.py --batch --scheduler-host localhost --scheduler-port <ssh port> #start workflow
8firefox localhost:<ssh port> #view scheduler on your local machine