4 from time
import sleep, time
11 Helper class to call a given (basf2) command via subprocess
12 and make sure the process is killed properly once a SIGINT or SIGTERM signal is
13 send to the main process.
14 To do this, the basf2 command is started in a new session group, so all child processes
15 of the basf2 command will also be killed.
17 When the main process receives a termination request via an SIGINT or SIGTERM,
18 a SIGINT is sent to the started basf2 process.
19 If the process is still alive after a given timeout (10 s by default),
20 it is killed via SIGKILL and all its started child forks with it.
21 After a normal or abnormal termination, the run() function returns the exit code
22 and cleanup can happen afterwards.
24 ATTENTION: In some rare cases, e.g. when the terminate request happens during a syscall,
25 the process can not be stopped (see uninterruptable sleep process state, e.g. in
26 https://stackoverflow.com/questions/223644/what-is-an-uninterruptable-process).
27 In those cases, even a KILL signal does not help!
29 The class can be used in a typical main method, e.g.
31 from hlt.clean_execution import CleanBasf2Execution
33 if __name__ == "__main__":
34 execution = CleanBasf2Execution()
36 execution.start(["basf2", "run.py"])
39 # Make sure to always do the cleanup, also in case of errors
46 Create a new execution with the given parameters (list of arguments)
57 Add the execution and terminate gracefully/hard if requested via signal.
59 basf2.B2INFO(
"Starting ", command)
60 process = subprocess.Popen(command, start_new_session=
True)
61 pgid = os.getpgid(process.pid)
62 if pgid != process.pid:
63 basf2.B2WARNING(
"Subprocess is not session leader. Weird")
71 Wait until all handled calculations have finished.
76 returncode = process.returncode
77 basf2.B2INFO(
"The process ", command,
" died with ", returncode,
78 ". Killing the remaining ones.")
85 The signal handler called on SIGINT and SIGTERM.
91 Clean or hard shutdown of all processes.
92 It tries to kill the process gracefully but if it does not react after a certain time,
93 it kills it with a SIGKILL.
96 basf2.B2WARNING(
"Signal handler called without started process. This normally means, something is wrong!")
99 basf2.B2INFO(
"Termination requested...")
102 signal.signal(signal.SIGINT, signal.SIG_IGN)
103 signal.signal(signal.SIGTERM, signal.SIG_IGN)
109 os.killpg(process.pid, signal.SIGINT)
111 except ProcessLookupError:
117 basf2.B2WARNING(
"Process did not react in time. Sending a SIGKILL.")
122 os.killpg(process.pid, signal.SIGKILL)
124 backtrace = subprocess.check_output([
"gdb",
"-q",
"-batch",
"-ex",
"backtrace",
"basf2",
125 str(process.pid)]).decode()
126 basf2.B2ERROR(
"Could not end the process event with a KILL signal. "
127 "This can happen because it is in the uninterruptable sleep state. "
128 "I can not do anything about this!",
130 except ProcessLookupError:
133 basf2.B2INFO(
"...Process stopped")
140 Wait maximum "timeout" for the process to stop.
141 If it did not end in this period, returns False.
143 if process_list
is None:
149 endtime = time() + timeout
155 remaining = endtime - time()
163 Set the signal handlers for SIGINT and SIGTERM to out own one.
174 Check if the handled process has ended already.
175 This functions does not wait.
177 I would rather use self._handled_process.wait() or poll()
178 which does exactly the same.
179 However: the main process is also waiting for the return code
180 so the threading.lock in the .wait() function will never aquire a lock :-(
182 pid, sts = process._try_wait(os.WNOHANG)
183 assert pid == process.pid
or pid == 0
185 process._handle_exitstatus(sts)
187 if pid == process.pid: