Belle II Software  release-06-02-00
HTCondor Class Reference
Inheritance diagram for HTCondor:
Collaboration diagram for HTCondor:

Classes

class  HTCondorResult
 

Public Member Functions

def get_batch_submit_script_path (self, job)
 
def can_submit (self, njobs=1)
 
def condor_q (cls, class_ads=None, job_id="", username="")
 
def condor_history (cls, class_ads=None, job_id="", username="")
 
def can_submit (self, *args, **kwargs)
 
def submit (self, job, check_can_submit=True, jobs_per_check=100)
 
def submit (self, job)
 
def get_submit_script_path (self, job)
 

Public Attributes

 global_job_limit
 The active job limit. More...
 
 sleep_between_submission_checks
 Seconds we wait before checking if we can submit a list of jobs. More...
 
 backend_args
 The backend args that will be applied to jobs unless the job specifies them itself.
 

Static Public Attributes

string batch_submit_script = "submit.sub"
 HTCondor batch script (different to the wrapper script of Backend.submit_script)
 
list submission_cmds = ["condor_submit", "-terse"]
 Batch submission commands for HTCondor.
 
int default_global_job_limit = 10000
 Default global limit on the number of jobs to have in the system at any one time.
 
dictionary default_backend_args
 Default backend args for HTCondor. More...
 
list default_class_ads = ["GlobalJobId", "JobStatus", "Owner"]
 Default ClassAd attributes to return from commands like condor_q.
 
int default_sleep_between_submission_checks = 30
 Default time betweeon re-checking if the active jobs is below the global job limit.
 
string submit_script = "submit.sh"
 Default submission script name.
 
string exit_code_file = "__BACKEND_CMD_EXIT_STATUS__"
 Default exit code file name.
 

Private Member Functions

def _make_submit_file (self, job, submit_file_path)
 
def _add_batch_directives (self, job, batch_file)
 
def _create_cmd (self, script_path)
 
def _submit_to_batch (cls, cmd)
 
def _create_job_result (cls, job, job_id)
 
def _create_parent_job_result (cls, parent)
 
def _ (self, job, check_can_submit=True, jobs_per_check=100)
 
def _ (self, job, check_can_submit=True, jobs_per_check=100)
 
def _ (self, jobs, check_can_submit=True, jobs_per_check=100)
 
def _add_wrapper_script_setup (self, job, batch_file)
 
def _add_wrapper_script_teardown (self, job, batch_file)
 

Static Private Member Functions

def _add_setup (job, batch_file)
 

Detailed Description

Backend for submitting calibration processes to a HTCondor batch system.

Definition at line 1882 of file backends.py.

Member Function Documentation

◆ _() [1/3]

def _ (   self,
  job,
  check_can_submit = True,
  jobs_per_check = 100 
)
privateinherited
Submit method of Batch backend for a `SubJob`. Should take `SubJob` object, create needed directories,
create batch script, and send it off with the batch submission command.
It should apply the correct options (default and user requested).

Should set a Result object as an attribute of the job.

Definition at line 1179 of file backends.py.

◆ _() [2/3]

def _ (   self,
  job,
  check_can_submit = True,
  jobs_per_check = 100 
)
privateinherited
Submit method of Batch backend. Should take job object, create needed directories, create batch script,
and send it off with the batch submission command, applying the correct options (default and user requested.)

Should set a Result object as an attribute of the job.

Definition at line 1215 of file backends.py.

◆ _() [3/3]

def _ (   self,
  jobs,
  check_can_submit = True,
  jobs_per_check = 100 
)
privateinherited
Submit method of Batch Backend that takes a list of jobs instead of just one and submits each one.

Definition at line 1266 of file backends.py.

◆ _add_batch_directives()

def _add_batch_directives (   self,
  job,
  batch_file 
)
private
For HTCondor leave empty as the directives are already included in the submit file.

Reimplemented from Batch.

Definition at line 1929 of file backends.py.

◆ _add_setup()

def _add_setup (   job,
  batch_file 
)
staticprivateinherited
Adds setup lines to the shell script file.

Definition at line 777 of file backends.py.

◆ _add_wrapper_script_setup()

def _add_wrapper_script_setup (   self,
  job,
  batch_file 
)
privateinherited
Adds lines to the submitted script that help with job monitoring/setup. Mostly here so that we can insert
`trap` statements for Ctrl-C situations.

Definition at line 784 of file backends.py.

◆ _add_wrapper_script_teardown()

def _add_wrapper_script_teardown (   self,
  job,
  batch_file 
)
privateinherited
Adds lines to the submitted script that help with job monitoring/teardown. Mostly here so that we can insert
an exit code of the job cmd being written out to a file. Which means that we can know if the command was
successful or not even if the backend server/monitoring database purges the data about our job i.e. If PBS
removes job information too quickly we may never know if a job succeeded or failed without some kind of exit
file.

Definition at line 809 of file backends.py.

◆ _create_cmd()

def _create_cmd (   self,
  script_path 
)
private
 

Reimplemented from Batch.

Definition at line 1935 of file backends.py.

◆ _create_job_result()

def _create_job_result (   cls,
  job,
  job_id 
)
private
 

Reimplemented from Batch.

Definition at line 2079 of file backends.py.

◆ _create_parent_job_result()

def _create_parent_job_result (   cls,
  parent 
)
private
We want to be able to call `ready()` on the top level `Job.result`. So this method needs to exist
so that a Job.result object actually exists. It will be mostly empty and simply updates subjob
statuses and allows the use of ready().

Reimplemented from Backend.

Definition at line 2086 of file backends.py.

◆ _make_submit_file()

def _make_submit_file (   self,
  job,
  submit_file_path 
)
private
Fill HTCondor submission file.

Reimplemented from Batch.

Definition at line 1903 of file backends.py.

◆ _submit_to_batch()

def _submit_to_batch (   cls,
  cmd 
)
private
Do the actual batch submission command and collect the output to find out the job id for later monitoring.

Reimplemented from Batch.

Definition at line 1949 of file backends.py.

◆ can_submit() [1/2]

def can_submit (   self,
args,
**  kwargs 
)
inherited
Should be implemented in a derived class to check that submitting the next job(s) shouldn't fail.
This is initially meant to make sure that we don't go over the global limits of jobs (submitted + running).

Returns:
    bool: If the job submission can continue based on the current situation.

Definition at line 1161 of file backends.py.

◆ can_submit() [2/2]

def can_submit (   self,
  njobs = 1 
)
Checks the global number of jobs in HTCondor right now (submitted or running) for this user.
Returns True if the number is lower that the limit, False if it is higher.

Parameters:
    njobs (int): The number of jobs that we want to submit before checking again. Lets us check if we
        are sufficiently below the limit in order to (somewhat) safely submit. It is slightly dangerous to
        assume that it is safe to submit too many jobs since there might be other processes also submitting jobs.
        So njobs really shouldn't be abused when you might be getting close to the limit i.e. keep it <=250
        and check again before submitting more.

Definition at line 2089 of file backends.py.

◆ condor_history()

def condor_history (   cls,
  class_ads = None,
  job_id = "",
  username = "" 
)
Simplistic interface to the ``condor_history`` command. lets you request information about all jobs matching the filters
``job_id`` and ``username``. Note that setting job_id negates username so it is ignored.
The result is a JSON dictionary filled by output of the ``-json`` ``condor_history`` option.

Parameters:
    class_ads (list[str]): A list of condor_history ClassAds that you would like information about.
        By default we give {cls.default_class_ads}, increasing the amount of class_ads increase the time taken
        by the condor_q call.
    job_id (str): String representation of the Job ID given by condor_submit during submission.
        If this argument is given then the output of this function will be only information about this job.
        If this argument is not given, then all jobs matching the other filters will be returned.
    username (str): By default we return information about only the current user's jobs. By giving
        a username you can access the job information of a specific user's jobs. By giving ``username='all'`` you will
        receive job information from all known user jobs matching the other filters. This is limited to 10000 records
        and isn't recommended.

Returns:
    dict: JSON dictionary of the form:

    .. code-block:: python

      {
        "NJOBS":<number of records returned by command>,
        "JOBS":[
                {
                 <ClassAd: value>, ...
                }, ...
               ]
      }

Definition at line 2180 of file backends.py.

◆ condor_q()

def condor_q (   cls,
  class_ads = None,
  job_id = "",
  username = "" 
)
Simplistic interface to the `condor_q` command. lets you request information about all jobs matching the filters
'job_id' and 'username'. Note that setting job_id negates username so it is ignored.
The result is the JSON dictionary returned by output of the ``-json`` condor_q option.

Parameters:
    class_ads (list[str]): A list of condor_q ClassAds that you would like information about.
        By default we give {cls.default_class_ads}, increasing the amount of class_ads increase the time taken
        by the condor_q call.
    job_id (str): String representation of the Job ID given by condor_submit during submission.
        If this argument is given then the output of this function will be only information about this job.
        If this argument is not given, then all jobs matching the other filters will be returned.
    username (str): By default we return information about only the current user's jobs. By giving
        a username you can access the job information of a specific user's jobs. By giving ``username='all'`` you will
        receive job information from all known user jobs matching the other filters. This may be a LOT of jobs
        so it isn't recommended.

Returns:
    dict: JSON dictionary of the form:

    .. code-block:: python

      {
        "NJOBS":<number of records returned by command>,
        "JOBS":[
                {
                 <ClassAd: value>, ...
                }, ...
               ]
      }

Definition at line 2113 of file backends.py.

◆ get_batch_submit_script_path()

def get_batch_submit_script_path (   self,
  job 
)
Construct the Path object of the .sub file that we will use to describe the job.

Reimplemented from Batch.

Definition at line 1942 of file backends.py.

◆ get_submit_script_path()

def get_submit_script_path (   self,
  job 
)
inherited
Construct the Path object of the bash script file that we will submit. It will contain
the actual job command, wrapper commands, setup commands, and any batch directives

Definition at line 830 of file backends.py.

◆ submit() [1/2]

def submit (   self,
  job 
)
inherited
Base method for submitting collection jobs to the backend type. This MUST be
implemented for a correctly written backend class deriving from Backend().

Reimplemented in Local.

Definition at line 770 of file backends.py.

◆ submit() [2/2]

def submit (   self,
  job,
  check_can_submit = True,
  jobs_per_check = 100 
)
inherited
 

Definition at line 1172 of file backends.py.

Member Data Documentation

◆ default_backend_args

dictionary default_backend_args
static
Initial value:
= {
"universe": "vanilla",
"getenv": "false",
"request_memory": "4 GB", # We set the default requested memory to 4 GB to maintain parity with KEKCC
"path_prefix": "", # Path prefix for file path
"extra_lines": [] # These should be other HTCondor submit script lines like 'request_cpus = 2'
}

Default backend args for HTCondor.

Definition at line 1893 of file backends.py.

◆ global_job_limit

global_job_limit
inherited

The active job limit.

Init method for Batch Backend. Does some basic default setup.

This is 'global' because we want to prevent us accidentally submitting too many jobs from all current and previous submission scripts.

Definition at line 1134 of file backends.py.

◆ sleep_between_submission_checks

sleep_between_submission_checks
inherited

Seconds we wait before checking if we can submit a list of jobs.

Only relevant once we hit the global limit of active jobs, which is a lot usually.

Definition at line 1137 of file backends.py.


The documentation for this class was generated from the following file: