condor_submit_dag Manage(1) and queue jobs within a specified DAG for execution on remote machines

Synopsis

condor_submit_dag [-help | -version]

condor_submit_dag[-no_submit] [-verbose] [-force] [-maxidle NumberOfProcs] [-maxjobs NumberOfClusters] [-dagman DagmanExecutable] [-maxpre NumberOfPreScripts] [-maxpost NumberOfPostScripts] [-notification value] [-noeventchecks] [-allowlogerror] [-r schedd_name] [-debug level] [-usedagdir] [-outfile_dir directory] [-config ConfigFileName] [-insert_sub_file FileName] [-append Command] [-autorescue 0|1] [-dorescuefrom number] [-allowversionmismatch] [-no_recurse] [-do_recurse] [-update_submit] [-import_env] [-DumpRescue] [-valgrind] [-DontAlwaysRunPost] [-priority number] [-dont_use_default_node_log] [-schedd-daemon-ad-file FileName] [-schedd-address-file FileName] [-suppress_notification] [-dont_suppress_notification] [-DoRecovery] DAGInputFile1[DAGInputFile2 ... DAGInputFileN ]

Description

condor_submit_dagis the program for submitting a DAG (directed acyclic graph) of jobs for execution under HTCondor. The program enforces the job dependencies defined in one or more DAGInputFiles. Each DAGInputFilecontains commands to direct the submission of jobs implied by the nodes of a DAG to HTCondor. Extensive documentation is in the HTCondor User Manual section on DAGMan.

Some options may be specified on the command line or in the configuration or in a node job's submit description file. Precedence is given to command line options or configuration over settings from a submit description file. An example is e-mail notifications. When configuration variable DAGMAN_SUPPRESS_NOTIFICATION is its default value of True , and a node job's submit description file contains



 notification = Complete

e-mail will notbe sent upon completion, as the value of DAGMAN_SUPPRESS_NOTIFICATION is enforced.

Options

-help

Display usage information and exit.

-version

Display version information and exit.

-no_submit

Produce the HTCondor submit description file for DAGMan, but do not submit DAGMan as an HTCondor job.

-verbose

Cause condor_submit_dag to give verbose error messages.

-force

Require condor_submit_dag to overwrite the files that it produces, if the files already exist. Note that dagman.out will be appended to, not overwritten. If new-style rescue DAG mode is in effect, and any new-style rescue DAGs exist, the -forceflag will cause them to be renamed, and the original DAG will be run. If old-style rescue DAG mode is in effect, any existing old-style rescue DAGs will be deleted, and the original DAG will be run.

-maxidle NumberOfProcs

Sets the maximum number of idle procs allowed before condor_dagmanstops submitting more node jobs. Note that for this argument, each individual proc within a cluster counts as a towards the limit, which is inconsistent with -maxjobs.Once idle procs start to run, condor_dagmanwill resume submitting jobs once the number of idle procs falls below the specified limit. NumberOfProcsis a non-negative integer. If this option is omitted, the number of idle procs is limited by the configuration variable DAGMAN_MAX_JOBS_IDLE (see 3.3.24), which defaults to 1000. To disable this limit, set NumberOfProcsto 0. Note that submit description files that queue multiple procs can cause the NumberOfProcslimit to be exceeded. Setting queue 5000 in the submit description file, where -maxidleis set to 250 will result in a cluster of 5000 new procs being submitted to the condor_schedd, not 250. In this case, condor_dagmanwill resume submitting jobs when the number of idle procs falls below 250.

-maxjobs NumberOfClusters

Sets the maximum number of clusters within the DAG that will be submitted to HTCondor at one time. Note that for this argument, each cluster counts as one job, no matter how many individual procs are in the cluster. NumberOfClustersis a non-negative integer. If this option is omitted, the number of clusters is limited by the configuration variable DAGMAN_MAX_JOBS_SUBMITTED (see 3.3.24), which defaults to 0 (unlimited).

-dagman DagmanExecutable

Allows the specification of an alternate condor_dagmanexecutable to be used instead of the one found in the user's path. This must be a fully qualified path.

-maxpre NumberOfPreScripts

Sets the maximum number of PRE scripts within the DAG that may be running at one time. NumberOfPreScriptsis a non-negative integer. If this option is omitted, the number of PRE scripts is limited by the configuration variable DAGMAN_MAX_PRE_SCRIPTS (see 3.3.24), which defaults to 20.

-maxpost NumberOfPostScripts

Sets the maximum number of POST scripts within the DAG that may be running at one time. NumberOfPostScriptsis a non-negative integer. If this option is omitted, the number of POST scripts is limited by the configuration variable DAGMAN_MAX_POST_SCRIPTS (see 3.3.24), which defaults to 20.

-notification value

Sets the e-mail notification for DAGMan itself. This information will be used within the HTCondor submit description file for DAGMan. This file is produced by condor_submit_dag . See the description of notificationwithin condor_submitmanual page for a specification of value.

-noeventchecks

This argument is no longer used; it is now ignored. Its functionality is now implemented by the DAGMAN_ALLOW_EVENTS configuration variable.

-allowlogerror

This optional argument has condor_dagmantry to run the specified DAG, even in the case of detected errors in the job event log specification. As of version 7.3.2, this argument has an effect only on DAGs containing Stork job nodes.

-r schedd_name

Submit condor_dagmanto a remote machine, specifically the condor_schedddaemon on that machine. The condor_dagmanjob will not run on the local condor_schedd(the submit machine), but on the specified one. This is implemented using the -remoteoption to condor_submit. Note that this option does not currently specify input files for condor_dagman, nor the individual nodes to be taken along! It is assumed that any necessary files will be present on the remote computer, possibly via a shared file system between the local computer and the remote computer. It is also necessary that the user has appropriate permissions to submit a job to the remote machine; the permissions are the same as those required to use condor_submit's -remoteoption. If other options are desired, including transfer of other input files, consider using the -no_submitoption, modifying the resulting submit file for specific needs, and then using condor_submiton that.

-debug level

Passes the the levelof debugging output desired to condor_dagman. levelis an integer, with values of 0-7 inclusive, where 7 is the most verbose output. See the condor_dagmanmanual page for detailed descriptions of these values. If not specified, no -debug value is passed to condor_dagman.

-usedagdir

This optional argument causes condor_dagmanto run each specified DAG as if condor_submit_dag had been run in the directory containing that DAG file. This option is most useful when running multiple DAGs in a single condor_dagman. Note that the -usedagdirflag must not be used when running an old-style Rescue DAG.

-outfile_dir directory

Specifies the directory in which the .dagman.out file will be written. The directorymay be specified relative to the current working directory as condor_submit_dag is executed, or specified with an absolute path. Without this option, the .dagman.out file is placed in the same directory as the first DAG input file listed on the command line.

-config ConfigFileName

Specifies a configuration file to be used for this DAGMan run. Note that the options specified in the configuration file apply to all DAGs if multiple DAGs are specified. Further note that it is a fatal error if the configuration file specified by this option conflicts with a configuration file specified in any of the DAG files, if they specify one.

-insert_sub_file FileName

Specifies a file to insert into the .condor.sub file created by condor_submit_dag . The specified file must contain only legal submit file commands. Only one file can be inserted. (If both the DAGMAN_INSERT_SUB_FILE configuration variable and -insert_sub_fileare specified, -insert_sub_fileoverrides DAGMAN_INSERT_SUB_FILE.) The specified file is inserted into the .condor.sub file before the Queue command and before any commands specified with the -appendoption.

-append Command

Specifies a command to append to the .condor.sub file created by condor_submit_dag . The specified command is appended to the .condor.sub file immediately before the Queue command. Multiple commands are specified by using the -appendoption multiple times. Each new command is given in a separate -appendoption. Commands with spaces in them must be enclosed in double quotes. Commands specified with the -appendoption are appended to the .condor.sub file aftercommands inserted from a file specified by the -insert_sub_fileoption or the DAGMAN_INSERT_SUB_FILE configuration variable, so the -appendcommand(s) will override commands from the inserted file if the commands conflict.

-autorescue 0|1

Whether to automatically run the newest rescue DAG for the given DAG file, if one exists (0 = false , 1 = true ).

-dorescuefrom number

Forces condor_dagmanto run the specified rescue DAG number for the given DAG. A value of 0 is the same as not specifying this option. Specifying a non-existent rescue DAG is a fatal error.

-allowversionmismatch

This optional argument causes condor_dagmanto allow a version mismatch between condor_dagmanitself and the .condor.sub file produced by condor_submit_dag (or, in other words, between condor_submit_dag and condor_dagman). WARNING! This option should be used only if absolutely necessary. Allowing version mismatches can cause subtle problems when running DAGs. (Note that, starting with version 7.4.0, condor_dagmanno longer requires an exact version match between itself and the .condor.sub file. Instead, a "minimum compatible version" is defined, and any .condor.sub file of that version or newer is accepted.)

-no_recurse

This optional argument causes condor_submit_dag to notrun itself recursively on nested DAGs (this is now the default; this flag has been kept mainly for backwards compatibility).

-do_recurse

This optional argument causes condor_submit_dag to run itself recursively on nested DAGs. The default is now that it does notrun itself recursively; instead the .condor.sub files for nested DAGs are generated "lazily" by condor_dagmanitself. DAG nodes specified with the SUBDAG EXTERNALkeyword or with submit file names ending in .condor.sub are considered nested DAGs. The DAGMAN_GENERATE_SUBDAG_SUBMITS configuration variable may be relevant.

-update_submit

This optional argument causes an existing .condor.sub file to not be treated as an error; rather, the .condor.sub file will be overwritten, but the existing values of -maxjobs, -maxidle, -maxpre, and -maxpostwill be preserved.

-import_env

This optional argument causes condor_submit_dag to import the current environment into the environmentcommand of the .condor.sub file it generates.

-DumpRescue

This optional argument tells condor_dagmanto immediately dump a rescue DAG and then exit, as opposed to actually running the DAG. This feature is mainly intended for testing. The Rescue DAG file is produced whether or not there are parse errors reading the original DAG input file. The name of the file differs if there was a parse error.

-valgrind

This optional argument causes the submit description file generated for the submission of condor_dagmanto be modified. The executable becomes valgrindrun on condor_dagman, with a specific set of arguments intended for testing condor_dagman. Note that this argument is intended for testing purposes only. Using the -valgrindoption without the necessary valgrindsoftware installed will cause the DAG to fail. If the DAG does run, it will run much more slowly than usual.

-DontAlwaysRunPost

This option causes the submit description file generated for the submission of condor_dagmanto be modified. It causes the -DontAlwaysRunPostoption to be in the arguments to condor_dagmanin the submit description file, which causes condor_dagmanto use the return value from a PRE script to determine whether or not a POST script will run. By default, condor_dagmanruns the POST script regardless of the return value of the PRE script. Versions of condor_dagmanprior to 7.7.2 did not ignore the return value and would not run the POST script if the PRE script failed.

-priority number

Sets the minimum job priority of node jobs submitted and running under the condor_dagmanjob submitted by this condor_submit_dag command.

-dont_use_default_node_log

This option is disabled as of HTCondor version 8.3.1. This causes a compatibility error if the HTCondor version number of the condor_scheddis 7.9.0 or older.Tells condor_dagmanto use the file specified by the job ClassAd attribute UserLog to monitor job status. If this command line argument is used, then the job event log file cannot be defined with a macro.

-schedd-daemon-ad-file FileName

Specifies a full path to a daemon ad file dropped by a condor_schedd. Therefore this allows submission to a specific scheduler if several are available without repeatedly querying the condor_collector. The value for this argument defaults to the configuration attribute SCHEDD_DAEMON_AD_FILE .

-schedd-address-file FileName

Specifies a full path to an address file dropped by a condor_schedd. Therefore this allows submission to a specific scheduler if several are available without repeatedly querying the condor_collector. The value for this argument defaults to the configuration attribute SCHEDD_ADDRESS_FILE .

-suppress_notification

Causes jobs submitted by condor_dagmanto not send email notification for events. The same effect can be achieved by setting configuration variable DAGMAN_SUPPRESS_NOTIFICATION to True . This command line option is independent of the -notificationcommand line option, which controls notification for the condor_dagmanjob itself.

-dont_suppress_notification

Causes jobs submitted by condor_dagmanto defer to content within the submit description file when deciding to send email notification for events. The same effect can be achieved by setting configuration variable DAGMAN_SUPPRESS_NOTIFICATION to False . This command line flag is independent of the -notificationcommand line option, which controls notification for the condor_dagmanjob itself. If both -dont_suppress_notificationand -suppress_notificationare specified with the same command line, the last argument is used.

-DoRecovery

Causes condor_dagmanto start in recovery mode. (This means that it reads the relevant job user log(s) and "catches up" to the given DAG's previous state before submitting any new jobs.)

Exit Status

condor_submit_dagwill exit with a status value of 0 (zero) upon success, and it will exit with the value 1 (one) upon failure.

Examples

To run a single DAG:


% condor_submit_dag diamond.dag

To run a DAG when it has already been run and the output files exist:


% condor_submit_dag -force diamond.dag

To run a DAG, limiting the number of idle node jobs in the DAG to a maximum of five:


% condor_submit_dag -maxidle 5 diamond.dag

To run a DAG, limiting the number of concurrent PRE scripts to 10 and the number of concurrent POST scripts to five:


% condor_submit_dag -maxpre 10 -maxpost 5 diamond.dag

To run two DAGs, each of which is set up to run in its own directory:


% condor_submit_dag -usedagdir dag1/diamond1.dag dag2/diamond2.dag

Author

Center for High Throughput Computing, University of Wisconsin-Madison

Copyright

Copyright (C) 1990-2015 Center for High Throughput Computing, Computer Sciences Department, University of Wisconsin-Madison, Madison, WI. All Rights Reserved. Licensed under the Apache License, Version 2.0.