arcsub(1) ARC Submission

DESCRIPTION

The arcsub command is used for submitting jobs to Grid enabled computing resources.

SYNOPSIS

arcsub [options] [filename ...]

OPTIONS

-c, --cluster=name
select one or more computing elements: name can be an alias for a single CE, a group of CEs or a URL
-g, --index=name
select one or more registries: name can be an alias for a single registry, a group of registries or a URL
-R, --rejectdiscovery=URL
skip the service with the given URL during service discovery
-S, --submissioninterface=InterfaceName
only use this interface for submitting (e.g. org.nordugrid.gridftpjob, org.ogf.glue.emies.activitycreation, org.ogf.bes)
-I, --infointerface=InterfaceName
the computing element specified by URL at the command line should be queried using this information interface (possible options: org.nordugrid.ldapng, org.nordugrid.ldapglue2, org.nordugrid.wsrfglue2, org.ogf.glue.emies.resourceinfo)
-e, --jobdescrstring=String
jobdescription string describing the job to be submitted
-f, --jobdescrfile=filename
jobdescription file describing the job to be submitted
-j, --joblist=filename
the file storing information about active jobs (default ~/.arc/jobs.xml)
-o, --jobids-to-file=filename
the IDs of the submitted jobs will be appended to this file
-D, --dryrun
submit jobs as dry run (no submission to batch system)
--direct
submit directly - no resource discovery or matchmaking
-x, --dumpdescription
do not submit - dump job description in the language accepted by the target
-P, --listplugins
list the available plugins
-t, --timeout=seconds
timeout in seconds (default 20)
-z, --conffile=filename
configuration file (default ~/.arc/client.conf)
-d, --debug=debuglevel
FATAL, ERROR, WARNING, INFO, VERBOSE or DEBUG
-b, --broker=broker
selected broker: Random (default), FastestQueue or custom. Use -P to find possible options.
-v, --version
print version information
-?, --help
print help

ARGUMENTS

filename ...
job description files describing the jobs to be submitted

EXTENDED DESCRIPTION

arcsub is the key command when submitting jobs to Grid enabled computing resources with the ARC client. As default arcsub is able to submit jobs to A-REX, CREAM and EMI ES enabled computing elements (CEs), and as always for successful submission you need to be authenticated at the targeted computing services. Since arcsub is build on a modular library, modules can be installed which enables submission to other targets, e.g. the classic ARC CE Grid-Manager.

Job submission can be accomplished by specifying a job description file to submit as an argument. arcsub will then by default perform resource discovery on the Grid and then the discovered resources will be matched to the job description and ranked according to the chosen broker (--broker option). If no Grid environment has been configured, please contact your system administrator, or setup one yourself in the client configuration file (see files section). Another option is to explicitly specify a registry service (or multiple) to arcsub using the --index option, which accepts an URL, alias or group. Alternatively a specific CE (or multiple) can be targeted by using the --cluster option. If such a scenario is the most common, it is worthwhile to specify those CEs in the client configuration as default services, which makes it superfluous to specify them as argument. In the same manner aliases and groups, defined in the configuration file, can be utilized, and can be used as argument to the --cluster or --index options. In all of the above scenarios arcsub obtains resource information from the services which is then used for matchmaking against the job description, however that step can be avoided by specifying the --direct option, in which case the job description is submitted directly to first specified endpoint.

The format of a classic GRIDFTP-based cluster URLs:
[ldap://]<hostname>[:2135/nordugrid-cluster-name=<hostname>,Mds-Vo-name=local,o=grid]
Only the hostname part has to be specified, the rest of the URL is automatically generated.

The format of an A-REX URL is:
[https://]<hostname>[:<port>][/<path>]
Here the port is 443 by default, but the path cannot be guessed, so if it is not specified, then the service is assumed to live on the root path.

Job descriptions can also be specified using the --jobdescrfile option which expect the file name of the description as argument, or the --jobdescrstring option which expect as argument the job description as a string, and both options can be specified multiple times and one does not exclude the other. The default supported job description languages are xRSL, JSDL and JDL.

If the job description is successfully submitted a job-ID is returned and printed. This job-ID uniquely identifies the job while it is being executed. On the other hand it is also possible that no CEs matches the constraints defined in the description in which case no submission will be done. Upon successful submission, the job-ID along with more technical job information is stored in the job-list file (described below). The stored information enables the job management commands of the ARC client to manage jobs easily, and thus the job-ID need not to be saved manually. By default the job-list file is stored in the .arc directory in the home directory of the user, however another location can be specified using the --joblist option taking the location of this file as argument. If the --joblist option was used during submission, it should also be specified in the consecutive commands when managing the job. If a Computing Element has multiple job submission interfaces (e.g. gridftp, EMI-ES, BES), then the brokering algorithm will choose one of them. With the --submissioninterface option the requested interface can be specified, and in that case only those Computing Elements will be considered which has that specific interface, and only that interface will be used to submit the jobs.

As mentioned above registry or index services can be specified with the --index option. Specifying one or multiple index servers instructs the arcsub command to query the servers for registered CEs, the returned CEs will then be matched against the job description and those matching will be ranked by the chosen broker (see below) and submission will be tried in order until successful or reaching the end. From the returned list of CEs it might happen that a troublesome or undesirable CE is selected for submission, in that case it possible to reject that cluster using the --rejectdiscovery option and providing the URL (or just the hostname) of the CE, which will disregard that CE as a target for submission.

When multiple CEs are targeted for submission, the resource broker will be used to filter out CEs which do not match the job description requirements and then rank the remaining CEs. The broker used by default will rank the CEs randomly, however a different broker can be chosen by using the --broker option, which takes the name of the broker as argument. The broker type can also be specified in client.conf. The brokers available can be seen using arcsub -P. By default the following brokers are available:

Random (default)
Chooses a random CE matching the job requirements.
FastestQueue
Ranks matching CEs according to the length of the job queue at the CEs, ranking those with shortest queue first/highest.
Benchmark
Ranks matching CEs according to a specified benchmark, which should be specified by appending the broker name with ':' and then the name of the benchmark. If no option is given to the Benchmark broker then CEs will be ranked according to the 'specint2000' benchmark.
Data
Ranks matching CEs according to the amount of input data cached by each CE, by querying the CE. Only CEs with the A-REX BES interface support this operation.
Null
Choose a random CE with no filtering at all of CEs.
PythonBroker
User-defined custom brokers can be created in Python. See the example broker SampleBroker.py or ACIXBroker.py (like Data broker but uses the ARC Cache Index) that come installed with ARC for more details of how to write your own broker. A PythonBroker is specified by --broker PythonBroker:Filename.Class:args, where Filename is the file containing the class Class which implements the broker interface. The directory containing this file must be in the PYTHONPATH. args is optional and allows specifying arguments to the broker.

Before submission, arcsub performs an intelligent modification of the job description (adding or modifying attributes, even converting the description language to fit the needs of the CE) ensuring that it is valid. The modified job description can be printed by specifying the --dumpdescription option. The format, i.e. job description language, of the printed job description cannot be specified, and will be that which will be sent to and accepted by the chosen target. Further information from arcsub can be obtained by increasing the verbosity, which is done with the --debug option where the default verbosity level is WARNING. Setting the level to DEBUG will show all messages, while setting it to FATAL will only show fatal log messages.

To validate your job description without actually submitting a job, use the --dryrun option: it will capture possible syntax or other errors, but will instruct the site not to submit the job for execution. Only the grid-manager (ARC0) and A-REX (ARC1) CEs support this feature.

EXAMPLES

Submission of a job description file "helloworld.jsdl" to the Grid
arcsub helloworld.jsdl

A information index server (registry) can also be queried for CEs to submit to:
arcsub -g registry.example.com helloworld.jsdl

Submission of a job description file "helloworld.jsdl" to ce.example.com:
arcsub -c ce.example.com helloworld.jsdl

Direct submission to a CE is done as:
arcsub --direct -c cd.example.com helloworld.jsdl

The job description can also be specified directly on the command line as shown in the example, using the XRSL job description language:
arcsub -c example.com/arex -e \
'&(executable="/bin/echo")(arguments="Hello World!")'

When submitting against CEs retrieved from information index servers it might be useful to do resource brokering:
arcsub -g registry.example.com -b FastestQueue helloworld.jsdl

If the job has a large input data set, it can be useful to send it to a CE where those files are already cached. The ACIX broker can be used for this:
arcsub -g registry.example.com -b PythonBroker:ACIXBroker.ACIXBroker:https://cacheindex.ndgf.org:6443/data/index helloworld.jsdl

Disregarding a specific CE for submission submitting against an information index server:
arcsub -g registry.example.com -R badcomputingelement.com/arex helloworld.jsdl

Dumping the job description is done as follows:
arcsub -c example.com/arex -x helloworld.jsdl

FILES

~/.arc/client.conf
Some options can be given default values by specifying them in the ARC client configuration file. Registry and computing element services can be specified in separate sections of the config. The default services can be specified by adding 'default=yes' attribute to the section of the service, thus when no --cluster or --index options are given these will be used for submission. Each service has an alias, and can be member of any number of groups. Then specifying the alias or the name of the group with the --cluster or --index options will select the given services. By using the --conffile option a different configuration file can be used than the default. Note that some installations also have a system client configuration file, however attributes in the client one takes precedence, and then command line options takes precedence over configuration file attributes.

~/.arc/jobs.xml
This a local list of the user's active jobs. When a job is successfully submitted it is added to this list and when it is removed from the remote cluster it is removed from this list. This list is used as the list of all active jobs when the user specifies the --all option to the various NorduGrid ARC user interface commands. By using the --joblist option a different file can be used than the default.

ENVIRONMENT VARIABLES

X509_USER_PROXY
The location of the user's Grid proxy file. Shouldn't be set unless the proxy is in a non-standard location.

ARC_LOCATION
The location where ARC is installed can be specified by this variable. If not specified the install location will be determined from the path to the command being executed, and if this fails a WARNING will be given stating the location which will be used.

ARC_PLUGIN_PATH
The location of ARC plugins can be specified by this variable. Multiple locations can be specified by separating them by : (; in Windows). The default location is $ARC_LOCATION/lib/arc (\ in Windows).

COPYRIGHT

APACHE LICENSE Version 2.0

AUTHOR

ARC software is developed by the NorduGrid Collaboration (http://www.nordugrid.org), please consult the AUTHORS file distributed with ARC. Please report bugs and feature requests to http://bugzilla.nordugrid.org