snakemake(1)
a Python based language and execution environment for GNU Makelike workflows
DESCRIPTION
usage: snakemake [-h] [--snakefile FILE] [--gui [PORT]] [--cores [N]]
-
[--resources [NAME=INT [NAME=INT ...]]]
[--config [KEY=VALUE [KEY=VALUE ...]]] [--configfile FILE]
[--list] [--list-target-rules] [--directory DIR] [--dryrun]
[--printshellcmds] [--dag] [--rulegraph] [--d3dag]
[--summary] [--detailed-summary] [--touch] [--keep-going]
[--force] [--forceall] [--forcerun TARGET [TARGET ...]]
[--prioritize TARGET [TARGET ...]] [--allow-ambiguity]
[--cluster CMD | --cluster-sync CMD | --drmaa [ARGS]]
[--cluster-config FILE] [--immediate-submit]
[--jobscript SCRIPT] [--jobname NAME] [--reason]
[--stats FILE] [--nocolor] [--quiet] [--nolock] [--unlock]
[--cleanup-metadata [FILE [FILE ...]]] [--rerun-incomplete]
[--ignore-incomplete] [--list-version-changes]
[--list-code-changes] [--list-input-changes]
[--list-params-changes] [--latency-wait SECONDS]
[--wait-for-files [FILE [FILE ...]]] [--benchmark-repeats N]
[--notemp] [--keep-target-files]
[--allowed-rules ALLOWED_RULES [ALLOWED_RULES ...]]
[--timestamp] [--greediness GREEDINESS] [--print-compilation]
[--overwrite-shellcmd OVERWRITE_SHELLCMD] [--verbose]
[--debug] [--profile FILE] [--bash-completion] [--version]
[target [target ...]]
positional arguments:
- target
-
Targets to build. May be rules or files.
optional arguments:
- -h, --help
-
show this help message and exit
- --snakefile FILE, -s FILE
-
The workflow definition in a snakefile.
- --gui [PORT]
-
Serve an HTML based user interface to the given port
(default: 8000). If possible, a browser window is
opened.
- --cores [N], --jobs [N], -j [N]
-
Use at most N cores in parallel (default: 1). If N is
omitted, the limit is set to the number of available
cores.
- --resources [NAME=INT [NAME=INT ...]], --res [NAME=INT [NAME=INT ...]]
-
Define additional resources that shall constrain the
scheduling analogously to threads (see above). A
resource is defined as a name and an integer value.
E.g. --resources gpu=1. Rules can use resources by
defining the resource keyword, e.g. resources: gpu=1.
If now two rules require 1 of the resource 'gpu' they
won't be run in parallel by the scheduler.
- --config [KEY=VALUE [KEY=VALUE ...]]
-
Set or overwrite values in the workflow config object.
The workflow config object is accessible as variable
config inside the workflow. Default values can be set
by providing a JSON file (see Documentation).
- --configfile FILE
-
Specify or overwrite the config file of the workflow
(see the docs). Values specified in JSON or YAML
format are available in the global config dictionary
inside the workflow.
- --list, -l
-
Show available rules in given Snakefile.
- --list-target-rules, --lt
-
Show available target rules in given Snakefile.
- --directory DIR, -d DIR
-
Specify working directory (relative paths in the
snakefile will use this as their origin).
- --dryrun, -n
-
Do not execute anything.
- --printshellcmds, -p
-
Print out the shell commands that will be executed.
- --dag
-
Do not execute anything and print the directed acyclic
graph of jobs in the dot language. Recommended use on
Unix systems: snakemake --dag | dot | display
- --rulegraph
-
Do not execute anything and print the dependency graph
of rules in the dot language. This will be less
crowded than above DAG of jobs, but also show less
information. Note that each rule is displayed once,
hence the displayed graph will be cyclic if a rule
appears in several steps of the workflow. Use this if
above option leads to a DAG that is too large.
Recommended use on Unix systems: snakemake --rulegraph
| dot | display
- --d3dag
-
Print the DAG in D3.js compatible JSON format.
- --summary, -S
-
Print a summary of all files created by the workflow.
The has the following columns: filename, modification
time, rule version, status, plan. Thereby rule version
contains the versionthe file was created with (see the
version keyword of rules), and status denotes whether
the file is missing, its input files are newer or if
version or implementation of the rule changed since
file creation. Finally the last column denotes whether
the file will be updated or created during the next
workflow execution.
- --detailed-summary, -D
-
Print a summary of all files created by the workflow.
The has the following columns: filename, modification
time, rule version, input file(s), shell command,
status, plan. Thereby rule version contains the
versionthe file was created with (see the version
keyword of rules), and status denotes whether the file
is missing, its input files are newer or if version or
implementation of the rule changed since file
creation. The input file and shell command columns are
selfexplanatory. Finally the last column denotes
whether the file will be updated or created during the
next workflow execution.
- --touch, -t
-
Touch output files (mark them up to date without
really changing them) instead of running their
commands. This is used to pretend that the rules were
executed, in order to fool future invocations of
snakemake. Fails if a file does not yet exist.
- --keep-going, -k
-
Go on with independent jobs if a job fails.
- --force, -f
-
Force the execution of the selected target or the
first rule regardless of already created output.
- --forceall, -F
-
Force the execution of the selected (or the first)
rule and all rules it is dependent on regardless of
already created output.
- --forcerun TARGET [TARGET ...], -R TARGET [TARGET ...]
-
Force the re-execution or creation of the given rules
or files. Use this option if you changed a rule and
want to have all its output in your workflow updated.
- --prioritize TARGET [TARGET ...], -P TARGET [TARGET ...]
-
Tell the scheduler to assign creation of given targets
(and all their dependencies) highest priority.
(EXPERIMENTAL)
- --allow-ambiguity, -a
-
Don't check for ambiguous rules and simply use the
first if several can produce the same file. This
allows the user to prioritize rules by their order in
the snakefile.
- --cluster CMD, -c CMD
-
Execute snakemake rules with the given submit command,
e.g. qsub. Snakemake compiles jobs into scripts that
are submitted to the cluster with the given command,
once all input files for a particular job are present.
The submit command can be decorated to make it aware
of certain job properties (input, output, params,
wildcards, log, threads and dependencies (see the
argument below)), e.g.: $ snakemake --cluster 'qsub
-pe threaded {threads}'.
- --cluster-sync CMD
-
cluster submission command will block, returning the
remote exitstatus upon remote termination (for
example, this should be usedif the cluster command is
'qsub -sync y' (SGE)
- --drmaa [ARGS]
-
Execute snakemake on a cluster accessed via DRMAA,
Snakemake compiles jobs into scripts that are
submitted to the cluster with the given command, once
all input files for a particular job are present. ARGS
can be used to specify options of the underlying
cluster system, thereby using the job properties
input, output, params, wildcards, log, threads and
dependencies, e.g.: --drmaa ' -pe threaded {threads}'.
Note that ARGS must be given in quotes and with a
leading whitespace.
- --cluster-config FILE, -u FILE
-
A JSON or YAML file that defines the wildcards used in
'cluster'for specific rules, instead of having them
specified in the Snakefile.For example, for rule 'job'
you may define: { 'job' : { 'time' : '24:00:00' } } to
specify the time for rule 'job'.
- --immediate-submit, --is
-
Immediately submit all jobs to the cluster instead of
waiting for present input files. This will fail,
unless you make the cluster aware of job dependencies,
e.g. via: $ snakemake --cluster 'sbatch --dependency
{dependencies}. Assuming that your submit script (here
sbatch) outputs the generated job id to the first
stdout line, {dependencies} will be filled with space
separated job ids this job depends on.
- --jobscript SCRIPT, --js SCRIPT
-
Provide a custom job script for submission to the
cluster. The default script resides as 'jobscript.sh'
in the installation directory.
- --jobname NAME, --jn NAME
-
Provide a custom name for the jobscript that is
submitted to the cluster (see --cluster). NAME is
"snakejob.{rulename}.{jobid}.sh" per default. The
wildcard {jobid} has to be present in the name.
- --reason, -r
-
Print the reason for each executed rule.
- --stats FILE
-
Write stats about Snakefile execution in JSON format
to the given file.
- --nocolor
-
Do not use a colored output.
- --quiet, -q
-
Do not output any progress or rule information.
- --nolock
-
Do not lock the working directory
- --unlock
-
Remove a lock on the working directory.
- --cleanup-metadata [FILE [FILE ...]], --cm [FILE [FILE ...]]
-
Cleanup the metadata of given files. That means that
snakemake removes any tracked version info, and any
marks that files are incomplete.
- --rerun-incomplete, --ri
-
Re-run all jobs the output of which is recognized as
incomplete.
- --ignore-incomplete, --ii
-
Ignore any incomplete jobs.
- --list-version-changes, --lv
-
List all output files that have been created with a
different version (as determined by the version
keyword).
- --list-code-changes, --lc
-
List all output files for which the rule body (run or
shell) have changed in the Snakefile.
- --list-input-changes, --li
-
List all output files for which the defined input
files have changed in the Snakefile (e.g. new input
files were added in the rule definition or files were
renamed). For listing input file modification in the
filesystem, use --summary.
- --list-params-changes, --lp
-
List all output files for which the defined params
have changed in the Snakefile.
- --latency-wait SECONDS, --output-wait SECONDS, -w SECONDS
-
Wait given seconds if an output file of a job is not
present after the job finished. This helps if your
filesystem suffers from latency (default 5).
- --wait-for-files [FILE [FILE ...]]
-
Wait --latency-wait seconds for these files to be
present before executing the workflow. This option is
used internally to handle filesystem latency in
cluster environments.
- --benchmark-repeats N
-
Repeat a job N times if marked for benchmarking
(default 1).
- --notemp, --nt
-
Ignore temp() declarations. This is useful when
running only a part of the workflow, since temp()
would lead to deletion of probably needed files by
other parts of the workflow.
- --keep-target-files
-
Do not adjust the paths of given target files relative
to the working directory.
- --allowed-rules ALLOWED_RULES [ALLOWED_RULES ...]
-
Only use given rules. If omitted, all rules in
Snakefile are used.
- --timestamp, -T
-
Add a timestamp to all logging output
- --greediness GREEDINESS
-
Set the greediness of scheduling. This value between 0
and 1 determines how careful jobs are selected for
execution. The default value (1.0) provides the best
speed and still acceptable scheduling quality.
- --print-compilation
-
Print the python representation of the workflow.
- --overwrite-shellcmd OVERWRITE_SHELLCMD
-
Provide a shell command that shall be executed instead
of those given in the workflow. This is for debugging
purposes only.
- --verbose
-
Print debugging output.
- --debug
-
Allow one to debug rules with e.g. PDB. This flag allows
to set breakpoints in run blocks.
- --profile FILE
-
Profile Snakemake and write the output to FILE. This
requires yappi to be installed.
- --bash-completion
-
Output code to register bash completion for snakemake.
Put the following in your .bashrc (including the
accents): `snakemake --bash-completion` or issue it in
an open terminal session.
- --version, -v
-
show program's version number and exit