SYNOPSIS
velvetoptimiser [,options/] ,-f 'velveth input line'/DESCRIPTION
VelvetOptimiser is a multi-threaded Perl script for automatically optimising the three primary parameter options (K, -exp_cov, -cov_cutoff) for the Velvet de novo sequence assembler.OPTIONS
- --help
- This help.
- --version!
- Print version to stdout and exit. (default '0').
- --v|verbose+
- Verbose logging, includes all velvet output in the logfile. (default '0').
- --s|hashs=i
- The starting (lower) hash value (default '19').
- --e|hashe=i
- The end (higher) hash value (default '31').
- --x|step=i
- The step in hash search.. min 2, no odd numbers (default '2').
-
--f|velvethfiles=s The file section of the velveth command line. (default '0').
- --a|amosfile!
-
- Turn on velvet's read tracking and amos file output. (default '0').
- --o|velvetgoptions=s Extra velvetg options to pass through.
- eg. -long_mult_cutoff -max_coverage etc (default '').
- --t|threads=i
- The maximum number of simulataneous velvet instances to run. (default '4').
- --g|genomesize=f The approximate size of the genome to be assembled in megabases.
- Only used in memory use estimation. If not specified, memory use estimation will not occur. If memory use is estimated, the results are shown and then program exits. (default '0').
-
--k|optFuncKmer=s The optimisation function used for k-mer choice. (default 'n50').
- --c|optFuncCov=s The optimisation function used for cov_cutoff optimisation. (default 'Lbp').
- --m|minCovCutoff=f The minimum cov_cutoff to be used. (default '0').
- --p|prefix=s
- --c|optFuncCov=s The optimisation function used for cov_cutoff optimisation. (default 'Lbp').
-
- The prefix for the output filenames, the default is the date and time in the format DD-MM-YYYY-HH-MM_. (default 'auto').
-
--d|dir_final=s The name of the directory to put the final output into. (default '.').
- --z|upperCovCutoff=f The maximum coverage cutoff to consider as a multiplier of the expected coverage. (default '0.8').
Velvet optimiser assembly optimisation function can be built from the following variables.
- LNbp = The total number of Ns in large contigs Lbp = The total number of base pairs in large contigs Lcon = The number of large contigs max = The length of the longest contig n50 = The n50 ncon = The total number of contigs tbp = The total number of basepairs in contigs
Examples are:
- 'Lbp' = Just the total basepairs in contigs longer than 1kb 'n50*Lcon' = The n50 times the number of long contigs. 'n50*Lcon/tbp+log(Lbp)' = The n50 times the number of long contigs divided
-
by the total bases in all contigs plus the log of the number of bases
in long contigs.