velvetg(1) de Bruijn graph construction, error removal and repeat resolution

DESCRIPTION

Usage: ./velvetg directory [options]
directory
: working directory name

Standard options:

-cov_cutoff <floating-point|auto>
: removal of low coverage nodes AFTER tour bus or allow the system to infer it
(default: no removal)
-ins_length <integer>
: expected distance between two paired end reads (default: no read pairing)
-read_trkg <yes|no>
: tracking of short read positions in assembly (default: no tracking)
-min_contig_lgth <integer>
: minimum contig length exported to contigs.fa file (default: hash length * 2)
-amos_file <yes|no>
: export assembly to AMOS file (default: no export)
-exp_cov <floating point|auto>
: expected coverage of unique regions or allow the system to infer it
(default: no long or paired-end read resolution)
-long_cov_cutoff <floating-point>: removal of nodes with low long-read coverage AFTER tour bus
(default: no removal)

Advanced options:

-ins_length* <integer>
: expected distance between two paired-end reads in the respective short-read dataset (default: no read pairing)
-ins_length_long <integer>
: expected distance between two long paired-end reads (default: no read pairing)
-ins_length*_sd <integer>
: est. standard deviation of respective dataset (default: 10% of corresponding length)
[replace '*' by nothing, '2' or '_long' as necessary]
-scaffolding <yes|no>
: scaffolding of contigs used paired end information (default: on)
-max_branch_length <integer>
: maximum length in base pair of bubble (default: 100)
-max_divergence <floating-point>: maximum divergence rate between two branches in a bubble (default: 0.2)
-max_gap_count <integer>
: maximum number of gaps allowed in the alignment of the two branches of a bubble (default: 3)
-min_pair_count <integer>
: minimum number of paired end connections to justify the scaffolding of two long contigs (default: 5)
-max_coverage <floating point>
: removal of high coverage nodes AFTER tour bus (default: no removal)
-coverage_mask <int>
: minimum coverage required for confident regions of contigs (default: 1)
-long_mult_cutoff <int>
: minimum number of long reads required to merge contigs (default: 2)
-unused_reads <yes|no>
: export unused reads in UnusedReads.fa file (default: no)
-alignments <yes|no>
: export a summary of contig alignment to the reference sequences (default: no)
-exportFiltered <yes|no>
: export the long nodes which were eliminated by the coverage filters (default: no)
-clean <yes|no>
: remove all the intermediary files which are useless for recalculation (default : no)
-very_clean <yes|no>
: remove all the intermediary files (no recalculation possible) (default: no)
-paired_exp_fraction <double>
: remove all the paired end connections which less than the specified fraction of the expected count (default: 0.1)
-shortMatePaired* <yes|no>
: for mate-pair libraries, indicate that the library might be contaminated with paired-end reads (default no)
-conserveLong <yes|no>
: preserve sequences with long reads in them (default no)

Output:

directory/contigs.fa
: fasta file of contigs longer than twice hash length
directory/stats.txt
: stats file (tab-spaced) useful for determining appropriate coverage cutoff
directory/LastGraph
: special formatted file with all the information on the final graph
directory/velvet_asm.afg
: (if requested) AMOS compatible assembly file