DBdust(1) description

SYNOPSIS

DBdust [-b] [-wint(64)] [-tdouble(2.)] [-mint(10)] path:db|dam

DESCRIPTION

Runs the symmetric DUST algorithm over the reads in the untrimmed DB path.db or path.dam producing a track .path.dust[.anno,.data] that marks all intervals of low complexity sequence, where the scan window is of size -w, the threshold for being a low-complexity interval is -t, and only perfect intervals of size greater than -m are recorded. If the -b option is set then the definition of low complexity takes into account the frequency of a given base. The command is incremental if given a DB to which new data has been added since it was last run on the DB, then it will extend the track to include the new reads. It is important to set this flag for genomes with a strong AT/GC bias, albeit the code is a tad slower. The dust track, if present, is understood and used by DBshow(1), DBstats(1), and daligner(1).

DBdust can also be run over an untriimmed DB block in which case it outputs a track encoding where the trace file names contain the block number, e.g. .FOO.3.dust.anno and .FOO.3.dust.data, given FOO.3 on the command line. We call this a block track. This permits job parallelism in block-sized chunks, and the resulting sequence of block tracks can then be merged into a track for the entire untrimmed DB with Catrack.